Download
The PhosCancer database provides downloadable files in TXT format, including:
- Phosphoproteomics: It is the preprocessed phosphoproteome data. Data from MS-GF+ and MASIC, combined with a phosphosite localization tool Ascore, had all 0 values replaced with NA, median centering and then log2 transformation. For the matrix, the rows represent the phosphosites, and the columns represent the samples. "Corrected with Protein" indicates that the statistical analysis is based on phosphorylation data adjusted for protein expression.
- Proteomics: It is the preprocessed protein data. Data from MS-GF+ and MASIC had all 0 values replaced with NA, median centering and then log2 transformation. For the matrix, the rows represent the proteins, and the columns represent the samples.
- Hallmark activity: It is the preprocessed activity matrix of hallmark-based protein expression, using the ssGSEA method. Data from MS-GF+ and MASIC had all 0 values replaced with NA, median centering and then log2 transformation. Proteins with >50% missing values were excluded, and a k-nearest neighbors imputation procedure was applied to impute the remaining missing data. Next, ssGSEA was conducted to estimate the hallmark activities. For the matrix, the rows represent the hallmarks, and the columns represent the samples.
- Upstream kinase: It is the correlation analysis result of phosphorylation level and kinases expression. Phosphosites with missing values exceeding 80% and kinases with fewer than 10 non-missing expression values in non-missing phosphorylated samples were excluded from the analysis.
Column |
Description |
Cancer |
Cancer type |
Site |
Phosphosite |
TumorSampleSize |
The number of tumor samples with non-missing values |
MostKinase |
MostKinase is identified as the top 10 kinases with the highest absolute correlation values among those for which the correlation p-value is less than 0.05 |
SpearmanP |
P value from Spearman's correlation analysis |
SpearmanFDR |
FDR for p value from Spearman's correlation analysis |
SpearmanE |
Correlation coefficient from Spearman's correlation analysis |
PearsonP |
P value from Pearson's correlation analysis |
PearsonFDR |
FDR for p value from Pearson's correlation analysis |
PearsonE |
Correlation coefficient from Pearson's correlation analysis |
- DE: Differential analysis result of phosphorylation levels between tumor and normal. This analysis excluded phosphosites with missing values exceeding 80% in either tumor or normal samples.
Column |
Description |
Cancer |
Cancer type |
Site |
Phosphosite |
TumorSampleSize |
The number of tumor samples with non-missing values |
NormalSampleSize |
The number of normal samples with non-missing values |
TumorSampleMean |
The mean phosphorylation level of tumor samples |
NormalSampleMean |
The mean phosphorylation level of normal samples |
log2FC |
log2fold change (Tumor/Normal) |
wilcoxP |
P value from Wilcoxon test |
wilcoxFDR |
FDR for p value from Wilcoxon test |
- Stage: Differential analysis result of phosphorylation levels across tumor stages. This analysis excluded phosphosites with missing values exceeding 80% in tumor samples.
Column |
Description |
Cancer |
Cancer type |
Site |
Phosphosite |
TumorSampleSize |
The number of tumor samples with non-missing values |
TumorSampleMean |
The mean phosphorylation level of tumor samples |
kruskalP |
P value from Kruskal-Wallis test |
kruskalFDR |
FDR for p value from Kruskal-Wallis test |
- Survival: Survival analysis result. Phosphosites with missing values not exceeding 80% in tumor samples and dead event of non-missing samples more than 1 were retained from this analysis.
Column |
Description |
Cancer |
Cancer type |
Site |
Phosphosite |
OSHazardRatio |
Hazard ratio for overall survival |
OSCoxP |
P value from Cox regression analysis for overall survival |
OSKMcutpointP |
P value from the log-rank test using an optimal cutpoint for overall survival |
OSKMMedianP |
P value from the log-rank test using a median cutoff for overall survival |
DFSHazardRatio |
Hazard ratio for disease-free survival |
DFSCoxP |
P value from Cox regression analysis for disease-free survival |
DFSKMcutpointP |
P value from the log-rank test using an optimal cutpoint for disease-free survival |
DFSKMMedianP |
P value from the log-rank test using a median cutoff for disease-free survival |
- Hallmark: Spearman's correlation analysis between phosphorylation levels and the activities of hallmark-based protein expression, using the ssGSEA method. This analysis excluded phosphosites with missing values exceeding 80% in tumor samples.
Column |
Description |
Cancer |
Cancer type |
Site |
Phosphosite |
TumorSampleSize |
The number of tumor samples with non-missing values |
MostHallmark |
MostHallmark is identified as the top hallmark with the highest absolute correlation value among those for which the correlation p-value is less than 0.05. |
SpearmanP |
P value from Spearman's correlation analysis |
SpearmanFDR |
FDR for p value from Spearman's correlation analysis |
SpearmanE |
Correlation coefficient from Spearman's correlation analysis |
PearsonP |
P value from Pearson's correlation analysis |
PearsonFDR |
FDR for p value from Pearson's correlation analysis |
PearsonE |
Correlation coefficient from Pearson's correlation analysis |