The colorectal cancer proteome


Colorectal cancer is the third most common cancer in the world and the fifth leading cause of cancer-related mortality. Environmental factors, including meat consumption, have been identified as important risk factors. The overall mortality is approximately 50%. The surgical stage at diagnosis is the most important factor for predicting prognosis and survival rate varies greatly depending on stage. 5-year survival rate is more than 90% for stage I and less than 10% for stage IV. Most colorectal cancer cases are detected at an advanced stage. Bleeding and hematochezia are two of the most common symptoms associated with rectal lesions.

Colorectal cancer is considered to originate from normal colon epithelium that develop into precursor lesions termed adenomas that subsequently may progress to invasive colorectal adenocarcinomas with metastatic potential. Colorectal cancer are divided into two subtypes, colon adenocarcinomas (COAD) and rectum adenocarcinomas (READ), depending on the site of the tumor.

Here, we explore the colorectal cancer proteome using TCGA transcriptomics data and antibody based protein data. 595 genes are suggested as prognostic based on transcriptomics data from 597 patients; 243 genes associated with unfavourable prognosis and 352 genes associated with favourable prognosis.

TCGA data analysis


In this metadata study we focus on combined colorectal cancer data from TCGA, separated data from colon (COAD) and rectum (READ) adenocarcinomas is also available and presented. The transcriptomics data was available from 597 patients in total, 438 COAD (204 female and 234 male) and 159 READ (71 female and 88 male). Most of the patients (473 patients) were still alive at the time of data collection. The stage distribution was stage i) 103 patients, stage ii) 215 patients, stage iii) 174 patients, stage iv) 85 patients and 20 patients with missing stage information.

Unfavourable prognostic genes in colorectal cancer


For unfavourable genes, higher relative expression levels at diagnosis gives significantly lower overall survival for the patients. There are 243 genes associated with unfavourable prognosis in colorectal cancer. In Table 1, the top 20 most significant genes related to unfavourable prognosis are listed.

ARHGAP4 is a gene associated with unfavourable prognosis colorectal cancer. The best separation is achieved by an expression cutoff at 16.8 fpkm which divides the patients into two groups with 44% 5-year survival for patients with high expression versus 67% for patients with low expression, p-value: 1.19e-5. A survival analysis in the different subtypes showed significant association only in colon adenocarcinoma. Immunohistochemical staining using an antibody targeting ARHGAP4 (HPA001012) shows differential expression pattern in colorectal cancer samples.

ARHGAP4 - survival analysis p<0.001
ARHGAP4 - high expression
ARHGAP4 - low expression

JDP2 is another gene associated with unfavourable prognosis colorectal cancer. The best separation is achieved by an expression cutoff at 4.5 fpkm which divides the patients into two groups with 53% 5-year survival for patients with high expression versus 64% for patients with low expression, p-value: 8.58e-5. A survival analysis in the different subtypes showed significant association only in colon adenocarcinoma. Immunohistochemical staining using an antibody targeting JDP2 (HPA059511) shows differential expression pattern in colorectal cancer samples.

JDP2 - survival analysis p<0.001
JDP2 - high expression
JDP2 - low expression

Table 1. The 20 genes with highest significance associated with unfavourable prognosis in colorectal cancer.

Gene

Description

Predicted localization

mRNA (cancer)

p-value

LRCH4 leucine-rich repeats and calponin homology (CH) domain containing 4 Intracellular,Membrane 2.5 1.61e-7
POFUT2 protein O-fucosyltransferase 2 Intracellular,Secreted 4.8 4.55e-7
CLK3 CDC-like kinase 3 Intracellular 2.2 2.05e-6
EGFL7 EGF-like-domain, multiple 7 Secreted 6.9 4.22e-6
DPP7 dipeptidyl-peptidase 7 Intracellular,Secreted 40.4 6.07e-6
Show more

Favourable prognostic genes in colorectal cancer


For favourable genes, higher relative expression levels at diagnosis gives significantly higher overall survival for the patients. There are 352 genes associated with favourable prognosis in colorectal cancer. In Table 2, the top 20 most significant genes related to favourable prognosis are listed.

ABCD3 is a gene associated with favourable prognosis colorectal cancer. The best separation is achieved by an expression cutoff at 6.9 fpkm which divides the patients into two groups with 71% 5-year survival for patients with high expression versus 36% for patients with low expression, p-value: 1.16e-5. A survival analysis in the different subtypes showed significant association only in colon adenocarcinoma. Immunohistochemical staining using an antibody targeting ABCD3 (HPA032026) shows differential expression pattern in colorectal cancer samples.

ABCD3 - survival analysis p<0.001
ABCD3 - high expression
ABCD3 - low expression

MCM4 is another gene associated with favourable prognosis colorectal cancer. The best separation is achieved by an expression cutoff at 16.5 fpkm which divides the patients into two groups with 65% 5 year survival for patients with high expression versus 44% for patients with low expression, p-value: 2.19e-4. A survival analysis in the different subtypes showed significant association only in colon adenocarcinoma. Immunohistochemical staining using an antibody targeting MCM4 (HPA004873) shows differential expression pattern in colorectal cancer samples.

MCM4 - survival analysis p<0.001
MCM4 - high expression
MCM4 - low expression

Table 2. The 20 genes with highest significance associated with favourable prognosis in colorectal cancer.

Gene

Description

Predicted localization

mRNA (cancer)

p-value

RBM3 RNA binding motif (RNP1, RRM) protein 3 Intracellular,Secreted 85.3 3.07e-7
NOL11 nucleolar protein 11 Intracellular 12.8 1.00e-6
USP53 ubiquitin specific peptidase 53 Intracellular 4.1 3.15e-6
TEX2 testis expressed 2 Intracellular,Membrane 5.7 3.90e-6
HOOK1 hook microtubule-tethering protein 1 Intracellular 7.8 4.53e-6
Show more

The colorectal cancer transcriptome


The transcriptome analysis shows that 68% (n=13281) of all human genes (n=19571) are expressed in colorectal cancer. All genes were classified according to the colorectal cancer-specific expression into one of five different categories, based on the ratio between mRNA levels in colorectal cancer compared to the mRNA levels in the other 16 analyzed cancer tissues. 168 genes show some level of elevated expression in colorectal cancer compared to other cancers (Figure 1). The elevated category is further subdivided into three categories as shown in Table 3.

Figure 1. The distribution of all genes across the five categories based on transcript abundance in colorectal cancer as well as in all other cancer tissues.

Table 3. Number of genes in the subdivided categories of elevated expression in colorectal cancer

Category

Number of genes

Description

Tissue enriched 21 At least five-fold higher mRNA levels in a particular cancer as compared to all other cancers
Group enriched 74 At least five-fold higher mRNA levels in a group of 2-7 cancers
Tissue enhanced 73 At least five-fold higher mRNA levels in a particular cancer as compared to average levels in all cancers
Total 168 Total number of elevated genes in colorectal cancer

Additional information


Appropriate diagnosis and staging are crucial for determining the best choice of treatment. The surgical stage represents a classification system based on the extent and depth of tumor growth. Stage I colorectal cancer shows invasive grows into the anatomical layers of the large intestine, but the tumor has not spread beyond the tissue of origin. Stage II colorectal cancer shows extended growth through the outer layer of the large intestine (peritoneum) and may have extended into nearby organs, but has not spread to any lymph node. Stage III colorectal cancer has spread to nearby lymph nodes but not yet metastasized to distant sites in the body. Finally, in Stage IV colorectal cancer the tumor has spread to distant organs such as the liver, lungs, or other sites. The Dukes classification is an older and less complicated staging system that predates the TNM system, and translates so that Duke A= Stage I, Duke B= Stage II, Duke C= Stage III and Dukes D= Stage IV.

Early colorectal cancer, where tumor spread is restricted to large intestine, is treated surgically and chemotherapy is used for more advanced stages where the tumor has spread to other organs. Anti-EGFR treatment is one recently introduced therapy. Epidermal growth factor receptor (EGFR) is commonly expressed in colorectal tumors and monoclonal antibodies inhibiting EGFR demonstrate clinical efficacy in patients with tumors that do not harbor downstream activating KRAS mutations. Today KRAS mutation status is analyzed routinely before starting anti-EGFR treatment.

The vast majority of colorectal cancer are adenocarcinomas, with less than 10% of the cancers being distinguished by an abundant secretion of mucin. The tumors are classified according to the degree of morphological differentiation into well, moderately and poorly differentiated. About 80% are well or moderately differentiated with a growth pattern consisting of tumor cells that form irregular glandular structures present at different layers of the bowel wall. Poorly differentiated colorectal cancer show no, or only slight, glandular formation. Generally poor differentiation is associated with poor prognosis, however there is no firmly established system for measuring grade of differentiation. Therefore, treatment decisions are based on the surgical stage and not morphological features. Apart from adenocarcinomas, endocrine tumors can also arise within the colorectal mucosa. Squamous and adenosquamous tumors are exceedingly rare.

In addition to microscopical examination of biopsies, immunohistochemistry can be used to determine a colorectal origin of a metastasis or to visualize the spread of tumor cells in surrounding tissues. Tumors of colorectal origin are immunoreactive toward cytokeratin 20, CDX-2, SATB2 and cadherin-17. Chromogranin-A antibodies can be used to distinguish endocrine tumors in the bowel from adenocarcinomas.

Relevant links and publications


Uhlen M et al, 2017. A pathology atlas of the human cancer transcriptome. Science.
PubMed: 28818916 DOI: 10.1126/science.aan2507

Cancer Genome Atlas Research Network et al, 2013. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet.
PubMed: 24071849 DOI: 10.1038/ng.2764

Uhlén M et al, 2015. Tissue-based map of the human proteome. Science
PubMed: 25613900 DOI: 10.1126/science.1260419

Gremel G et al, 2014. The human gastrointestinal tract-specific transcriptome and proteome as defined by RNA sequencing and antibody-based profiling. J Gastroenterol.
PubMed: 24789573 DOI: 10.1007/s00535-014-0958-7

De Rosa M et al, 2015. Genetics, diagnosis and management of colorectal cancer (Review). Oncol Rep.
PubMed: 26151224 DOI: 10.3892/or.2015.4108

Histology dictionary - Colorectal cancer