RNA category is based on mRNA expression levels in the analyzed samples (RNA assay description). The categories include: tissue/cell line enriched, group enriched, tissue/cell line enhanced, expressed in all, mixed and not detected. RNA category is calculated separately for The Cancer Genome Atlas (TCGA) data from cancer tissues and internally generated Human Protein Atlas (HPA) data from normal tissues and cell lines.
TCGA (cancer tissue):
HPA (cell line):
Group enriched (HEL, HMC-1, K-562)
HPA (normal tissue):
Tissue enriched (bone marrow)
Protein evidence scores are generated from several independent sources and are classified as evidence at i) protein level, ii) transcript level, iii) no evidence, or iv) not available.
Evidence at protein level
Protein expression normal tissuei
A summary of the overall protein expression pattern across the analyzed normal tissues. The summary is based on knowledge-based annotation.
"Estimation of protein expression could not be performed. View primary data." is shown for genes analyzed with a knowledge-based approach where available RNA-seq and gene/protein characterization data has been evaluated as not sufficient in combination with immunohistochemistry data to yield a reliable estimation of the protein expression profile.
Selective nuclear expression in hematopoietic cells.
ANTIBODY IHC RELIABILITY
Data reliability descriptioni
Standardized explanatory sentences with additional information required for full understanding of the knowledge-based expression profile.
Antibody staining mainly consistent with RNA expression data.
Reliability score (score description), divided into Supported, Approved, or Uncertain, is evaluated in normal tissues and based on consistency between the staining pattern of one antibody or several antibodies with RNA-seq data and available gene/protein characterization data.
Kaplan-Meier plots for all cancers where high expression of this gene has significant (p<0.001) association with patient survival are shown in this summary. Whether the prognosis is favourable or unfavourable is indicated in brackets. Each Kaplan-Meier plot is clickable and redirects to a detailed page that includes individual expression and survival data for patients with the selected cancer.
Gene product is not prognostic.
RNA EXPRESSION OVERVIEWi
RNA expression overview shows RNA-seq data from The Cancer Genome Atlas (TCGA).
RNA-seq data in 17 cancer types are reported as median FPKM (number Fragments Per Kilobase of exon per Million reads), generated by the The Cancer Genome Atlas (TCGA). RNA cancer tissue category is calculated based on mRNA expression levels across all 17 cancer tissues and include: cancer tissue enriched, cancer group enriched, cancer tissue enhanced, expressed in all, mixed and not detected. To access cancer specific RNA and prognostic data, click on the cancer name.
Antibody staining in 20 different cancers is summarized by a selection of four standard cancer tissue samples representative of the overall staining pattern. From left: colorectal cancer, breast cancer, prostate cancer and lung cancer. An additional fifth image can be added as a complement. The assay and annotation is described here. Note that samples used for immunohistochemistry by the Human Protein Atlas do not correspond to samples in the TCGA dataset.
For each cancer, the fraction of samples with protein expression level high, medium, low, or not detected are provided by the blue-scale color-coding (as described by the color-coding scale in the box to the left). The length of the bar represents the number of patient samples analyzed (max=12 patients). The images and annotations can be accessed by clicking on the cancer name or protein expression bar. If more than one antibody is analyzed, the tabs at the top of the staining summary section can be used to toggle between the different antibodies. The mouse-over function displays additional data for the features in the staining summary view.
Next to the cancer staining data, the protein expression data of normal tissues or specific cell types corresponding to each cancer are shown and protein expression levels are indicated by the blue-scale color coding.
A manually written summary of the overall protein expression pattern across the analyzed cancer tissues.
Malignant cells displayed weak to moderate cytoplasmic staining.
Gene information from Ensembl and Entrez, as well as links to available gene identifiers are displayed here. Information was retrieved from Ensembl if not indicated otherwise.
GATA1 (HGNC Symbol)
ERYF1, GATA-1, GF1, NF-E1, NFE1
GATA binding protein 1 (globin transcription factor 1) (HGNC Symbol)
Entrez gene summary
This gene encodes a protein which belongs to the GATA family of transcription factors. The protein plays an important role in erythroid development by regulating the switch of fetal hemoglobin to adult hemoglobin. Mutations in this gene have been associated with X-linked dyserythropoietic anemia and thrombocytopenia. [provided by RefSeq, Jul 2008]
The protein browser displays the antigen location on the target protein(s) and the features of the target protein. The tabs at the top of the protein view section can be used to switch between the different splice variants to which an antigen has been mapped.
At the top of the view, the position of the antigen (identified by the corresponding HPA identifier) is shown as a green bar. A yellow triangle on the bar indicates a <100% sequence identity to the protein target.
Under the antigens, the maximum percent sequence identity of the protein to all other proteins from other human genes is displayed, using a sliding window of 10 aa residues (HsID 10) or 50 aa residues (HsID 50) (read more).
If a signal peptide is predicted by a majority of the signal peptide predictors SPOCTOPUS, SignalP 4.0, and Phobius (turquoise) and/or transmembrane regions (orange) are predicted by MDM, these are displayed.
Low complexity regions are shown in yellow and InterPro regions in green. Common (purple) and unique (grey) regions between different splice variants of the gene are also displayed (read more), and at the bottom of the protein view is the protein scale.
The protein information section displays alternative protein-coding transcripts (splice variants) encoded by this gene according to the Ensembl database.
The ENSP identifier links to the Ensembl website protein summary, while the ENST identifier links to the Ensembl website transcript summary for the selected splice variant. The data in the UniProt column can be expanded to show links to all matching UniProt identifiers for this protein.
The protein classes assigned to this protein are shown if expanding the data in the protein class column. Parent protein classes are in bold font and subclasses are listed under the parent class.
The Gene Ontology terms assigned to this protein are listed if expanding the Gene ontology column. The length of the protein (amino acid residues according to Ensembl), molecular mass (kDalton), predicted signal peptide (according to a majority of the signal peptide predictors SPOCTOPUS, SignalP 4.0, and Phobius) and the number of predicted transmembrane region(s) (according to MDM) are also reported.
Predicted intracellular proteins Transcription factors Zinc-coordinating DNA-binding domains Cancer-related genes COSMIC somatic mutations in cancer genes COSMIC Somatic Mutations COSMIC Missense Mutations COSMIC Frameshift Mutations Disease related genes Protein evidence (Kim et al 2014) Protein evidence (Ezkurdia et al 2014)
GO:0000122 [negative regulation of transcription from RNA polymerase II promoter] GO:0000976 [transcription regulatory region sequence-specific DNA binding] GO:0000977 [RNA polymerase II regulatory region sequence-specific DNA binding] GO:0000978 [RNA polymerase II core promoter proximal region sequence-specific DNA binding] GO:0000979 [RNA polymerase II core promoter sequence-specific DNA binding] GO:0000981 [RNA polymerase II transcription factor activity, sequence-specific DNA binding] GO:0001047 [core promoter binding] GO:0001077 [transcriptional activator activity, RNA polymerase II core promoter proximal region sequence-specific binding] GO:0001078 [transcriptional repressor activity, RNA polymerase II core promoter proximal region sequence-specific binding] GO:0001085 [RNA polymerase II transcription factor binding] GO:0001158 [enhancer sequence-specific DNA binding] GO:0001228 [transcriptional activator activity, RNA polymerase II transcription regulatory region sequence-specific binding] GO:0001701 [in utero embryonic development] GO:0002039 [p53 binding] GO:0003677 [DNA binding] GO:0003682 [chromatin binding] GO:0003700 [transcription factor activity, sequence-specific DNA binding] GO:0005515 [protein binding] GO:0005634 [nucleus] GO:0005654 [nucleoplasm] GO:0005667 [transcription factor complex] GO:0006355 [regulation of transcription, DNA-templated] GO:0006366 [transcription from RNA polymerase II promoter] GO:0007267 [cell-cell signaling] GO:0007507 [heart development] GO:0007596 [blood coagulation] GO:0008270 [zinc ion binding] GO:0008285 [negative regulation of cell proliferation] GO:0008301 [DNA binding, bending] GO:0008584 [male gonad development] GO:0009887 [organ morphogenesis] GO:0009888 [tissue development] GO:0010559 [regulation of glycoprotein biosynthetic process] GO:0010724 [regulation of definitive erythrocyte differentiation] GO:0017053 [transcriptional repressor complex] GO:0030099 [myeloid cell differentiation] GO:0030218 [erythrocyte differentiation] GO:0030219 [megakaryocyte differentiation] GO:0030220 [platelet formation] GO:0030221 [basophil differentiation] GO:0030222 [eosinophil differentiation] GO:0030502 [negative regulation of bone mineralization] GO:0031490 [chromatin DNA binding] GO:0033690 [positive regulation of osteoblast proliferation] GO:0035162 [embryonic hemopoiesis] GO:0035854 [eosinophil fate commitment] GO:0043066 [negative regulation of apoptotic process] GO:0043565 [sequence-specific DNA binding] GO:0045648 [positive regulation of erythrocyte differentiation] GO:0045893 [positive regulation of transcription, DNA-templated] GO:0045944 [positive regulation of transcription from RNA polymerase II promoter] GO:0048468 [cell development] GO:0048565 [digestive tract development] GO:0048821 [erythrocyte development] GO:0050731 [positive regulation of peptidyl-tyrosine phosphorylation] GO:0070527 [platelet aggregation] GO:0070742 [C2H2 zinc finger domain binding] GO:0071733 [transcriptional activation by promoter-enhancer looping] GO:0097028 [dendritic cell differentiation] GO:0097067 [cellular response to thyroid hormone stimulus] GO:2000678 [negative regulation of transcription regulatory region DNA binding] GO:2001240 [negative regulation of extrinsic apoptotic signaling pathway in absence of ligand]
Predicted intracellular proteins Cancer-related genes COSMIC somatic mutations in cancer genes COSMIC Somatic Mutations COSMIC Missense Mutations COSMIC Frameshift Mutations Protein evidence (Ezkurdia et al 2014)
GO:0003700 [transcription factor activity, sequence-specific DNA binding] GO:0006355 [regulation of transcription, DNA-templated] GO:0008270 [zinc ion binding] GO:0043565 [sequence-specific DNA binding]