The Landmark Genes

Benchmarking against 184 query:result pairs (*) from well-known published and unpublished CMap connections 1,000 genes captures 80% of the information.

Benchmarking the landmark genes.

Gene expression is highly correlated. We take advantage of this
high-degree of correlation to reduce the number of measurements needed to
generate meaningful gene expression data for the approximately 20,000 genes
in the human genome.

We selected the landmark genes by analyzing gene expression profiles of cells
collected from normal tissues and various disease types. The genes selected
are: 1) minimally redundant, 2) widely expressed in different cellular contexts and 3) posses inferential value in our statistical models

By analyzing several query:result pairs (* in graph) from well-known
published and unpublished CMap connections, we determined that 1,000 genes
can capture approximately 80% of the information. The table below shows
examples of Landmark genes used in the L1000 assay.

L1000 ID Symbol Entrez ID Description
YYYE11F11 PSME1 5720 proteasome(prosome,macropain) activator subunit 1 (PA28 alpha)
BBBBC1D1 CISD1 55847 CDGSH iron sulfur domain 1
BBBBE1F1 SPDEF 25803 SAM pointed domain containing ets transcription factor
QQQA8B8 ATF1 466 activating transcription factor 1
ZZZC4D4 RHEB 6009 Ras homolog enriched in brain
BBBBC2D2 IGF1R 3480 insulin-like growth factor 1 receptor
TTTE12F12 FOXO3 2309 forkhead box 03
BBBBG2H2 GSTM2 2946 glutathione S-transferase mu 2 (muscle)
ZZZE4F4 RHOA 387 ras homolog gene family, member A
VVVE5F5 IL1B 3553 interleukin 1, beta
QQQG7H7 ASAH1 427 N-acylsphingosine amidohydrolase (acid ceramidase) 1
ZZZC2D2 RALA 5898 v-ral simian leukemia viral oncogene homolog A (ras related)
QQQG6H6 ARHGEF12 23365 Rho guanine nucleotide exchange factor (GEF) 12
AAAAC2D2 SOX2 6657 SRY (sex determining region Y)-box 2
ZZZC10D10 SERPINE1 5054 serpine pepsidase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 1
UUUG8H8 HLA-DMA 3108 major histocompatability complex, class II, DM alpha
SSSG12H12 EGF 1950 epidermal growth factor
BBBBC5D5 SPTLC2 9517 serine palmitoyltransferase, long chain base subunit 2
QQQG5H5 APP 351 amyloid beta (A4) precursor protein
BBBBG5H5 TSKU 25987 tsukuski small leucine rich proteoglycan homolog (Xenopus laevis)


Download the full list of landmark genes.