Is there a database containing sequences of human cell lines?

Is there a database containing sequences of human cell lines?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I'm looking for the whole genome sequences of several human cell lines, e.g., A549 or Ea.hy.926. Is there a database specifically dedicated to human cell lines?

As pointed out by @kmm ( is an excellent database. Alternatively you can look at the broad institute page ( in the cancer cell line encyclopedia section, then go to the browse tab and select the cell line for a very comprehensive list of cell lines.

Is there a database containing sequences of human cell lines? - Biology

Genomic sequence variation
Data collection and a catalog of human variation

dbVar and Database of Genomic Variants

Online Mendelian Inheritance in Man
OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. The full-text, referenced overviews in OMIM contain information on all known mendelian disorders and over 12,000 genes. OMIM focuses on the relationship between phenotype and genotype. It is updated daily, and the entries contain copious links to other genetics resources.

The Exome Aggregation Consortium (ExAC)
ExAC is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. The data set provided on this website spans 61,486 unrelated individuals sequenced as part of various disease-specific and population genetic studies. We have removed individuals affected by severe pediatric disease, so this data set should serve as a useful reference set of allele frequencies for severe disease studies. All of the raw data from these projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.

Encyclopedia Of DNA Elements (ENCODE) Project
Links to ENCODE2 uniformly processed histone mark data:
Links to other ENCODE2 uniformly processed data:
Data collection, integrative analysis, and a comprehensive catalog of
all sequence-based functional elements

Roadmap Epigenomics Project (NIH Common Fund)

International Human Epigenome Consortium (IHEC)
Data collection and reference maps of human epigenomes for key
cellular states relevant to health and diseases

###Human BodyMap Viewable with Ensemble ( or the
Integrated Genomics Viewer (
Gene expression database from Illumina, from RNA-seq data

###Cancer CellLine Encyclopedia (CCLE)
Array based expression data, CNV, mutations, perturbations over huge collection of cell lines

###FANTOM5 Project
Large collection of CAGE based expression data across multiple species (time-series and perturbations)
Database supporting queries of condition-specific gene expression on
a curated subset of the Array Express Archive.

GNF Gene Expression Atlas

Viewable at BioGPS (
GNF (Genomics Institute of the Novartis Research Foundation) human and mouse gene expression array data.
Protein expression profiles based on immunohistochemistry for a large number of human tissues, cancers and cell lines, subcellular localization, transcript expression levels
A comprehensive, freely accessible database of protein sequence and
functional information
An integrated database of protein classification, functional domains,
and annotation (including GO terms).

Protein Capture Reagents Initiative
Resource generation: renewable, monoclonal antibodies and other reagents that target the full range of proteins

Knockout Mouse Program (KOMP)

The Connectivity Map (CMAP)
The Connectivity Map (also known as cmap) is a collection of genome-wide transcriptional expression data from cultured human cells treated with bioactive small molecules and simple pattern-matching algorithms that together enable the discovery of functional connections between drugs, genes and diseases through the transitory feature of common gene-expression changes. You can learn more about cmap from our papers in Science and Nature Reviews Cancer.

Library of Integrated Network-based Cellular Signatures (LINCS)
Data collection and analysis of molecular signatures that describe how
different types of cells respond to a variety of perturbing agents

Genomic of drug sensitivity in cancer
Mutation, CNV, Affy expression and drug sensitivity in

The Drug Gene Interaction database (DGIdb)

Molecular Libraries Program (MLP)
Access to the large-scale screening capacity necessary to identify small molecules that can be optimized as chemical probes to study the functions of genes, cells, and biochemical pathways in health and disease
Data collection and an online public resources integrating extensive gene expression and neuroanatomical data for human and mouse, including variation of mosue gene expression by strain.
BrainCloud is a freely-available, biologist-friendly, stand-alone application for exploring the temporal dynamics and genetic control of transcription in the human prefrontal cortex across the lifespan. BrainCloud was developed through collaboration between the Lieber Institute and NIMH

The Human Connectome Project
Data collection and integration to create a complete map of the structural and functional neural connections, within and across individuals

Geuvadis RNA sequencing project of 1000 Genomes samples
mRNA and small RNA sequencing on 465 lymphoblastoid cell line (LCL) samples from 5 populations of the 1000 Genomes Project: the CEPH (CEU), Finns (FIN), British (GBR), Toscani (TSI) and Yoruba (YRI). Project Achilles is a systematic effort aimed at identifying and cataloging genetic vulnerabilities across hundreds of genomically characterized cancer cell lines. The project uses a genome-wide shRNA library to silence individual genes and identify those genes that affect cell survival. Large-scale functional screening of cancer cell lines provides a complementary approach to those studies that aim to characterize the molecular alterations (mutations, copy number alterations, etc.) of primary tumors, such as The Cancer Genome Atlas. The overall goal of the project is to link cancer genetic dependencies to their molecular characteristics in order to Identify molecular targets and guide therapeutic development.

Human Ageing Genomic Resources

The Cancer Genome Atlas (TCGA)
Data collection and a data repository, including cancer genome sequence data

International Cancer Genome Consortium (ICGC)
Data collection and a data repository for a comprehensive description of genomic, transcriptomic and epigenomic changes of cancer

Genotype-Tissue Expression (GTEx) Project
Data collection, data repository, and sample bank for human gene expression and regulation in multiple tissues, compared to genetic variation

Knockout Mouse Phenotyping Program (KOMP2)
Data collection for standardized phenotyping of a genome-wide collection of mouse knockouts

Database of Genotypes and Phenotypes (dbGaP)
Data repository for results from studies investigating the interaction of genotype and phenotype

NHGRI Catalog of Published GWAS
Public catalog of published Genome-Wide Association Studies

Clinical Genomic Database
A manually curated database of conditions with known genetic causes, focusing on medically significant genetic data with available interventions.

NHGRI's Breast Cancer information core
ClinVar is designed to provide a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. ClinVar collects reports of variants found in patient samples, assertions made regarding their clinical significance, information about the submitter, and other supporting data. The alleles described in submissions are mapped to reference sequences, and reported according to the HGVS standard. ClinVar then presents the data for interactive users as well as those wishing to use ClinVar in daily workflows and other local applications. ClinVar works in collaboration with interested organizations to meet the needs of the medical genetics community as efficiently and effectively as possible.

Human Gene Mutation Database (HGMD)
The Human Gene Mutation Database (HGMD®) represents an attempt to collate known (published) gene lesions responsible for human inherited disease

NHLBI Exome Sequencing Project (ESP) Exome Variant Server
The goal of the NHLBI GO Exome Sequencing Project (ESP) is to discover novel genes and mechanisms contributing to heart, lung and blood disorders by pioneering the application of next-generation sequencing of the protein coding regions of the human genome across diverse, richly-phenotyped populations and to share these datasets and findings with the scientific community to extend and enrich the diagnosis, management and treatment of heart, lung and blood disorders.
Genetics Home Reference is the National Library of Medicine's web site for consumer information about genetic conditions and the genes or chromosomes related to those conditions.
GeneReviews are expert-authored, peer-reviewed disease descriptions presented in a standardized format and focused on clinically relevant and medically actionable information on the diagnosis, management, and genetic counseling of patients and families with specific inherited conditions.

Global Alzheimer's Association Interactive Network (GAAIN)
The Global Alzheimer’s Association Interactive Network (GAAIN) is a collaborative project that will provide researchers around the globe with access to a vast repository of Alzheimer’s disease research data and the sophisticated analytical tools and computational power needed to work with that data. Our goal is to transform the way scientists work together to answer key questions related to understanding the causes, diagnosis, treatment and prevention of Alzheimer’s and other neurodegenerative diseases.
In 2013, obtained WGS data for the largest cohort of 800 Alzheimer's patients

The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium
The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium was formed to facilitate genome-wide association study meta-analyses and replication opportunities among multiple large and well-phenotyped longitudinal cohort studies. They also have DNA methylation data alongside WGS and Exome Seq.

The NIMH Center for Collaborative Genomic Studies on Mental Disorders

Access options

Get full journal access for 1 year

All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.

Get time limited or full article access on ReadCube.

All prices are NET prices.


ERVmap: A Bioinformatic Tool to Map RNA Sequencing Reads to Human Proviral ERVs.

To obtain a complete high-resolution genome-wide human ERV compendium, or ERVome, we compiled a curated list of 3,220 ERV proviral loci (Dataset S1). These ERVs were either transcribed in various disease contexts or identified as ERVs based on sequence analysis in silico (14, 38 ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ ⇓ –46). We included ERV loci with unique chromosomal locations that had been described as autonomous/proviral ERVs and did not intentionally exclude any loci. For loci that overlapped between studies, we selected the one with longer sequence coverage. Unlike the RepeatMasker annotation in which the ERVs are mainly noncontiguous non-autonomous LTR elements with an average length of 368 bp, the ERVmap database contains ERVs that are mostly autonomous LTR elements with an average length of 7.5 kb (37). RepeatMasker annotation has 885 loci above 5 kb, whereas ERVmap has 2,722 loci above 5 kb. The average length of the rest of the ERVs below 5 kb for ERVmap is 3.6 kb, whereas for RepeatMasker it is 360 bp. ERVmap captures all known proviral ERV sequences to date and is ideal for analysis of specific autonomous ERV genomic loci throughout the host genome.

Using this database, we aligned processed RNA sequencing reads to the human genome (hg38) using Burrows-Wheeler Aligner (BWA), used scripts specifically designed for ERVmap to filter the mapped reads according to our stringent criteria, quantified filtered reads that mapped to the ERV coordinates from our database, and normalized the counts to size factors obtained through standard cellular gene-expression analysis (SI Appendix, Fig. S1). The pipeline yields normalized values based on filtered read counts mapped per locus. To obtain high-fidelity mapping of sequence reads to these repetitive ERV loci, we employed a very stringent filtering criteria to the mapped reads, such that each mapped read: (i) could only have one best match, (ii) the second best match must have at least one more mismatch, and (iii) excluded if it has more than three mismatches (14). This criterion is for 150-bp pair-end reads and is proportionally adjusted according to read length of the sequencing data. In this algorithm, which we call ERVmap, we intentionally excluded reads that mapped to conserved regions in the proviral sequence and reads that mapped to polymorphic loci, both of which would fall under the third criteria of having more than three mismatches per sequenced read to favor locus specificity over overall abundance. Our code is available through GitHub ( (SI Appendix, Fig. S2). Finally, we developed a web-based tool that is available for users to obtain the human ERVome in any RNA sequencing data by simply uploading raw RNA sequencing files (

ERV Expression Patterns in Human Cell Lines.

We obtained RNA sequencing data from ENCODE for several common cell lines (Fig. 1A) to analyze the ERVome in these cells. We selected cell lines that have accompanying ChIP-sequencing data. In all of the analyzed cells, we observed ∼40% of ERVs at detectable levels (Fig. 1B). K562 cells expressed the highest level of ERVs, not because they expressed more ERV loci but because the expressed ERVs are transcribed at higher levels (Fig. 1 B and C). In contrast, A549 cells expressed the lowest level of ERVs, roughly one-third of the amount expressed by K562 cells. Comparison of the ERVs between cell lines revealed clusters of ERVs that are uniquely expressed in each cell line (Fig. 1D). These ERVs clustered distinctly using a t-distributed stochastic neighbor-embedding (t-SNE) algorithm, implying that unique sets of ERVs are expressed in each cell line (Fig. 1E). Additionally, ERV expression alone was sufficient to segregate cell types based on principle component analysis (PCA), suggesting that ERV expression is unique enough to allow discrimination between cell types (Fig. 1F). We confirmed expression of a set of ERVs using qRT-PCR (SI Appendix, Fig. S3). Finally, we analyzed ChIP-sequencing data available for these cell lines and observed H3K4me3 and H3K27Ac histone marks at actively transcribed ERV loci, and also observed a positive correlation between active histone marks and higher ERV expression (SI Appendix, Fig. S4 A and B). In contrast, very few repressive histone marks H3K9me3 and H3K27me3 were present at the transcribed ERV loci, suggesting that the lack of silencing histone modifications and the presence of active histone modifications accompany ERV expression in these cells (SI Appendix, Fig. S4 C and D).

Cell-type–specific ERV expression in cell lines. (A) Description of cell lines used in the ERV analysis. (B) Histogram of the amount of reads attributed to each of the 3,220 ERV loci sorted in order of highest to lowest expressed ERVs for each cell line. (C) Sum of all ERV reads per cell line compared across cell types. (D) Heatmap of ERVs that are expressed across indicated cell types. ERVs with zero reads across all cell lines were excluded. A total of 1,704 ERVs are displayed. Two-dimensional t-SNE analysis (E) and PCA (F) of ERVs expressed by indicated cell types using the same set of 1,704 ERVs as in D. t-SNE analysis was performed using a perplexity of 30 and maximum iteration of 1,000. N/A, cell assignment not possible due to multiple cell lines expressing the same exact amount of the particular ERV.

In contrast to ERVmap, less resolution was observed when RepeatMasker annotation was used to analyze ERV expression using a published method called RepEnrich (47). RepEnrich quantifies LTR elements at the level of subfamilies, each of which contains hundreds of copies in the genome (SI Appendix, Table S1) and does not yield quantification of reads at specific ERV loci. RepEnrich analysis did not reveal clusters of cell-type–specific ERV elements (SI Appendix, Fig. S5A), but there were enough differences in the expression of ERV families between cell types to segregate cells based on hierarchical clustering and PCA analysis (SI Appendix, Fig. S5B). Thus, ERVmap provides locus-specific profiling of the ERVome using RNA-sequencing (RNA-seq) datasets that should facilitate downstream mechanistic studies.

Differential ERV Expression in Primary Cell Types.

We next used RNA-seq data from primary cells in ENCODE to analyze the ERVome in seven different cell types, both immune and nonimmune cells, to obtain ERV expression in a range of cell types (Fig. 2A). Similar to cell lines, roughly 50% of the ERV loci were expressed by any given cell type (Fig. 2B). All cells expressed similar total levels of ERV transcripts, but distinct sets of ERVs were transcribed in a given cell type (Fig. 2 C and D). Neurosphere embryos and B cells in particular expressed clusters of highly cell-type–specific ERVs. Using the t-SNE algorithm, we observed unique clusters of ERVs expressed in each cell type however, we observed similar ERV clusters between CD4 + and CD8 + T cells, suggesting that ERVs expressed by these cell types are similar relative to other cell types (Fig. 2E). This likely reflects the biological similarity within the two T cell populations. Cell types segregated based solely on ERV expression profiles and revealed that the ERVome is largely distinct between lymphocytes, keratinocytes, and neurosphere embryos (Fig. 2F). Finally, in comparison with cell lines, primary cells expressed lower levels of ERVs overall, suggesting that the process of transformation or cell culture might lead to elevated ERV expression (SI Appendix, Fig. S6).

Cell-type–specific ERV expression in primary cells. (A) Cell types and associated information for each sample used in the ERV analysis. (B) Histogram of the amount of reads attributed to each of the 3,220 ERV loci sorted in order of highest to lowest expressed ERVs for each cell type. For cell types with multiple samples, the average number of reads per locus was plotted. (C) Sum of all ERV reads per sample compared across cell types. For cell types with multiple datasets, the average and SEM are graphed. (D) Heatmap of ERVs that are expressed across indicated cell types. The 500 most varying ERVs were used for the analysis to reduce noise. Two-dimensional t-SNE analysis (E) and PCA (F) of ERVs expressed by indicated cell types using the same set of 500 ERVs as in D. t-SNE analysis was performed using a perplexity of 30 and maximum iteration of 1,000.

ERVome Is Elevated in SLE Patients.

ERVs have been implicated in various diseases, including cancer and autoimmunity. SLE is a multigenic autoimmune disease with diverse clinical manifestations and still lacks a cure. Many drugs that target various immune effectors have been tested, but they have had varying levels of success (48). One of the biggest hurdles in designing effective drugs is the poor understanding of the underlying cause for the diverse array of symptoms associated with the disease. While studies have observed elevated expression of ERV sequences in SLE patients (49 ⇓ ⇓ ⇓ –53), these studies have focused on one or two ERVs and the field could benefit from a genome-wide analysis of the ERVome in SLE patients to reveal relevance of ERVs in disease. Thus, we obtained peripheral blood mononuclear cells (PBMCs) from female SLE patients and healthy females (SI Appendix, Table S2), because SLE is a female-dominant disease, and performed RNA sequencing followed by ERVmap analysis. In this cohort, we identified 124 ERVs that were significantly elevated in SLE patients’ PBMCs compared with healthy controls, but none that were repressed (Fig. 3A). SLE patients expressed significantly higher levels of ERV transcripts as a whole as well as at the individual locus, and ERV expression largely segregated SLE patients from healthy controls (Fig. 3 B and C). Finally, we observed that ERV expression is not a direct correlate of the interferon (IFN) signature for many patients, as illustrated by comparison between total ERV expression and the total IFN-stimulated gene (ISG) expression per patient (Fig. 3D). The ISG expression was calculated using a previously published list of ISG signature observed in SLE patients (54). Together, ERVmap revealed a global elevation of the ERVome and identified specific ERV loci that are elevated in SLE patients that together may reflect an ISG-independent signature of SLE.

Patients with SLE have elevated ERV expression. (A) A volcano plot depicting differential expression of all 3,220 ERVs. Red ERVs are significantly elevated in SLE patients compared with healthy controls (padj < 0.05, log2 fold-change > 1.0). The top 30 significantly elevated ERVs are indicated by their names. (B) Comparison of the sum of all significantly different ERV reads between healthy and SLE donors (SLE, n = 20 healthy, n = 6). Error bars represent SEM and nonparametric Mann–Whitney U test was performed to calculate significance. ***P < 0.001. (C) Heatmap of the 124 significantly elevated ERVs in SLE patients compared with healthy controls as determined by using a cut-off of padj < 0.05. (D) Heatmap of the sum of reads for significantly elevated ERVs and the sum of reads for ISGs per patient sample.

Identification of Additional ERVs That Are Elevated and Correlate with Cytolytic Activity in Breast Cancer Tissues.

Cytotoxic T cells and natural killer cells are important effectors of tumor surveillance. They are armed with granzyme and perforin to directly kill tumor cells. A recent study using a set of 66 ERVs as a reference showed that 8 of the 66 ERVs positively correlated with the expression of granzyme and perforin in breast cancer tissues, implying a potential role of ERVs in immune surveillance (19). We applied ERVmap to the same breast cancer tissue dataset generated from The Cancer Genome Atlas (TCGA) Research Network to determine whether we are able to identify additional ERVs that are elevated in breast cancer tissues and associate with the granzyme and perforin cytolytic activity measure (CYT), as reported previously. We observed a large number of significantly elevated, as well as repressed, ERVs in breast cancer tissues compared with normal breast tissues (Fig. 4A). We confirmed elevated expression of two of the three tumor-specific ERVs (TSERVs) identified by the Hacohen group (19), ERVH48-1 and ERVE-4 (Fig. 4B). We also identified an additional 203 ERVs that were significantly elevated in breast cancer tissues, as well as 195 repressed ERVs (Fig. 4 A and C). Five of the eight ERVs that positively correlated with CYT in the Hacohen and colleagues (19) paper also showed a positive correlation using ERVmap, but none of these ERVs were significantly elevated in breast cancer tissues. Instead, we identified 38 ERVs that were both significantly elevated and showed positive correlation with CYT (Fig. 4D). We also identified 56 ERVs that were significantly repressed but showed a positive correlation with CYT (Fig. 4D). Together the data illustrated that ERVmap can reveal tumor-associated ERVs, which may play a role in tumor surveillance.

ERVmap reveals additional breast cancer-associated ERVs that correlate with cytolytic activity. (A) A volcano plot depicting all 3,220 ERVs. Blue ERVs are significantly repressed and red ERVs are significantly elevated in breast cancer tissues compared with healthy controls (padj < 0.05, log2 fold-change < −1.5 or > 1.5). The top 30 significantly elevated and repressed ERVs are indicated by their names. (B) Normalized DESeq ERV read counts for previously reported TSERVs or the top 10 significantly elevated or significantly repressed ERVs identified in this report (C) are plotted as dot plots for the indicated ERVs for each sample (brca, breast cancer n = 1,246, red normal n = 221, gray). (D) Spearman’s r correlation was calculated between the significantly elevated or repressed ERVs (padj < 0.05, log2 fold-change > 1.5 or < −1.5) and the average expression level of granzyme and perforin (CYT) for all breast cancer tissue samples. (*P < 0.05 **P < 0.01 ***P < 0.001 ****P < 0.0001).


SCPortalen is the first single-cell database that provides comprehensively curated metadata and analysis results of publicly available single-cell dataset. Furthermore, we attempted to make these datasets comparable by using a unified analysis pipeline.

Additional work will focus on analysis, such as (i) characterization of expression distribution and identification of multi-state genes, and (ii) expression of long non-coding RNA. The database design of SCPortalen will be scale-up to meet the increasing demands of single-cell omics research. Future database update will include processing of whole genome sequence of single-cell and ATAC-Seq, and FISH images. We believe that SCPortalen will be a useful resource for the single-cell research community.



Handling information

  1. Check all containers for leakage or breakage.
  2. Remove the frozen cells from the dry ice packaging and immediately place the cells at a temperature below ­-130°C, preferably in liquid nitrogen vapor, until ready for use.

To insure the highest level of viability, thaw the vial and initiate the culture as soon as possible upon receipt. If upon arrival, continued storage of the frozen culture is necessary, it should be stored in liquid nitrogen vapor phase and not at -70°C. Storage at -70°C will result in loss of viability.

  1. Thaw the vial by gentle agitation in a 37°C water bath. To reduce the possibility of contamination, keep the O-ring and cap out of the water. Thawing should be rapid (approximately 2 minutes).
  2. Remove the vial from the water bath as soon as the contents are thawed, and decontaminate by dipping in or spraying with 70% ethanol. All of the operations from this point on should be carried out under strict aseptic conditions.
  3. Transfer the vial contents to a 75 cm 2 tissue culture flask and dilute with the recommended complete culture medium (see the specific batch information for the recommended dilution ratio). It is important to avoid excessive alkalinity of the medium during recovery of the cells. It is suggested that, prior to the addition of the vial contents, the culture vessel containing the growth medium be placed into the incubator for at least 15 minutes to allow the medium to reach its normal pH (7.0 to 7.6).
  4. Incubate the culture at 37°C in a suitable incubator. A 5% CO2 in air atmosphere is recommended if using the medium described on this product sheet.

If it is desired that the cryoprotective agent be removed immediately, or that a more concentrated cell suspension be obtained, centrifuge the cell suspension at approximately 125 xg for 5 to 10 minutes. Discard the supernatant and resuspend the cells with fresh growth medium at the dilution ratio recommended in the specific batch information.

  1. Remove and discard culture medium.
  2. Briefly rinse the cell layer with 0.25% (w/v) Trypsin- 0.53 mM EDTA solution to remove all traces of serum that contains trypsin inhibitor.
  3. Add 2.0 to 3.0 ml of Trypsin-EDTA solution to flask and observe cells under an inverted microscope until cell layer is dispersed (usually within 5 to 15 minutes).
    Note: To avoid clumping do not agitate the cells by hitting or shaking the flask while waiting for the cells to detach. Cells that are difficult to detach may be placed at 37°C to facilitate dispersal.
  4. Add 6.0 to 8.0 ml of complete growth medium and aspirate cells by gently pipetting.
  5. Add appropriate aliquots of the cell suspension to new culture vessels.
  6. Incubate cultures at 37°C.

Quality control specifications


Legal disclaimers

The product is provided 'AS IS' and the viability of ATCC ® products is warranted for 30 days from the date of shipment, provided that the customer has stored and handled the product according to the information included on the product information sheet, website, and Certificate of Analysis. For living cultures, ATCC lists the media formulation and reagents that have been found to be effective for the product. While other unspecified media and reagents may also produce satisfactory results, a change in the ATCC and/or depositor-recommended protocols may affect the recovery, growth, and/or function of the product. If an alternative medium formulation or reagent is used, the ATCC warranty for viability is no longer valid. Except as expressly set forth herein, no other warranties of any kind are provided, express or implied, including, but not limited to, any implied warranties of merchantability, fitness for a particular purpose, manufacture according to cGMP standards, typicality, safety, accuracy, and/or noninfringement.

This product is intended for laboratory research use only. It is not intended for any animal or human therapeutic use, any human or animal consumption, or any diagnostic use. Any proposed commercial use is prohibited without a license from ATCC.

While ATCC uses reasonable efforts to include accurate and up-to-date information on this product sheet, ATCC makes no warranties or representations as to its accuracy. Citations from scientific literature and patents are provided for informational purposes only. ATCC does not warrant that such information has been confirmed to be accurate or complete and the customer bears the sole responsibility of confirming the accuracy and completeness of any such information.

This product is sent on the condition that the customer is responsible for and assumes all risk and responsibility in connection with the receipt, handling, storage, disposal, and use of the ATCC product including without limitation taking all appropriate safety and handling precautions to minimize health or environmental risk. As a condition of receiving the material, the customer agrees that any activity undertaken with the ATCC product and any progeny or modifications will be conducted in compliance with all applicable laws, regulations, and guidelines. This product is provided 'AS IS' with no representations or warranties whatsoever except as expressly set forth herein and in no event shall ATCC, its parents, subsidiaries, directors, officers, agents, employees, assigns, successors, and affiliates be liable for indirect, special, incidental, or consequential damages of any kind in connection with or arising out of the customer's use of the product. While reasonable effort is made to ensure authenticity and reliability of materials on deposit, ATCC is not liable for damages arising from the misidentification or misrepresentation of such materials.

Is there a database containing sequences of human cell lines? - Biology

PhosphoSitePlus ® provides comprehensive information and tools for the study of protein post-translational modifications (PTMs) including phosphorylation, acetylation, and more. The web use is free for everyone including commercial.

Protein, Sequence, or Reference Search:Protein Searches retrieve lists of proteins and their modification types based on protein name or ID, protein type, domain, cellular component, MW, and pI range. Sequence searches retrieve lists of proteins and sequences containing specified sequences, degenerate motifs, and domains. Reference searches by author or protein retrieve lists of associated literature references. PubMedID searches returns information about the sites curated from the selected paper.

Site Search:Retrieve lists of modification sites that fulfill the selected search parameter(s). Data output includes the modified residue and flanking sequences, protein and gene names, and related information.

Comparative Site Search:Retrieves a list of modified sites that possess certain specified attributes and exclude others. Searches can be restricted by eight criteria: sites responsive to specific treatments, or those observed in specific protein types, domains, cellular components, disease states, cell lines, cell types, or tissues.

Browse MS2 Data by Disease:Allows the user to browse curated MS/MS records by selecting a disease type. Results show the number of records and associated modification sites observed in the selected disease.

Browse MS2 Data by Cell Line:Allows the user to browse curated MS/MS records by selecting a cell line. Results show the number of records and associated modification sites observed in the selected cell line.

Browse MS2 Data by Tissue:Allows the user to browse curated MS/MS records by selecting a tissue. Results show the number of records and associated modification sites observed in the selected tissue.


Use the "Explore" option to open a dataset in Tableau where you can interactively browse through the peptides, proteins and cell types right in your browser!

Annotation of cell types

Description and origin of all cell types and tissues used for the CSPA.

Matrix of all proteins and their detection in the different cell types.

Excel file containing 6 tables organized in different sheets:

  1. List of all proteins identified within the different cell types
  2. Matrix of 1492 human proteins against 47 human cell types
  3. Matrix of 1296 mouse proteins against 31 mouse cell types
  4. Table containing the number of identified proteins of each cell type
  5. Matrix with human surfaceome proteins and cells and their estimated relative quantities in log2 scale
  6. Matrix with mouse surfaceome proteins and cells and their estimated relative quantities in log2 scale

Sisyphus CSPA

Filemaker based database containing the easy-to-navigate Sisyphus database executable.

CSPA validated surfaceome proteins

Excel file containing all human and mouse surfaceome proteins in two tables and an additional table with all identified N-glycopeptides:

  1. List of 1492 human surfaceome proteins and their annotation.
  2. List of 1296 mouse surfaceome proteins and their annotation.
  3. List of 13942 mouse and human derived N-glycopeptides, including identified modified form.

Corrected topologies

PDF files with original and based on N-glycopeptide identification corrected topology pictures of 51 human proteins and 39 mouse proteins. The pictures were created with PROTTER and identified N-glycopeptides were marked yellow.

CSPA based spectral libraries for human proteins

ZIP file, containing a README.txt file and two subfolders with the respective spectral libraries:

  1. The .pepidx, .spidx and .splib file of the human spectral library for proteins within the CSPA. The sequence motiv N-X-S/T has been modified to D-X-S/T, which corresponds to a deamidated asparagine (N). Methionines are variable modified by oxidation and a decoy spectral library is appended.
  2. The .pepidx, .spidx and .splib file of the human spectral library for proteins within the CSPA. Asparagines and methionines can be searched with variable modifications of deamiation and oxidation, respectively and a decoy spectral library is appended.

CSPA based spectral libraries for mouse proteins

ZIP file, containing a README.txt file and two subfolders with the respective spectral libraries:

  1. The .pepidx, .spidx and .splib file of the mouse spectral library for proteins within the CSPA. The sequence motiv N-X-S/T has been modified to D-X-S/T, which corresponds to a deamidated asparagine (N). Methionines are variable modified by oxidation and a decoy spectral library is appended.
  2. The .pepidx, .spidx and .splib file of the mouse spectral library for proteins within the CSPA. Asparagines and methionines can be searched with variable modifications of deamiation and oxidation, respectively and a decoy spectral library is appended.

CSPA toolbox

Excel file containing tables for generating inclusion lists and transition list of surfaceome proteins within the CSPA:

<p>This section provides information on the quaternary structure of a protein and on interaction(s) with other proteins or protein complexes.<p><a href='/help/interaction_section' target='_top'>More. </a></p> Interaction i

<p>This subsection of the <a href="">'Interaction'</a> section provides information about the protein quaternary structure and interaction(s) with other proteins or protein complexes (with the exception of physiological receptor-ligand interactions which are annotated in the <a href="">'Function'</a> section).<p><a href='/help/subunit_structure' target='_top'>More. </a></p> Subunit structure i

Found in a mRNP complex with UPF1, UPF2, UPF3B and XRN1 (PubMed:14527413). Associates with alpha and beta tubulins (By similarity).

Interacts with DIS3L2 (PubMed:23756462).

Interacts with ZC3HAV1 in an RNA-dependent manner (PubMed:21876179).

Interacts with ZFP36L1 (PubMed:15687258).

Interacts with TRIM71 (via NHL repeats) in an RNA-dependent manner (PubMed:23125361).

Interacts with YTHDC2 (via ANK repeats) (PubMed:29033321).

Manual assertion inferred from sequence similarity to i

Manual assertion based on experiment in i

Protein-protein interaction databases

The Biological General Repository for Interaction Datasets (BioGRID)

CORUM comprehensive resource of mammalian protein complexes

Database of interacting proteins

Protein interaction database and analysis system

Molecular INTeraction database

STRING: functional protein association networks

Miscellaneous databases

RNAct, Protein-RNA interaction predictions for model organisms.

Is there a database containing sequences of human cell lines? - Biology

This brain cell database contains a survey of biological features derived from single cell data, from both human and mouse. It is part of a multi-year project to create a census of cells in the mammalian brain.

The database contains electrophysiological, morphological, and transcriptomic data measured from individual cells, as well as models simulating cell activity. Thus far, data generation has focused on select areas of cerebral cortex, and thalamic neurons.

Browse electrophysiological response data and reconstructed neuronal morphologies using the Cell Feature Search tool. Single cell gene expression data is described on the RNA-Seq Data page.

Use the Allen Software Development Kit (SDK) to programmatically access and analyze raw data, and to run models.

Data can be downloaded by selecting individual experiments in the Cell Feature Search tool, by accessing transcriptomic RNA-Seq files, or through the Allen SDK or API.

Single Cells from Human Brain

Cells are acquired from donated ex vivo brain tissue dissected from temporal or frontal lobes, based on anatomical annotations described in The Allen Human Brain Reference Atlas. For electrophysiological and morphological analyses in the cortex, cells are selected based on soma shape and laminar location.

For transcriptomic analysis, individual layers of cortex are dissected, and neuronal nuclei are isolated. Laminar sampling is guided by the relative number of neurons present in each layer.

Single Cells from Mouse Brain

Cells are acquired from selected brain areas in the adult mouse. Cells are identified for isolation using transgenic mouse lines harboring fluorescent reporters, with drivers that allow enrichment for cell classes based on marker genes. For electrophysiological and morphological analyses, excitatory cells with layer-enriched distribution and inhibitory cells expressing canonical markers were isolated. Brain areas selected for analysis include subregions from visual cortex, motor cortex and anterior lateral motor cortex (ALM), in the secondary motor area (MOs). Subregions from visual cortex (secondary visual areas) are also included.

For transcriptomic analysis, regional and laminar dissections were performed on specimens from pan-neuronal, pan-excitatory, and pan-inhibitory transgenic lines, to sample comprehensively. Data from the lateral geniculate nucleus (LGd) is also included.

Watch the video: Η βάση δεδομένων υπάρχει ήδη (December 2022).