Genomics Area

DNA and RNA sequencing has become an invaluable tool for fundamental and applied research in areas as diverse as cancer genetics, rare disorders, host-pathogen interactions, preservation of endangered species, evolutionary studies and the improvement of species of agricultural/ farm interest.

The Omicstech Genomics Area is formed by the Centro Nacional de Análisis Genómico (CNAG-CRG) sequencing platform and the Genomics and Transcriptomics facility of the Center for Omic Sciences (COS). The CNAG-CRG is one of the top European centres in terms of sequencing capacity and diversity of capabilities. It focuses its activities on 2nd and 3rd generation sequencing technologies and the corresponding analysis tools. The CNAG-CRG expertise is based on extensive experience in sample quality control, library construction methods, sequencing, lab automation, data banking and data analysis. The offering portfolio is in constant evolution and it includes state-of the art applications such as long read nanopore sequencing, single-cell RNA sequencing and ATAC-sequencing. Processes at the CNAG-CRG run under ISO 9001:2015 certification and ISO 17025:2005 accreditation. The services offered by the COS through its Genomics and Transcriptomics facility are focused on small genome sequencing and metagenomic studies.

The Genomics area of the OmicsTech can guide researchers at any project phase from the design of the experiments to the data analysis.

For further information and discussion of possible ways of collaboration with the CNAG-CRG or the COS facilities, please contact genomics@omicstech-icts.org

Applications

Whole genome sequencing with short reads (Illumina)

Whole genome sequencing (WG-Seq) is the process of determining the complete DNA sequence of an organism's genome at a single time. It delivers a base-by-base view of all genomic alterations, including single nucleotide variants (SNVs), small insertions and deletions (indels), copy number changes (CNVs), and structural variations (SVs).

Whole genome sequencing with long reads (ONT)

Whole genome sequencing with ONT technology produces reads that are several kilobases long. They are extremely useful to improve genome assemblies, to identify large structural variations and to phase alleles to their respective parental homolog.

Whole genome bisulfite sequencing and/or oxidative bisulfite sequencing

DNA methylation sequencing by whole genome bisulfite sequencing (WGBS-Seq) is the gold-standard for an unbiased assessment of DNA methylation at single-base resolution. Oxidative bisulfite sequencing (oxBS-Seq) differentiates between 5mC and 5hmC.

Targeted bisulfite sequencing and/or oxidative bisulfite sequencing

Targeted methylation sequencing (Methyl-Seq) offers a balanced, cost-effective choice between whole-genome bisulfite sequencing and methylation arrays for both screening and biomarker discovery studies.

Whole exome sequencing

Whole exome sequencing is used to investigate all protein-coding regions of the genome with enhanced coverage for disease-associated genes. It is suitable to identify nucleotide variants across coding regions, being a cost-effective alternative to whole-genome sequencing, and producing smaller and more manageable data sets compared to whole-genome sequencing data.

Targeted sequencing

Targeted sequencing is a highly directed approach that enables the analysis of genetic variation in specific genomic regions, using pre-designed gene panels, custom gene panels or amplicon sequencing. Potential applications are the discovery of rare mutations in complex samples (such as highly heterogeneous tumour samples) or sequencing the bacterial 16S rRNA gene across multiple species, a widely used method for phylogeny and taxonomy studies.

RNA sequencing with short reads (Illumina)

RNA sequencing is a sensitive and accurate method for determining the primary sequence and relative abundance of all RNA molecules. It provides strand-specific information that allows assigning transcripts to the corresponding DNA strand. Most common applications are differential gene expression, allele-specific gene expression, alternative splicing, fusion transcripts, de novo transcriptome assembly and genome annotation, and single nucleotide variant identification.

RNA sequencing with long reads (ONT)

RNA sequencing with long reads allows full-length characterisation of native RNA or cDNA, as well as the identification of complex transcript isoforms, and chimeric or gene fusion transcripts.

smallRNA sequencing

Small RNA sequencing is a technique to isolate and sequence small RNA species, such as microRNAs (miRNAs). The study of smallRNAs permits examining tissue-specific expression patterns and isoforms of miRNAs in order to understand how post-transcriptional regulation contributes to phenotype.

ChIP sequencing

Combining chromatin immunoprecipitation (ChIP) assays with sequencing, ChIP sequencing (ChIP-Seq), is a powerful method for identifying genome-wide DNA binding sites for transcription factors and other proteins. ChIP-Seq enables thorough examination of the interactions between proteins and nucleic acids on a genome-wide scale.

Genotyping by sequencing

Genotyping by sequencing (GBS) is a method to obtain genotype data from samples by using restriction enzyme digestion followed by massive parallel sequencing. it is a very robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species, even if no reference genome is available.

HiC sequencing

The three-dimensional configuration of the genome is complex, dynamic and crucial for gene regulation. Hi-C sequencing is a chromosome conformation capture technique that reveals the interactions between different pieces of DNA.

Single cell RNA sequencing

Single cell RNA sequencing (scRNA-Seq) allows for transcriptome-wide analyses of individual cells. Looking at both complex systems, such as tissues or organs, and at single-cell level, the cellular heterogeneity and the state of each cell type is revealed.

ATAC sequencing

The assay for transposase-accessible chromatin sequencing (ATAC-Seq) is a a rapid and sensitive technique to assess genome-wide chromatin accessibility. It uses the Tn5 transposome to detect nucleosome-free regions of the genome. It is widely used for nucleosome mapping and to determine transcription factor occupancy.

Whole exome sequencing plus variant confirmation by Sanger Sequencing

Next-generation sequencing (NGS) has dramatically changed the molecular diagnostic area. Whole exome sequencing by NGS and confirmation of the reported candidate variants by Sanger Sequencing is a common practice in many clinical labs.

DNA fingerprinting by whole genome sequencing and fragment analysis

DNA fingerprinting enables identification of individuals using hair, blood, semen, or other biological samples, based on unique patterns (polymorphisms) in their DNA. It can be done by whole genome sequencing in the discovery phase and/or by amplified fragment length polymorphism (AFLP, detection of multiple DNA restriction fragments by means of PCR amplification) for larger number of samples.

Metagenomics sequencing by whole genome and amplicon-based (16S, 18S, or ITS) sequencing

Metagenomics is the study of genetic material recovered directly from environmental samples. Accurate identification of species is a major challenge and can be done by two complementary approaches: whole genome sequencing (WG-Seq) and 16S, 18S or ITS amplicon sequencing. The latest is a cost-effective approximation to study large number of samples although is limited to one specific gene, while the WG-Seq strategy can be applied to lower number of samples but works well for all organisms found in the same sample, prokaryotes and eukaryotes.

Targeted Sequencing of small fragments by capillary electrophoresis.

The Sanger method is based on sequentially synthesizing a strand of DNA complementary to a single strand (used as a template), in the presence of DNA polymerase, the four 2'-deoxynucleotides that make up the DNA sequence (dATP, dGTP, dCTP and dTTP) and four dideoxynucleotides. Using specific primers for known genes can detect different polymorphisms in the sequence.

Fragments analysis by capillary electrophoresis.

Microsatellites, also known as simple sequence repeats (SSRs) or short tandem repeats (STRs), have been popular markers due to their high polymorphism. The PCR reaction is performed with fluorescent dye-labelled primers, then the PCR fragments can be analysed on a capillary DNA sequencing machine, and the data is analysed using GeneMapperTM software.

Targeted gene expression

Quantification of allelic expression of specific genes using TaqMan™ assays or SYBR Green intercalator assays can be performed using RT-PCR system.

Taqman SNP Genotyping analysis

TaqMan is a commonly used SNP genotyping method developed by Life Technologies, which is an advanced, mature, validated, and widely used technology using RT-PCR system. Each TaqMan genotyping assay contains two primers for amplifying the sequence of interest and two allele-specific and differently labeled TaqMan probes for allele detection. Each allele-specific MGB probe is labelled with a fluorescent reporter dye (either a FAM or a VIC reporter molecule) in the 5’ end and is attached with a fluorescence quencher to the 3’ end.

Gene expression or miRNA analysis by microarrays

DNA microarrays are microscope slides that are printed with thousands of tiny spots in defined positions, with each spot containing a known DNA sequence or gene. To perform a microarray analysis, RNA molecules (mRNA for gene expression and miRNA for miRNA analysis) are typically collected from both an experimental sample and a reference sample. The samples are then converted into complementary DNA (cDNA), and each sample is labelled with a fluorescent probe. The data gathered through microarrays can be used to create gene expression profiles, which show simultaneous changes in the expression of many genes in response to a particular condition or treatment.

Small genome sequencing

Small organism whole genome sequencing can be performed using a next generation sequencing platform.

Small RNA and miRNA sequencing by Ion Torrent

 

The small RNA sequencing service covers the existing small RNA molecule sequencing and novel small RNA discovery, mutation characterization, and expression profiling of small RNAs by leveraging advanced NGS technologies and the data analysis pipeline.

Mitochondrial DNA sequencing

Mitochondria play a very important role in important cellular functions. Mitochondrial DNA sequencing is a useful tool for researchers studying human diseases, and can also be also in population genetics and biodiversity assessments

Image analysing of visible spectra and quimioluminiscence

Imaging systems are used for the detection, quantitation, and analysis of proteins and nucleic acids in gels and on membranes. They can be used for detection and automated data analysis with all common modes of protein and nucleic acid staining and labelling: colorimetric, fluorescent and chemiluminescent.

Bioinformatic Applications

Germline variant identification and annotation

Next-generation sequencing is extensively used to test for inherited disorders and to identify germline variants associated with complex disorders. Our extensively benchmarked pipeline identifies germline single nucleotide variants and small insertions and deletions from whole genome sequencing, whole exome sequencing or targeted sequencing data. The standard pipeline can be customised to take into account specific challenges such as polyploidy or distant reference genomes. A specific analysis pipeline for genotyping by sequencing data enables the analysis of organisms with or without reference genome.

Somatic variant identification and annotation

Genomic characterization of tumours is increasingly being used to guide treatment decisions. Our extensively benchmarked pipeline can identify somatic single nucleotide variants and small insertions and deletions from whole genome sequencing, whole exome sequencing or targeted sequencing data. The standard pipeline can be customised to take into account specific challenges such as the absence of paired control samples.

Transcript quantification and differential expression analysis

RNA sequencing analysis is widely used in many labs to functionally characterize organisms, tissues or cells. Our RNA sequencing analysis pipeline includes transcript quantification, differential gene expression analysis, differential alternate splicing analysis, detection of gene fusion events and single nucleotide variant identification from transcripts.

Whole genome and whole transcriptome de novo assembly

De novo sequence assembly is challenging, not only because of the sheer size of the data and computational requirements, but also due to repetitive elements, polyploidy and variation (single-nucleotide, insertions/deletions, and larger structural variants). We aim to meet these challenges by optimizing and tuning our analysis strategy as each project demands.

Methylation analysis

Epigenetic changes, such as cytosine methylation, are known to play an important role in the regulation of gene expression. Our pipeline allows large scale, high performance analysis of DNA methylation from bisulfite sequencing datasets, as well as single nucleotide variant identification.

3D genome analysis

The three-dimensional organization of the genome plays important, yet poorly understood roles in gene regulation. We have pioneered the development of hybrid methods for determining the structures of genomes and genomic domains from HiC sequencing data.

Single cell RNA sequencing analysis

 

Single-cell RNA sequencing data can be very useful to elucidate cellular heterogeneity and related dynamics in organs and organisms, in health and disease, in humans and model systems. We have sophisticated computational pipelines that allow a variety of analysis including distance between single cells, de-convolution, clustering, differential expression and hierarchical markers.

Microarray Analysis

The microarray data is analysed for gene expression (mRNA and miRNA) using the Gene Expression and miRNA workflow in GeneSpring GX 13.1. Pathway analysis can also be done using the same software

Equipment

Illumina sequencing instruments
NovaSeq, HS4000, HS2500 and MiSeq

Illumina short read sequencing instruments with diverse capabilities, that allow processing various standard and custom sequencing applications, from whole genome sequencing to ChIP sequencing.

ONT sequencing instruments
MinIONs and GridION

Oxford Nanopore Technologies (ONT) long read sequencing instruments with low and medium throughput capabilities. They produce ultra-long reads of several kilobases, only limited by the length of the molecules to be sequenced.

Single cell/ DNA molecule capture system
Chromium Controller

Advanced microfluidics platform where single cells/ DNA molecules are encapsulated in nanoliter microreactor droplets. It combines large partition numbers with a massively diverse barcode library to generate >100,000 barcode-containing partitions.

Systems for automated liquid handling
Gilson PIPETMAX 268, Sciclone NGS Workstations, Zephyr SPE, Mantis, BRAVO

Automated liquid handling systems for processing up to 96 samples or sequencing libraries simultaneously, in pre-PCR, semi-pre-PCR and post-PCR restricted areas.

Systems for DNA/RNA fragmentation
Covaris E210, Covaris LE220-plus

Systems for automated high throughput ultrasonication that support up to 96 DNA samples simultaneously.

Systems for quantification and quality controlling DNA/RNA samples and libraries
Synergy™ HT Multi-Mode Microplate Reader, Qubit, Nanodrop 2000, Bioanalyzers 2100, TapeStation Fragment Analyzer, ChemiDoc Gel Imaging

Fluorometers, microvolume spectrophotometer and parallel capillary electrophoresis systems for quantification and integrity evaluation of DNA/RNA samples and sequencing libraries and imaging systems for the detection, quantitation of nucleic acids

SageHLS HMW library System

System for high molecular weight DNA extraction and large fragment capture.

Real-time PCR instruments
Light Cycler 480 and ABI 7900HT real-time PCR

High-performance, medium- to high-throughput real-time PCR platform that supports mono- or multicolor applications, as well as multiplex protocols.

Laboratory Information Management System
LIMS

In-house developed LIMS for tracking projects, samples, libraries, sequencing runs and results.

CNAG-CRG informatics infrastructure

CNAG-CRG computing cluster, with 3472 of computing cores, 7.6 petabyte of data storage, an internal 56 Gb/s network and multiple 10 Gb/s direct physical connections to the Barcelona Supercomputing Center which has over 48.000 compute cores.

Ion Torrent sequencing instruments
PGM and S5 System

Next-generation sequencing (NGS) is a high-throughput methodology that allows rapid sequencing of base pairs in DNA or RNA samples. It can be used in a wide range of applications, for example, in gene expression profiling, for the detection of epigenetic changes and for molecular analysis.

3500 Genetic Analyzer

A capillary electrophoresis for Sanger sequencing suitable for processing up to two 96-well sample plates at a time.

Agilent Microarrays scanner

The core principle behind microarrays is the hybridization between two DNA strands, the property of complementary nucleic acid sequences to specifically pair with each other by forming hydrogen bonds between complementary nucleotide base pairs. Fluorescently labelled target sequences that bind to a probe sequence generate a signal. Microarrays use relative quantitation in which the intensity of a feature (signal) is compared to the intensity of the same feature under a different condition, and the identity of the feature is known by its position.