by - Sarah Varghese
The DNA is a double helix consisting of four nitrogenous bases attached to a sugar and a phosphate group, thus the DNA is said to have a sugar phosphate backbone. The four bases forming the DNA are Adenine, Guanine, Cytosine and Thymine. These four bases, pair with specific partners to form units known as base pairs (bp): adenine (A) with thymine (T) and cytosine (C) with guanine (G). The human genome comprises approximately 3 billion such base pairs, providing the instructions for human creation and maintenance. Thus, the method employed for determining the exact sequence of nucleotides, or bases, in a DNA molecule is referred to as Gene/DNA sequencing. Gene sequencing is considered to be a revolutionary discovery in the history of modern science as it enables us to determine the kind of genetic information that is carried in a particular DNA segment. It can be equated to hitting the jackpot since, nucleotide sequence is the most fundamental level of knowledge of a gene or genome. It is the blueprint that contains the instructions for building an organism, and no understanding of genetic function or evolution could be complete without obtaining this information.
Gene sequencing finds applications in various domains and has played a pivotal role in their advancement.
It has medical applications such as disease diagnosis, treatment and can be used to carry out epidemiological studies. Gene sequencing can explain genetic variations that underlie diseases thereby helping in their early detection. It serves as a valuable tool in diagnosing genetic diseases, identifying tumorigenic gene mutations, detecting microbial infections amongst others, all of which contribute to the advancement of personalized medicine.
It enables enhancement of agriculture through potent animal and plant breeding. Thus, it also helps in preventing disease outbreaks and producing a better product yield.
It helps in evolutionary studies between populations/species by comparative analysis of homologous DNA sequences from different organisms. Thus, it helps in establishing an evolutionary relationship between organisms with homologous DNA sequences.
DNA sequencing also plays a crucial role in investigations and parentage testing in the field of forensic sciences.
DNA sequencing helps in decoding an organism’s entire genome sequence, thereby helping us understand its genetic makeup. It is also used to analyse the transcriptional activity of all genes in an organism thus, enabling researchers to spot patterns of gene expression and underlying regulatory mechanisms. Study of structure and function of proteins is also facilitated by predicting protein-coding regions via DNA sequencing. Thus, DNA sequencing plays a very crucial role in genomics and proteomics research.
Methods of Gene Sequencing
Gene sequencing methodologies can be broadly classified into two categories, namely: The classical sequencing methodologies or first-generation sequencing technology and the next-generation sequencing technology.
First-generation sequencing technology
Having emerged in the 1970s, the first-generation sequencing technologies include the Maxam-Gilbert Method, discovered by and named after the American molecular biologists Allan M. Maxam and Walter Gilbert and the Sanger Method, discovered by English biochemist Frederick Sanger. Sanger sequencing is among the earliest DNA sequencing technologies and became the more commonly employed of the two approaches. This method finds basis in the principles of DNA polymerase and terminator nucleotides i.e., one of four possible dideoxy nucleotides, which lack a 3’ hydroxyl group to interrupt the replication of the DNA chain thereby, fragmenting the DNA molecules into sections of varying length. To briefly explain, this method entails using the DNA sequence of interest as a template for the chain-termination polymerase chain reaction (PCR). Chain-termination PCR includes mixing a low ratio of dideoxyribonucleotides (ddNTPs) with the normal dNTPs in the PCR reaction. ddNTPs lack 3’ OH group, thus when the DNA polymerase incorporates a ddNTP at random, it leads to cessation of elongation. As a result, varying lengths of DNA fragments are formed. These chain- terminated oligonucleotides are then separated via gel electrophoresis, allowing them to be separated in the order of smallest to largest. The last step then involves simply reading the gel in order to determine the DNA sequence. It is to be noted that since DNA polymerase synthesizes in the 5’ to 3’ direction, each terminal ddNTP corresponds to a specific nucleotide in the original template. In this way, the 5’ to 3’ sequence of the original DNA strand can be determined.
Next-generation sequencing technology
The next-generation sequencing also known as the second-generation or massively parallel sequencing, has superseded the Sanger method due to its cost-effectiveness, rapidity and ability to simultaneously determine the sequence of millions of fragments. Advances in bioinformatics that enable storage of large databases have improved the utility of next-generation sequencing technologies. Some of the second-generation sequencing platforms are SOLid and Illumina. Third-generation sequencing platforms include Oxford Nanopore and Helicos. Third-generation sequencing, also known as long-read sequencing interrogates billions of DNA and RNA templates while detecting variable methylation. It enables detection of more variations, including the ones that cannot be detected by solely using short-read sequencing methods.
Single Cell Sequencing: A Journey with plenty of milestones
Almost all cells in the human body have the same set of genetic materials, but their transcriptome information in each cell reflects the unique activity of only a subset of genes. Profiling the gene expression activity in cells is considered as one of the most authentic approaches to probe cell identity, state, function and response. Huge technological breakthroughs have been made in the single‐cell transcriptomics during the last decade. With single‐cell RNA sequencing, it is now possible to analyse the transcriptome at single‐cell level for over millions of cells in a single study. This allows us to classify, characterize and distinguish each cell at the transcriptome level, which leads to identify rare cell population but functionally important. Single-cell sequencing is a method that examines the genome of individual cells from a cell population as opposed to the next—generation sequencing methods that examine the genome of a cell population. Though it is a relatively recent development, it is opening doors in our understanding of cell populations and tissues. Single-cell technologies are currently being used to measure the genome, the DNA-methylome or the transcriptome of each cell of a population. This development in our understanding of the genome has enabled us to identify novel mutations in cancerous cells, explore the progressive epigenome variations occurring during embryonic development and investigate how a seemingly homogenous cells’ population expresses specific genes. First performed in the year 2009 on a mouse blastomere, this technique has undergone numerous developments and modifications since then like any other methodology and now follows the following protocol:
Isolation of single cells from a cell population.
Extraction, processing and amplification of the genetic material of each isolated cell.
Preparation of a “sequencing library” including the genetic material of an isolated cell.
Sequencing of the library using a next-generation sequencer.
Single-cell sequencing finds applications in various fields such as drug discovery, neuroscience, embryogenesis, organogenesis, profiling of human health and disease, computational sciences, etc. Thus, it has brought researchers a few steps closer to understanding the mystery that is the livingform.
The Human Genome Project: A guide to mankind’s emergence
The Human Genome Project is the most remarkable biomedical research ventured upon in the 20th Century. This project, carried out by a consortium of brilliant scientists across the globe, singlehandedly changed the entire landscape of our understanding about the human genome. The project had a singular goal of generating the first sequence of the human genome. Carried out from 1990-2003, it was one of the most ambitious and important scientific endeavours in the history of mankind. The project was so atypical for biomedical research in the sense that, it was driven by a desire to explore an unknown part of the biological world – as opposed to initially just focus on formulating a theory or hypothesis. After a phase of trials and conclusions the project ultimately used one method for DNA sequencing, the Sanger Method, but not before greatly advancing this primary method through a concatenation of crucial technical innovations. The genome sequence constructed by the project was a melange of multiple people whose identities were kept anonymous, with majority of the sequence coming from one person of blended ancestry. More specifically, 70% of the reference genome was generated from that individual’s DNA, while the remaining 30% came from a combination of 19 other individuals of mostly European ancestry. Just like Rome wasn’t built in a day, the human genome sequence took more than a decade to be crafted, with researchers facing numerous limitations in their abilities due to the technological advancements not being up to the mark. However, it is very well known that nothing worth having comes easy, and by the year 2000 the International Human Genome Sequencing Consortium had produced a draft of the human genome that accounted for 90% of the genome with nearly 150,000 gaps that could not be determined accurately. With further advancements they achieved a breakthrough, announcing in April of 2003 that they had essentially generated a complete human genome sequence that accounted for 92% of the human genome with less than 400 gaps. This was significantly better than the draft, and had now paved the way for interdisciplinary sciences and new advancements in various fields, changing the lives of many forever. Nineteen years later, on March 31, 2022, the Telomere-to-Telomere consortium announced that they successfully filled in the remaining gaps and devised the first truly complete human genome sequence.
As a result, gene sequencing can be rightfully considered as a treasure discovered by mankind that heralded a new era in scientific research and its numerous applications.
References:
1. Understanding Single Cell Sequencing, How It works and Its Applications - By Stephanie Vaga
2. The Single-Cell Sequencing: The new developments and medical applications - By Xiaoning Tang, Yongmei Huang, Jinli Lei, Hui Luo and Xiao Zhu
3. Human Genome Project Factsheet- NIH
4. DNA SEQUENCING- NIH
5. Whole Genome Sequencing- By Marco Marra
6. DNA Sequencing: Definition, Methods and Applications- CD Genomics
7. DNA Sequencing- By Anthony J.F. Griffiths
Commenti