From the Human Genome Project to the Present: A Brief History of Genome Sequencing

The Human Genome Project (HGP) is a global research initiative that aims to examine the composition of human DNA and locate all the genes within the human genome.

In May 1985, a group of 12 experts were gathered by Robert Sinsheimer, the Chancellor of the University of California, Santa Cruz (UCSC), to discuss the benefits and drawbacks of a potential project called the Human Genome Project (HGP).  However, there was controversy around it. Some people felt that big science, like the HGP, diverts resources from smaller scientific projects, that sequencing the genome wasn’t worth it because most of it was junk, and that the project was too complex and would not attract good scientists. In the early years of advocating for the HGP, about 80% of biologists and the National Institutes of Health (NIH) were against it.

The US Department of Energy (DOE) was one of the initial advocates for the HGP. They argued that understanding the human genome sequence could help us understand the effects of radiation on the genome from events like atom bomb explosions and other forms of energy transmission.

The DOE played a critical role in advocating for the HGP and stimulating debate around its potential benefits. Interestingly, there was more support for the project from the US Congress than from most biologists. Congress understood the potential for international competitiveness, economic benefits, and more effective approaches to dealing with the disease. A National Academy of Science committee report in 1988 endorsed the project, and by 1990, it was initiated. The project was completed ahead of schedule and under budget, with the finished sequence published in 2004.

HGP had a significant impact on the field of biology, and it is currently driving a revolution in medicine. The project’s completion has provided scientists with a better understanding of the genetic basis of human diseases, allowing for the development of new diagnostic tools and therapies. This has led to a paradigm shift in medicine, with personalized, genomics-based approaches becoming increasingly common. The HGP has also opened up new avenues for research, including the study of genetic variations and their impact on human health. Overall, the HGP has transformed our understanding of the human genome and has the potential to significantly improve human health in the years to come.

The Human Genome Project had two main principles from the start.

  • The first was to include researchers from all over the world to understand our shared genetic heritage and benefit from diverse approaches.
  • The second was to make all human genome sequence information freely available to the public within 24 hours.
  • Around 200 labs in the US and over 18 countries contributed to the project. The project had two early goals: to build genetic and physical maps of the human and mouse genomes and to sequence, the smaller yeast and worm genomes as a test run for sequencing the larger, more complex human genome.
  • When the yeast and worm sequencing efforts were successful, the project moved on to sequencing the human genome.

An insight towards human genome project (HGP)

The Human Genome Project had several primary objectives between 1998 and 2003. These objectives included:

(i) sequencing the entire human genome,

(ii) developing new sequencing technologies,

(iii) analyzing variations in human genomic sequences,

(iv) developing technologies to understand genomic functions,

(v) addressing ethical, legal, and social issues related to genomics,

(vi) advancing the fields of bioinformatics and computational biology, and (vii) providing research training opportunities.

Bioinformatics is a field of modern biotechnology where human genomics research has a central role.

The HGP has the potential to greatly benefit medical science and improve human quality of life. However, it is important to consider the ethical, legal, and social implications that may arise from this research. To ensure that the benefits of HGP are sustained, a comprehensive framework should be established and followed by all scientists and researchers involved. This framework, known as the Ethical, Legal, and Social Implications (ELSI) model, is crucial in preventing violations of human values and dignity, which could lead to global destruction if not adhered to diligently.

First-generation DNA sequencing

First-generation DNA sequencing, also known as Sanger sequencing, is a method of sequencing DNA that was developed by Frederick Sanger in the 1970s. This technology was the first to enable the sequencing of DNA, and it was used to sequence the first whole genome, that of the bacteriophage Phi X174, in 1977.

In Sanger sequencing, DNA is replicated in the presence of modified nucleotides, which terminate DNA synthesis when incorporated into the growing strand. The resulting fragments are separated by size using gel electrophoresis, and the sequence of the DNA is read by determining the identity of the terminating nucleotide at each position.

Sanger sequencing is considered a “first-generation” sequencing technology because it was the first to enable the sequencing of DNA, and it remained the gold standard for DNA sequencing for several decades. However, it is a relatively slow and expensive process compared to newer sequencing technologies.

Despite being surpassed by newer technologies such as next-generation sequencing, Sanger sequencing is still widely used today for applications such as validating the results of next-generation sequencing, sequencing individual genes, and analyzing specific regions of the genome.

Second-generation DNA sequencing

The genomics revolution refers to the tremendous advancements made in the field of genomics, which is the study of an organism’s complete set of DNA, including all of its genes. These advancements have largely been driven by improvements in DNA sequencing technology, which have made it faster, cheaper, and more accessible to sequence DNA.

The speed and efficiency of DNA sequencing have improved at a remarkable rate, even faster than the rate of improvement in computing technology predicted by Moore’s law. Specifically, between 2004 and 2010, sequencing capabilities doubled every five months, compared to the doubling of microchip complexity every two years predicted by Moore’s law.

There are many different types of DNA sequencing technologies available today, each with its unique capabilities, chemistries, and specifications. Researchers can choose from a diverse toolbox of sequencing technologies to design experiments that suit their needs.

Illumina‘s sequencing platform has emerged as the most successful technology, to the point of near-monopoly. Illumina‘s success can be attributed to its high-throughput capabilities, accuracy, and low cost per base. These features have made it the preferred choice for researchers, and thus it can be considered to have made the greatest contribution to the second generation of DNA sequencers.

Third-generation DNA sequencing

Some examples of third-generation sequencing technologies include:

PacBio sequencing: This technology is based on the principle of single-molecule real-time (SMRT) sequencing. It uses a DNA polymerase attached to a fluorescently labeled nucleotide to read the sequence of a single DNA molecule in real time.

Oxford Nanopore sequencing: A protein nanopore is utilized by this technology to transfer DNA through a membrane and detect modifications in electric current with the passage of each nucleotide through the nanopore. The alterations in current are then analyzed to identify the sequence of the DNA molecule.

 Ion Torrent sequencing: The principle behind this technology involves sensing the emission of hydrogen ions during DNA synthesis. With the addition of each nucleotide to the developing DNA strand, a minute quantity of hydrogen ion is released, and this is detected by a sensor to deduce the sequence of the DNA molecule.

Conclusion

The Human Genome Project finished in 2003 and has been used as a reference for genetic research. Presently, the focus of genomics research is to comprehend the practical outcomes of genetic variations and their contribution to human ailments. The development of new sequencing technologies, like long-read sequencing, is ongoing and aimed at providing more accurate and comprehensive information about the genome. Large-scale initiatives, such as the 100,000 Genomes Project and the All of Us Research Program, aim to sequence the genomes of large populations to develop a deeper understanding of the genetic basis of diseases. While the HGP was a significant accomplishment, ongoing research continues to advance rapidly in the field of genomics.

References

  1. Khan FA (2011). Biotechnology fundamentals. CRC Press, Boca Raton, Florida, USA.
  2. Hood L: Acceptance remarks for Fritz J. and Delores H. Russ Prize. The Bridge. 2011, 41: 46-49.
  3. Sinsheimer RL: The Santa Cruz workshop – May 1985. Genomics. 1989, 5: 954-956. 10.1016/0888-7543(89)90142-0.
  4. Cooke-Degan RM: The Gene Wars: Science, Politics and the Human Genome. 1994, New York: WW Norton.
  5. Report on the Human Genome Initiative for the Office of Health and Environmental Research.
  6. National Academy of Science: Report of the Committee on Mapping and Sequencing the Human Genome. 1988, Washington DC: National Academy Press
  7. Human Genome Sequencing Consortium: Finishing the euchromatic sequence of the human genome. Nature. 2004, 431: 931-945. 10.1038/nature03001
  8. Phabharna Ganguly,” https://www.genome.gov/news/news-release/the-Human-Genome-Project-turns-the-big-3-0”
  9. Collins FS, Patrinos A, Jordan E, Aravinda C, Gesteland R, and Walters L (1998). New goals for the U.S. Human Genome Project: 1998 –2003. Science, 282: 682-689.
  10. Engku Ahmad Zaki Engku Alwi et.al;, “review of human genome project (HGP) from ethical perspectives”, October 2017International Journal of Advanced And Applied Sciences 4(12):125-132,
  11. Holley R.W., Apgar J., Merrill S.H., Zubkoff P.L. Nucleotide and oligonucleotide compositions of the alanine-, valine-, and tyrosine-acceptor soluble ribonucleic acids of yeast. J. Am. Chem. Soc. 1961;83:4861–4862.
  12. Luo, D. Tsementzi, N. Kyrpides, T. Read, K.T. Konstantinidis, Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample, PLoS One 7 (2012) e30087
  13. Real-time DNA sequencing from single polymerase molecules, J. Eidet al. Science, 323 (2009), pp. 133-138
  14. Alice Maria Giani al;, “Long walk to genomics: History and current approaches to genome sequencing and assembly