Single cell sequencing

1. Introduction

Cell theory delivered a new dimension towards the overall understanding of biology as well as diseases by proclaiming that cells are the fundamental unit of life. The succeeding discovery which declared that DNA is the transmissible information that encrypts the proteins which in turn perform cellular functions leading to the development in the arena of genomics and proteomics. Although cumulative methodologies for understanding genetic variations have successfully recognized many new species that are unicellular in nature and have also deciphered genetic set of causes for human diseases however, it is known to us that ecosystem diversity of unicellular species is much more than we can precisely measure by examining a varied set of organisms and also the genomes among the multicellular cells of an individual organism are not necessarily the same. The idea of single-cell genomics intends to offer new viewpoints for understanding concepts in genetics by taking along the assessment of genomes into cellular level. These accessories are initiating new boundaries by comprehending the influence of individual cells. Possibilities have been created to apply single-cell genomics in identification and accumulation of the genomes of microorganisms that cannot be cultured and thereby estimate the roles of genetic entities in the regulation of physiology and related diseases. In addition, it also helps to determine the influence of genetic heterogeneity in the interior of tumours towards cancer progression and assessment of treatment outcome. However, this area of study relies on the potential to examine an individual DNA isolated from individual cells which are a supposedly challenging procedure.

Previously it was supposed that population of cells were homogeneous in nature, however, the cutting-edge research showed that heterogeneity occurs amongst the population of small cells. Measurements of gene expression that is dependent on homogenous cell population are ought to be misleading as they provide an average overview and seldom takes care of though minute but critical fluctuations taking place in individual cells. Individual cells are very different from each other in terms of size, protein expression, and RNA transcripts expression, which forms the key aspect in delivering answers to formerly unfathomable queries pertaining to cancer research, stem cell development, immunology and developmental biology. Maximum biological studies are executed on cells populations that appear to be genetically identical and morphologically similar in makeup but are in fact heterogeneous in nature, consisting of individual cells having exclusive expression profiles. Thus, the estimation of nucleic acid and protein expression are the usual depictions of the colony of individual cells. Single-cell analysis permits the study of variation between cell-to-cell within a population of cells. Additionally, exhaustive scrutiny of differentiation of stem cells and cancer can be done with single-cell analysis. Single cells isolation is generally achieved employing FACS, micromanipulation methods, and laser based microdissection. The biology of single-cell is a burning topic of study today. With DNA sequencing of single-cell, the heterogeneity of cell populations genomic can be explored at the level of the individual cell. Genetic variations, like in case of point mutations and variation in copy number taking place during disease and normal developmental processes, are profiled using the little quantities of DNA from single cells. Applications include analysis of genetic heterogeneity among unicellular as well as multicellular organisms, germ line cell chromosomal aberrations detection, genetic screening of embryos before implantation and for defining the genetic make-up of tumours intending towards the development of targeted therapeutic agents.  Overall with single-cell analysis one can reveal the unidentified information stored in gene expression profiles among individual cells, circumvent the fault of taking mean of whole cell populations, determine formerly hidden subpopulations and expose novel governing pathways.

Only a small fraction (estimated to be less than 1 %) of microbial species on Earth can be cultivated in the laboratory; thus, the standard microbial research methods based on pure culture isolation and observation can provide only very limited information about an environmental microbial community. The development and successful application of microbial small sub-unit ribosomalRNA (16S rRNA) gene PCR analysis has greatly expanded our knowledge of the diversity and phylogeny of microorganisms. Novel, yet-uncultivated microorganisms have been continually discovered by the 16S rRNA gene approach, revealing an “uncultured microbial majority”, which is estimated to comprise 40–50 as yet-uncultivated candidate phyla of bacteria and a similar number of as-yet uncultivated major lineages of archaea. Recent achievements in metagenomics (genomic sequences from the entire environmental community) and single-cell genomics are now opening the window to observation and analysis of this “biological dark matter”. Single-cell sequencing analyzes the genomic information of individual cells with the aid of rapidly advancing sequencing methodology. This technology provides the genomic information of an individual cell within its microenvironment. Generally, single-cell sequencing is comprised of two parts: single cell isolation and whole genome amplification.

2. Recent developments

Owing to the sensitivity and dynamic range, RT-qPCR is traditionally  applied for analysis of expression. Currently, however, it is limited to cross-examining only one type of analyte either DNA, RNA or protein at a particular time pertaining to the limited availability of sample and varied assay environments. Dr Anders Stahlberg and his team at the University of Gothenburg, have recently addressed this shortcoming by introducing a protocol which allows for the analysis of multiple analytes in a single cell. One of the important developments of the field is the sequencing of single-cell RNA. Traditional RNA sequencing considers processing multiple numbers of cells at a time and thereafter normalize the variations. However, since no two cells are similar in nature, this analysis can disclose the understated variations that deliver uniqueness to each one and possess the potential to  even disclose completely novel cell types.

3. Technological challenges

Obtaining single-cell sequencing data in high-quality incorporates the following technical challenges primarily

Competent isolation of individual cells followed by genome amplification of that particular single cell to achieve necessary samples for subsequent downstream analyses; cost-effective probing of the genome in a way to recognize differences that can assess the study hypotheses; and finally data interpretation  keeping in view the errors and biases brought together during the above mentioned  steps. To make the most of the single-cell data quality and to guarantee good signal to technical noise ratio, it is necessary to scrutinize each of the variables demanding watchful consideration during the establishment of single-cell studies.

4. Methods of Single Cell Isolation

There are three principal single-cell isolation strategies: micromanipulation, flow cytometry and microfluidics chips.

4.1 Micromanipulation

Micromanipulation is a precise but laborious method to obtain a single cell. With the aid of micromanipulator devices, micrometer levels of precision in movement can be achieved by manually in operations such as holding, injecting and cutting cells. Glass micropipettes, optical tweezers and laser microdissection are the major tools used to manipulate single cells under the microscope.

4.2 Cell Isolation by Glass Micropipette

In this method, a target cell is captured with a disposable glassmicropipette and then transferred to chips or tubes for subsequent analysis. The glass micropipette can be fabricated by commercial puller devices to the desired diameter. A major drawback of this method is the long distance to be covered in transferring cells from a growth or storage medium into a tube or microtiter plate (MTP) for molecular analysis. The amount of time thus required for cell transfer limits the throughput of the method. In addition, once the micromanipulator is taken out of the field of view of the microscope it is no longer possible to visually control correct transfer of the single cell or bacterium into a tube or MTP. As a result, some cells fail to transfer correctly into the bottom of a tube or MTP well.

4.3 Cell Isolation by Optical Tweezers

Optical tweezers (OT) are capable of trapping and manipulating nanometer and micron-sized dielectric particles by exerting extremely small forces via a highly focused laser beam. In principle, a single selected cell is fixed with the laser beam and is separated from the mixed culture by moving the computer-controlled microscope stage and transferred into a predetermined separation chamber on the slide. Although this method has been successfully used in the isolation and culture of thermophilic bacteria and archaea, it has not been used for whole genome amplification and sequencing. In addition, this method may damage the target cells due to heating and photo damage, as power intensities rise to megawatts per square centimeter at the highly focused spot used for OT.

4.4 Cell Isolation by Laser Microdissection

Laser microdissection and pressure catapulting (LMPC) is a method for isolating specific cells of interest from microscopic regions of tissue or cell samples or organisms. Cells are spread on a polyethylene membrane, and localization of the target cells, based upon their morphological or histological criteria, is performed under microscopic visualization. Then, the surrounding membrane of the target cells are cut by laser dissection. After microdissection, a laser shot of increased energy is used to catapult target cells and the surrounding membrane into a common microfuge tube positioned above the sample for genomic amplification. Cells for LMPC sorting must be suspended in deionized water and dried on a polyethylene naphthalate membrane to prevent the formation of salt crystals that can cause problems for laser microdissection and cell localization this method can isolate whole specific cells or even a single chromosomal region by cutting away the unwanted parts. In addition to genome amplification, available downstream applications include DNA genotyping and loss-of-heterozygosity analysis, RNA transcript profiling, cDNA library generation, proteomics discovery and signal-pathway profiling.Although it offers many advantages in terms of speed, ease of use, and versatility, similar to micromanipulation, LMPC has the complication that proper cell placement into tubes or MTPs is difficult to control.

4.5 Flow Cytometry

In flow cytometry, cells are suspended in a stream of fluid and passed by an electronic detection apparatus, allowing both analysis and sorting of up to thousands of particles per second based on multiple physical and chemical characteristics. A flow cytometer provides “high-throughput” (for a large number of cells) automated quantification of set parameters of the cell.

Flow cytometry suffers several limitations. First, the cells must be in a singlecell suspension, posing a problem in the case of microbial cells that grow in form biofilms. The parameters per cell that can be measured simultaneously is limited by the number of detectors that can be used at the same time. In practice, this number is less than two dozen. The validation of results requires the simultaneous detection of multiple markers to increase specificity. There is a surprising lack of standardization in assay and instrument set-up for flow cytometry. Standards are also lacking for how flow data are analyzed and reported. Lastly, because of the massive amount of data generated, flow cytometry data analysis can become very complicated and relies heavily on gating by a human expert.

4.6 Microfluidic Chips

Microfluidic chips provide a useful interface for the manipulation of single cells. Cell separation and sorting on a microfluidic chip can be achieved using a variety of microscale filters and fluid dynamics mechanisms, including field-flow fractionation, hydrodynamic filtration, and inertial microfluidics. This method has been intensively applied on analysis of blood cells. The major challenge in cell sorting by microfluidic chip is to design and fabricate chips for different samples. Complications may include heterogeneous populations of cells and the presence of noncellular particles, such as sediments and minerals.

5. Cell isolation

Primary samples are subjected to formation of a suspension of single cells that are viable in nature as the initial step in isolation of individual cells. This happens to be quite tedious in the case where complex solid tissues are employed as it necessitates enzymatic dissociation or mechanical dissociation of cells while keeping them viable so as to not bias the analysis for particular subpopulations. Moreover, samples from diseased tissues might retain the dissimilar rate of dissociation in  comparison to their normal equivalents, together with diverse dissociation rate among samples related to the same disease. Standardised protocols for digestion that is usually used in studying tissues alongwith useful methods for augmenting the dissociation of uncommon or diseased tissues are zones that have the need of further growth and expansion in future. Laser-capture microdissection delivers a low-throughput technique of DNA isolation from single cells in their innate spatial framework however the value of sequencing data obtained from microdissected single cells has been comparatively of less use. Ultimately microfluidic as well as bead-based approaches have been adopted to explicitly aim for individual circulating tumour cells (CTCs). Microbial samples obtained from environment also need proficient bacterial lysis with other necessities that can be extremely inconstant amongst species. Numerous methods have been established for isolation of single cells from suspension including  procedures that involve labor-intensive handlings, like in serial dilution, microwell dilution,  optical tweezers and micropipetting. Additionally, quite a few protocols have been established for isolation of whole cells or nuclei by means of fluorescence-activated cell sorting (FACS). Nuclear isolation retains the benefit of permitting single-cell sequencing from tissue samples that are frozen which cannot be achieved by  other methods. Based on the environmental source in case of microbial samples added sample preparation and settings calibration during FACS may be obligatory. Automated micromanipulation approaches that employ micromechanical valves or droplets in microfluidic apparatus are becoming familiar for mainstream applications. Irrespective of the method in question, it is also imperative to precisely check that whether a single cell has been isolated physically so as to avoid unauthentic biological assumptions after assessing empty or  multiple cell containing chambers. In the perfect scenario, this can be achieved by procuring microscopy data from the individual chamber or the ones containing a single cell. Technologies for single-cell isolation have lately been studied, wherein the factors relating to correctness, reproducibility, throughput and user friendliness were emphasized. Maximum of the studies employing  these tools have been completed to demonstrate likelihood by means of analysis of less number of cells. Several vital biological queries that are especially achieved via single-cell genomics will necessitate the examination many cells at a time, turning it further likely that technologies that are accessible via combination like microfluidic  approaches are accepted for an extended period. Further, categorizing accessible approaches for isolation of single-cell is an area of intensive research that is expected to yield state-of-the-art tools that will in turn advance all the metrics related to capturing performance.

6. Whole Genome Amplification Methods

Genome sequencing requires micrograms of DNA; however, single cells contain only picograms. Therefore, various methods of whole genome amplification (WGA) have been developed. Modified polymerase chain reaction (PCR) is the classic WGA method. This method requires thermocycling, random primers, degenerate or universal primers, and Taq  DNA polymerase or similar enzymes. Taq DNA polymerase lacks 30–50proofreading activity and hence has high error rates. Newer WGA methods, multiple displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC) have provided improvements over PCR.

7. Multiple Displacement Amplification (MDA)

Multiple displacement amplification (MDA) is a non-PCR based DNA amplification technique. MDA still uses random primers; however, this method amplifies gDNA without thermocycling and generates larger products with a lower error frequency compared with conventional PCR amplification techniques.

This method enables the rapid amplification of samples with very small amounts of DNA samples, providing a sufficient amount for genomic analysis. The reaction starts by annealing random hexamer primers to the template: DNA synthesis is carried out at a constant temperature by a high fidelity enzyme preferentially 29 DNA polymerase. This enzyme readily synthesizes DNA strands of 0.5 Mb length, and its high fidelity and 30–50 proofreading activity reduces the amplification error rate to 1 in 106–107 bases, compared to the reported error rate for conventional Taq polymerase of 1 in 9,000.

MDA generates sufficient yield of DNA products for sequencing from a single cell and is therefore a powerful tool. The large size of MDA-amplified DNA products also provides desirable sample quality for identifying the size of polymorphic repeat alleles. Its high fidelity also makes it reliable enough to be used in singlenucleotide polymorphism (SNP) allele detection. Due to the strand displacement that occurs during amplification, the amplified DNA has sufficient coverage of the source DNA, providing a high quality product for genomic analysis. The products of displaced strands can also be subsequently cloned into vectors to construct libraries for sequencing (Zhang et al. 2006). These advantages make MDA the most widely used method forWGA. The major drawback of MDA is amplification bias. Most studies on MDA have reported that this issue occurs due to over-amplification and allelic dropout. Another reported issue is that primer-primer interactions result in a sequenced product even in the absence of input template during MDA amplification. Therefore, there are problems regarding negative controls in the MDA reaction.

8. Multiple Annealing and Looping Based Amplification Cycles (MALBAC)

MALBAC is a PCR-based genome amplification method that introduces a step of quasilinear preamplification to reduce the bias associated with nonlinear amplification. In the preamplification phase, single-cell genomic DNA is melted at 94°C and then annealed randomly with MALBAC primers at 0°C , synthesizing semiamplicons. In the subsequent five temperature cycles, full amplicons are generated by a series quenching at 0°C, extension at 65°C, melting at 94°C and self-looping at 58°C with DNA polymerase. Self-looping of the full amplicons at the end of every cycle prevent these full amplicons from being used as a template for amplification during MALBAC, thereby reducing the amplification bias that is commonly associated with the uneven exponential amplification of DNA fragments by PCR. After the preamplification, only the full amplicons can be exponentially amplified in the following PCR using the common 27-nucleotide sequence as the primer. The PCR reaction will generate microgram levels of DNA material for sequencing experiments.

MALBAC has resulted in many significant advances over MDA amplification. MDA does not utilize DNA looping and amplifies DNA in an exponential fashion, resulting in bias. Amplification bias results in low coverage of the genome. The reduced bias associated with MALBAC has provided better genome sequence coverage, lower incidence of false positive and lower false negative mutations than other single-cell sequencing methods. However, the DNA polymerase used in the first cycle is error prone and can introduce sequencing errors that are propagated to the product DNA.

9. Whole-genome amplification

One of the critical constituents of gaining genetic data from single cells is the amplification of the genome’s single copy while reducing the occurrence of anomalies like bias amplification, loss of genome, chimaeras and mutations. It is a field that has shown extensive progress over the past decade. Out of all the initial set of approaches that endeavoured to amplify human genomes entirely depending on PCR amplification coupled to single cells with either mutual sequences scattered all over the genome, sheared genomes ligated to a common sequence, or oligonucleotide priming that is random or degenerate. Practically during the amplification, these approaches have caused loss of signal from the most of the parts of the genome attributed to variances in the common sequences density or owing to inconsistency inefficiency of PCR between loci, which are aggravated when it comes to the beginning with a single copy of the genome. The retrieval of the genome can be improved to approximately 10% by initiating the process with two genome copies achieved by categorization the tetraploid nuclei using PCR primed on the degenerate oligonucleotide. However, it remains indistinct whether the assortment biases are presented when choosing cells that divide quickly. A current alteration of this technique has stretched its applications to diploid cells. Worth mentioning that these approaches use polymerases that are thermostable harbouring greater error percentages than polymerases that are thermolabile which results in the introduction of additional mutations throughout the amplification process. The second group of WGA is grounded on isothermal procedures. The utmost regularly used methods are the multiple displacement amplification (MDA), which employs random priming which is isothermal and then extension with Φ29 polymerase, possessing greater processivity, a small error percentage and activity of strand displacement. These approaches are known to yield better genome coverage than the above mentioned procedures that are PCR-based. Nevertheless, the results of exponential amplification often over represent the loci that are amplified initially and are aggravated by amplification with a greater fold. It is indistinct whether for the specific loci the over representation mentioned is owed to systematic or stochastic biases. Also, the Φ29 polymerase action consequences into the creation of chimeric sequence side products that are low in proportion and can be reduced by endonuclease treatment permitting the separation of amplicons physically by reaction debranching. In an effort towards overcoming the reduced coverage depicted by PCR based approaches and absence of consistency in isothermal methods two types of nearly identical methods that are hybrid in nature have been established. Both of these procedures apply a restricted isothermal amplification which is followed by amplification of the amplicons by PCR that was produced in the isothermal step. Utmost lately, several annealing and looping-based amplification cycles (MALBAC) employs analogous protocol where it uses random primers together with novel common sequences and temperature cycles that are supposed to encourage isothermal amplicons’ looping thus further hindering amplification afore the PCR step resulting in further amplification that is uniform in nature. Experimentally, isothermal and hybrid methods are the best regularly explored WGA approaches in existing single-cell studies. Both the MDA and MALBAC is equally successful in genome amplification from single cells, whereas when amplification was done in low reaction volumes, amplification of a substantial quantity of superfluous contaminant DNA also took place. A microfluidic platform was explored to eradicate this contaminant DNA to a large extent. MALBAC is improved in determining variation in copy number but less false positives were recorded in case of MDA. These conclusions propose that the amplification technique should be cautiously selected for individual experiment relying on the category of genetic variation that is in question. Recent studies basically approve the deduction that in the study of individual single genomes that all the discussed approaches have its pros and cons and one has to be cautious in choosing the approach depending on the interest. Earlier reports have revealed a substantial reduction in contamination when single-cell WGA was accomplished in a microfluidic platform. Additionally, it has also been revealed that use of microfluidics devices employing nanolitre volumes of samples consequences in more even MDA compared to conventional microlitre volume based reactions. Of late a microfluidic device has been established to achieve MALBAC, however, whether the result of MALBAC will be advanced by executing the reactions in nanolitre based amplification chambers is vague. Employing single-cell MDA in microfluidic emulsions appears to distinctly increase the consistency of amplification, and multiple success reports have been documented with this tactic.

10. Interrogation of WGA products

The subsequent step in analyses of single cell genomics is to regulate the cross-examination of the amplified genomes. Generally for multifaceted eukaryotic genomes like in case of the human genome, one can decide to examine a precise locus of the sequence of interest (<1 Mb) constituting the entire regions that code for protein or sequence the whole genome. The sort of genomic cross-examination also requires to be judiciously considered keeping in mind the perspective of the queries being taken care of by the study and by considering the WGA method biases that are taken into account. Directing exact positions of the single cells genomes can assist to emphasize areas that retain the maximum biological involvement in the system taken into account while cutting off the sequencing related costs and false mutation detections. Minor target regions are less prone to errors that were hosted in the course of the first few rounds of WGA and would be carried over to the result in the inaccurate identification of a genetic variant. Moreover, the use of the bulk sample as a reference has the possibility to reduce false-positive variants. Target-specific amplification using PCR or target capture through hybridization is employed for targeting wherein the target specific amplification delivers more constant target coverage compared to the methods that are capture-based. This is significant when the intention is to make the best of coverage of a genome that has previously gone through non-uniform amplification. Target capture effortlessly offers better coverage even if the application of microfluidic devices for target-specific amplification parallelly can expressively upsurge coverage without much difference in labour. Sequencing single-cell exome permits wider genome examination which is useful for the recognition of variants specific for every cell. Nevertheless, with an increase in the area of genome cross examination, the likelihood of discovering false variants also becomes intensified particularly when applying polymerases with increased error possibilities. The single cell whole genomes can also be put under examination. This also comes associated with the possibility of amplified false mutation detection and charges with the capacity to examine a greater genome proportion. Single cells subjected to whole-genome sequencing (WGS) also eradicates the added reduction in consistency that happens as a consequence of exome or targeted capture enabling WGS to enable the detection of CNVs and SNVs. Furthermore, WGS can list structural as well as non-coding variants that may add up to the biological entity under study. Still, this originates at an expense of demanding approximately 30-fold sequencing more per cell compared to exome sequencing, that has a chance of becoming a limitation while at work with more number of cells.

11. Single-cell sequencing errors

Out of all key challenges of assessing the single-cell genomics information is to design methods that can distinguish technical aberrations and noise presented during the process of isolation of single-cell, WGA and genome cross examination from the real biological changes. The fraction of cells being examined can be biased by the assortment of cells based on viability, size, or tendency to go into the cell cycle during single-cell isolation. Subsequently, it is essential to relate the variant alleles spotted in the single cells to the whole population inorder to confirm there was no bias in selection. This can be achieved by equating the fraction of single cells containing variant to the frequency of the variant allele in the original whole sample. Several errors are presented in the process of WGA, which consists of coverage loss, diminished coverage consistency, allelic imbalance, and errors occurring in the process of amplification of genome. Majority of the published articles have endeavoured to account for the rates of these errors. But in several studies, cell lines are used to determine analyses of quality control which is trailed by experimentation on primary samples making it problematic to compare among protocols reported in studies, as it is vague if analogous performance can be achieved on the primary samples in use for each one of these studies in relation to the cell lines that were employed for optimization of methods. It should be predominantly taken care that some cell lines or types possibly may not be diploid in nature and can harbour high aneuploidy or in some cases polyploidy disturbing the experimental results immensely. Additionally, numerous standards are functionalised in the step of quality control wherein the cells are characterized into a subset that fulfils the selected standards and is applied to deduce biological assumptions and a subset that is rejected. Many of these standards employed in quality control comprise of visual validation of an independently isolated cell, qualification of WGA product and/or quantification alongwith the genome coverage. Two currently developed approaches to forecast the width of coverage of genome applying low-pass sequencing can deliver cheap technique for determining the quality of cell lysate in bigger genomes as in case of eukaryotic. Nevertheless, relating studies of single-cell genomics is presently complex, as maximum studies do not account the over-all number of cells taken into consideration, the value of the information from the rejected cells or the standards applied for the classification of quality control. Practically the assessment of clonal assemblies is hindered by damage of somatic variants, which happens most of the time when the drops out of locus occur approximately in 50% of the cases when a single out of the available alleles drops out. A current study related errors and assemblage of performance amongst species and thereby more consistent investigation and reporting approaches are required to enable data determination between studies related to single-cell and deliver precise performance standards for each method.

12. Single-cell variant calling

Although several errors are presented in the WGA process, methods and protocols are now in use for overcoming the extra technical disturbance or noise formed during WGA enabling the the documentation of factual variation. SNV calling necessitates that variant allele is covered at a degree that surpasses the addition of the rates of sequencing error and amplification. In a more precise manner mutation that is presented in the process of the amplification, together with the allelic imbalances that take place during amplification of genome should be taken into consideration during variants calling. Two types of elementary approaches that help in overcoming the variants that are false-positive and are presented as aberrations of the amplification are firstly, the whole sample can be marked as a reference to diminish the false detection rate and secondly, when applying the data procured from single-cell about two to three cells can be essentially required to contain the similar variant at the exact location, which is not likely to happen by coincidentally with the numerous mutations introduced in the process of WGA in single-cell of the  enormous human genome. However, the definite number of cells necessary to classify mutation has not been thoroughly verified depending on the area of the genome cross-examined. To surpass the allelic imbalance, there is a requirement of variant calling algorithms intended to consider the technical noise.

13. Applications

13.1 Sorting of the microbial dark matter

The ability to reduce the sampling bias that befalls when researchers depend on conventional culturing methods for isolation of microorganisms can be solved by sequencing. Like in the case of sequencing 16S ribosomal RNA which has identified bacterial species and key groups of archaea that are not culturable even though the residual genomes of those supposed new species are problematic to accumulate pertaining to the data derived from sequencing of samples that are a collection of several species. Normally, in case of these metagenomic samples, single-cell genomics has the possibility to bring together the genomes of species that occur at low frequencies and yield collection of genomes of entirely new microorganisms that are uncharacterized. The attention is on the species sequencing spotted by sequencing of 16S ribosomal RNA but does not have the entire genome associations as these studies related to single-cell have shown the utmost probability of progressing our knowledge about the ecosystems of microbes in the near future. The original single-cell genome from the environment that is to be sequenced was a participant of the phylum TM7. In this context, species were categorised from the oral cavity of human subjects subsequently isolation followed by MDA with the help of the microfluidic device. In a similar context, cells were subjected to sorting using FACS thereafter MDA and followed by sequencing. Practically all studies till now have applied MDA. In a comparison study by means of raw readings from single cells of E. coli, MDA achieved improved results than MALBAC. Abundant part of the coverage of genome for MALBAC was nowhere to be found due to contamination once the reaction was executed in tubes. If alone charted readings are taken, MALBAC would accumulate a better percentage of the genome, giving an additional indication that decreasing contamination using MALBAC that is microfluidic-based approach could possibly deliver even improved assemblies of the microbial genome. Techniques have lately been advanced to methodically evaluate the quality of sequencing data derived from single-cell microorganism together with the existence of a sequence that has the ability to contaminate. Another method for refining amplification standards and consequent assemblies is the detention and culturing of individual bacteria inside droplets subsequently followed by thousands of cells subjected to amplification that were derived from the original bacteria. On the other hand, researchers have concentrated on species harbouring genomes that are polyploid to gain improved assemblies by beginning with bacteria that contain 200–900 copies of genome per cell. Current developments in single-cell genomics have allowed the account of totally novel phyla, as well as are commencing to offer biological understandings that were not possible employing metagenomics methodologies. Additionally, an improved knowledge of the microbiome is generating information that is directing towards marketable applications. In Oceanospiralles order new members of the genomes which comprise enzymes that can digest crude oil were recognized by ocean samples subjected to single-cell sequencing after the Deepwater Horizon oil spill. There is also potential in expending single-cell genomics in the identification of human microbial pathogens that are unculturable and also to regulate variations in pathogenicity among strains of the similar species inside an individual. Moreover, even though a majority of studies have concentrated on 16S rRNA gene sequences of known bacterial phyla single-cell genomics can be of use to accumulate the genomes of bacteria or archaea that can be seen, but their rRNA remains undistinguishable by PCR owing to the divergence of sequence from the commonly applied primers for amplification. An upcoming application of single-cell genomics is the use of single-cell sequencing in recognising new viruses that is tiresome to bring together from the metagenomics samples. Computational techniques are being advanced to expand the approaches for decoding novel viral sequences from their hosts. These approaches are starting to be applied to understand host–phage connections, which will perhaps be amplified transcriptome sequencing of single-cell. Another approach has observed the protist–virus communications applying single-cell sequencing, and it is probable that further studies will deliver facts on the association between a phage, virus or bacterium and its host by decoding the modification between cell-to-cell interaction which is partly missed with bulk sequencing tactics. Still, numerous experiments continue to surge the throughput and excellence of microbial single-cell genomes. Further well-organized gears for separating and lysing individual microorganisms, consistent and less predisposition to error during amplification methods, and much more vigorous assembly algorithms that include the supplementary uncertainty presented by technical aberrations during WGA of single-cell are required to yield genome assemblies of high-quality. The contest of delivering a more consistent method for making, analysing and evaluating genomes of single-cell microorganism is being taken care of by the Human Microbiome Project, which is working towards the development of genome sequencing of 3,000 individually cultured and uncultured bacteria derived from various human anatomical locations.

13.2 Identifying genetic mosaicism in multicellular organisms

The expansion of cytogenetic approaches during the 1950s directed towards the discovery that cells contained by the same individual can contain variable chromosome numbers. Patients with a mixture of expression of dominant diseases of Mendelian nature were then recognized by uncommon arrangements of the conventional cutaneous appearances of some diseases, comprising hereditary haemorrhagic telangiectasia and neurofibromatosis type I. It was then revealed that germline mutations are fatal for other diseases such as McCune–Albright Syndrome which is only expressed as mosaic diseases. More lately, the expansion of variant discovery approaches grounded on next-generation sequencing and microarrays has permitted the documentation of numerous new diseases that are the consequence of mosaic CNVs or SNVs. However, preceding studies of human mosaicism have been restricted to the documentation of genetic abnormalities that occur at comparatively high frequencies because of the low sensitivity of present tools. Nevertheless, a human cell is projected to attain an SNV inside its coding region after each 300 cell divisions. As the normal human body is projected to comprise of 37 trillion cells every location in our genomes in different cells obtains thousands of mutations as we mature from a zygote towards an adult human. Additionally, researches that have tested tissues from a variety of sites of the same individual propose that mosaic SNV and CNV percentages are greater than formerly appreciated. However, the part of that variation which is low-level in genetics in the predilection and pathogenesis of human diseases remains mostly unknown. Current studies in human samples have initiated distinguishing mosaic genetic variation applying to the sequence of single-cells. It has also been revealed that in healthy individuals a considerable proportion of single human neurons contain mega base CNVs even though these conclusions have been doubtful. More lately, whole genome sequencing of single-cell was applied to classify mosaic SNVs in which the investigators established an enhancement in mutation sites in the brain that are vigorously transcribed, signifying those sites in the cells are the primary source of mutation. It is probable mosaic genetic variants that are low-level will be progressively associated with diseases related to human as the investigational and analytical methods endure lessening the technical aberrations from single-cell WGA, contributing to developments in the capacity to distinguish the true biological variants from experimental errors. Moreover, these approaches are expected to find direct applications in clinical context. Single-cell genomic methods have been applied to screen embryos to be used in vitro fertilization and further of late before implantation they have been also applied to detect aneuploid nature in polar bodies.

13.3 Cancer

Cancer is the most considered instance of genetic mosaicism. Tumour origination, evolution, progression and maintenance are facilitated in single cells by the consecutive acquirement of genetic variants. The goal of the huge continuing sequencing projects in cancer is to list those variants for improved understanding of the tumour biology. But as in the case of other researches related to genetic mosaicism, the limit of detection is restricted to variants that are existing in only approximately 20% of cells in a bulk sample comprising a huge number of cells. The application of variant allele frequency distributions in local and bulk sequencing studies has directed that most of the cancers have substantial heterogeneity as far as their genetics is concerned. However, those approaches do not co-integrate mutations into discrete clones which is obligatory to definitely govern the samples’ clonal structure and to conclude the evolutionary traits of the malignancies. Sequencing studies at single-cell level have initiated to reveal genetic heterogeneity in between tumours at the resolution of single-cell. The presentation of vague reports of the clonal structures in these experiments climaxes the necessity to generate a common classification/taxonomy as the area of single-cell cancer genomics develops. Circulating tumour cells (CTCs) can be separated and cross-examined as a probable insight into the tumour genetics via non-invasive testing. The exclusive practical challenges accompanying the isolating and scrutiny of the single CTCs genomes have been completed. These studies have also started to display potential in recognizing and characterizing CTCs as substitute to investigative and disease monitoring approaches. Further, current studies have intended to advance experimental as well as computational devices so that investigations at a single-cell resolution of malignancies deliver an advanced-resolution overview of the disease. Of late a recent study on breast cancer using MDA on tetraploid nuclei concluded the clonal make-up of the sample using SNVs. The investigators also performed CNV profiling, even if not on the similar cells, and established that maximum CNVs were developed in advance to SNVs. Of late MDA was also used for genome amplification the of nearly 1,500 acute lymphoblastic leukaemia cells. With the huge number of cells, it was possible to improve methods of determining the clonal make-up. The strong authentication of clonal assemblies by means of these investigation methods enabled to positively draw new assumptions pertaining to the happenings that result in the development, including the presence of co-dominant clones at analysis, the gaining of clone-specific interrupted cytosine mutagenesis and the surveillance of the KRAS mutations that are developed later stage of disease expansion.

13.4 Future applications in cancer research

With new investigational and computational growths, the area of single-cell genomics is expected to initiate contribution in terms of significant new insights towards cancer progression and evolution. Presently, from a single cell, only SNVs or CNVs can be precisely recognized with a low-pass or targeted sequencing. Developments in WGA methods could additionally cut down sequencing necessities, which would permit cheaper whole-genome cross-examination in single cells for the genomic variations, comprising SNVs as well as structural variants that exist in the non-coding regions. The approaches applied to examine cancer genomes after amplification like WGS, targeted sequencing or whole-exome sequencing, should be judiciously nominated based on the hypotheses under study, as well as the cost considerations alongwith throughput and the data quality attained. Also, more consistent descriptions across sequencing studies related to cancer are compulsory to permit precise comparisons amid various studies.

13.5 Other Applications

  1. Rare cells or events
  2. Circulating metastatic cells
  3. Fetal cells in maternal blood
  4. Events within a library
  5. Scarce, precious sample
  6. Archival tissue
  7. Tumor sample
  8. Biomarker discovery
  9. Single-cell precision in populations
  10. Drug and candidate screening
  11. Cell differentiation (e.g., stem cells)
  12. Stochastic responses to stimuli