Jumping libraries

Jumping libraries or junction-fragment libraries can be described as complete collections of genomic DNA fragments, that are generated by chromosome jumping. These libraries have the advantages of analyzing large areas of the genome and it can overcome distance limitations in most common cloning techniques. Two stretches of DNA composes jumping library. These stretches are separated by many kilobases in the original genome. A series of biochemical manipulations reduce the stretch of DNA located between these two “ends”. This manipulation generally carried out at the start of the reaction.

Many new clone libraries are developed to cope with the large physical distances separating markers and genes in mammalian genomes. as these technologies are still at the developing stages there remains a lot of problems to be overcome. A different number of jumping libraries have been constructed from human and mouse genomes. These jumping libraries also find a lot of application in cloning and mapping techniques in mammalian genetics.

Invention and early improvements

Origin

The first description of chromosome jumping was done in 1984 by Collins and Weissman. During this period, cloning techniques had a limitation of 240 kb for cloning step. The mapping of the limited clones is done by cytogenetic techniques, which clones to a small region of a particular chromosome and its resolution is limited around 5-10Mb. It leads to a major gap in the available technologies to counter this resolution problem and also no methods are available for application of mapping larger areas of the genome.

Application of molecular and genetic analytic techniques have aided majorly in understanding many organisms ranging from bacteria to Drosophila melanogaster. But the application of these techniques has been very difficult when it comes to mammals. Difficulties are contributed by the large size of the mammalian genome, and the abundance of repetitive sequences. For  both genetic and cytogenetic (hybridization to polytene chromosomes) analysis can resolve distances comparable to the length of DNA recovered in single X or cosmid clones, but in mammals, these techniques can only resolve or provide markers at distances of millions of base pairs, at least 20 times the capacity of the larges,” capacity cloning vectors.

To address this void between the resolution of mammalian genetics or cytogenetics and the range covered by molecular cloning and DNA analysis a number of techniques are being developed. These techniques try to cope up with the distance between the mammalian genome. These techniques can be categorized into two types: serial techniques (chromosome walking and chromosome jumping), and approaches allowing the parallel analysis of chromosome sub-regions, entire chromosomes or genomes (linking clone analysis and the establishment of ordered cosmid libraries).

Chromosome walking

At present, available cloning vectors with their capacity  limits the distances that can be analyzed by serial walking) techniques. Chromosome walking, that is the repeated use of fragments from the end of a clone to identify adjacent overlapping clones, only proceeds in steps of less (and often much less than) the average length of an insert. Although possibilities for the cloning of larger DNA fragments in other vector systems have been discussed, essentially all cloning of long stretches of DNA is carried out either in cosmid or l replacement vectors, limiting the insert sizes to less than 50 or less than 24 kb, respectively. If cosmid clones are used, a hypothetical chromosome walk of a million base pairs (1 cm in the human genome) will, therefore, require, on a purely arithmetic basis, at least 20 and probably more then 40 steps, making this approach essentially unusable for these or larger distances. In addition, there are reasons for assuming that some mammalian sequences might not be clonable in E. coli, thus terminating any chromosome walks across such a region.

Chromosome jumping

Chromosome jumping, in contrast, clones only the ends of large DNA fragments, and the rest of the DNA is deleted by a series of biochemical manipulations carried out before the cloning step. Cloning vector still decides the length of the jumping clone insert, the sum of the two end fragments. But this distance between the two fragments can be hundreds of kilobases. In this approach, only the end sequences of the original fragment have to be clonable or nonrepetitive. This end sequences constitute the final jumping clone.

This approach makes this method be less sensitive to sequences which are either unclonable or too repetitive to be crossed by chromosome walking.

Long DNA, prepared by lysis of cells embedded in blocks of low melting-point agarose, is digested very lightly within the locks with, for example, BamHl. Partial digestion products are then separated by pulsed field gradient gel electrophoresis s and DNA is isolated from slices containing specific size cuts ranging up to hundreds of kilobases. In order to permit the subsequent elective deletion of the middle of these large DNA fragments without concomitant random reassortment of end fragments, the DNAs are then circularized at very low DNA concentration, usually in the presence of a plasmid sequence carrying a suppressor gene or a suppressor gene fragment. In a fraction of the molecules, the suppressor gene will be included during circularization and can be used later as a ‘tag’ to isolate or recognize the fragment containing the junction between the two ends.

The large circular molecules are then cleaved into small fragments by complete digestion with a second enzyme (e.g. EcoRI) which does not cleave within the chosen suppressor plasmid. The resulting fragments are ligated into a l vector carrying another mutation. Clones of the ligation product of the two ends of the large DNA fragments, identifiable by the suppressor-containing sequence, are selected by plating on bacterial hosts not carrying a suppressor gene.

Clones identified by hybridization in a library of umping fragments should, therefore, be derived from both ends of the original large DNA fragment after deletion of all sequences between the outermost sites for the enzyme used for the second cleavage.

A number of factors have to be taken into account in optimizing the conditions for each step of the procedure. In particular, conditions favoring intramolecular ligation over intermolecular ligation have to be chosen for the circularization step. The ratio between circle formation and intermolecular ligation events are governed.by two parameters: the effective local concentration(j)  of one end of a molecule experienced by the other end of the same molecule, and the concentration (i) of the ends of all other long DNA molecules, determined by their molar concentration. The parameter j can be derived from the Jacobson-Stockmayer equation:

j= 9.55*10-8/ kb3/2  M

or expressed as the equivalent concentration of linear DNA.

j=63.4/kb1/2  mg/ml

where kb is the length of the molecule in kilobase pair, The fraction of intramolecular ligation events is then given by the ratio j/(i+j). These predictions have been verified experimentally in a model system. With increasing dilution, the ratio will asymptotically approach one, while the fraction of intermolecular events will approach but never reach zero. Therefore circularization has to be carded out at low DNA concentration. So that the sequence used as a tag can also be incorporated with a reasonable probability during the circulation step, the molar concentration of the tag has to be chosen to be roughly equivalent to j.

Since ligation in dilute solution will reduce, but not eliminate intermolecular ligation, prospective jumping clones visually have to be checked, for example by hybridization to appropriate panels of somatic cell hybrids or to pulsed field gradient gels, to ensure that both ends of the jumping clone come from the same large DNA fragment. The fraction of wrong clones can be considerably higher than expected statistically if, especially in attempts to jump very long distances, a significant fraction of the large DNA is broken and therefore unable to circularize, while still contributing to the background of intermolecular ligation events.

Chromosome jumping is inherently directional. This is because an EcoRI-BamHl fragment used as a probe will only identify clones derived from partial BamHI fragments extending in the direction of the EcoRl site. In a complementary library, constructed from fragments generated by partial digestion with EcoRI and re-leaved with BamHl, only clones extending from the EcoRI site in the direction of the BamHl site should be identified. A clone identified in the complementary library by the end of the previous jump will therefore inherently extend in the same direction. Alternating between complementary libraries should give clones extending further and further in one or the other direction.

As in chromosome walking experiments, libraries of more than three genome equivalents (e.g. 3 million clones for a BamHl partial library) have to be used to give a greater than 95% probability of finding a clone for the next step. In practice, even larger libraries will be preferable, since a significant fraction of the identified end points will contain repetitive sequences. Clones with highly repetitive endpoints could be detected by hybridization with radioactively labeled total DNA, and thus at least these clones could be excluded from further analysis.

Rare cutter jumping libraries

With the aim of reducing the otherwise very great the complexity of the libraries to be constructed and screened, our laboratory has especially concentrated on the use of enzymes, such as Notl, that cut rarely in  mammalian DNA. While a library constructed from DNA fragments generated by partial digestion with an enzyme such as BamHl will typically have to  contain millions of clones to be representative  (requiring the ligation and packaging of the equivalent of hundreds of millions of clones) we expect that a library of a few (perhaps 3-10) a thousand clones will contain most (clonable) Notl jumping clones (assuming that Notl fragments are on average a million bases long). In addition, the use of rare cutting sites as start points and/or endpoints of chromosome jumps allow the distance covered by the jumping clone to be estimated by hybridizing the end ~jgments to blots of pulsed field gradient gels. However, such an estimate has to be checked, since a jumping clone could end in a restriction site that might be cut only very infrequently in a digest of the whole genome (for example, bemuse of methylation of the DNA); if this were the case then a hybridizing band larger than the true jump size would be observed. One way of checking the estimate would be to analyze double-digest hybridizations.

Rare cutter libraries will be a very powerful tool for many applications, as long as the appropriate end fragments can be obtained. One difficulty is that problems can be expected in circularizing the DNA fragments created by enzymes such as Notl, since some of these fragments will be very long. Also, the choice between different possible endpoints that is available in a partial digest library is lost in complete digest libraries, making them more sensitive to repetitive or unclonable sequences. Some of these problems can be overcome in a jumping library constructed from DNA fragments generated by digesting long DNA partially with an enzyme cutting commonly (e.g. EcoR]) and complete with an enzyme cutting very rarely in the genome (e.g. Notl) and circularizing over a plasmid cleaved with both enzymes. Such a library will both preserve the advantage of the small number of Notl sites constituting one of the endpoints and allow many more possible start points, if screened by fragments adjacent EcoRI sites.

Using these strategies and a number of newly constructed vectors, we have constructed several jumping libraries. Those recently completed and currently being tested is a human NotllBamHl library, a human Notl complete-EcoRI partiallBamHI re-cleaved library, and mouse NotI/BamHI and a Mlul/EcoRI  libraries. Randomly picked clones from these libraries in all cases give the expected restriction digest patterns. In addition, for a number of clones picked randomly from the human Notl/BamHl library (containing 100000 clones corresponding to 10–30 genomes) both endpoints of the clone have been shown to hybridize to Notl fragments of identical size. Work on the identification and analysis of specific clones are in progress.

 Linking libraries

If a jumping library is constructed with an enzyme that cuts rarely in the genome, a strategy is needed for making ordered jumps through the library. It should, in theory, be possible to do this by making a  complementary library. However, it is likely that the efficiency of constructing such a library will be low because only a small fraction of circularized partial BamHl digestion products re-cleaved withNotl will be within a clonable size range.

There are a number of possible solutions to the problem. A fragment for the next step can be provided by a l, or cosmid clone isolated by using the jumping-  library clone as a probe. Alternatively, special ‘linking libraries’, containing only genomic sequences surrounding, for example, the Notl sites, can be constructed and screened. Each clone isolated from such a Nofl linking library will overlap two neighboring Notl restriction fragments, which can be identified by hybridization of the done or sub-fragments to appropriate pulsed field gradient gels. Therefore, if chromosome regions short enough to comprise only a small number of distinguishable Note restriction fragments are saturated with linking clones, the order of the clones can be established.  This method allows combined physical and genetic maps to be made of regions  surrounding specific mutations, and eventually, maps of chromosomes and even entire genomes can be obtained.

In a first step, DNA from a hamster cell line,  containing parts of the human chromosomes 5 and 4,  including the Huntington’s chorea locus, is partially  cleaved with Sau3A and sliced on sucrose gradients to  isolate fragments of 10-20 kb. This DNA is then  circularized at low concentration in the presence of a  BamHl-cut plasmid containing a suppressor genes  re-cleaved with Notl, and ligated into Notl-cleaved  NotEMBL3A, an analog of the l replacement vector EMBL3A carrying Notl sites in the polylinker sequence. Linking clones are selected by plating on a suppressor- free host and clones from the human chromosome fragments are identified by hybridization to human repetitive DNA probes. By use of this protocol a number of clones have been isolated, characterized by restrictions analysis and mapped to either chromosome 4 or chromosome 5 by hybridization to appropriate somatic cell hybrids. Two clones located on chromosome 4 have been analyzed further by hybridization to Notl digests of human DNA separated by pulsed-field gradient gel electrophoresis. As expected, each clone can be shown to hybridize to two separate Notl fragment bands.

Other techniques likely to be useful in the construction of linking libraries are also being developed. For instance, we have constructed aNotl insertion vector to enable us to use, for example, complete EcoRl digests derived from either sorted chromosomes or chromosome libraries as starting material. Also, a number of plasmids carrying polylinkers with rare-cutting sites (Notl, Mlul, Sacll) have been constructed (but not yet tested in this application) with the aim of allowing cosmid or l clones carrying these sites to be selected from pre-existing libraries.