Knockin Targeting Vectors

Design of Commonly Used Knockin Targeting Vectors

The key strategy in a knockin experiment is to place the expression of an exogenous gene or a modification of the endogenous gene under the transcriptional control of cis-acting elements belonging to the endogenous gene. Therefore, knockin targeting vectors should be constructed such that the knockin genes faithfully reproduce the normal transcription patterns, yet encode a mutated or otherwise “different” coding region. Use of a strong ubiquitously expressed promoter, such as phosphoglycerate kinase (PGK) ensures that the recombined cells will grow appropriately under drug selection.

In order to restore the state of the targeted gene to as close to undisturbed as possible, it is best to enable subsequent removal of the drug-selection marker by loxP flanking of the selection cassette. The cloning sites used for replacement by knockin and selection marker sequences can be either exonic or intronic, or a combination of both. If intronic cloning is used, the coding sequence of the knocked-in gene should be flanked on its 5′ end with a splice acceptor and at its 3′ end with a splice donor sequence to ensure proper stable expression of its transcript within cells. When a knocked-in coding sequence is designed to direct synthesis of a fusion with an endogenous gene, it is essential to make certain that the fusion will be in-frame and will include all necessary transcription and translation signals (including polyadenylation sequence, initiation codon, internal ribosome entry site (IRES), and so on). Furthermore, cDNA, or genomic sequences, or a combination of both can be incorporated into the targeting construct as part of the knockin. This depends largely on the size, the organization, and the complexity of the genes involved (of both the endogenous gene to be targeted and the knockin gene), as well as the objectives of the knockin project. If the knockin is designed to introduce a subtle mutation within a small, easily manipulated region, one selects an appropriate genomic fragment containing the mutation of interest so that the knockin gene is a close mimic of the wildtype gene. On the other hand, if the goal is to replace an entire gene (with a homolog from another species or with a different gene), often a cDNA sequence is used to replace the targeted coding sequence.

Recombination-Mediated Cassette Exchange

The successful application of knockin approaches for studying gene function might invite researchers to generate different modifications of the same gene for further subtle analysis of its function. The laborious, time-consuming, and costly procedure of targeting a construct into embryonic stem (ES) cells and of isolation and characterization of ES cell clones resulting from proper homologous recombination needs to be done repeatedly in case different modifications (mutations) have to be introduced. Application of RMCE might be a good time- and money-saving alternative in case multiple different modifications need to be knocked in. RMCE refers to a highly efficient recombinase-mediated exchange of an initial cassette introduced into a gene by secondary cassettes encoding different modifications of this gene. For application of RMCE to knockins, the first cassette is initially introduced into the gene of interest by homologous recombination. Subsequently, this parental ES cell line can be used repeatedly to introduce different modifications in the gene of interest.

Principle of RMCE

Mutant FLP recognition target (FRT) sites were shown to recombine efficiently with identical mutant FRT sites but no longer with wildtype FRT sites. This finding resulted in the design of an FRT/FLP recombinase RMCE method: an integrated expression cassette consisting of a HygTK (hygromycin B-positive/ ganciclovir-negative) selection marker flanked by two such heterospecific c FRT sites could be efficiently exchanged by a Neo expression cassette flanked by the same two heterologous FRT sites. The exchange method exists between two FLP recombination events: an initial insertion event followed by an excision event. Positive and/or negative selection is used to select for the intended exchange. Application of only negative selection allows even the replacement of constructs without a selection marker. This method was also shown to work in ES cells with a single, randomly integrated HygTK cassette. Mutant LoxP sites are also described that recombine with each other, but not with wild-type LoxP sites. Similar RMCE methods are described for the LoxP/Cre system. These FRT/FLP- and LoxP/Cre-based RMCE methods were shown to work efficiently in ES cells and were recognized as potentially powerful tools for the generation of modified and knockin mice.

Potential Application of RMCE for the Generation of Knockin Mice

The starting point for the application of RMCE for the generation of knockin mice is the introduction by homologous recombination of an exchangeable cassette in ES cells replacing parts of the target gene. After targeting of the HygTK cassette flanked by heterospecific FRT sites, such a parental ES cell line could subsequently be used repeatedly to generate different knockins in the same gene by an exchange of the HygTK cassette for different sequences. The HygTK cassette flanked with the heterospecific FRT sites is preferably introduced intronic at the 5′-end and intronic or downstream of the last exon at the 32-end. The FRT sites remain present in the locus after the exchange, so they should at least not be present in the protein coding region. As for the commonly used knockin targeting vectors, the RMCE exchange plasmid should contain the appropriate knockin sequences and preferably an excisable selection marker gene flanked by LoxP sites. However, it should be noted that in case only negative selection with ganciclovir against the HygTK cassette is applied, the selection marker gene could be omitted from the exchange plasmid.


The pig is an important dual purpose animal model for agriculture and biomedical applications. With an increasing global population and an increased demand for animal protein, domesticated animals such as pigs are critical for tackling the emerging global food security crisis. Unlike domestic ruminants, the pig has a short gestational interval (114 days), is a litter bearing animal carrying an average of 14 piglets in one pregnancy, and in commercial setting can give rise to three pregnancies in one year. These attributes make it not only a valuable model for agriculture but also for genetic engineering applications, where relatively short pregnancy and large litter size is preferred for generating and propagating genetically modified animals. From a biomedical standpoint, there is an increased awareness among the biomedical community that mouse models cannot meet the complete spectrum of biomedical needs, and an alternative animal model such as the pig is required to meet the shortcomings of the mouse model.

In domestic pigs, the preferred means for generating genetically engineered animals is somatic cell nuclear transfer (SCNT), where somatic cells typically fetal fibroblasts are modified to include the intended genetic modification and used as nuclear donors for generating genetically modified offspring. The most common genetic modification is transgenesis, where the transgene of interest is introduced into somatic cells and selected for stable integration of the transgene prior to SCNT. However, random integration of transgenes suffers from potential limitations such as insertional mutagenesis (the transgene inserts into an existing gene potentially disrupting the endogenous gene’s expression or function), lack of control over transgene copy number, silencing or aberrant expression of transgenes in non-target tissues based on the site of integration (positional variegation), random assortment and segregation in subsequent generations, to name a few. In this regard, inserting the transgenes into a specific locus by gene targeting (knockin) is preferable to avoid the concerns outlined above. However, homologous recombination-mediated gene targeting events suffer from poor efficiencies in somatic cells (1 in 106–107 cells). An additional limitation with the use of most commonly used somatic cells (fetal fibroblasts) is their limited viability in culture for screening recombinants. Site-specific nucleases or genome editors such as ZFNs (zinc finger nucleases), TALENs (transcription activator-like effector nucleases), and CRISPR (clustered regulated interspaced short palindromic repeat) and CRISPR-associated (Cas) nuclease system (CRISPR/Cas) that engineer a double strand break (DSB) at the target site and promote gene targeting or homologous recombination can improve efficiencies by nearly 1000 fold and thus could offer a solution.

Among the available editors, the CRISPR/Cas system has emerged as a tool of choice in most laboratories because of the ease of design, assembly, delivery and a high degree of reliable gene modifications. In pigs and other domestic animals, the CRISPR/Cas system has been employed successfully for the generation of edited animals. In these studies, a mammalian codon optimized Type III Cas9 from Streptococcus pyogenes alongside a chimeric synthetic single-guide RNA (sgRNA) containing Cas9 binding sites and a 20 nt guide sequence specific to the target site has been used to introduce DSBs. The DSBs generated by CRISPRs (and other editors) activate endogenous DNA repair pathways that include a predominant error-prone non-homologous end joining (NHEJ) or high fidelity homology-directed repair (HDR) pathway. For generating gene ablation models, NHEJ is the preferred pathway, whereas for introducing gene knockins and point mutations, the HDR pathway is preferred. In order to achieve HDR at high frequencies the use of a single-stranded oligonucleotide as a targeting template, small molecule inhibitors of DNA ligase IV such as SCR7, transient incubation of cells at low temperatures, and finally, the choice of CRISPR reagents. CRISPR reagents can be delivered as DNA expression vectors, RNA preparations, and most recently a ribonucleoprotein complex of Cas9 protein and sgRNA.

Materials and Methods

Plasmid Construction and Production of sgRNA

An expression plasmid for Cas9 nuclease (pMJ920) was used. Two complementary sgRNA oligo DNAs (22 nucleotides in length) were commercially synthesized, annealed to form double-strand DNA and cloned into a Bsa1 restriction enzyme digested in-house vector to yield a U6 promoter-driven sgRNA expression cassette. The cloned fragments were DNA sequenced to confirm their fidelity. Confirmed sgRNA expressing vector was in vitro transcribed using MEGAshortscript T7 kit to generate chimeric sgRNA. sgRNA was purified with MEGAclear kit for complexing with Cas9 protein prior to nucleofection.

Nucleofection and Knockin Experiments

Cas9 plasmid + sgRNA DNA + oligo or Cas9 protein + sgRNA mRNA + oligo were prepared to a final volume of less than 5 mL. The Cas9 protein mixture was incubated for 10 min at room temperature to allow ribonucleoprotein complex formation as per manufacturer’s recommendation. Approximately, 1 * 106 cultured fetal fibroblast cells (culture conditions described below as for sorted cells) were harvested, washed once in PBS, and resuspended in nucleofection buffer. About 5 mL of Cas9 plasmid mixture or Cas9 protein mixture and cell suspension were combined in a Lonza 4D strip nucleocuvette. Reaction mixtures were electroporated using DO113 setting and immediately plated into one well of a 6-well plate at high density to facilitate recovery. Electroporated cells were incubated (with 10 mM/mL SCR7 or without SCR7) 30oC for 72 h or 38.5oC for 18 hands disassociated with 0.05% trypsin for sorting.

Single Cell Sorting and Culture For Colony Screening

Nucleofected cells were sorted for GFP expression on a Flowcytometer at  1 cell/well density into 96-well plates, which were gelatinized by adding 100 mL of 0.1% gelatin solution for 1–3 h to facilitate attachment. Fifty microliters of 40% FCS High Glucose DMEM which was conditioned (CM) by incubating 20 mL with 2.5 * 106 irradiated CF1 mouse embryonic feeder cells/T75 flask. The CM was supplemented with 5 ng/mL bFGF and filtered before adding to 96 well plate. All wells were fed with 50 mL 10% FCS CM + bFGF immediately after sort, and 50 mL 20% FCS CM after 18 h. The sorted plates were incubated in 5% CO2 + 5% O2 38.5oC for 7–10 days. Colonies that were 80%–100% confluent were split with 0.05% Trypsin, with one well split into 2 wells of 48 well (1:4; this is passage (P1)) and further incubated at 38.5oC for 3–5 days when 1 well was collected for DNA and the 2nd split into 1 well of 12 well (1:4) for further propagation (P2) or frozen in 92% FCS and 8% DMSO.

Somatic Cell Nuclear Transfer (SCNT)

Cumulus-oocyte complexes (COCs) were purchased from a commercial supplier.  Briefly, matured oocytes were enucleated by aspirating the polar body and MII chromosomes with an enucleation pipette. After enucleation, a donor cell was introduced into the perivitelline space of an enucleated oocyte. Fusion of injected oocytes was induced by DC pulse (2.0 kV/cm for 30 ms using a BTX-Cell Manipulator 2001). After fusion, the reconstructed oocytes were activated by an electric pulse (1.0 kV/cm for 60 ms), followed by 4 h of incubation in PZM3 medium containing 2 mM 6-dimethylaminopurine. Approximately 120–130 reconstructed oocytes were surgically transferred into the oviducts of naturally cycling gilts on the first day of standing estrus. Following transfer, pregnancies were confirmed on Day 30 by ultrasound. Fetuses were harvested from Day 45 pregnant euthanized sow. Tissues from four of the 11 fetuses were trypsin digested and incubated in T175 plates (P0) for subsequent culture and cryostorage.

Genotyping of Fibroblast Cell Colonies, and Edited Nuclear Transfer Fetuses

Single cell-derived colonies cultured for 10–15 days were washed three times with PBS-PVA (pH 7.4) medium. About 2–3 mL of colony suspension was transferred into 18 mL of colony lysis buffer (50mMKCl, 1.5mMMgCl2, 10mMTris pH 8.0, 0.5% NP-40, 0.5% Tween-20 and 100 mg/mL proteinase K) and incubated for 1 h at 65oC. The digestion was terminated by heating the mixture at 95oC for 10 min, and 2 mL of supernatant was used as a PCR template. Tissue biopsies from fetuses were digested in a tissue lysis buffer (50 mM Tris pH 8.0, 0.1 M NaCl, 20 mM EDTA, 1% SDS, 50 mg/mL RNase A, 100 mg/mL proteinase K) overnight at 65oC. Following overnight digest, the genomic DNA of the sample was extracted from the tissue lysate using phenol-chloroform, and recovered by resuspension in 100 mL of 10 mM Tris-HCl, pH 7.4 buffer following ethanol precipitation. Purified genomic DNA was amplified using PCR (primers in Table), cloned into PCR2.1 vectors and transformed into E. coli DH5-maximum competent cells. Five to ten colonies were picked, cultured, plasmid DNA extracted and sequenced. Sequences were aligned by Bio-Edit software

Targeted phiC31 Integrase Mediated Integration of GFP Transgene into Pseudo attP Sites in COL1A Locus

Porcine fetal fibroblasts containing pseudo attP sites downstream of COL1A site were nucleofected with a previously published GFP transgene (pDB2; 0.7 mg) containing consensus attB site for phiC31 integrase and CMV promoter drove integrase gene (1.3 mg). A control nucleofection was performed with GFP transgene (pDB2) without the CMV-Integrase plasmid. Following nucleofection, the cells were selected for stable integration by selecting with 500 ng/mL of G418 for 7–10 days followed by flow cytometry. A PCR screen was performed with transgene-specific and flanking COL1A sequence to show integration of the plasmid at the target site

Primers table

Target Primer Sequence
Gene Knockin Primers COL1A Forward AGCCAGGCTGCCTTGTTTG