Introduction
Combinatorial protein libraries are useful in explaining the intricate relationship between the amino acid sequence and the protein shape, activity and stability. Properties of proteins changes if there are any mutagenesis in certain residues of the amino acid sequence and its side chains. Chemically synthesized or directed mutagenesis has been used its enzymatic activity and to measure the bonding strengths in proteins. The libraries of these mutant proteins are helpful in studying the functions of sidechains and proteins at a rapid rate. Both random and site-specific mutagenesis can be used for the creation of the combinatorial libraries of mutant proteins. This saturation mutagenesis can be done by replacing specific amino acid by all other naturally occurring amino acids. Although substitution of specific amino acids can also provide a picture of amino acid functions. Use of fewer amino acid also simplifies the data analysis.
The residues present in the protein molecules determines the non-covalent binding between the receptors and ligand molecules. This structure forms structural epitopes. Biophysical techniques (like X-ray crystallography and NMR) reveal the binding contacts. But to understand how protein works, the properties of the structure, which is revealed by biophysical techniques, is not complete enough. Majority of the binding energy is contributed by a subset of residue through hydrogen bonds, salt bridges, dipole-dipole interactions and hydrophobic interactions. Functional epitopes are defined by energetically favorable contacts. A hotspot of binding energy is contributed by a cluster of functional epitopes. Quantifying the contribution of individual residues and identifying functional epitopes is done methods used in protein engineering. Functional epitopes reveal the working of a protein. Identification of the residues, whose residues partake in the direct contact helps to understand the protein function. Residues are also responsible for the folding of the protein. This determines the stability of the protein. Combinatorial alanine scanning is ideally suited for identifying these residues.
Types
Alanine mutagenesis
A method of systematic alanine substitution is useful in identification of functional epitopes. This systematic alanine substitution is called as alanine-scanning mutagenesis. The substitution removes all the side chain atoms past the beta-carbon. If any sidechain function is present in the native protein, it is interfered by the substitution of alanine as it lacks unusual backbone dihydral angle preferences. Glycine can also nullify the sidechain but adds a property of flexibility in the protein backbone. Alanine scanning mutagenesis has been used in cases of human growth hormone (hGH) binding to hGH-binding protein (hGHbp), CD4 binding to HIV-gp 120, the enzymatic activity of kinases and lysozyme stability.
Though alanine mutagenesis can be used for detailed mapping of functional epitopes, the limitation of this method is that it is laborious. For every alanine-substituted protein, it should be separately expressed and folded if necessary. An in-vitro assay is used for assessing the effect of the mutation. Invivo assays are available only for a subset of interesting proteins. Combinatorial libraries of alanine scanning can be used to counter the limitation of alanine scanning.
To apply the combinatorial library two conditions should be satisfied in methodologies.
- An alanine substitution and a wild-type substitution should be done in the specific position of the proteins.
- A diversity of 10 11 alanine substituted proteins should be satisfied.
Combinatorial, site-specific mutagenesis
This method is used as an alternative to the alanine-scanning, as using this method multiple alanine substitutions can be accessed. A single round of site-specific oligonucleotide-directed mutagenesis, a binomial substitution of either alanine or a wild-type amino acid can be done. This substitution is readily available for oligonucleotide synthesis of seven basic amino acids. For all the seven amino acids altering a single nucleotide sequence should result in an alanine codon. For example, the codon for serine is TCC, if T is replaced by G the resulting codon GCC represents alanine in the translated protein library. The translated library will have the same ratio of alanine and serine as of the ratio of T and G added to the reaction mixer. Libraries with multiple alanine substitutions in different positions can be encoded by degenerate oligonucleotides with the mutation in multiple positions.
This combinatorial mutagenesis has been used in the following application. The function of DNA-binding protein l-repressor and stability of the protein has been studied by multiple alanine substitutions. In this case, the multiple alanine substitutions have been used as an index of robustness of the protein. Almost 25% of the mutation retained activity. This technique also able to answer whether the energetic contribution of individual amino acids are additive or not. Majorly individual sidechains of the amino acids contribute the binding energy of the receptor-ligand interaction in an additive fashion. This test is also applied to identify the amino acid side chains that are essential to the activity of tRNA synthetase.
When binomial mutagenesis is extended beyond the seven amino acids (whom with single mutation resulted in alanine), a split pool synthesis is used. In one pool alanine synthesis is coded while the other has a wild type. This split-pool synthesis is used to investigate the interface the heavy and light variable domains of the antibody. The conservation of the wildtype sidechains bordering the antigen-binding site is described as a secondary sphere. The secondary sphere effects imply the usefulness of the combinatorial scanning in the identification of the residues, which contribute directly to protein function. The resilence of the protein folding despite multiple alanine substitutions has been demonstrated in the above example, t4 lysozyme, Arc repressor, and many other systems. These multiple alanine substitutions lead to another technique called shotgun technique. When combinatorial alanine scanning is applied to more than 20 positions it is known as shotgun scanning. The resulting protein libraries produced in this method are diverse.
Shotgun scanning
The libraries of the alanine substituted proteins are displayed on the surface of filamentous phage particles in in-vitro selection. Successive rounds of binding selection are used to enrich residues which contribute binding energy to the receptor-ligand interaction. Conventional, automated DNA synthesis is employed for synthesis of oligonucleotides encoding the shotgun scanning libraries.
Phage display simplifies the construction and screening of protein libraries in many ways.
(i) Large libraries of proteins (>1010 unique clones) are readily accessible. For each unique protein variant fused to the surface of a different phage particle.
(ii) Post selection for displayed proteins that bind to a receptor, this phage can be amplified in an Escherichia coli host.
(iii) The phage particles encapsulate DNA encoding the displayed protein; thus, standard DNA sequencing can be used to identify selected proteins.
In each specifically mutated position substitution with either the wild-type amino acid, alanine or up to two other amino acid side chains in making Shotgun-scanning libraries. The difference in the oligonucleotide synthesis and substitution of some amino acids with four possible sidechains differentiates shotgun scanning from previous combinatorial alanine-mutagenesis techniques. A simple assumption that the energetic contribution of specific sidechains to receptor–ligand binding focuses entirely on the distribution of alanine or wild-type in each substituted position is employed for the selection process. This simplification makes use of standard oligonucleotide synthesis possible in combinatorial alanine mutagenesis. Although, it makes the possibility of a secondary analysis of non-wild-type, non-alanine substitutions in each position.
In case of unexpected interactions, strong selection for non-alanine and non-wild-type amino acids should be implemented. The instance like mutations that improve affinity to the ligand is an example of unexpected interaction. To validate this focus, the distribution of wild-type or alanine in each position is tested. The 19 residues of hGH comprising the high-affinity binding site for hGHbp were shotgun-scanned in a single library. After multiple rounds of selection and amplification in an E. coli host, individual hGHbp-binding phage was identified by a high-throughput binding assay and subjected to DNA sequencing. The sequencing is focused entirely upon the distribution of alanine or wild-type in each scanned position. This revealed specific positions that were highly conserved as the wild-type amino acid, whereas other positions demonstrated a roughly even distribution of alanine or wild-type.
Application 1: Rapid mapping of protein functional epitopes by combinatorial alanine scanning
A Combinatorial alanine-scanning technique is used in this application. In this application the functionality of 19 side chains buried at the interface between human growth hormone and the extracellular domain of its receptor are studied. The substitution mutation is done as a split-pool technique. In all 19 side cahins either the wild type or alanine mutation is allowed to take place. This mutation is used to construct a phage-display protein library. selection is based on binding and appled to isolate functional clones. The entire library pool is subjected to isolated followed by DNA sequencing. At each varied position DNA sequencing was used to determine the alanine/ wild-type ratio. This ratio is used as a parameter used to calculate the effect of relative mutation (alanine to wild-type). The ratiion is comapred to change in free energy. Out of all 19 side chains only seven side chains contribute significantly to the binding interaction. These conserved residues form a compact cluster in the human growth hormone tertiary structure. The methodology for this application is discussed below.
Materials and Methods
Step 1: Shotgun-Scanning Library Construction
(i) Phagemid pW1205a was used as the template for library construction. pW1205a is identical to a previously described phagemid designed to display hGH on the surface of M13 bacteriophage as a fusion to the amino terminus of the major coat protein, except for the following changes.
(a) First, for allowing detection with an anti-flag antibody, the hGH–P8 fusion moiety has a peptide epitope flag (amino acid sequence: MADPNRFRGKDLGG) is fused to its amino terminus, .
(b) Second, codons encoding residues 41, 42, 43, 61, 62, 63, 171, 172, and 173 of hGH have been replaced by TAA stop codons.
(ii) Briefly, pW1205a was used as the template for the Kunkel mutagenesis method with three mutagenic oligonucleotides designed to repair simultaneously the stop codons and introduce mutations at the desired sites.
(iii)The mutagenic oligonucleotides had the following sequences: oligo 1 (mutant hGH codons 41, 42, 45, and 48), 59-ATC CCC AAG GAA CAG RMA KMT TCA TTC SYT CAG AAC SCA CAG ACC TCC CTC TGT TTC-39; oligo 2 (mutant hGH codons 61, 62, 63, 64, 67, and 68), 59-TCA GAA TCG ATT CCG ACA SCA KCC RMC SST GAG GAA RCT SMA CAG AAA TCC AAC CTA GAG-39; oligo 3 (mutant hGH codons 164, 167, 168, 171, 172, 175, 176, 178, and 179), 59-AAC TAC GGG CTG CTC KMY TGC TTC SST RMA GAC ATG GMT RMA GTC GAG RCT KYT CTG SST RYT GTG CAG TGC CGC TCT-39. (Note that standard single-letter codes for amino acids are used, and DNA degeneracies are represented by IUB code: K 5 GyT, M 5 AyC, N 5 AyCyGyT, R 5 AyG, S 5 GyC,W5 AyT, Y 5 CyT.)
(iv)The library contained 1.2 * 1011 unique members, and DNA sequencing of the naive library revealed that 45% of these had incorporated all three mutagenic oligonucleotides. Thus, the library had a diversity of approximately 5.4 * 1010.
Step 2: Shotgun Library Sorting and Binding Assays
(i) Phage from the library described above were cycled through rounds of binding selection with hGHbp or anti-hGH monoclonal antibody coated on 96-well Nunc Maxisorp immunoplates as the capture target.
(ii) E. coli XL1-blue was used for the propagation of phage. A M13-VCS helper phage (Stratagene) is added during the propagation.
(iii) After one (antibody sort) or three (hGHbp sort) rounds of selection, individual clones were grown in a 96-well format in 500 ml of 2YT broth supplemented with carbenicillin (10 mg/ml) and M13-VCS (1010 phage per ml).
(iv) Phage ELISAs is used to detect phage-displayed hGH variants that bound to either hGHbp or anti-hGH antibody, in the culture supernatants, immobilized on a 96-well Maxisorp immunoplate.
Step 3: DNA Sequencing and Analysis
(i) Culture supernatant containing phage particles was used as the template for a PCR that amplified the hGH gene and incorporated M13(-21) and M13R universal sequencing primers.
(ii) The amplified DNA fragment was used as the template in Big-Dye terminator sequencing reactions, which were analyzed on an automated DNA sequencer.
(iii) All reactions were performed in a 96-well format.
(iv) The program SGCOUNT aligned each DNA sequence against the wild-type (wt) DNA sequence by using a Needleman–Wunch pairwise alignment algorithm, translated each aligned sequence of acceptable quality, and then tabulated the occurrence of each
natural amino acid at each position.
(v) Additionally, SGCOUNT reported the presence of any sequences containing identical
amino acids at all mutated positions (siblings). The antibody sort (175 total sequences) did not contain any siblings. But, the hGHbp sort (330 total sequences) contained 16 siblings representing 5 unique sequences. SGCOUNT was written in C and compiled and tested.
Step 4: Shotgun-Scanning Data Analysis
For each selection, the ratio of wt to alanine at each position was calculated as follows:
wt/Ala= nwt/nalanine
With an assumption wt/Ala = Ka,wt/Ka,Ala, where Ka,wt and Ka,Ala are the association equilibrium constants for hGHbp binding to wt or alanine-substituted hGH, respectively. With this assumption, we calculated a DDG value for the hGHbp selection (DDGbp) and the antibody selection (DDGa) by substituting wt/Ala for Ka,wt/Ka,Ala in the standard equation:
DDG = RT ln(Ka,wt/Ka,Ala) = RT ln(wt/Ala).
In this way, we obtained a measure of each alanine mutant’s effect on each selection as a change in free energy relative to that of wt. Finally, we defined the contribution to binding free energy attributable to each side chain (DDGmut-wt) as follows:
DDGmut-wt = DDGbp – DDGa.