2.11: Rules of Protein Structure - Biology

2.11: Rules of Protein Structure - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

The function of a protein is determined by its shape. A number of agents can disrupt this structure thus denaturing the protein.

  • changes in pH (alters electrostatic interactions between charged amino acids)
  • changes in salt concentration (does the same)
  • changes in temperature (higher temperatures reduce the strength of hydrogen bonds)
  • presence of reducing agents (break S-S bonds between cysteines)

Often when a protein has been gently denatured and then is returned to normal physiological conditions of temperature, pH, salt concentration, etc., it spontaneously regains its function (e.g. enzymatic activity or ability to bind its antigen). This tells us that the protein has spontaneously resumed its native three-dimensional shape. Moreover, this ability is intrinsic; no outside agent was needed to get it to refold properly.

However, there are enzymes that add sugars to certain amino acids, and these may be essential for proper folding. These proteins, called molecular chaperones, enable a newly-synthesized protein to acquire its final shape faster and more reliably than it otherwise would.


Although the three-dimensional (tertiary) structure of a protein is determined by its primary structure, it may need assistance in achieving its final shape.

  • As a polypeptide is being synthesized, it emerges (N-terminal first) from the ribosome and the folding process begins.
  • However, the emerging polypeptide finds itself surrounded by the watery cytosol and many other proteins.
  • As hydrophobic amino acids appear, they must find other hydrophobic amino acids to associate with. Ideally, these should be their own, but there is the danger that they could associate with nearby proteins instead — leading to aggregation and a failure to form the proper tertiary structure.

Despite the importance of chaperones, the rule still holds: the final shape of a protein is determined by only one thing: the precise sequence of amino acids in the protein. And the sequence of amino acids in every protein is dictated by the sequence of nucleotides in the gene encoding that protein. So the function of each of the thousands of proteins in an organism is specified by one or more genes.

The NKCC and NCC Genes

E. Predicted but not Demonstrated Topologies of SLC12A1, 2 and 3 Proteins

A protein topology predicted in silico is halfway from the peptide sequence to the real three-dimensional structure of the protein (von Heijne, 2006 ). Hence, computer algorithms developed to predict protein topology or structure based on physicochemical properties of amino acid sequences as well as by comparison with known protein structures (e.g. threading and homology modeling) are invaluable tools to infer topology and/or function–structure relationships.

Most of the SLC12A proteins appear to share similar predicted structures with several transmembrane domains and long intracellular N- or C-termini. This assumption is based on the estimated hydrophilicity/hydrophobicity profiles of deduced SLC12A protein sequences according to Kyte-Doolittle’s algorithm ( Kyte and Doolittle, 1982 ). A key feature of this algorithm is the so-called “window size”, i.e. the number of amino acids examined at a time to determine a point of hydrophobic character ( Kyte and Doolittle, 1982 ). Hence, it is critical to choose a window size that corresponds to the expected size of the structural motif under investigation (i.e. a window size of 19–21 (about the size of a membrane spanning α-helix) will make hydrophobic, membrane-spanning domains stand out on the Kyte-Doolittle scale (typically >1.6)). However, windows sizes ranging from 11 to 15 amino acids were used to generate hydropathy plots predicting 12 transmembrane (TM) domains in mammalian members of the SLC12A family ( Caron et al., 2000 Delpire et al., 1994 Gamba et al., 1994 Gillen et al., 1996 Hiki et al., 1999 Moore-Hoon and Turner, 1998 Payne and Forbush, 1994 Payne et al., 1996 Yerby et al., 1997 ). Although alternative topological models for members of the SLC12A family have been proposed ( Park and Saier, 1996 ) and several transport protein families include members that probably have more or less than 12 TM domains ( Espanol and Saier, 1995 Paulsen and Skurray, 1993 ), it is accepted that the SLC12A family are proteins of 12 TM domains.

It is now clear that the most important factor in determining membrane insertion is the hydrophobicity of 19–21 amino acid sequences ( Zhao and London, 2006 ). This concept is better represented by using the experimentally determined transfer-free energies (ΔG) for each amino acid (i.e. a thermodynamic scale of hydrophobicity) originally proposed by Wimley and White ( Wimley and White, 1996 ). Hence, the hydrophobicity plot of Wimley-White (also known as the octanol plot) identifies the position of transmembrane α-helices in protein sequences with less ambiguity than the Kyte-Doolittle plot. As shown in Fig. 11.3 , octanol plots obtained for SLC12A1 (NKCC2), SLC12A2 (NKCC1) and SLC12A3 (NCC) are different to the ones originally proposed for these gene products using the Kyte-Doolittle algorithm with a window size of 11–15 ( Delpire et al., 1994 Gamba et al., 1994 Payne and Forbush, 1994 Yerby et al., 1997 ). However, the octanol plot correlates very well with the Kyte-Doolittle plot if the latter is constructed using a window size of 19–21 amino acids ( Fig. 11.3 ).

Figure 11.3 . Kyte-Doolittle and White-Wimley plots of NKCC2 and NKCC1 protein sequences. A. Predicted NKCC protein topology. Putative transmembrane domains (TM) are indicated as gray boxes across the lipid bilayer. The position of NKCC2 amino acids predicted to be localized at the TM domains is numbered underneath each potential TM domain. The continuous gray line represents the amino acid chain of the NKCC2 proteins. Colored dots located at the cytoplasmic N-terminal and C-terminal portions of NKCC2s represent the location of residues predicted to be phosphorylated (blue: Ser, green: Thr and black: Tyr) and potential N-glycosylation sites (red dots). The potential site for tyrosine sulfination at the N-terminus of NKCC2 is indicated with an arrowhead. Phosphorylation and sulfination sites on NKCC2 proteins were predicted using NetPhos ( ) and Sulfinator ( ), respectively. B. Hydropathy plots of hNKCC2A (ABU69043) (top), rNKCC2A (ABU63482) (center) and hNKCC1a (AAC50561) (bottom). These analyses were performed using a window size of 19 residues. Window sizes of 19 or 21 make hydrophobic, membrane-spanning domains stand out clearly (typically, values &gt 1.6 on the Kyte and Doolittle scale). Under these conditions, hNKCC2 proteins are predicted to have 10 TM regions: 174–198, 208–228, 259–279, 298–318, 323–349, 380–402, 413–441, 489–512, 551–579 and 604–627. Each TM is ∼20 residues in length and highly identical among species. All predicted TMs in NKCC2s have energetic preferences for being in the lipid environment as characterized by the total free energy (ΔG) above zero in the White-Wimley interface hydropathy plot. The mean charges of the amino acids are calculated by giving the residues D (Asp) and E (Glu) a charge of −1, K (Lys) and R (Arg) a charge of +1, and the residue H (His) a charge of +0.5. The represented data were obtained using jEMBOSS for Linux (, TMap, TMPredProtScale (at the ExPASy molecular biology server) and PROTEUS Structure Prediction Server v2.0 ( ).

Prediction algorithms based solely on hydrophobicity plots ( Kyte and Doolittle, 1982 ) or thermodynamic scales of hydrophobicity ( Wimley and White, 1996 ) are somewhat incomplete and innacurate. The fact that ∼5% of the transmembrane α-helices in the known structures are very short (<15 residues) and only partially span the membrane, together with the lack of critical thermodynamic data, have made transmembrane prediction algorithms somewhat unsatisfactory. It was not until recently that the free energy contributions from individual amino acids in different positions along the membrane was reported ( Hessa et al., 2007 ). Hence, the accuracy of algorithms predicting TM helixes has been recently improved by the development of new tools such as MemBrain ( Shen and Chou, 2008 ), TopPred ΔG ( Hessa et al., 2007 ), SCAMPI ( Bernsel et al., 2008 ), ZPRED ( Granseth et al., 2006 ) and PRO/PRODIV-TMHMM ( Viklund and Elofsson, 2004 ). Most of these algorithms are part of TOPCONS protein topology prediction server ( By using MemBrain or SCAMPI, human SLC12A1, SLC12A2 and SLC12A3 proteins (i.e. NKCC2, NKCC1 and NCC) it can be predicted that these proteins may have 13 TM domains, whereas PRODIV, PRO or OCTUPUS predicts 12 TM domains ( Fig. 11.4 ). It should be mentioned that the model having 13 TM domains places the N- and C-termini in different compartments (inside and outside, respectively), which is not supported by current experimental evidence.

Figure 11.4 . Consensus prediction of membrane protein topology. The topological information of hNKCC2A, hNKCC1a and hNCCa proteins (GenBank ABU69043, AAC50561 and AAC50355, respectively) was generated by using five different algorithms: SCAMPI, OCTUPUS, ZPRED, PRO/PRODIV-TMHMM ( ) and MemBrain, an algorithm used for predicting ends of TM domains that are shorter than 15 residues. A. Predicted topology of NKCC and NCC proteins according to the algorithms used (MemBrain (red), SCAMPI (blue), PRO/PRODIV and TOPCONS (green)). Predicted TM domains are indicated as gray boxes across the lipid bilayer. The NKCC/NCC amino acid location predicted to be located in each TM is numbered underneath each transmembrane domain and varies according to the algorithm used. The continuous gray line represents the amino acid chain of the NKCC/NCC proteins whereas dotted lines represent the potential topologies according to the algorithm used. The cytoplasmic N-terminal and C-terminal portions of NKCCs/NCC are indicated. B. Predicted total free energy (ΔG) values of each residue in hNKCC2A (top), hNKCC1a (center) and hNCCa (bottom) protein sequences.

Molecular evolutionary and structural analysis of the cytosolic DNA sensor cGAS and STING

Cyclic GMP-AMP (cGAMP) synthase (cGAS) is recently identified as a cytosolic DNA sensor and generates a non-canonical cGAMP that contains G(2',5')pA and A(3',5')pG phosphodiester linkages. cGAMP activates STING which triggers innate immune responses in mammals. However, the evolutionary functions and origins of cGAS and STING remain largely elusive. Here, we carried out comprehensive evolutionary analyses of the cGAS-STING pathway. Phylogenetic analysis of cGAS and STING families showed that their origins could be traced back to a choanoflagellate Monosiga brevicollis. Modern cGAS and STING may have acquired structural features, including zinc-ribbon domain and critical amino acid residues for DNA binding in cGAS as well as carboxy terminal tail domain for transducing signals in STING, only recently in vertebrates. In invertebrates, cGAS homologs may not act as DNA sensors. Both proteins cooperate extensively, have similar evolutionary characteristics, and thus may have co-evolved during metazoan evolution. cGAS homologs and a prokaryotic dinucleotide cyclase for canonical cGAMP share conserved secondary structures and catalytic residues. Therefore, non-mammalian cGAS may function as a nucleotidyltransferase and could produce cGAMP and other cyclic dinucleotides. Taken together, assembling signaling components of the cGAS-STING pathway onto the eukaryotic evolutionary map illuminates the functions and origins of this innate immune pathway.

© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.


Distribution of cGAS and STING…

Distribution of cGAS and STING homologs across a choanoflagellate and 61 metazoan species.…

ML phylogenetic trees showing the…

ML phylogenetic trees showing the relationships of cGAS (A) or STING (B) homologs.…

Evolution of functional domains in…

Evolution of functional domains in cGAS proteins. (A) A diagram of the domain…

Multiple sequence alignment of DncV…

Multiple sequence alignment of DncV and the representative sequences of cGAS homologs, OAS1…

Evolution of functional domains in…

Evolution of functional domains in STING proteins. (A) A diagram of the domain…

Structural modeling of STING from…

Structural modeling of STING from Japanese medaka Oryzias latipes binding with 2′3′-cGAMP. (A)…

cGAS-STING signaling to trigger type…

cGAS-STING signaling to trigger type I IFN and phylogenetic profiles of its molecular…

Structural insight into the role of novel SARS-CoV-2 E protein: A potential target for vaccine development and other therapeutic strategies

The outbreak of COVID-19 across the world has posed unprecedented and global challenges on multiple fronts. Most of the vaccine and drug development has focused on the spike proteins and viral RNA-polymerases and main protease for viral replication. Using the bioinformatics and structural modelling approach, we modelled the structure of the envelope (E)-protein of novel SARS-CoV-2. The E-protein of this virus shares sequence similarity with that of SARS- CoV-1, and is highly conserved in the N-terminus regions. Incidentally, compared to spike proteins, E proteins demonstrate lower disparity and mutability among the isolated sequences. Using homology modelling, we found that the most favorable structure could function as a gated ion channel conducting H+ ions. Combining pocket estimation and docking with water, we determined that GLU 8 and ASN 15 in the N-terminal region were in close proximity to form H-bonds which was further validated by insertion of the E protein in an ERGIC-mimic membrane. Additionally, two distinct "core" structures were visible, the hydrophobic core and the central core, which may regulate the opening/closing of the channel. We propose this as a mechanism of viral ion channeling activity which plays a critical role in viral infection and pathogenesis. In addition, it provides a structural basis and additional avenues for vaccine development and generating therapeutic interventions against the virus.

Conflict of interest statement

The authors have declared that no competing interests exist.


Fig 1. Sequence alignment of SARS CoV…

Fig 1. Sequence alignment of SARS CoV E proteins, disparity index and mutability.

Fig 2. Pentameric homology model of the…

Fig 2. Pentameric homology model of the E protein of SARS-CoV-2.

Fig 3. Pore volume estimation of E…

Fig 3. Pore volume estimation of E put protein by GALAXY-WEB and docking with water…

Fig 4. Pore volume estimation of E…

Fig 4. Pore volume estimation of E put protein by SWISS MODEL and docking with…

Fig 5. E-put protein interacts with the…

Fig 5. E-put protein interacts with the lipid molecule components of ERGIC membrane.

Fig 6. Proposed mechanism of proton chaneling…

Fig 6. Proposed mechanism of proton chaneling activity in E-put protein.

Fig 7. Membrane insertion of E-put protein…

Fig 7. Membrane insertion of E-put protein and structural morphing from open to closed state.


A Dataset of Protein Complexes

We retrieved all Biological Units from the PDB (October 2005), which are the protein complexes in their physiological state, according to the PDB curators. This information is attained by a combination of statements from the authors of the structures, literature curation, and the automatic predictions made by the Protein Quaternary Structure (PQS) server [17,18]. The PDB Biological Unit is explained in more detail in Protocol S1. Inferring the Biological Unit from a crystallographic structure is a difficult, error-prone process [17,19,20]. In Ponstingl et al. (2003), an automatic prediction method was estimated to have a 16% error rate. We discuss later how our classification of protein complexes can facilitate this process and how we used it to pinpoint possible errors in Biological Units.

We filtered Biological Units according to the following criteria: we only considered the structures present in SCOP 1.69 [4] because our methodology requires SCOP superfamily domain assignments. We removed virus capsids and any complex containing more than 62 protein chains because PDB files cannot handle more than 62 distinct chains references (a–z, A–Z, 0–9), and also because of the high computational cost. We discarded structures that were split into two or more complexes when removing nonbiological interfaces as defined in the next section. When two or more copies of a complex are present in the asymmetric unit, the PDB curators create many copies of the same Biological Unit. In these cases, we retain only one copy.

After applying these filters, we obtained 21,037 structures, which we use throughout this study.

Extracting Fundamental Structural Features from Protein Complexes

A prerequisite for creating a hierarchical classification of protein complexes is a fast way of comparing complexes with each other. The full atomic representation is not practical, because automatic structural superposition is difficult, if not impossible, for divergent pairs of structures [21]. Instead, we need to summarize the fundamental structural features of protein complexes into a representation easier to manipulate.

Which subset of features shall we choose? A natural way to break down a complex is into its constituent chains, each of which is a gene product. The pattern of interactions between the chains determines the QS and hence function of the complex. Unlike large-scale proteomic experiments, where complexes consist of a list of constituent subunits, PDB structures provide us with the QS: the exact stoichiometry of the subunits and the pattern of interfaces between them. The QS often plays a role in regulating protein function, and its disruption can be associated with diseases [22,23]. For example, in the case of the superoxide dismutase, the disruption of the QS destabilizes the protein and is linked with a neuropathology [23].

To extract the pattern of interfaces from the structures, we calculate the contacts between pairs of atomic groups. We define a protein–protein interface by a threshold of at least ten residues in contact, where the number of residues is the sum of the residues contributed to the interface by both chains. A residue–residue contact is counted if any pair of atomic groups is closer than the sum of their van der Waals radii plus 0.5 Å [24]. We investigated the effect of changing the threshold of ten residues at the interface and found that it had only a minor effect on the classification. Please refer to Table S1 for details.

As one of our goals is to compare the evolutionary conservation of protein chains both within and across complexes, we must include information that allows us to relate the chains to each other. To do this, we use structural information, as defined by the SCOP superfamily domains, as well as sequence information. The N- to C-terminal order of SCOP superfamily domains enables us to detect distant relationships, while the sequence similarity allows comparisons at a finer level, e.g., filtering of identical chains.

We chose the chain domain architecture, the sequence, and the chain–chain contacts to represent protein complexes because these are universal attributes of complexes. In contrast, other attributes such as the presence of a catalytic site, or the transient or obligate nature of an interface, are neither universal nor always available from the structure. However, these attributes can be easily projected onto our classification scheme to see how they relate among protein complexes sharing evolutionarily related chains.

To this core representation we add symmetry information, which refines the description of the subunits' arrangement beyond the interaction pattern. We process the symmetry of each complex using an exhaustive search approach. Briefly, we centre the coordinates of the complex on its centre of mass we then generate 600 evenly spaced axes passing through the centre of mass. We check whether the complex, rotated at different angles around each of the axes, superposes onto the unrotated complex. From this, we deduce the symmetry type. For a more detailed description, please refer to the Methods section and to Figure S1.

A graph is simple and well-suited to store and visualize this information (Figure 2A). The graph itself provides what we call the topology of the complex, i.e., the number of polypeptide chains (nodes) and their pattern of interfaces (edges). A label on the graph carries the symmetry information. A label on each edge indicates the number of residues at the interface. Two further pieces of information are associated with each node in the graph: the amino acid sequence and the SCOP domain architecture of the chain. These two attributes provide information on the sequence and structural similarity and evolutionary relationships between chains. We then compare graph representations of complexes to build the hierarchical classification.

Note that we also include monomeric proteins in the classification, and we represent them by a single node. Though monomeric proteins are not complexes, their inclusion allows us to compare their frequency and other properties to those of protein complexes.

Comparison of Complexes and Overview of the Classification

An advantage of the graph representation is that it allows fast and easy comparison using a graph-matching algorithm. As the graphs carry specific attributes about the structure and sequence of the chains, and about the symmetry of the complex, we had to implement a customized version of a graph-matching procedure to take this information into account. For algorithmic details please refer to the Methods section.

Importantly, our graph-matching procedure allows different attributes to be considered, as illustrated in Table 1 with “Y” and “N” tags. The table shows that the 12 levels of the hierarchical classification are created using one or more of the following five criteria to compare the complexes with each other: (i) the topology, represented by the number of nodes and their pattern of contacts, (ii) the structure of each constituent chain in the form of a SCOP domain architecture, (iii) the number of nonidentical chains per domain architecture within each complex, (iv) the amino acid sequence of each constituent chain for comparison between complexes, and (v) the symmetry of the complex.


  • As a polypeptide is being synthesized, it emerges (N-terminal first) from the ribosome and the folding process begins.
  • However, the emerging polypeptide finds itself surrounded by the watery cytosol and many other proteins.
  • As hydrophobic amino acids appear, they must find other hydrophobic amino acids to associate with. Ideally, these should be their own, but there is the danger that they could associate with nearby proteins instead &mdash leading to aggregation and a failure to form the proper tertiary structure.

To avoid this problem, the cells of all organisms contain molecular chaperones that stabilize newly-formed polypeptides while they fold into their proper structure. The chaperones use the energy of ATP to do this work.


Some proteins are so complex that a subset of molecular chaperones &mdash called chaperonins &mdash is needed.

Chaperonins are hollow cylinders into which the newly-synthesized protein fits while it folds.

Chaperonins also use ATP as the energy source to drive the folding process.

As mentioned above, high temperatures can denature proteins, and when a cell is exposed to high temperatures, several types of molecular chaperones swing into action. For this reason, these chaperones are also called heat-shock proteins (HSPs).

Not only do molecular chaperones assist in the folding of newly-synthesized proteins, but some of them can also unfold aggregated proteins and then refold the protein properly. Protein aggregation is the cause of disorders such as Alzheimer's disease, Huntington's disease, and prion diseases (e.g., "mad-cow" disease). Perhaps some day ways will be found to treat these diseases by increasing the efficiency of disaggregating chaperones.

Despite the importance of chaperones, the rule still holds: the final shape of a protein is determined by only one thing: the precise sequence of amino acids in the protein.

And the sequence of amino acids in every protein is dictated by the sequence of nucleotides in the gene encoding that protein. So the function of each of the thousands of proteins in an organism is specified by one or more genes.

Homology Modeling and Ligand-Based Molecule Design

4.5 Summary

Homology modeling expands the structural coverage of the druggable genome. The reliability of a homology model is proportional to the sequence homologous level between the target protein and its template. Other important factors to be considered in template selection include the consistency of the activation state between the target protein and the template, especially for kinases and GPCR targets. A homology model constructed on the basis of a properly selected template sheds light on the structural characters of the target protein, and high-quality homology models have proven useful in molecular docking-based VS. Mutagenesis results are valuable resources to evaluate the hypothesis proposed by docking approaches. On the other hand, ligand-based drug discovery is suitable to the targets whose structures are unavailable, yet rich information is accumulated for the ligands. Pharmacophore models derived from high-quality ligands are capable of identifying biologically relevant compounds efficiently and effectively. PBVS provides an important complement to HTS. To avoid wasting time and resources on false positives, an accurate and specific pharmacophore query is highly desirable. GA-guided query optimization offers an attractive venue to sharpen a pharmacophore query derived from a single structure, such that it can discriminate subtle structural variations among the positive and negative compounds.


Proteins are biological molecules that serve as cellular machines in living organisms. These large molecules are specific three-dimensional structures involved in biological processes such as cellular signaling, catalyzing chemical reactions, molecular transportation, and many other functions. Proteins are polymers, consisting of long chains of monomers, amino acids.

Amino acids

Figure 11. Primary structure of an amino acid. An amino acid is composed of an amine group, a central carbon, a carboxyl group and an R-group. R-groups vary from amino acid to amino acid.

Of the more than 500 amino acids known, only 20 appear in proteins of living organisms. An amino acid is a relatively simple organic molecule (Fig 11). Attached to a central carbon by single covalent bonds are: 1) a hydrogen atom, 2) an amine group (NH3+), 3) a carboxylic acid (COO-) group and 4) an R-group, also known as a side chain.

At around pH 7 as in water, the amine group of an amino acid attracts a proton becoming NH3+, and acts as a base. The carboxyl group is negatively charged in water, due to the high electronegativity of both oxygens pulling electrons from hydrogen and losing the proton. Different amino acids vary in their R-group. Of the protein-building amino acids, the R-groups can vary in their size, shape and polarity. Proteins, being made up of chains of amino acids, vary based on the interactions of the atoms within the amino acids and water. These interactions dictate the shape of the protein, which in turn determines its function.

R-groups vary in their polarity. Non-polar molecules have relatively equal distribution of electrons via covalent bonding, while polar molecules have an unequal distribution of electrons. The unequal distribution of electrons in polar molecules creates partially charged atoms (δ+, δ-). Polar R-groups are hydrophilic, meaning they have an affinity for water due to hydrogen bonds between the partial charges of the R-group and the water molecules. Non-polar R-groups are repelled by water, or hydrophobic. Therefore in a chain of amino acids, ones with polar R-groups will bend towards water and non-polar R-groups bend away, affecting the eventual shape of a protein.

Peptide bonding

Figure 12. A polypeptide is composed of several amino acids connected by peptide bonds. Note: on one end of the polypeptide is an amine group, whereas a carboxyl group is on the opposite end.

Proteins are polymers of amino acids chained together by covalent bonds, known as peptide bonds (Fig 12). A peptide bond is a condensation reaction, in which the oxygen ion (O-) of the carboxylic acid from one amino acid is removed (becoming carboxyl) and combines with two hydrogen atoms (2H) from the amine group of an adjacent amino acid to produce water (H2O). A covalent bond forms between two amino acids when the carbon of the carboxyl group that lost the OH during the condensation reaction combines with the adjacent nitrogen (N) of another amino acid that lost the hydrogen atoms, bonding two adjacent amino acids. This is a peptide bond. Amino acids link via peptide bonds forming long chained molecules, or polypeptides.

7 References

Bestor TH. Transposons reanimated in mice. 2005. Cell 122:322-325.

Committee on Rat Nomenclature, Cochairmen Gill T.J. III, Nomura T. 1992. Definition, nomenclature, and conservation of rat strains. ILAR News 34:S1-S56.

Committee on Standardized Genetic Nomenclature for Mice. 1963. A revision of the standardized genetic nomenclature for mice. J. Hered. 54:159-162.

Committee on Standardized Genetic Nomenclature for Mice. 1973. Guidelines for nomenclature of genetically determined biochemical variants in the house mouse, Mus musculus. Biochem. Genet. 9:369-374.

Committee on Standardized Genetic Nomenclature for Mice, Chair: Lyon, M.F. 1981. Rules and guidelines for gene nomenclature. In: Genetic Variants and Strains of the Laboratory Mouse, Green, M.C. (ed.), First Edition, Gustav Fischer Verlag, Stuttgart, pp. 1-7.

Committee on Standardized Genetic Nomenclature for Mice, Chair: Lyon, M.F. 1989. Rules and guidelines for gene nomenclature. In: Genetic Variants and Strains of the Laboratory Mouse, Lyon, M.F., A.G. Searle (eds.), Second Edition, Oxford University Press, Oxford, pp. 1-11.

Committee on Standardized Genetic Nomenclature for Mice, Chairperson: Davisson, M.T. 1996. Rules and guidelines for gene nomenclature. In: Genetic Variants and Strains of the Laboratory Mouse, Lyon, M.F., Rastan, S., Brown, S.D.M. (eds.), Third Edition, Volume 1, Oxford University Press, Oxford, pp. 1-16.

Ding S, Wu X, Li G, Han M, Zhuang Y, Xu. T. 2005. Efficient transposition of the piggyBac (PB) transposon in mammalian cells and mice. Cell 122:473-483.

Desvignes T, Batzel P, Berezikov E, Eilbeck K, Eppig JT, McAndrews MS, Singer A, Postlethwait JH. 2015. miRNA Nomenclature: A View Incorporating Genetic Origins, Biosynthetic Pathways, and Sequence Variants. Trends Genet 31: 613-626.

Dunn, L.C., H. Gruneberg, G.D. Snell. 1940. Report of the committee on mouse genetics nomenclature. J. Hered. 31:505-506.

Dupuy AJ, Akagi K, Largaespada DA, Copeland NG, Jenkins NA. 2005. Mammalian mutagenesis using a highly mobile somatic Sleeping Beauty transposon system. Nature 436:221-226.

Eppig, JT. 2006. Mouse Strain and Genetic Nomenclature: an Abbreviated Guide. In: Fox J, Barthold S, Davvison M, Newcomer C, Quimby F, Smith A (eds) The Mouse in Biomedical Research, Volume 1, Second Edition. Academic Press. pp.79-98.

Gaj T, Gersbach CA, Barbas CF 3 rd . 2013. ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends Biotechnol 31: 397-405.

International Committee on Standardized Genetic Nomenclature for Mice, Chairperson: Davisson, M.T. 1994. Rules and guidelines for genetic nomenclature in mice. Mouse Genome 92 vii-xxxii.

Levan G., H.J. Hedrich, E.F. Remmers, T. Serikawa, M.C. Yoshida. 1995. Standardized rat genetic nomenclature. Mamm. Genome 6:447-448.

Wijshake T, Baker DJ, van de Sluis B. 2014. Endonucleases: new tools to edit the mouse genome. Biochim Biophys Acta. 2014 Apr 30

<p>This section provides any useful information about the protein, mostly biological knowledge.<p><a href='/help/function_section' target='_top'>More. </a></p> Function i

V region of the variable domain of immunoglobulin light chains that participates in the antigen recognition (PubMed:24600447).

Immunoglobulins, also known as antibodies, are membrane-bound or secreted glycoproteins produced by B lymphocytes. In the recognition phase of humoral immunity, the membrane-bound immunoglobulins serve as receptors which, upon binding of a specific antigen, trigger the clonal expansion and differentiation of B lymphocytes into immunoglobulins-secreting plasma cells. Secreted immunoglobulins mediate the effector phase of humoral immunity, which results in the elimination of bound antigens (PubMed:20176268, PubMed:22158414).

The antigen binding site is formed by the variable domain of one heavy chain, together with that of its associated light chain. Thus, each immunoglobulin has two antigen binding sites with remarkable affinity for a particular antigen. The variable domains are assembled by a process called V-(D)-J rearrangement and can then be subjected to somatic hypermutations which, after exposure to antigen and selection, allow affinity maturation for a particular antigen (PubMed:17576170, PubMed:20176268).

<p>Manually curated information that is based on statements in scientific articles for which there is no experimental support.</p> <p><a href="/manual/evidences#ECO:0000303">More. </a></p> Manual assertion based on opinion in i

Watch the video: Lecture 07, concept 29: Disordered protein structures think function (May 2022).