Information

Transmembrane Protein Problem


Problem A transmembrane protein has 1000 aa. The 5th aa is found on the external side of the cell membrane. It interacts with the aqueous environment outside the cell. Amino acid 90 is inside the membrane bilayer. Aa 100-600 are intracellular, and 200-400 make a tight ball with minimal exposure to the aqueous cytoplasm. Amino acid 979 is found on the extracellular side of the protein where it forms a weak ionic bond with Cl-.

a. Can you draw the protein and mark positions of all mentioned aa in it?

b. What are the properties of these amino acids? 


I dont necessarily want/need the exact answers to these questions, rather i would like some guidance in what principles i would need to understand and conceptualize to attack this problem. Thanks all !


I made a quick sketch on the basis of the information you gave (this is not to scale):

Aminoacid 5 (aa5) of the protein is on the outside, aa90 is inside the membrane. What we don't know here where the transmembrane part starts (directly with aa6 or later) and how it is organised (I indicated this as a transmembrane helix, but this can of course be different). Then we know that the border between the transmembrane part and the cytosolic part is somewhere between aa90 and aa100, that aa100 to aa200 seems to be some connecting part. The aa200 to aa400 is a globular domain which shields hydrophobic amino acids from the cytosol, so this part has to contain a high percentage of hydrophobic amino acids.

The part from 400 to 600 is again cytosolic but with no further information about the structure. After aa600 starts a second transmembrane part of the protein, but here we don't know how long it is. The maximum possibility would be until aa977, since we know the aa978 is outside of the membrane.


Membrane proteins are usually drawn as topological digrams with two parallel horizontal lines representing the membrane. lines as 'extracellular' and below the lines as 'intracellular' for instance, but that is not what is being asked for here.

Its not clear to me that the number of transmembrane spans are fully described, nor are the boundaries of the transmembane spans. I think what is being asked for could be a domain diagram of the amino acid chain.

Draw a line on graph paper representing the peptide and use a scale to label each point described. For regions draw a box around the peptide line to describe a domain, and label it.

I think its hard to talk about amino acid properties other than hydrophilic/hydrophobic. hope this helps?


Origins of Osteoclasts

Deborah L. Galson , G. David Roodman , in Osteoimmunology , 2011

DC-STAMP

DC-STAMP is a seven-transmembrane-spanning receptor with no homology to any other known protein or multimembrane-spanning receptor. DC-STAMP is expressed both in immature and mature dendritic cells (DCs), and its mRNA levels fall upon activation of DCs with CD40 ligand (CD40L). DC-STAMP has been demonstrated to be an NFATc1 direct target gene and is involved in the fusion of both osteoclasts and macrophage giant cells [88, 89] . Mice deficient in DC-STAMP had mild osteopetrosis with mononucleated osteoclasts due to lack of fusion rather than a defect in differentiation, and therefore resorption was inefficient. Use of bone marrow monocytes from mice with GFP replacing DC-STAMP mixed with unlabeled WT bone marrow cells revealed that the WT cells could fuse with the GFP+DC-STAMP cells confirming that only one partner in osteoclast fusion needs to be expressing DC-STAMP [88] . The ligand(s) for DC-STAMP is still unknown and the requirement for its expression is unclear, including the issue of whether osteoclasts and macrophages express different DC-STAMP ligands. Furthermore, the molecular mechanism by which DC-STAMP exerts its effects in osteoclasts remains unknown.


Contents

Classification by structure Edit

There are two basic types of transmembrane proteins: [3] alpha-helical and beta-barrels. Alpha-helical proteins are present in the inner membranes of bacterial cells or the plasma membrane of eukaryotes, and sometimes in the outer membranes. [4] This is the major category of transmembrane proteins. In humans, 27% of all proteins have been estimated to be alpha-helical membrane proteins. [5] Beta-barrel proteins are so far found only in outer membranes of gram-negative bacteria, cell walls of gram-positive bacteria, outer membranes of mitochondria and chloroplasts, or can be secreted as pore-forming toxins. All beta-barrel transmembrane proteins have simplest up-and-down topology, which may reflect their common evolutionary origin and similar folding mechanism.

In addition to the protein domains, there are unusual transmembrane elements formed by peptides. Typical example is Gramicidin A, a peptide that forms a dimeric transmembrane β-helix. [6] This peptide is secreted by Gram-positive bacteria as an antibiotic. A transmembrane polyproline-II helix has not been reported in natural proteins. Nonetheless, this structure was experimentally observed in specifically designed artificial peptides. [7]

Classification by topology Edit

This classification refers to the position of the protein N- and C-termini on the different sides of the lipid bilayer. Types I, II, III and IV are single-pass molecules. Type I transmembrane proteins are anchored to the lipid membrane with a stop-transfer anchor sequence and have their N-terminal domains targeted to the endoplasmic reticulum (ER) lumen during synthesis (and the extracellular space, if mature forms are located on cell membranes). Type II and III are anchored with a signal-anchor sequence, with type II being targeted to the ER lumen with its C-terminal domain, while type III have their N-terminal domains targeted to the ER lumen. Type IV is subdivided into IV-A, with their N-terminal domains targeted to the cytosol and IV-B, with an N-terminal domain targeted to the lumen. [8] The implications for the division in the four types are especially manifest at the time of translocation and ER-bound translation, when the protein has to be passed through the ER membrane in a direction dependent on the type.

Membrane protein structures can be determined by X-ray crystallography, electron microscopy or NMR spectroscopy. [10] The most common tertiary structures of these proteins are transmembrane helix bundle and beta barrel. The portion of the membrane proteins that are attached to the lipid bilayer (see annular lipid shell) consist mostly of hydrophobic amino acids. [11]

Membrane proteins which have hydrophobic surfaces, are relatively flexible and are expressed at relatively low levels. This creates difficulties in obtaining enough protein and then growing crystals. Hence, despite the significant functional importance of membrane proteins, determining atomic resolution structures for these proteins is more difficult than globular proteins. [12] As of January 2013 less than 0.1% of protein structures determined were membrane proteins despite being 20–30% of the total proteome. [13] Due to this difficulty and the importance of this class of proteins methods of protein structure prediction based on hydropathy plots, the positive inside rule and other methods have been developed. [14] [15] [16]

Stability of α-helical transmembrane proteins Edit

Transmembrane α-helical proteins are unusually stable judging from thermal denaturation studies, because they do not unfold completely within the membranes (the complete unfolding would require breaking down too many α-helical H-bonds in the nonpolar media). On the other hand, these proteins easily misfold, due to non-native aggregation in membranes, transition to the molten globule states, formation of non-native disulfide bonds, or unfolding of peripheral regions and nonregular loops that are locally less stable. [ citation needed ]

It is also important to properly define the unfolded state. The unfolded state of membrane proteins in detergent micelles is different from that in the thermal denaturation experiments. [ citation needed ] This state represents a combination of folded hydrophobic α-helices and partially unfolded segments covered by the detergent. For example, the "unfolded" bacteriorhodopsin in SDS micelles has four transmembrane α-helices folded, while the rest of the protein is situated at the micelle-water interface and can adopt different types of non-native amphiphilic structures. Free energy differences between such detergent-denatured and native states are similar to stabilities of water-soluble proteins (< 10 kcal/mol). [ citation needed ]

Folding of α-helical transmembrane proteins Edit

Refolding of α-helical transmembrane proteins in vitro is technically difficult. There are relatively few examples of the successful refolding experiments, as for bacteriorhodopsin. In vivo, all such proteins are normally folded co-translationally within the large transmembrane translocon. The translocon channel provides a highly heterogeneous environment for the nascent transmembrane α-helices. A relatively polar amphiphilic α-helix can adopt a transmembrane orientation in the translocon (although it would be at the membrane surface or unfolded in vitro), because its polar residues can face the central water-filled channel of the translocon. Such mechanism is necessary for incorporation of polar α-helices into structures of transmembrane proteins. The amphiphilic helices remain attached to the translocon until the protein is completely synthesized and folded. If the protein remains unfolded and attached to the translocon for too long, it is degraded by specific "quality control" cellular systems. [ citation needed ]

Stability and folding of β-barrel transmembrane proteins Edit

Stability of β-barrel transmembrane proteins is similar to stability of water-soluble proteins, based on chemical denaturation studies. Some of them are very stable even in chaotropic agents and high temperature. Their folding in vivo is facilitated by water-soluble chaperones, such as protein Skp. It is thought that β-barrel membrane proteins come from one ancestor even having different number of sheets which could be added or doubled during evolution. Some studies show a huge sequence conservation among different organisms and also conserved amino acids which hold the structure and help with folding. [17]

Light absorption-driven transporters Edit

Oxidoreduction-driven transporters Edit

  • Transmembrane cytochrome b-like proteins: coenzyme Q - cytochrome c reductase (cytochrome bc1 ) cytochrome b6f complex formate dehydrogenase, respiratory nitrate reductase succinate - coenzyme Q reductase (fumarate reductase) and succinate dehydrogenase. See electron transport chain. from bacteria and mitochondria

Electrochemical potential-driven transporters Edit

P-P-bond hydrolysis-driven transporters Edit

  • P-type calcium ATPase (five different conformations)
  • Calcium ATPase regulators phospholamban and sarcolipin
  • General secretory pathway (Sec) translocon (preprotein translocase SecY)

Porters (uniporters, symporters, antiporters) Edit

    carrier proteins
  • Major Facilitator Superfamily (Glycerol-3-phosphate transporter, Lactose permease, and Multidrug transporter EmrD) (multidrug efflux transporter AcrB, see multidrug resistance)
  • Dicarboxylate/amino acid:cation symporter (proton glutamate symporter)
  • Monovalent cation/proton antiporter (Sodium/proton antiporter 1 NhaA) sodium symporter
  • Ammonia transporters
  • Drug/Metabolite Transporter (small multidrug resistance transporter EmrE - the structures are retracted as erroneous)

Alpha-helical channels including ion channels Edit

    like, including potassium channels KcsA and KvAP, and inward-rectifier potassium ion channel Kirbac of neurotransmitter receptors (acetylcholine receptor)
  • Outer membrane auxiliary proteins (polysaccharide transporter) - α-helical transmembrane proteins from the outer bacterial membrane

Enzymes Edit

Proteins with alpha-helical transmembrane anchors Edit

    transmembrane dimerization domain ]
  • Cytochrome c nitrite reductase complex
  • Steryl-sulfate sulfohydrolase
  • Stannin A dimer
  • Inovirus (filamentous phage) major coat protein -associated protein A and B [18] .
  • Membrane protease specific for a stomatin homolog

Β-barrels composed of a single polypeptide chain Edit

    Beta barrels from eight beta-strands and with "shear number" of ten (n=8, S=10). They include:
      (OmpA) (OmpX) (OmpW) (PagP) (NspA)

    Note: n and S are, respectively, the number of beta-strands and the "shear number" [19] of the beta-barrel


    Transmembrane Protein Problem - Biology

    New research by Yuyang Lei and colleagues published in the journal Circulation Research sheds new light on how the spike protein might play a critical role in the widespread damage caused by SARS-CoV2, and offers insight into treating the complications of COVID-19.

    Vaccine skeptics have seized on the study to cast doubt on the safety of vaccines. But a review of the study’s findings shows that the concerns raised by vaccine doubters are much ado about nothing.


    Credit: Shutterstock

    The vascular endothelium is an important player in the illness and death associated with COVID-19. The endothelium is a system of cells that line and protect the inside of blood vessels. SARS-CoV2 injures the endothelium leading to blood clots, heart attack, pulmonary embolism, and stroke. Despite the established link between COVID-19 and these cardiovascular complications, the mechanism by which they develop is unknown.

    Researchers from Jiaotong University the University of California, San Diego and the Salk Institute used a pseudovirus coated with spike protein to investigate the effects of the viral protein on endothelial cells. Pseudoviruses – which were first developed over 50 years ago – contain the outer shell of the virus, but they lack the viral genes needed to reproduce.

    Hamsters treated with the spike protein coated pseudovirus showed lung damage similar to that seen in humans infected with SARS-CoV2. When researchers added pseudovirus to cultured endothelial cells they found that the mitochondria inside the cells were injured. Since mitochondria are responsible for providing energy to cells, their dysfunction can cause cell death.

    When isolated pulmonary arteries were exposed to the spike protein carrying pseudovirus there was some disruption in the ability of the blood vessels to dilate. The decreased ability to expand blood vessels that serve the lungs could impair the ability of the body to take up oxygen from lungs that are damaged by the virus.

    The novelty of this study was the discovery that the spike protein itself causes damage, and that the pathway triggered by the spike protein could explain the widespread cardiovascular complications that develop in COVID-19 patients.

    A Twisted Tale

    Shortly after Lei and colleagues published their study, vaccine skeptics touted the findings as proof that newly developed COVID-19 vaccines are dangerous. Afterall, if COVID-19 vaccines produce spike protein to trigger immunity, and that same spike protein causes injury, then vaccines are really no different than the disease they are designed to prevent.

    The problem with these claims is that science doesn’t support their arguments.

    The Long Road to Perdition

    COVID19 vaccines are injected into the deltoid where they are taken up by muscle cells. The vaccine remains largely contained near the site of injection. Local muscle cells that take in the vaccine produce the spike protein and place it on the surface of the cell where it is recognized by the immune system. Vaccine that is not taken up by muscle is drained into the local lymph nodes where lymphatic cells absorb the vaccine and similarly make spike protein. The lymphatic cells are responsible for activating T and B cells, which are important steps in generating immunity.

    In order to damage the endothelium of blood vessels, COVID-19 vaccines have to enter the vascular system and infect cells that circulate in the blood. Data collected by the European Medicines Agency shows that no significant amount of vaccine enters the circulation. The confinement of the expressed spike protein away from the circulatory system significant prevents it from causing damage to the vascular endothelium.

    Redesigning the Spike Protein

    The spike protein attaches SARS-CoV2 to cells through a receptor called ACE2. In order to fully interact, the spike protein must undergo a conformational change.

    A research team lead by Dr. Barney Graham from the Vaccine Research Center at the NIH National Institute of Allergy and Infectious Diseases created an engineered form of the spike protein that is unable to make the shape change required to effectively bind to cells. The Pfizer/BioNTech, Moderna, Novavax, and Johnson&Johnson vaccines all use this inactivated spike protein, which means any spike protein that is produced by the vaccine is not able to be activated. This safety-switch limits the ability of the spike protein to bind ACE2 and limits its ability to cause damage.

    Stuck in a Hole

    In addition to engineering the spike protein so it can not be fully activated, the protein is tagged with an extra piece called a “transmembrane anchor”. The transmembrane anchor allows the spike protein to appear on the surface – or membrane – of the cell, but it is held in place by the anchor. This prevents the spike protein from drifting away and creates a fixed target for the immune system to recognize the foreign protein.

    Three Strikes Against Misinformation

    The significance of the work by Lei and colleagues has been overshadowed by the concerns raised by vaccine skeptics. Their claims of a looming vaccine catastrophe brought about by vaccine-induced spike proteins fails to consider that the spike protein of vaccines is different than the natural form that its engineered shape prevents activation and that multiple elements confine spike protein expression to a highly localized collection of cells whose purpose is to activate the immunity vaccines are designed to produce.

    Ironically, the same study cited by vaccine skeptics as proof of their arguments draws a very different conclusion than the negative ones they espouse. Lei and colleagues conclude their paper by noting that their study “suggests that vaccination-generated antibody and/or exogenous antibody against [spike] protein not only protects the host from SARS-CoV-2 infectivity but also inhibits [spike] protein imposed endothelial injury.” In other words, the spike proteins used by currently available vaccines actually offer a double layer of protection.

    W. Glen Pyle, PhD is a Professor in the Department of Biomedical Sciences at the University of Guelph and an Associate Member of the IMPART Team.


    Conclusions

    We determined that the most useful properties for discriminating transmembrane segments from non-transmembrane segments and for discriminating intrinsically unstructured segments from intrinsically structured segments in transmembrane proteins were hydropathy, polarity, and flexibility, and based on these properties, constructed a number of classifiers to identify transmembrane segments with an out-of-sample accuracy of approximately 75%. Several interesting observations emerged from our study:

    • Intrinsically unstructured segments and transmembrane segments tend to have opposite properties, as summarized in Table ​ Table5. 5 . For example, unstructured segments tended to have a low hydropathy value, whereas transmembrane segments tended to have a high hydropathy value. These results are in agreement with previous work that found that transmembrane segments tend to be more hydrophobic than non-transmembrane segments, due to the fact that transmembrane α-helices require a stretch of 12-35 hydrophobic amino acids to span the hydrophobic region inside the membrane [26].

    Table 5

    Tendencies of various properties for tranmembrane (TM) and intrinsically unstructured (IU) segments.

    SegmentType
    PropertyTMIU
    HydropathyHighLow
    PolarityLowHigh
    BulkinessHighLow
    FlexibilityLowHigh
    Electronic effectsHighLow

    Reproduced with permission from [38]

    • Transmembrane proteins appear to be much richer in intrinsically unstructured segments than other proteins about 70% of transmembrane proteins contain intrinsically unstructured regions, as compared to about 35% of other proteins.

    • In approximately 70% of transmembrane proteins that contain intrinsically unstructured segments, the intrinsically unstructured segments are close to transmembrane segments.

    These observations may provide insight into the structural and functional roles that intrinsically unstructured segments play in membrane proteins, and may also aid in the identification of intrinsically unstructured and transmembrane segments from primary protein structure.


    Active and Passive Transmembrane Transport

    based on whether the transport process is exergonic or endergonic. Passive transport is the exergonic movement of substances across the membrane. In contrast, active transport is the endergonic movement of substances across the membrane that

    Passive transport

    Passive transport does not require the cell to

    energy. In passive transport, substances move from an area of higher concentration to an area of lower concentration, down their concentration gradient

    Depending on the chemical nature of the substance, we may associate different processes with passive transport.

    Diffusion

    Diffusion is a passive process of transport. A single substance moves from an area of high concentration to an area of low concentration until the concentration is equal across a space. You are familiar with diffusion of substances through the air. For example, think about someone opening a bottle of ammonia in a room filled with people. The ammonia gas is at its highest concentration in the bottle its lowest concentration is at the edges of the room. The ammonia vapor will diffuse, or spread away, from the bottle gradually, more and more people will smell the ammonia as it spreads. Materials move within the cell&rsquos cytosol by diffusion, and certain materials move through the plasma membrane by diffusion.

    Figure 2. Diffusion through a permeable membrane moves a substance from an area of high concentration (extracellular fluid, in this case) down its concentration gradient (into the cytoplasm). Each separate substance in a medium, such as the extracellular fluid, has its own concentration gradient, independent of the concentration gradients of other materials. In addition, each substance will diffuse according to that gradient. Within a system, there will be different rates of diffusion of the different substances in the medium.(credit: modification of work by Mariana Ruiz Villareal)

    Factors that affect diffusion

    If unconstrained, molecules will move through and explore space randomly at a rate that depends on their size, their shape, their environment, and their thermal energy. This

    movement underlies the diffusive movement of molecules through whatever medium they are in. The absence of a concentration gradient does not mean that this movement will stop, just that there may be no net movement of the number of molecules from one area to another, a condition known as a dynamic equilibrium.

    Factors influencing diffusion include:

    • Extent of the concentration gradient: The greater the difference in concentration, the more rapid the diffusion. The closer the distribution of the material gets to equilibrium, the slower the rate of diffusion becomes.
    • Shape, size and mass of the molecules diffusing: Large and heavier molecules move more slowly therefore, they diffuse more slowly. The reverse is typically true for smaller, lighter molecules.
    • Temperature: Higher temperatures increase the energy and therefore the movement of the molecules, increasing the rate of diffusion. Lower temperatures decrease the energy of the molecules, thus decreasing the rate of diffusion.
    • Solvent density: As the density of a solvent increases, the rate of diffusion decreases. The molecules slow down because they have a more difficult time getting through the denser medium. If the medium is less dense, rates of diffusion increase. Since cells primarily use diffusion to move materials within the cytoplasm, any increase in the cytoplasm&rsquos density will decrease the rate at which materials move in the cytoplasm.
    • Solubility: As discussed earlier, nonpolar or lipid-soluble materials pass through plasma membranes more easily than polar materials, allowing a faster rate of diffusion.
    • Surface area and thickness of the plasma membrane: Increased surface area increases the rate of diffusion, whereas a thicker membrane reduces it.
    • Distance traveled: The greater the distance that a substance must travel, the slower the rate of diffusion. This places an upper limitation on cell size. A large, spherical cell will die because nutrients or waste cannot reach or leave the center of the cell, respectively. Therefore, cells must either be

    , as with many prokaryotes, or

    Facilitated transport

    In facilitated transport, also called facilitated diffusion, materials diffuse across the plasma membrane with the help of membrane proteins. A concentration gradient exists that allows these materials to diffuse into or out of the cell without

    cellular energy. If the materials are ions or polar molecules (compounds that

    by the hydrophobic parts of the cell membrane), facilitated transport proteins help shield these materials from the repulsive force of the membrane, allowing them to diffuse into the cell.

    Channels

    The integral proteins involved in facilitated transport are collectively referred

    to as transport proteins, and they function as either channels for the material or carriers. In both cases, they are transmembrane proteins. Different channel proteins have different transport properties. Some have evolved to have very high specificity for the substance that is being transported while others transport a variety of molecules sharing some common characteristic

    . The interior "passageway" of channel proteins have evolved to provide a low energetic barrier for transport of substances across the membrane through the complementary arrangement of amino acid functional groups (of both backbone and side-chains). Passage through the channel allows polar compounds to avoid the nonpolar central layer of the plasma membrane that would otherwise slow or prevent their entry into the cell. While at any one time significant amounts of water crosses the membrane both in and out, the rate of an individual water molecule transport may not be fast enough to adapt to changing environmental conditions. For such cases, Nature has evolved a special class of membrane proteins called

    that allow water to pass through the membrane at a very high rate.

    Figure 3. Facilitated transport moves substances down their concentration gradients. They may cross the plasma membrane with the aid of channel proteins. (credit: modification of work by Mariana Ruiz Villareal)

    Channel proteins are either open at all times or they are &ldquogated.&rdquo The latter controls the opening of the channel.

    Various mechanisms may be involved

    in the gating mechanism. For instance, the attachment of a specific ion or small molecule to the channel protein may trigger opening. Changes in local membrane "stress" or changes in voltage across the membrane may also be triggers to open or close a channel.

    Different organisms and tissues in multicellular species express different channel proteins in their membranes depending on the environments they live in or specialized function they play in an organism. This provides each type of cell with a unique membrane permeability profile that is evolved to complement its "needs" (note the anthropomorphism). For example, in some tissues, sodium and chloride ions pass freely through open channels, whereas in other tissues a gate must open to allow passage. This occurs in the kidney where both forms of channels are found in different parts of the renal tubules. Cells involved in the transmission of electrical impulses, such as nerve and muscle cells, have gated channels for sodium, potassium, and calcium in their membranes. Opening and closing of these channels changes the relative concentrations on opposing sides of the membrane of these ions, resulting a change in electrical potential across the membrane that lead to message propagation with nerve cells or in muscle contraction with muscle cells.

    Carrier proteins

    Another type of protein embedded in the plasma membrane is a carrier protein. This aptly named protein binds a substance and, in doing so, triggers a change of its own shape, moving the bound molecule from the outside of the cell to its interior depending on the gradient, the material may move in the opposite direction. Carrier proteins are typically specific for a single substance. This selectivity adds to the overall selectivity of the plasma membrane. The molecular-scale mechanism of function for these proteins remains poorly understood.

    Figure 4. Some substances

    move down their concentration gradient across the plasma membrane with the aid of carrier proteins. Carrier proteins change shape as they move molecules across the membrane.

    Carrier protein play an important role in the function of kidneys. Glucose, water, salts, ions, and amino acids needed by the body

    in one part of the kidney. This filtrate, which includes glucose,

    in another part of the kidney with the help of carrier proteins. Because there are only a finite number of carrier proteins for glucose, if more glucose is present in the filtrate than the proteins can handle, the excess

    from the body in the urine. In a diabetic individual,

    as &ldquospilling glucose into the urine.&rdquo A different group of carrier proteins called glucose transport proteins, or GLUTs,

    in transporting glucose and other hexose sugars through plasma membranes within the body.

    Channel and carrier proteins transport material at different rates. Channel proteins transport much more quickly than do carrier proteins. Channel proteins facilitate diffusion at a rate of tens of millions of molecules per second, whereas carrier proteins work at a rate of a thousand to a million molecules per second.

    Active transport

    Active transport mechanisms require the use of the cell&rsquos energy, usually in the form of adenosine triphosphate (ATP). If a substance must move into the cell against its concentration gradient&mdashthat is, if the concentration of the substance inside the cell is greater than its concentration in the extracellular fluid (and vice versa)&mdashthe cell must use energy to move the substance. Some active transport mechanisms move small-molecular weight materials, such as ions, through the membrane. Other mechanisms transport much larger molecules.

    Moving against a gradient

    To move substances against a concentration or electrochemical gradient, the cell must use energy. Transporters harvest this energy from ATP generated through the cell&rsquos metabolism. Active transport mechanisms, collectively called pumps, work against electrochemical gradients. Small substances constantly pass through plasma membranes. Active transport maintains concentrations of ions and other substances needed by living cells in the face of these passive movements. Much of a cell&rsquos supply of metabolic energy may

    maintaining these processes. (Most of a red blood cell&rsquos metabolic energy is used to maintain the imbalance between exterior and interior sodium and potassium levels required by the cell.) Because active transport mechanisms depend on a cell&rsquos metabolism for energy, they are sensitive to many metabolic poisons that interfere with the supply of ATP.

    Two mechanisms exist for the transport of small-molecular weight material and small molecules. Primary active transport moves ions across a membrane and creates a difference in charge across that membrane, which directly depends on ATP. Secondary active transport describes the movement of material

    the electrochemical gradient established by primary active transport that does not directly require ATP.

    Carrier proteins for active transport

    An important membrane adaption for active transport is specific carrier proteins or pumps to facilitate movement: there are three types of these proteins or transporters. A

    also carries two different ions or molecules, but in different directions. These transporters can also transport small, uncharged organic molecules like glucose.

    These three types of carrier proteins are also found

    in facilitated diffusion, but they do not require ATP to work in that process. Some examples of pumps for active transport are

    -K + ATPase, which carries sodium and potassium ions, and H + -K + ATPase, which carries hydrogen and potassium ions. Both are

    carrier proteins. Two other carrier proteins are Ca 2+ ATPase and H + ATPase, which carry only calcium and only hydrogen ions, respectively. Both are pumps.

    Figure 5. A uniporter carries one molecule or ion. A symporter carries two different molecules or ions, both in the same direction. An antiporter also carries two different molecules or ions, but in different directions. (credit: modification of work by &ldquoLupask&rdquo/Wikimedia Commons)

    Primary active transport

    In primary active transport, the energy is

    directly from the hydrolysis of ATP. Often, primary active transport, such as that shown below, which functions to transport sodium and potassium ions allows secondary active transport to occur (discussed in the section below).

    The second transport method is still considered

    active because it depends on the use of energy from the primary transport.

    Figure 6. Primary active transport moves ions across a membrane, creating an electrochemical gradient (electrogenic transport). (credit: modification of work by Mariana Ruiz Villareal)

    One of the most important pumps in animal cells is the sodium-potassium pump (Na + -K + ATPase), which maintains the electrochemical gradient (and the correct concentrations of

    K + ) in living cells. The sodium-potassium pump moves K + into the cell while moving Na + out

    at a ratio of three Na + for every two K + ions moved in. The Na + -

    exists in two forms depending on its orientation to the interior or exterior of the cell and its affinity for either sodium or potassium ions. The process comprises the following six steps.

    1. With the enzyme oriented towards the interior of the cell, the carrier has a high affinity for sodium ions. Three ions bind to the protein.
    2. ATP

    Several things have happened because of this process. There are more sodium ions outside of the cell than inside and more potassium ions inside than out. For every three ions of sodium that move out, two ions of potassium move in. This results in the interior being slightly more negative relative to the exterior. This difference in charge is important in creating the conditions necessary for the secondary process. The sodium-potassium pump is, therefore, an electrogenic pump (a pump that creates a charge imbalance), creating an electrical imbalance across the membrane and contributing to the membrane potential.

    Visit the site to see a simulation of active transport in a sodium-potassium ATPase.

    Secondary active transport (

    Secondary active transport brings sodium ions, and possibly other compounds, into the cell. As sodium ion concentrations build outside of the plasma membrane because of the action of the primary active transport process, an electrochemical gradient is created. If a channel protein exists and is open, the sodium ions will return through the membrane down the gradient. This movement is used to transport other substances that can attach themselves to the transport protein through the membrane. Many amino acids, and glucose, enter a cell this way. This secondary process is also used to store high energy hydrogen ions in the mitochondria of plant and animal cells for the production of ATP. The potential energy that accumulates in the stored hydrogen ions is translated into kinetic energy as the ions surge through the channel protein ATP synthase, and that energy is used to convert ADP into ATP.


    Missense mutations in transmembrane domains of proteins: Phenotypic propensity of polar residues for human disease

    Anthony W. Partridge and Alex G. Therien contributed equally to the work.

    Division of Structural Biology and Biochemistry, Research Institute, Hospital for Sick Children, Toronto, and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada

    Anthony W. Partridge and Alex G. Therien contributed equally to the work.

    Division of Structural Biology and Biochemistry, Research Institute, Hospital for Sick Children, Toronto, and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada

    Division of Structural Biology and Biochemistry, Research Institute, Hospital for Sick Children, Toronto M5G 1X8, Ontario, Canada===Search for more papers by this author

    Division of Structural Biology and Biochemistry, Research Institute, Hospital for Sick Children, Toronto, and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada

    Anthony W. Partridge and Alex G. Therien contributed equally to the work.

    Division of Structural Biology and Biochemistry, Research Institute, Hospital for Sick Children, Toronto, and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada

    Anthony W. Partridge and Alex G. Therien contributed equally to the work.

    Division of Structural Biology and Biochemistry, Research Institute, Hospital for Sick Children, Toronto, and Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada

    Division of Structural Biology and Biochemistry, Research Institute, Hospital for Sick Children, Toronto M5G 1X8, Ontario, Canada===Search for more papers by this author


    Access options

    Get full journal access for 1 year

    All prices are NET prices.
    VAT will be added later in the checkout.
    Tax calculation will be finalised during checkout.

    Get time limited or full article access on ReadCube.

    All prices are NET prices.


    Cotranslational membrane protein insertion

    The pathway for the insertion of membrane proteins has been highly conserved across organisms and has two distinct steps: recognition and targeting of a nascent IMP and then its insertion into the membrane.

    Recognition and Targeting

    Recognition occurs on the ribosome as the nascent polypeptide chain emerges from the exit channel by the Signal Recognition Particle (SRP). This protein has a nonspecific hydrophobic channel that interacts with amino acids that would form a transmembrane domain based on the presence of numerous hydrophobic residues. Upon binding, translation is arrested and the SRP-ribosome complex searches for the eukaryotic SRP Receptor (SR) or bacterial FtsY. Once the ribosome-SRP-receptor complex is formed bound it diffuses through the membrane until it interacts with the Sec translocon. The nascent chain is then transferred to the channel. The critical aspect of this process is that it prohibits the hydrophobic region of the polypeptide from exposure to the hydrophilic cytosol to avoid misfolding and aggregation (Grudnik et al. 2009).

    Figure (PageIndex<2>): General steps of cotranslational membrane protein insertion. (Shao & Hegde 2011)

    Sec translocon

    In both prokaryotes and eukaryotes, there are protein conductive channels generally referred to as Sec translocases. These channels have two fundamental functions that enable the insertion of proteins into the membrane. They contain a hydrophilic channel that allows hydrophilic residues of a polypeptide to pass through the membrane as well as a lateral gate that opens to expose the interior of the channel to the lipid acyl chains. This allows for hydrophobic residues to enter through the channel and then interact directly with the lipid tails while avoiding the polar head groups. These channels therefore allow the nascent chain that is being translated to effectively &ldquothread&rdquo in and out of the membrane as many times as is required to reach the final structure. (van den Berg et al. 2004).

    Figure (PageIndex<3>): The Sec61 translocon and potential accessory factors. (Shao & Hegde 2011)


    Methods

    Datasets

    All datasets used for analysis are listed in Table 1. Transmembrane protein sequences and annotations were taken from TOPDB [50] and UniProt [49]. UniProt-derived datasets are the most comprehensive datasets, built with (1) robust transmembrane prediction methods, providing the limit of today’s achievable accuracy with regard to hydrophobic core localisation, and (2) subcellular location annotation that can be used for orientation determination. However, they mostly rely on predicted transmembrane regions. TOPDB has meticulous experimental verifications of the orientation from the literature that are independent of prediction algorithms [50]. Unfortunately, this dataset is much smaller with too few entries to have it divided with regard to taxonomy or subcellular locations.

    UniProt database files were downloaded by querying the server for different taxonomic groups as well as different subcellular membrane locations: UniHuman (human representative proteome), UniCress (Arabidopsis thaliana, otherwise known as mouse-ear cress, representative proteome), UniER (human endoplasmic reticulum representative proteome), UniPM (human plasma membrane representative proteome) and UniGolgi (human Golgi representative proteome). To enforce a level of quality control, the queries were restricted to manually reviewed records and transmembrane proteins with manually asserted TRANSMEM annotation [49]. Proteins were then sorted into multi-pass and single-pass groups according to whether they had more than one or exactly one TRANSMEM region respectively. TRANSMEM regions are validated by either experimental evidence [49] or according to a robust transmembrane consensus of the predictors TMHMM [23], Memsat [80], Phobius [21, 22] and the hydrophobic moment plot method of Eisenberg and co-workers [54]. TMHs and flanking regions were oriented according to UniProt TOPO_DOM annotation according to the keyword “cytoplasmic”. If a “cytoplasmic” TOPO_DOM was found in the previous TOPO_DOM relative to the TRANSMEM region, then the sequence remained the same. If “cytoplasmic” was found in the next TOPO_DOM, relative to the TRANSMEM section, then the sequence was reversed. Proteins without the “cytoplasmic” keyword in their TOPO_DOM annotation were omitted from further analysis.

    The TOPDB database [50] is a manually curated database composed of experimental records from the literature that allow determination of the protein topology. Experiments include fusion proteins, posttranslational modifications, protease experiments, immunolocalisation, chemical modifications as well as revertants, sequence motifs with known mandatory membrane-embedded topologies and tailoring mutants (Additional file 3: Table S1). Length cut-offs for the TMH were set with 16 as the shortest length and 38 as the longest.

    The datasets described in the following subsections are used throughout this work.

    ExpAll

    TOPDB contained 4190 manually annotated transmembrane proteins at the time of download [50]. CD-HIT [81] identified 3857 representative sequences using sequence clusters of >90% sequence identity. This choice of similarity threshold was chosen since CD-HIT ultimately underlies the clustering behind UniRef. Unlike the other datasets, which by definition contain reasonably typical TMHs, many of the transmembrane segments annotated in TOPDB are extremely short or long, and this would cause severe unrealistic hydrophobic mismatches. The short segments in particular could be the result of misannotation, TMHs broken into pieces due to kinks or segments that peripherally insert only into the interface of the membrane bilayer. To remove the atypical lengths, cut-offs were set with 16 as the lower cut-off and 38 as the upper cut-off after inspecting the length histogram. We found that, for the single-pass TMHs in TOPDB, 1215 out of 1544 are within the length limits (78.7%). Amongst the 17,141 multi-pass TMHs, we find 15,563 within our global length limits (from 2205 TOPDB records corresponding to 2281 UniProt entries). This removed 1578 very short TMHs and none of the long TMHs. Our cut-off selection is very similar to the one used by Baeza-Delgado et al. [13].

    To get an idea of the taxonomical breakdown in the ExpAll dataset, the UniProt ID tags were extracted and mapped to UniProtKB. The combined dataset of multi-pass (single-pass) proteins was mapped to 1288 (1343) eukaryotic records, 404 (776) of which were human records, 926 (191) bacterial records, 46 (5) Archaea records and 14 (22) viral records.

    UniHuman

    This is a set of mostly human TMH-containing proteins or their close mammalian homologues. UniProtKB contains 5187 human protein records that are manually annotated with TRANSMEM regions (query = “annotation:(type:transmem) AND reviewed:yes AND organism:"Homo sapiens (Human) [9606]" AND proteome:up000005640”. To reduce sequence redundancy, these sequences were submitted to UniRef90 [82]. To note, UniRef90 was chosen over UniRef50 to maintain a viable size of datasets for statistical analysis of occurrence of negatively charged residues, which are very rare in the vicinity of TMHs. There were 5015 UniRef90 clusters representing the 5187 sequences. A list of sequences representing those clusters was submitted back to UniProtKB, and 5014 representative entries were recovered. There is a small issue in that the list of representatives from UniRef includes non-canonical isoforms, whilst the batch retrieve query of UniProtKB only supports complete entries, i.e. canonical isoforms. This resulted in the loss of one record at this point due to two splice isoforms acting as representative identifiers. Of those 5014 records, 4714 were records from human entries, 197 were from mice, 94 from rats, 5 from bovines, 2 from chimps, 1 from Chinese hamsters, and 1 from pigs. Although the TMH length variations within the UniHuman dataset are much smaller than for ExpAll, we applied the same length cut-offs for the sake of comparability. Out of the 1709 single-pass cases, 1705 entered the final dataset. Of those, 1596 were from human records, 87 were from mouse, 19 were from rat, and 2 were from chimpanzee. The further loss of a record in the taxonomic query is again due to multiple splice isoform records being represented by a single UniProt record. Amongst the 12,390 multi-pass TMHs, 12,353 were included into UniHuman. The other, multi-pass record identifiers were mapped to 1789 UniProtKB entries. Of these, 1660 were human entries, 63 from rat, 61 from mouse, 4 from bovines and 1 from Chinese hamsters. This clustered human dataset was then queried for subcellular locations to make the UniER, UniGolgi and UniPM datasets (detailed below).

    UniER

    The clustered UniHuman dataset was queried using UniProtKB for endoplasmic reticulum subcellular location (locations:(location:"Endoplasmic reticulum [SL-0095]" evidence:manual)). This returned 487 protein entries, 457 of which belonged to human, 24 to mouse and 6 to rat. Of these records, 287 contained sufficient annotation for orientation determination. One hundred thirty-two were single-pass entries, of which 120 records were from humans, 11 from mouse, and 1 from rat. One hundred fifty-five were multi-pass entries containing 898 TMHs. One hundred forty-four were records from human, 8 were from mouse and 3 were from rat.

    UniGolgi

    The clustered human dataset was queried using UniProtKB for Golgi subcellular location (locations:(location:"Golgi apparatus [SL-0132]" evidence:manual)). This returned 323 protein entries, 301 of which belonged to human, 19 to mice, 2 to rat and 1 to pig. Of these records, 269 contained sufficient annotation for orientation determination. Two hundred six were single-pass entries, of which 195 records were from human, 9 from mouse, and 1 from rat. Sixty-one were multi-pass entries containing 383 transmembrane regions. Fifty-four were records from human, 6 were from mouse and 1 was from rat.

    UniPM

    The clustered human dataset was queried using UniProtKB for the cell membrane subcellular location (locations:(location:"Cell membrane [SL-0039]" evidence:manual)). This returned 1036 protein entries, 948 of which belonged to humans, 62 to mice, and 26 to rats. Of these records, 920 contained sufficient annotation for orientation determination. Four hundred ninety-three were single-pass entries, of which 451 records were from human, 37 from mouse, and 5 from rat. Four hundred twenty-seven were multi-pass entries containing 3079 transmembrane regions. Three hundred ninety-four were records from human, 17 were from mouse and 16 were from rat.

    UniCress

    For the mouse-ear cress, a representative proteome dataset was acquired with the query annotation:proteomes:(reference:yes) AND reviewed:yes AND organism:"Arabidopsis thaliana (Mouse-ear cress) [3702]" AND proteome:up000006548. This returned 3174 records in UniProtKB. UniRef90 identified 3111 clusters. Of the representative sequences, 3110 were mapped back to UniProtKB. Of those, 3090 were from Arabidopsis thaliana, 2 from Hornwort, 1 from cucumber, 1 from tall dodder, 1 from soybean (Glycine max), 2 from Indian wild rice, 2 from rice, 2 from garden pea, 1 from potato, 4 from spinach, 1 from Thermosynechococcus elongatus (thermophilic cyanobacterium), 1 from wheat, and 2 from maize. Of those there were 1146 with suitable TOPO_DOM annotation for topological orientation determination. Of those records, 632 were identified as single-pass, all of which were from Arabidopsis thaliana. Five hundred seven protein records were from multi-pass records, which contained 3823 TMHs. Five hundred six of those records were from Arabidopsis thaliana, whilst 1 was from Thermosynechococcus elongatus.

    UniFungi

    For the Fungi dataset, the query “annotation:(type:transmem) taxonomy:"Fungi [4751]" AND reviewed:yes” was used. This returned 5628 records that were submitted to UniRef90. UniRef90 identified 4934 representative records, all of which were successfully mapped back to UniProtKB. Of those, 2070 had suitable annotation for orientation. A total of 1990 records belonged to Ascomycota including 1243 Saccharomycetales. 73 were Basidiomycota, and 6 were Apansporoblastina. Seven hundred twenty-nine records contained a single TMH region, 702 of which belonged to Ascomycota, 26 to Basidiomycota and 1 to Encephalitozoon cuniculi, a Microsporidium parasite. There were 8698 helices contained in 1338 records of multi-pass proteins. Of these records, 1285 were Ascomycota, 47 were Basidiomycota, and 5 were Apansporoblastina. One TMH from UniFungi was discounted from P32897 due to an unknown position.

    UniEcoli

    This dataset was generated by querying UniProt with “reviewed:yes AND organism:”Escherichia coli (strain K12)[83333]””, which returned 941 hits. The hits were submitted to UniRef90, which returned 935 clusters. The representative IDs were then resubmitted to UniProtKB, all of which returned successfully. Nine hundred thirty-four were from bacteria, whilst one was from lambdalike viruses. Of the bacterial records, 862 were from various Escherichia species, of which 565 were from E. coli strain K12, 28 were from Salmonella choleraesuis, 25 were from Shigella and the rest all also fell under the Gammaproteobacteria class. This dataset contains 54 single-pass proteins and 3888 helices from 529 multi-pass proteins with sufficient annotation for topological determination.

    UniBacilli

    The Bacilli dataset was constructed by querying UniProt for “reviewed:yes AND taxonomy:”Bacilli””. This returned 5044 records, which were submitted to UniRef90. There were 2591 clusters found in UniRef from these records. The representative IDs were successfully resubmitted to UniProtKB. Of these, 2031 were of the order Bacillales whilst 560 were also of the order Lactobacillales. This dataset contains 124 single-pass proteins and 822 helices from 140 multi-pass proteins.

    UniArch

    The Archaea dataset was constructed by querying UniProt for “reviewed:yes AND taxonomy:”Archaea [2157]””. This returned 1152 records, which were submitted to UniRef90. One thousand fifty-four clusters were found in UniRef from these records. The representative IDs were successfully resubmitted to UniProtKB. Nine hundred forty-six records belonged to the Euyarchaeota, 101 to Thermoprotei, 4 to Thaumarchaeota, and 3 to Korarchaeum cryptofilum. This dataset contains 48 single-pass proteins and 59 multi-pass proteins containing 327 helices from 59 proteins.

    We are aware that proteome datasets are “moving targets” that have dramatically changed over the years and probably will continue to do so to some extent in the future [83]. Yet, we think that currently available protein sequence sets are sufficiently good for our purposes, as we search for statistical properties in the TMH context only.

    On the determination of flanking regions for TMHs and the TMH alignment

    The determination of the boundary point at the sequence between the TMH in a membrane and the sequence immersed in the cytoplasm, extracellular space, vesicular lumen, etc. is not as trivial as it initially appears. There is a lot of dynamics in the TMH positioning, and the actual boundary point will be represented by various residues at different time points. Whilst the TMH core region detection from a sequence is trivial with modern software, the exact determination of TMH boundaries remains difficult, since it is unclear exactly how far in or out of the membrane a given helix extends [84]. Previous studies have dealt with this issue in various ways [9, 13, 16, 85].

    Here in this work, we explore two boundary definitions. First, we assign TMH boundary locations as described in the respective databases. These flanks are the ones that are reported in our TMH data files that are available at http://mendel.bii.a-star.edu.sg/SEQUENCES/NNI/. We studied flank lengths of ±5, ±10, and ±20 residues preceding and following the inside and outside TMH boundaries. In these cases, the flanks are aligned relative to the residue closest to the TMH.

    In cases where the loops before and after the TMH are shorter than the predefined flank lengths, further precautions are necessary. In the multi-pass datasets particularly (Additional file 4: Figure S4, Additional file 3: Table S1), the flanks overlap with other membrane region flanks. We explore several variants. On the one hand, we work with data files where the flank residue stretches are equally truncated so that no overlap occurs. If the loop length was uneven, the central odd residue was not included into any flank. We find, surprisingly, that a large number of TMHs have no or just a super-short flank, a circumstance that should disturb any statistical analysis due to the absence of objects. Therefore, we also work with alternative datasets: (1) with flanks overlapping between consecutive TMHs (e.g. in Table 3B, yet this leads to some residues being counted more than one time) as well as (2) with subsets of the data where the flanks at both sides have a defined minimal length (50% or 100% of the required flanks unfortunately, some of them become too small for analysis).

    The problem of flanks overlapping also affects some single-pass and multi-pass TMH proteins with INTRAMEM regions as described in some UniProt entries. We do not include INTRAMEM regions in the datasets as TMHs, but sometimes the flanking regions of TMHs were truncated to avoid overlap with INTRAMEM flanking regions (Additional file 5: Table S2). The identifiers affected for single-pass TMH proteins are Q01628, P13164, Q01629, Q5JRA8, A2ANU3 (UniHuman), P13164, Q01629, A2ANU3 (UniPM) and Q5JRA8 (UniER).

    The second form of boundary point definition for flank determination was achieved by gaplessly aligning all TMHs relative to their central residue at the position equal to half the length of the TMHs at either side. Though there is some length variation amongst TMHs most of them are centred around a length of 20–22 residues. In this case, flanks are the sequence extensions beyond the standardised-length 21-residue TMHs. We define the inside flanking segments as the positions –20 to –10 and the outside flanking regions to be +10 to +20 from the central TMH residue (with the label “0”). Instead of emphasising some artificially selected boundary residue, this definition allows the average TMH boundary transition to become apparent.

    Separating simple and complex single-pass helices

    Single-pass helices from ExpAll and UniHuman datasets helices were split into two groups: simple and complex following a previously described classification [6, 7] to roughly distinguish simple hydrophobic anchors and TMHs with additional structural/functional roles. Simple and complex helices were determined using TMSOC [7]. The complexity class is determined by calculating the hydrophobicity and sequence entropy. The resulting coordinates cluster with anchors being more hydrophobic and less complex, whilst more complex and more polar TMHs are associated with non-anchorage functions. In UniHuman there were 889 simple helices and 570 complex TMHs. In ExpAll there were 769 simple helices and 570 complex helices.

    Distribution normalisation

    In this work, we have used normalisation techniques described in previous investigations as well as new approaches designed to more sensitively identify biases of rare residues. Baeza-Delgado and co-workers used LogOdds normalisation column-wise in TMH alignments. Critically, this is based on their definition of probability, which takes into account the total number of amino acids in the dataset as a denominator [13]. Since aliphatic residues such as leucine and other highly abundant slightly polar residues dominate the denominator, the distribution of the rare acidic residues will be easily lost in the “background noise” of those highly abundant residues. Pogozheva and co-workers used two approaches, (1) the total accessible surface area (ASA total) and (2) the total number of charged residues (N total), as a denominator in their distribution normalisation [16].

    In this work, two methods for measuring residue occurrence in the TMH and its flanks were used. As in previous work, we compute the occurrence a i,r of an amino acid type i at a certain sequence position r in a set of aligned sequences of TMHs and their flanks. Following [9], the absolute relative occurrence p i,r of this amino acid type at the sequence position r is then given by Eq. (1) as:

    Here, the denominator is the maximal number of all residues in any alignment column (i.e., the number of sequences in the alignment) and, to emphasise, this will make p i,r mostly dependent on the most abundant residue types. This type of normalisation reveals the most preferred residue types at given sequence positions.

    Our second normalisation method is independent of the abundance of any amino acid types other than the studied one, and it answers the question: If there is a residue of type i in the TMH-containing segment, where would it most likely be? This relative occurrence q i,r is calculated in Eq. (2) as:

    The value a i is the total abundance of residues of just amino acid type i in a given alignment of TMH-containing segments (i.e., in the TMH together with its two adjoining flanks summed over all cases of TMHs in the given dataset). Peaks in q i,r as a function of r reveal the preferred positions of residues of type i. The difference in p i,r and q i,r normalisation is visualised in Additional file 6: Figure S3.

    Hydrophobicity calculations

    Hydrophobicity profiles were calculated using the Kyte and Doolittle hydrophobicity scale [52] and validated with the Eisenberg scale [54], the Hessa biological scale [36] and the White and Wimley whole residue scale [53] (Additional file 1: Figure S1). The hydrophobicity profile uses un-weighted windowing of the residue hydrophobicity scores from end to end of the TMD slice. Three residues were used as full window lengths, and partial windows were permitted.

    Normalised net charge calculations

    Charge was calculated at each position by scanning through each position of the TMHs and flanking regions and subtracting one from the position if an acidic residue (D or E) was present, or adding one if a positively charged residue (K or R) was present. The accumulative net charge c r was then divided by the total number N of TMHs that were used in calculating the accumulative net charge. Thus, the charge distribution is calculated by:

    Statistics

    The inside/outside bias of negative residues was quantified by computing the independent Kruskal-Wallis (KW) and two-sample t test statistical method from the Python scipy.stats package v0.15 (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kruskal.html, https://docs.scipy.org/doc/scipy-0.15.0/reference/generated/scipy.stats.ttest_ind.html). This test answers the question of whether two means are actually different in the statistical sense. For the leucine residues, each TMH region was divided into two sections, representing the inner and outer leaflets (Table 4). For the hydrophobicity plot, three window values of hydrophobicity were taken for each TMH at each position. The statistical analyses were separately performed for single-pass and multi-pass transmembrane proteins. At each position, the two groups were compared using the KW test.

    The zero hypothesis of homogeneity of two distributions was examined with the Kolmogorov-Smirnov (KS), the KW and the χ 2 statistical tests.The KS test scrutinises for significant maximal absolute differences between distribution curves, the KW test looks for skews between distributions and the χ 2 statistical test checks the average difference between distributions. As the statistical significance value (P value) is a strong function of N, the total amount of data used in the statistical test, we rely on the (absolute) Bahadur slope (B) as a measure of distance between two distributions [55,56,57]:

    The larger the absolute Bahadur slope, the greater the difference between the two distributions.


    Watch the video: ΣΥΜΠΤΩΜΑΤΑ: 8 Συμπτώματα Νεφρικής Βλάβης που οι Περισσότεροι Αγνοούν (January 2022).