DNA sequence from the middle of a gene

Someone gives you a short DNA sequence that comes from the middle of a gene.


From this sequence, determine the following:

  1. Is the promoter located to the left or right as the sequence is written?

  2. Is the sense strand the top or bottom strand?

  3. What amino acids are encoded by this gene fragment?

The only thing I have been able to come up with, is that since this is from the middle of a DNA sequence I need to choose a frame with no stop codons.




AGA TTG ACT AAT CG <<< this is the ORF



If the sequence comes from the middle of a gene we assume it should encode an open reading frame. For this sequence only 1/6 frames does not include a stop codon (shown above in italics). So in standard format, with promoter to the left we can write the ds sequence as:



Since this is homework I'll leave the rest to you.

On the sequence alone, you can answer neither of these questions because:

  • from the sequence alone you don't know anything about the gene or the promoter.
  • the same is true for the orientation
  • and the codons, since you don't know if the code is in frame or not. If one base is cut-off from the original sequence, your codons shift, and don't show the original code. The sequence has 14 nucleotides which will not ressolve into a short amino acid sequence.

You can probably identify the gene using BLAST and then see, where the sequence is located and answer the questions. I tried to blast the sequence, but it is too short to give a definitive answer.

So the first thing, you need to do is identify the sequence (gene and the organism where it comes from), then you can do the rest of the work.

Lecture Summaries

Lee et al. describe their discovery of a genetic interaction between a C. elegans protein coding gene and a non-coding RNA (a microRNA) called lin-4. Their experiments demonstrate that lin-4 represses expression of lin-14 protein. They characterize the small RNA product of lin-4 and, by analyzing its sequence, are able to infer that it binds lin-14 mRNA via an antisense RNA-RNA interaction: basepairing between the miRNA and complementary sequence in the mRNA (a miRNA binding site). This finding briefly brings us back to the Operon paper, because regulation at the RNA level by miRNAs fits into Scenario II hypothesized by Jacob and Monod.

From the first paper, we learn how Fire et al. systematically tested the requirements for RNAi in C.elegans, where it is potent and systemic (spreading through the animal). By carefully varying the experimental conditions and exploiting genetics, the authors were able to correctly infer the existence of catalytic activity and amplification in the RNAi-pathway of the animal. Many years later, Jenal et al. used thousands of artificial small-interfering RNAs (siRNAs) to identify human genes that affect alternative poly-adenylation, a mechanism that creates mRNA isoforms with different 3&rsquoUTRs from the same gene. This approach demonstrates the utility of siRNA-based screens to identify the molecular components of a cellular pathway. We will discuss the potential consequences of alternative 3&rsquo ends for post-transcriptional gene regulation based on what we have learned about miRNAs.

Kristjánsdóttir et al. &ldquodisassembled&rdquo a long 3&rsquoUTR into short fragments and tested each for its effects on protein production using a fluorescent reporter assay. This approach allowed the identification of a large number of functional elements in the Hmga2 3&rsquoUTR and revealed that these elements act largely independently from one another. Next we learn about the discovery that 3&rsquoUTR choice can alter not the amount but also the function of the protein. By combining carefully designed reporter constructs with microscopy, flow cytometry, and RNA interference, Berkovits and Mayr were able to illuminate a chain of molecular events that switch the localization and function of CD47 protein, depending on whether it is translated from an mRNA isoform harboring either the short or long 3&rsquoUTR (with identical coding sequences).

First we will look at piwi-interacting RNAs (piRNAs). Brennecke et al. sequenced small-RNAs from Drosophila to study piRNAs, distant relatives of miRNAs and siRNAs. piRNAs are highly expressed in germ cells. The authors found that piRNAs originate from broken copies of transposons, inserted at specific sites within the genome. Using fly genetics, they demonstrated that a few of these sites are &ldquomaster regulators&rdquo of transposon activity and elucidated the mechanism by which piRNAs form an adaptive immune system that can keep transposons in check.

We will briefly discuss the history of how (endogenous) circular RNAs (circRNAs) were discovered in other organisms, including mammals. While many circRNAs could be inconsequential by-products of splicing, a few are now known to have a biological function. Piwecka et al., made mice without Cdr1as, a circRNA with high expression in the brain. Interestingly, they find very specific alterations of miRNA function and behavior. In addition, we have seen that RNA molecules have the ability to silence expression at the level of RNA. From Chaumeit et al. we will discuss how the long non-coding RNA (lncRNA) Xist takes this repression to the level of DNA by repressing gene expression across almost an entire chromosome. By acting as a scaffold for transcriptional repressors and chromatin-modifying enzymes, Xist is able to efficiently shut off transcription of many genes at once.

Problem: If you compared the DNA sequence of a gene with the sequence of the mature mRNA that was transcribed from the gene you would find: a. The mRNA is shorter because it does not contain exons b. Both are the same length c. The mRNA is shorter because it does not contain introns d. The mRNA is shorter because each codon of three bases encodes only one amino acid e. The mRNA is longer because each codon of one amino acid encodes three bases

What scientific concept do you need to know in order to solve this problem?

Our tutors have indicated that to solve this problem you will need to apply the Eukaryotic RNA Processing and Splicing concept. You can view video lessons to learn Eukaryotic RNA Processing and Splicing. Or if you need more Eukaryotic RNA Processing and Splicing practice, you can also practice Eukaryotic RNA Processing and Splicing practice problems.

What is the difficulty of this problem?

Our tutors rated the difficulty ofIf you compared the DNA sequence of a gene with the sequence. as low difficulty.

How long does this problem take to solve?

Our expert Biology tutor, Kaitlyn took 4 minutes and 59 seconds to solve this problem. You can follow their steps in the video explanation above.

What professor is this problem relevant for?

Based on our data, we think this problem is relevant for Professor Keating's class at RUTGERS.


So, what is all the non-coding DNA doing there? We know that even coding regions in our DNA are interrupted by non-coding sequences called introns. This is true of most eukaryotic genomes. An examination of genes in eukaryotes shows that non-coding intron sequences can be much longer than the coding sections of the gene, or exons. Most exons are relatively small, and code for fewer than a hundred amino acids, while introns can vary in size from several hundred base-pairs to many kilobase-pairs (thousands of base-pairs) in length. For many genes in humans, there is much more of intron sequence than coding (a.k.a. exon) sequence. Intron sequences account for roughly a quarter of the genome in humans.

Terms and Concepts

  • Genes
  • DNA
  • Mutation
  • Genetic disease
  • Nucleotides
  • RNA
  • Transcription
  • Translation
  • Amino acids
  • Codon
  • Hydrophilic
  • Hydrophobic
  • Allele


  • How does a gene become a protein?
  • In a given gene, what kind of DNA mutation would not change the protein that is made?
  • What makes some amino acids hydrophobic and others hydrophilic?
  • How common are mutations in the human genome? Is it very likely or very unlikely that your DNA carries any mutations?

The P450 Superfamily: Update on New Sequences, Gene Mapping, Accession Numbers, Early Trivial Names of Enzymes, and Nomenclature

We provide here a list of 221 P450 genes and 12 putative pseudogenes that have been characterized as of December 14, 1992. These genes have been described in 31 eukaryotes (including 11 mammalian and 3 plant species) and 11 prokaryotes. Of 36 gene families so far described, 12 families exist in all mammals examined to date. These 12 families comprise 22 mammalian subfamilies, of which 17 and 15 have been mapped in the human and mouse genome, respectively. To date, each subfamily appears to represent a cluster of tightly linked genes. This revision supersedes the previous updates [Nebert et al., DNA 6, 1–11, 1987 Nebert et al., DNA 8, 1–13, 1989 Nebert et al., DNA Cell Biol. 10, 1–14 (1991)] in which a nomenclature system, based on divergent evolution of the superfamily, has been described. For the gene and cDNA, we recommend that the italicized root symbol "CYP" for human ("Cyp" for mouse), representing "cytochrome P450," be followed by an Arabic number denoting the family, a letter designating the subfamily (when two or more exist), and an Arabic numeral representing the individual gene within the subfamily. A hyphen should precede the final number in mouse genes. "P" ("p" in mouse) after the gene number denotes a pseudogene. If a gene is the sole member of a family, the subfamily letter and gene number need not be included. We suggest that the human nomenclature system be used for all species other than mouse. The mRNA and enzyme in all species (including mouse) should include all capital letters, without italics or hyphens. This nomenclature system is identical to that proposed in our 1991 update.

Also included in this update is a listing of available data base accession numbers for P450 DNA and protein sequences. We also discuss the likelihood that this ancient gene superfamily has existed for more than 3.5 billion years, and that the rate of P450 gene evolution appears to be quite nonlinear. Finally, we describe P450 genes that have been detected by expressed sequence tags (ESTs), as well as the relationship between the P450 and the nitric oxide synthase gene superfamilies, as a likely example of convergent evolution.


Liesack W, Söller R, Stewart T, Haas H, Giovannoni S, Stackebrandt E: The influence of tachytelically (rapidly) evolving sequences on the topology of phylogenetic trees - intrafamily relationships and the phylogenetic position of Planctomycetaceae as revealed by comparative analysis of 16S ribosomal RNA sequences. Syst Appl Microbiol. 1992, 15: 357-362.

Schlesner H, Stackebrandt E: Assignment of the genera Planctomyces and Pirella to a new family Planctomycetaceae fam. nov. and description of the order Planctomycetales ord. nov. Syst Appl Microbiol. 1986, 8: 174-176.

Woese CR, Stackebrandt E, Macke TJ, Fox GE: A phylogenetic definition of the major eubacterial taxa. Syst Appl Microbiol. 1985, 6: 143-151.

Embley TM, Hirt RP, Williams DM: Biodiversity at the molecular level: the domains, kingdoms and phyla of life. Phil Trans R Soc Lond B. 1994, 345: 21-33.

Fuerst JA: The planctomycetes: emerging models for microbial ecology, evolution and cell biology. Microbiology. 1995, 141: 1493-1506.

Hugenholtz P, Pitulle C, Herschberger KL, Pace NR: Novel division level bacterial diversity in a Yellowstone hot spring. J Bacteriol. 1998, 180: 366-376.

Neef A, Amann R, Schlesner H, Schleifer K-H: Monitoring a widespread bacterial group: in situ detection of planctomycetes with 16S rRNA-targeted probes. Microbiology. 1998, 144: 3257-3266.

Strous M, Fuerst JA, Kramer EHM, Logemann S, Muyzer G, van de pas-Schoonen KT, Webb R, Kuenen JG, Jetten MSM: Missing lithotroph identified as new planctomycete. Nature. 1999, 400: 446-449. 10.1038/22749.

Schmid M, Twachtmann U, Klein M, Strous M, Juretschko S, Jetten M, Metzger JW, Schleifer K-H, Wagner M: Molecular evidence for genus level diversity of bacteria capable of catalyzing anaerobic ammonium oxidation. Syst Appl Microbiol. 2000, 23: 93-106.

Gray JP, Herwig RP: Phylogenetic analysis of the bacterial communities in marine sediments. Appl Environ Microbiol. 1996, 62: 4049-4059.

DeLong EF, Franks DG, Alldredge AL: Phylogenetic diversity of aggregate-attached vs. free-living marine bacterial assemblages. Limnol Oceanogr. 1993, 38: 924-934.

Vergin KL, Urbach E, Stein JL, DeLong EF, Lanoil BD, Giovannoni SJ: Screening of a fosmid library of marine environmental genomic DNA fragments reveals four clones related to members of the order Planctomycetales. Appl Environ Microbiol. 1998, 64: 3075-3078.

Liesack W, Stackebrandt E: Occurrence of novel groups of the domain Bacteria as revealed by analysis of genetic material isolated from an Australian terrestrial environment. J Bacteriol. 1992, 174: 5072-5078.

Borneman J, Triplett EW: Molecular microbial diversity in soils from eastern Amazonia: evidence for unusual microorganisms and microbial population shifts associated with deforestation. Appl Environ Microbiol. 1997, 63: 2647-2653.

Derakshani M, Lukow T, Liesack W: Novel bacterial lineages at the (sub)division level as detected by signature nucleotide-targeted recovery of 16S rRNA genes from bulk soil and rice roots of flooded rice microcosms. Appl Environ Microbiol. 2001, 67: 623-631. 10.1128/AEM.67.2.623-631.2001.

Zarda B, Hahn D, Chatzinotas A, Schönhuber W, Neef A, Amann RI, Zeyer J: Analysis of bacterial community structure in bulk soil by in situ hybridization. Arch Microbiol. 1997, 168: 185-192. 10.1007/s002030050486.

Lindsay MR, Webb RI, Strous M, Jetten MS, Butler MK, Forde RJ, Fuerst JA: Cell compartmentalisation in planctomycetes: novel types of structural organisation for the bacterial cell. Arch Microbiol. 2001, 175: 413-429. 10.1007/s002030100280.

Fuerst JA, Webb RI: Membrane-bounded nucleoid in the eubacterium Gemmata obscuriglobus. Proc Natl Acad Sci USA. 1991, 88: 8184-8188.

Lindsay MR, Webb RI, Fuerst JA: Pirellulosomes: a new type of membrane-bounded cell compartment in planctomycete bacteria of the genus Pirellula. Microbiology. 1997, 143: 739-748.

Stackebrandt E, Wehmeyer U, Liesack W: 16S ribosomal RNA-and cell wall analysis of Gemmata obscuriglobus, a new member of the order Planctomycetales. FEMS Microbiol Lett. 1986, 37: 289-292. 10.1016/0378-1097(86)90421-0.

Ward NL, Rainey FA, Hedlund BP, Staley JT, Ludwig W, Stackebrandt E: Comparative phylogenetic analyses of members of the order Planctomycetales and the division Verrucomicrobia: 23S rRNA gene sequence analysis supports the 16S rRNA gene sequence-derived phylogeny. Int J Syst Evol Microbiol. 2000, 50: 1965-1972.

Rönner S, Liesack W, Wolters J, Stackebrandt E: Cloning and sequencing of a large fragment of the atpD-gene of Pirellula marina - a contribution to the phylogeny of Planctomycetales. Endocytobios Cell Res. 1991, 7: 219-229.

Ward-Rainey N, Rainey FA, Stackebrandt E: The presence of a dnaK (HSP70) multigene family in members of the orders Planctomycetales and Verrucomicrobiales. J Bacteriol. 1997, 179: 6360-6366.

Leary BA, Ward-Rainey N, Hoover TM: Cloning and characterization of Planctomyces limnophilus rpoN: complementation of a Salmonella typhimurium rpoN mutant strain. Gene. 1998, 221: 151-157. 10.1016/S0378-1119(98)00423-5.

Weisburg WG, Hatch TP, Woese CR: Eubacterial origin of chlamydiae. J Bacteriol. 1986, 167: 570-574.

Jenkins C, Fuerst JA: Phylogenetic analysis of evolutionary relationships of the planctomycete division of the domain Bacteria based on amino acid sequences of elongation factor-Tu. J Mol Evol. 2001, 52: 405-418.

The Red Environment Genomics (REGX) project. []

Choi IG, Kim SS, Ryu J, Han YS, Bang W, Kim S, Yu YG: Random sequence analysis of genomic DNA of a hyperthermophilic Aquifex pyrophilus. Extremophiles. 1997, 1: 125-134. 10.1007/s007920050025.

Fitz-Gibbon S, Choi AJ, Miller JH, Stetter KO, Simon MI, Swanson R, Kim UJ: A fosmid-based genomic map and identification of 474 genes of the hyperthermophilic archaeon Pyrobaculum aerophilum. Extremophiles. 1997, 1: 36-51. 10.1007/s007920050013.

Kim CW, Markiewicz P, Lee JJ, Schierle CF, Miller JH: Studies of the hyperthermophile Thermotoga maritima by random sequencing of cDNA and genomic libraries. Identification and sequencing of the trpEG(D) operon. J Mol Biol. 1993, 231: 960-981. 10.1006/jmbi.1993.1345.

Peterson SN, Hu P, Bott KF, Hutchison CA: A survey of the Mycoplasma genitalium genome by using random sequencing. J Bacteriol. 1993, 175: 7918-7930.

Koonin EV, Mushegaian AR, Rudd KE: Sequencing and analysis of bacterial genomes. Curr Biol. 1996, 6: 404-416.

Guillouet S, Rodal AA, An G, Lessard PA, Sinskey AJ: Expression of the Escherichia coli catabolic threonine dehydratase in Corynebacterium glutamicum and its effect on isoleucine production. Appl Env Microbiol. 1999, 65: 3100-3107.

Vanoni MA, Curti B: Glutamate synthase: a complex iron-sulfur flavoprotein. Cell Mol Life Sci. 1999, 55: 617-638. 10.1007/s000180050319.

Fazzio TG, Roth JR: Evidence that the CysG protein catalyzes the first reaction specific to B12 synthesis in Salmonella typhimurium, insertion of cobalt. J Bacteriol. 1996, 178: 6952-6959.

Grolle S, Bringer-Meyer S, Sahm H: Isolation of the dxr gene of Zymomonas mobilis and characterization of the 1-deoxy-D-xylulose 5-phosphate reductoisomerase. FEMS Microbiol Lett. 2000, 191: 131-137. 10.1016/S0378-1097(00)00382-7.

Bauld J, Staley JT: Planctomyces maris sp. nov.: a marine isolate of the Planctomyces-Blastocaulis group of budding bacteria. J Gen Microbiol. 1976, 97: 45-55.

Schlesner H: The development of media suitable for microorganisms morphologically resembling Planctomyces spp., Pirellula spp., and other Planctomycetales from various aquatic habitats using dilute media. Syst Appl Microbiol. 1994, 17: 135-145.

Schlesner H: Pirella marina sp. nov., a budding, peptidoglycanless bacterium from brackish water. Syst Appl Microbiol. 1986, 8: 177-180.

Staley JT, Fuerst JA, Giovannoni S, Schlesner H: The order Planctomycetales and the genera Planctomyces, Pirellula, Gemmata and Isosphaera. In The Prokaryotes: a Handbook on the Biology of Bacteria: Ecophysiology, Isolation, Identification, Applications. Edited by Balows A, Trüper H, Dworkin M, Harder W, Schleifer K-H. Vol IV, 2nd edn. New York: Springer-Verlag,. 1992, 3710-3731.

Koike-Takeshita A, Koyama T, Ogura K: Identification of a novel gene cluster participating in menaquinone (vitamin K2) biosynthesis. Cloning and sequence determination of the 2-heptaprenyl-1,4-naphthoquinone methyltransferase gene of Bacillus stearothermophilus. J Biol Chem. 1997, 272: 12380-12382. 10.1074/jbc.272.19.12380.

Sittig M, Schlesner H: Chemotaxonomic investigation of various prosthecate and/or budding bacteria. Syst Appl Microbiol. 1993, 16: 92-103.

Teplyakov A, Obmolova G, Badet-Denisot MA, Badet B: The mechanism of sugar phosphate isomerization by glucosamine-6-phosphate synthase. Prot Sci. 1999, 8: 596-602.

Liesack W, König H, Schlesner H, Hirsch P: Chemical composition of the peptidoglycan-free cell envelopes of budding bacteria of the Pirella /Planctomyces group. Arch Microbiol. 1986, 145: 361-366.

König H, Schlesner H, Hirsch P: Cell wall studies on budding bacteria of the Planctomyces /Pasteuria group and on a Prosthecomicrobium sp. Arch Microbiol. 1984, 138: 200-205.

Senecoff JF, Meagher RB: Isolating the Arabidopsis thaliana genes for de novo purine synthesis by suppression of Escherichia coli mutants I. 5'-phosphoribosyl-5-aminoimida-zole synthetase. Plant Physiol. 1993, 102: 387-399. 10.1104/pp.102.2.387.

Nara T, Hashimoto T, Aoki T: Evolutionary implications of the mosaic pyrimidine-biosynthetic pathway in eukaryotes. Gene. 2000, 257: 209-222. 10.1016/S0378-1119(00)00411-X.

Spencer RH, Chang G, Dees DC: "Feeling the pressure": structural insights into a gated mechanosensitive channel. Curr Opin Struct Biol. 1999, 9: 448-454. 10.1016/S0959-440X(99)80063-3.

Guo D, Bowden MG, Pershad R, Kaplan HB: The Myxococcus xanthus rfbABC operon encodes an ATP-binding cassette transporter for O-antigen biosynthesis and multicellular development. J Bacteriol. 1996, 178: 1631-1639.

Chu S, Noonan B, Cavaignac S, Trust TJ: Endogenous mutagenesis by an insertion sequence element identifies Aeromonas salmonicida AbcA as an ATP-binding cassette transport protein required for biogenesis of smooth lipopolysaccharide. Proc Natl Acad Sci USA. 1995, 92: 5754-5758.

Kerger BD, Manusco CA, Nichols PD, Whitte DC, Langworthy T, Sittig M, Schlesner H, Hirsch P: The budding bacteria, Pirellula and Planctomyces, with atypical 16S rRNA and absence of peptidoglycan, show eubacterial phospholipids and uniquely high proportions of long chain beta-hydroxy fatty acids in the lipopolysaccharide lipid A. Arch Microbiol. 1988, 149: 255-260.

Giovannoni SJ, Godchaux W, Schabtach E, Castenholtz RW: Cell wall and lipid composition of Isosphaera pallida, a budding eubacterium from hot springs. J Bacteriol. 1987, 169: 2702-2707.

Minamino T, Macnab RM: Components of the Salmonella flagellar export apparatus and classification of export substrates. J Bacteriol. 1999, 181: 1388-1394.

Kirby JR, Niewold TB, Maloy S, Ordal GW: CheB is required for behavioural responses to negative stimuli during chemotaxis in Bacillus subtilis. Mol Microbiol. 2000, 35: 44-57. 10.1046/j.1365-2958.2000.01676.x.

Rosario MM, Kirby JR, Bochar DA, Ordal GW: Chemotactic methylation and behavior in Bacillus subtilis: role of two unique proteins, CheC and CheD. Biochemistry (Moscow). 1995, 34: 3823-3831.

Chang P, Marians KJ: Identification of a region of Escherichia coli DnaB required for functional interaction with DnaG at the replication fork. J Biol Chem. 2000, 275: 26187-26195. 10.1074/jbc.M001800200.

Stewart E, Chapman CR, Al-Khodairy F, Carr AM, Enoch T: rqh1+, a fission yeast gene related to the Bloom's and Werner's syndrome genes, is required for reversible S phase arrest. EMBO J. 1997, 16: 2682-2692. 10.1093/emboj/16.10.2682.

Traxler BA, Minkley EG: Evidence that DNA helicase I and oriT site-specific nicking are both functions of the F Tra I protein. J Mol Biol. 1988, 204: 205-209.

Ward-Rainey N, Rainey FA, Wellington EMH, Stackebrandt E: Physical map of the genome of Planctomyces limnophilus, a representative of the phylogenetically distinct planctomycete lineage. J Bacteriol. 1996, 178: 1908-1913.

Dahlberg C, Bergström M, Andreasen M, Christensen BB, Molin S, Hermansson M: Interspecies bacterial conjugation by plasmids from marine environments visualized by gfp expression. Mol Biol Evol. 1998, 15: 385-390.

Moolenaar GF, Moorman C, Goosen N: Role of the Escherichia coli nucleotide excision repair proteins in DNA replication. J Bacteriol. 2000, 182: 5706-5714. 10.1128/JB.182.20.5706-5714.2000.

Derbyshire V, Grindley ND, Joyce CM: The 3'-5' exonuclease of DNA polymerase I of Escherichia coli: contribution of each amino acid of the active site to the reaction. EMBO J. 1991, 10: 17-24.

Tvermyr M, Kristiansen BE, Kristensen T: Cloning, sequence analysis and expression in E. coli of the DNA polymerase I gene from Chloroflexus aurantiacus, a green nonsulfur eubacterium. Genet Anal. 1998, 14: 75-83. 10.1016/S1050-3862(97)10002-X.

Chenuil A, Soliganc M, Bernard M: Evolution of the large-subunit ribosomal RNA binding site for protein L23/25. Mol Biol Evol. 1997, 14: 578-588.

Allen T, Shen P, Samsel L, Liu R, Lindahl L, Zengel JM: Phylogenetic analysis of L4-mediated autogenous control of the S10 ribosomal protein operon. J Bacteriol. 1999, 181: 6124-6132.

Bocchetta M, Gribaldo S, Sanangelantoni A, Cammarano P: Phylogenetic depth of the bacterial genera Aquifex and Thermotoga inferred from analysis of ribosomal protein, elongation factor and RNA polymerase subunit sequences. J Mol Evol. 2000, 50: 366-380.

Woese CR, Olsen GJ, Ibba M, Söll D: Aminoacyl-tRNA synthetases, the genetic code, and the evolutionary process. Microbiol Mol Biol Rev. 2000, 64: 202-236. 10.1128/MMBR.64.1.202-236.2000.

Gupta R: The phylogeny of proteobacteria: relationships to other eubacterial phyla and eukaryotes. FEMS Micro Rev. 2000, 24: 367-402. 10.1016/S0168-6445(00)00031-0.

Miyamoto S, Teramoto H, Coso OA, Gutkind JS, Burbelo PD, Akiyama SK, Yamada KM: Integrin function: molecular hierarchies of cytoskeletal and signalling molecules. J Cell Biol. 1995, 131: 791-805.

Bost F, Diarra-Mehrpour M, Matin J-P: Inter-α-trypsin inhibitor proteoglycan family. A group of proteins binding and stabilizing the extracellular matrix. Eur J Biochem. 1998, 252: 339-346. 10.1046/j.1432-1327.1998.2520339.x.

Fitzgerald LA, Ponez M, Steiner B, Rall SC, Bennett JS, Phillips DR: Comparison of cDNA-derived protein sequences of the human fibronectin and vitronectin receptor a-subunits and platelet glycoprotein IIb. Biochemistry. 1987, 26: 8158-8165.

May AP, Ponting CP: Integrin α- and β4 subunit-domain homologues in cyanobacterial proteins. Trends Biochem Sci. 1999, 24: 12-13. 10.1016/S0968-0004(98)01310-3.

Nelson KE, Clayton RA, Gill SR, Gwinn ML, Dodson RJ, Haft DH, Hickey EK, Peterson JD, Nelson WC, Ketchum KA, et al: Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima. Nature. 1999, 399: 323-329. 10.1038/20601.

Stephens RS, Kalman S, Lammel C, Fan J, Marathe R, Aravind L, Mitchell W, Olinger L, Tatusov RL, Zhao Q, et al: Genome sequence of an obligate intracellular pathogen of humans: Chlamydia trachomatis. Science. 1998, 282: 754-759. 10.1126/science.282.5389.754.

Doolittle WF: Lateral genomics. Trends Cell Sci. 1999, 24: M5-M8. 10.1016/S0962-8924(99)01664-5.

Nicholas KB, Nicholas HB, Deerfield DW: GeneDoc: analysis and visualization of genetic variation. EMB News. 1997, 4: 14-

New Technologies, Future Weapons: Gene Sequencing and Synthetic Biology

Since the completion of the human genome project in 2003, there has been a surge of investment and discovery in both the gene sequencing and synthetic biology sectors of biotechnology. While the information contained in genome databases is not inherently dangerous, it can be used for destructive purposes. With synthesis technology becoming less expensive, more accurate, and faster every year, it is foreseeable that by 2020 malefactors will have the ability to manipulate genomes in order to engineer new bioterrorism weapons.

With every technological advancement come new national security risks. Without a clear understanding of the actual risks associated with synthetic biology, the U.S. is in danger of responding to fears with overregulation. To create regulation that fits the technology, the U.S. should fund risk assessments on the impact of synthesis and sequencing—giving policymakers a better idea of where the highest likelihood of terrorism lies. Simultaneously, and to continue leading the biotechnology revolution, the U.S. also needs to provide federal funding for synthetic biology and gene sequencing research. These steps, coupled with a strong strategy for bioterrorism that confronts issues of prevention and response/surge capacity, would allow America to reap the rewards of these emerging technologies while avoiding many of their attendant perils.

Select Agent Classifications Are No Longer Effective

In the past, one way that government agencies combated bioterrorism was by restricting access to the pathogens themselves. For instance, the Centers for Disease Control and the Department of Agriculture have worked together to regulate the laboratory use of “select agents” (pathogens and biological agents that have the potential to pose a severe threat to public health, such as the Ebola virus). But with the advent of DNA synthesis technology, simply restricting access to the actual pathogen no longer provides the security that it once did. Since the gene sequence is a blueprint, once an organism has been sequenced it can be synthesized without using samples of existing cultures or stock DNA.

In today’s market it costs just a few thousand dollars to design a custom DNA sequence, order it from a manufacturer, and within a few weeks receive the DNA in the mail. Since select agents are currently not defined by their DNA sequences, terrorists can actually order subsets of select agent DNA and assemble them to create entire pathogens. The possibility for attack by a bioterrorism weapon containing a select agent will be greater in the future as synthesis technology continues to advance.

New Restrictions and Regulations?

Since terrorists would not be able to fabricate select agents without access to the requisite genomes, it seems at first glance that restricting access to genomic databases could ameliorate much of the problem. In actuality, the gene databases are a fundamental tool for researchers. Future advances in gene sequencing and synthesis would be severely hindered by government regulation of these databases. No other area of life science depends as much on online databases. In fact, the gene sequencing and DNA synthesis fields are so database-driven that most scientific journals require genome data to be deposited into these databases as a prerequisite for publication.

Moreover, the full genetic sequence for many select agents and other pathogenic genomes (smallpox, botulism, anthrax) are already in Internet-accessible databases that currently mandate free, unfettered, and anonymous access. Once a genome has been released onto the Web, it makes little sense to restrict future publication of that genome. (Posting to the Internet is easy removing all copies of a post is a near-impossible feat.)

Regulation Tailored to the Risks

Overregulation has a negative effect on research, while under-regulation would undoubtedly expose the U.S. to national security risks. Federal agencies such as the NIH and the NSF may be best suited to conduct ongoing risk assessments for synthetic biology and gene sequencing technologies.

As the field develops, regulations should be updated so that they can be narrowly tailored to fit the actual risks—thereby impacting future research as little as possible. In addition, independent committees of industry leaders, agency officials, and academics should be appointed to create regulations based on these risk assessments.

As the world’s leader in biotechnology research, the U.S. is currently in an excellent position. However, other nations are beginning to catch up. Around the world, industry and universities alike are working to decode the genetic makeup of thousands of organisms to discover which genes are responsible for what diseases and to create technologies that perform gene sequencing and DNA synthesis faster and more accurately than ever before.

Domestic researchers need to have the funding to develop the next generation of countermeasures for genetically engineered pathogens. Without favorable legislation such as tax breaks, biotechnology companies may begin moving overseas. And without federal funding, top scientists would be unable to perform the fundamental research that will fuel the next stage of synthesis and sequencing technologies. If the U.S. is not far ahead of other nations in its research, it runs a higher risk of being susceptible to attack.

Detecting Synthetic Pathogens

As synthesis and sequencing technologies continue to advance, it will become easier and easier for rogue individuals or bioterrorists to leak man-made pathogens into water and food supplies. To mitigate this risk, the U.S. should promote research into the areas most prone to attack. Next steps should include:

  • Conducting risk assessments. Without adequate understanding of the risks involved in any technological field, the government may overregulate and stifle scientific and technological progress.
  • Investing in biotechnology. Other nations recognize the potential of synthetic biology. If the U.S. does not continue to invest in synthetic biology, it will technologically fall behind other countries and run a higher risk of being subjected to a bio-weapons attack.
  • Move forward with WMD Commission recommendations. In 2008, Congress created the Commission on the Prevention of WMD Proliferation and Terrorism to study the “risk of WMD terrorism” and to recommend “steps that could be taken to prevent a successful attack on the United States.” As part of this work, the commission took a close look at the threat of bioterrorism and recommended key changes that should be made to both counter and respond to such an act of terrorism. The commission focused on both surge capacity for first responders in the event of an attack and preventing would-be terrorists from gaining access to biological agents.

These recommendations would help ensure that the U.S. remains protected while working on the cutting edge of biotechnology.

Bio-Specific Strategy Needed

The WMD Commission has emphasized the need for a “bio-specific strategy” in terms of preventing acts of bioterrorism. The U.S., however, has significant work to do in terms of developing this strategy in a way that is representative of the risk of bioterrorism, respects legitimate uses of biological agents, and prepares the nation if such a disaster strikes.

Ethel Machi is an independent science and technology consultant and Jena Baker McNeill is Policy Analyst for Homeland Security in the Douglas and Sarah Allison Center for Foreign Policy Studies, a division of the Kathryn and Shelby Cullom Davis Institute for International Studies, at The Heritage Foundation.

Tool Time

If epigenetic work is to continue breaking new ground, many observers say technology will need to continue advancing. Jones and Martienssen note in their paper that there must be additional improvements in high-throughput technologies, analytical techniques, computational capability, mechanistic studies, and bioinformatic strategies. They also say there is a need for basics such as standardized reagents and a consistent supply of antibodies for testing.

Preston agrees with many of these ideas, and says there is also a need to develop a comprehensive tally of all proteins in the cell and to get better protein modification information. He says universities are recognizing the demand for the talents needed to solve epigenomics problems, and are increasing their efforts to cover these topics in various ways, especially at the graduate school level.

Other groups are doing their part by creating tools to further the field. All the imprinted genes identified so far are tracked in complementary efforts by Morison’s and Jirtle’s groups and the Mammalian Genetics Unit of the U.K. Medical Research Council. The European managers of the DNA Methylation Database have assembled a compendium of known DNA methylations that, although not comprehensive, still provides a useful tool for researchers investigating the roughly 22,000 human genes.

Kunio Shiota, a professor of cellular biochemistry at the University of Tokyo and one of the co-organizers of the November 2005 Tokyo conference, says epigenetic advances will rely in part on a range of processes that are slowly becoming familiar to more researchers—massively parallel signature sequencing (MPSS), chromatin immunoprecipitation microarray analysis (ChIP-chip), DNA adenine methyltransferase identification (Dam-ID), protein binding microarrays (PBM), DNA immunoprecipitation microarray analysis (DIP-chip), and more. Someday, he says, these terms could become fully as familiar as MRI and EKG.

The rapidly growing acceptance of epigenetics, a century after it first surfaced, is a huge step forward, in Jirtle’s opinion. “We’ve done virtually nothing so far,” he says. “I’m biased, but the tip of the iceberg is genomics and single-nucleotide polymorphisms. The bottom of the iceberg is epigenetics.”

Biology 171

Each somatic cell in the body generally contains the same DNA. A few exceptions include red blood cells, which contain no DNA in their mature state, and some immune system cells that rearrange their DNA while producing antibodies. In general, however, the genes that determine whether you have green eyes, brown hair, and how fast you metabolize food are the same in the cells in your eyes and your liver, even though these organs function quite differently. If each cell has the same DNA, how is it that cells or organs are different? Why do cells in the eye differ so dramatically from cells in the liver?

Whereas each cell shares the same genome and DNA sequence, each cell does not turn on, or express, the same set of genes. Each cell type needs a different set of proteins to perform its function. Therefore, only a small subset of proteins is expressed in a cell. For the proteins to be expressed, the DNA must be transcribed into RNA and the RNA must be translated into protein. In a given cell type, not all genes encoded in the DNA are transcribed into RNA or translated into protein because specific cells in our body have specific functions. Specialized proteins that make up the eye (iris, lens, and cornea) are only expressed in the eye, whereas the specialized proteins in the heart (pacemaker cells, heart muscle, and valves) are only expressed in the heart. At any given time, only a subset of all of the genes encoded by our DNA are expressed and translated into proteins. The expression of specific genes is a highly regulated process with many levels and stages of control. This complexity ensures the proper expression in the proper cell at the proper time.

Learning Objectives

By the end of this section, you will be able to do the following:

  • Discuss why every cell does not express all of its genes all of the time
  • Describe how prokaryotic gene regulation occurs at the transcriptional level
  • Discuss how eukaryotic gene regulation occurs at the epigenetic, transcriptional, post-transcriptional, translational, and post-translational levels

For a cell to function properly, necessary proteins must be synthesized at the proper time and place. All cells control or regulate the synthesis of proteins from information encoded in their DNA. The process of turning on a gene to produce RNA and protein is called gene expression . Whether in a simple unicellular organism or a complex multi-cellular organism, each cell controls when and how its genes are expressed. For this to occur, there must be internal chemical mechanisms that control when a gene is expressed to make RNA and protein, how much of the protein is made, and when it is time to stop making that protein because it is no longer needed.

The regulation of gene expression conserves energy and space. It would require a significant amount of energy for an organism to express every gene at all times, so it is more energy efficient to turn on the genes only when they are required. In addition, only expressing a subset of genes in each cell saves space because DNA must be unwound from its tightly coiled structure to transcribe and translate the DNA. Cells would have to be enormous if every protein were expressed in every cell all the time.

The control of gene expression is extremely complex. Malfunctions in this process are detrimental to the cell and can lead to the development of many diseases, including cancer.

Prokaryotic versus Eukaryotic Gene Expression

To understand how gene expression is regulated, we must first understand how a gene codes for a functional protein in a cell. The process occurs in both prokaryotic and eukaryotic cells, just in slightly different manners.

Prokaryotic organisms are single-celled organisms that lack a cell nucleus, and their DNA therefore floats freely in the cell cytoplasm. To synthesize a protein, the processes of transcription and translation occur almost simultaneously. When the resulting protein is no longer needed, transcription stops. As a result, the primary method to control what type of protein and how much of each protein is expressed in a prokaryotic cell is the regulation of DNA transcription. All of the subsequent steps occur automatically. When more protein is required, more transcription occurs. Therefore, in prokaryotic cells, the control of gene expression is mostly at the transcriptional level.

Eukaryotic cells, in contrast, have intracellular organelles that add to their complexity. In eukaryotic cells, the DNA is contained inside the cell’s nucleus and there it is transcribed into RNA. The newly synthesized RNA is then transported out of the nucleus into the cytoplasm, where ribosomes translate the RNA into protein. The processes of transcription and translation are physically separated by the nuclear membrane transcription occurs only within the nucleus, and translation occurs only outside the nucleus in the cytoplasm. The regulation of gene expression can occur at all stages of the process ((Figure)). Regulation may occur when the DNA is uncoiled and loosened from nucleosomes to bind transcription factors ( epigenetic level), when the RNA is transcribed ( transcriptional level), when the RNA is processed and exported to the cytoplasm after it is transcribed ( post-transcriptional level), when the RNA is translated into protein ( translational level), or after the protein has been made ( post-translational level).

The differences in the regulation of gene expression between prokaryotes and eukaryotes are summarized in (Figure). The regulation of gene expression is discussed in detail in subsequent modules.

Differences in the Regulation of Gene Expression of Prokaryotic and Eukaryotic Organisms
Prokaryotic organisms Eukaryotic organisms
Lack a membrane-bound nucleus Contain nucleus
DNA is found in the cytoplasm DNA is confined to the nuclear compartment
RNA transcription and protein formation occur almost simultaneously RNA transcription occurs prior to protein formation, and it takes place in the nucleus. Translation of RNA to protein occurs in the cytoplasm.
Gene expression is regulated primarily at the transcriptional level Gene expression is regulated at many levels (epigenetic, transcriptional, nuclear shuttling, post-transcriptional, translational, and post-translational)

Prokaryotic cells can only regulate gene expression by controlling the amount of transcription. As eukaryotic cells evolved, the complexity of the control of gene expression increased. For example, with the evolution of eukaryotic cells came compartmentalization of important cellular components and cellular processes. A nuclear region that contains the DNA was formed. Transcription and translation were physically separated into two different cellular compartments. It therefore became possible to control gene expression by regulating transcription in the nucleus, and also by controlling the RNA levels and protein translation present outside the nucleus.

Most gene regulation is done to conserve cell resources. However, other regulatory processes may be defensive. Cellular processes such as developed to protect the cell from viral or parasitic infections. If the cell could quickly shut off gene expression for a short period of time, it would be able to survive an infection when other organisms could not. Therefore, the organism evolved a new process that helped it survive, and it was able to pass this new development to offspring.

Section Summary

While all somatic cells within an organism contain the same DNA, not all cells within that organism express the same proteins. Prokaryotic organisms express most of their genes most of the time. However, some genes are expressed only when they are needed. Eukaryotic organisms, on the other hand, express only a subset of their genes in any given cell. To express a protein, the DNA is first transcribed into RNA, which is then translated into proteins, which are then targeted to specific cellular locations. In prokaryotic cells, transcription and translation occur almost simultaneously. In eukaryotic cells, transcription occurs in the nucleus and is separate from the translation that occurs in the cytoplasm. Gene expression in prokaryotes is mostly regulated at the transcriptional level (some epigenetic and post-translational regulation is also present), whereas in eukaryotic cells, gene expression is regulated at the epigenetic, transcriptional, post-transcriptional, translational, and post-translational levels.

Free Response

Name two differences between prokaryotic and eukaryotic cells and how these differences benefit multicellular organisms.

Eukaryotic cells have a nucleus, whereas prokaryotic cells do not. In eukaryotic cells, DNA is confined within the nuclear region. Because of this, transcription and translation are physically separated. This creates a more complex mechanism for the control of gene expression that benefits multicellular organisms because it compartmentalizes gene regulation.

Gene expression occurs at many stages in eukaryotic cells, whereas in prokaryotic cells, control of gene expression only occurs at the transcriptional level. This allows for greater control of gene expression in eukaryotes and more complex systems to be developed. Because of this, different cell types can arise in an individual organism.

Describe how controlling gene expression will alter the overall protein levels in the cell.

The cell controls which proteins are expressed and to what level each protein is expressed in the cell. Prokaryotic cells alter the transcription rate to turn genes on or off. This method will increase or decrease protein levels in response to what is needed by the cell. Eukaryotic cells change the accessibility (epigenetic), transcription, or translation of a gene. This will alter the amount of RNA and the lifespan of the RNA to alter the amount of protein that exists. Eukaryotic cells also control protein translation to increase or decrease the overall levels. Eukaryotic organisms are much more complex and can manipulate protein levels by changing many stages in the process.


Watch the video: 1 Next Generation Sequencing NGS - An Introduction (December 2021).