# Making sense of enzyme Km comparisons

I have encountered comparisons of the Michaelis-Menten constant ($$K_m$$) a few times. Generally speaking if the $$K_m$$ of an enzyme is higher, then its affinity to its substrate is lower. How does this make sense?

Maybe the maximum velocity ($$V_{mathrm{max}}$$) of higher $$K_m$$ enzymes is higher? Then of course, $$K_m$$ can have a higher value. Because $$K_m$$ is the substrate concentration at half of the $$V_{mathrm{max}}$$. But I think we cannot determine affinity with $$K_m$$.

Since the Michaelis-Menton constant Km is the concentration of substrate at 0.5Vmax, it is an inverse measure of its substrate affinity, because a lower Km indicates that less substrate is needed to reach a certain reaction speed. Hence, a low Km means a high substrate affinity.

"Maybe the maximum velocity (Vmax) of higher-Km enzymes is higher? Then of course Km can have a higher value."

is incorrect. Km characterizes how steep reaction speed increases with substrate availability; it does not determine maximum speed.

Lastly, to address your title question, comparing affinities can make a lot of sense. For example, consider the case where enzymes catalyzing similar reactions in different species of organisms are compared. Very low Km means optimal use of small substrate levels, while a high Vmax shows optimized reaction speeds. This in turn may tell you something about optimal habitats and evolutionary pressure.

## Structural Comparison of MTA Phosphorylase and MTA/AdoHcy Nucleosidase Explains Substrate Preferences and Identifies Regions Exploitable for Inhibitor Design†

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

## Abstract

Literature search is a routine practice for scientific studies as new discoveries build on knowledge from the past. Current tools (e.g. PubMed, PubMed Central), however, generally require significant effort in query formulation and optimization (especially in searching the full-length articles) and do not allow direct retrieval of specific statements, which is key for tasks such as comparing/validating new findings with previous knowledge and performing evidence attribution in biocuration. Thus, we introduce LitSense, which is the first web-based system that specializes in sentence retrieval for biomedical literature. LitSense provides unified access to PubMed and PMC content with over a half-billion sentences in total. Given a query, LitSense returns best-matching sentences using both a traditional term-weighting approach that up-weights sentences that contain more of the rare terms in the user query as well as a novel neural embedding approach that enables the retrieval of semantically relevant results without explicit keyword match. LitSense provides a user-friendly interface that assists its users to quickly browse the returned sentences in context and/or further filter search results by section or publication date. LitSense also employs PubTator to highlight biomedical entities (e.g. gene/proteins) in the sentences for better result visualization. LitSense is freely available at https://www.ncbi.nlm.nih.gov/research/litsense.

## Transcriptomics of Arabidopsis with respect to sulphur metabolism

Arabidopsis is well established as a valuable model plant system ( Scholl et al., 2000). The release of its genome sequence (The Arabidopsis Genome Initiative, 2000) increased its value as a research tool and boosted approaches aimed at deciphering holistic rather than individual plant responses to particular conditions. Such systems approaches have been applied to study plant sulphate metabolism, with the aim of systematically unravelling the molecular responses. While early analyses focussed on gene and metabolite responses of sulphate uptake, reduction, and assimilation processes, systems approaches have sought to examine the connections and interplay within the system as a whole. This has been based on the inherent assumption that a response to (e.g.) sulphate starvation will not only affect sulphur metabolism per se but also other interconnected and downstream processes. It is obvious that in this context different parts of plants, such as roots, leaves, or seeds, will show both general and also specific responses. Furthermore, developmental aspects have to be taken into account when analysing plant responses to sulphate withdrawal or resupply over time. Transcriptomics studies as part of sulphur systems biology were pioneered by Hirai et al. (2003), Nikiforova et al. (2003), and Maruyama-Nakashita et al. (2003) using Arabidopsis. A transcriptome analysis of sulphur (S-)deprived Arabidopsis seedlings was performed for different durations of S starvation in order to address the development of the response over time ( Nikiforova et al., 2003). In addition, Arabidopsis seedlings were treated with O-acetylserine (OAS), the immediate precursor of cysteine biosynthesis ( Hirai et al., 2003). OAS accumulates upon S deprivation and early research considered it as an S-starvation signal ( Saito, 2000 Hopkins et al., 2005), which was indeed subsequently demonstrated ( Hubberten et al., 2012a). Due to the technical limitations at the time, these pioneering studies were performed on macro-arrays, each comprising about 10 000 random cDNAs ( Hirai et al., 2003 Nikiforova et al., 2003), or using Affymetrix 8K chips with probes for

8000 genes ( Maruyama-Nakashita et al., 2003). The differentially expressed genes that were identified included some that were already known to be responsive to S status, such as sulphate transporters ( Smith et al., 1997 Hawkesford, 2000), which confirmed the validity of the approach. More interestingly, information on novel genes was obtained. Thus, alongside known genes, these early studies provided a catalogue of genes that as yet had no assignment of their function in response to S-deficient growing conditions.

The number of transcriptome studies that have been conducted on Arabidopsis is still quite low with only 14 in total ( Table 1). It is justifiable to include arrays of plants exposed to selenium ( Van Hoewyk et al., 2008) as it acts as a competitor with sulphur, thus mimicking S deprivation. Among related Brassicaceae species a sulphate starvation study was performed on rapeseed ( Buhtz et al., 2008, 2010). Despite the fact that rapeseed has a high requirement for sulphate ( Girondé et al., 2014), there is a lack of transcriptomics studies on this subject. With respect to Arabidopsis, the tissues and conditions investigated in the early studies were already quite diverse ( Table 1). This provided a wealth of information, but it made comparisons between studies difficult as each experiment was based on very specific conditions with respect to the sulphate levels applied and/or the tissues examined. For example, the tissues studied in response to S-deprivation included whole seedlings grown on agar plates ( Nikiforova et al., 2003), seedlings separated into leaves and roots ( Hirai et al., 2003 Maruyama-Nakashita et al., 2003, 2005, 2006), and developing seeds ( Higashi et al., 2006). Subsequent studies have examined whole seedlings exposed to S deprivation in submerged seedling cultures followed by resupply in order to score for recovery processes ( Bielecka et al., 2015), hydroponically grown root tissues exposed to S deprivation and separated into fractions of various cell types ( Iyer-Pascuzzi et al., 2011), and studies where S deprivation has been one factor among other combined stresses ( Barciszewska-Pacak et al., 2015 Forieri et al., 2017). Sulphate starvation has been used as a condition to investigate phloem-specific micro-RNAs in rapeseed ( Buhtz et al., 2010). Although only a subset of the phloem RNA fraction was analysed, results regarding the regulatory function of miRNA-395 were substantiated in further studies employing Arabidopsis ( Kawashima et al., 2009, 2011). Sulphate metabolism in response to acid rain conditions has been investigated, with high inputs of S under low pH conditions ( Liu et al., 2014). Acid rain is an ecological and a health problem in many countries due to combustion of fossil fuels releasing SO2. In North America and Europe, SO2 emissions have been successfully reduced over recent decades due to legislative measures that have regulated industrial and domestic use of fossil fuels. However, this has consequently reduced sulphur inputs into agro-ecological systems, which in turn has triggered research into its agricultural impact ( Haneklaus et al., 2003 Menz and Seip, 2004). From the molecular perspective, several transcriptome datasets on wheat in relation to responses to S nutrition may be the primary resource for studying the effects of sulphur inputs ( Table 1).

Transcriptome analyses related to sulphur metabolism

Experiment . Species . Tissue . Type . ID . References .
–S Arabidopsis Seedling Macroarray Nikiforova et al. (2003)
Leaf, Root Macroarray Hirai et al. (2003)
Leaf, Root Affymetrix 8K Chip Maruyama-Nakashita et al. (2003)
Leaf, Root Agilent oligo microarray E-MEXP-211 Hirai et al. (2005)
Root GeneChip ATH1 GSE5688 Maruyama-Nakashita et al. (2005)
Seed GeneChip A-AFFY-2 E-ATMX-1 Higashi et al. (2006)
Root GeneChip ATH1 GSE4455 Maruyama-Nakashita et al. (2006)
Root cell types GeneChip ATH1 GSE30100 GSE30099 GSE30098 Iyer-Pascuzzi et al. (2011)
Root cell types Agilent-custom promoter array GSE30166 Iyer-Pascuzzi et al. (2011)
Seedling GeneChip ATH1 GSE64972 Bielecka et al. (2015)
Seedling, Leaf Illumina HiSeq 2000 GSE66599 Barciszewska-Pacak et al. (2015)
Leaf, Root GeneChip ATH1 GSE81347 Aarabi et al. (2016)
Leaf GeneChip 1.0 ST GSE93048 Dong et al. (2017)
Root GeneChip 1.1 ST GSE77602 Forieri et al. (2017)
Oilseed rape Leaf, Root, Phloem LC Sciences dual colour GSE20263 Buhtz et al. (2010)
+Se Arabidopsis Leaf, Root GeneChip ATH1 GSE9311 Van Hoewyk et al. (2008)
–O2Arabidopsis Seedling GeneChip ATH1 Branco-Price et al. (2008)
+acid rain S Arabidopsis Leaf GeneChip ATH1 GSE52487 Liu et al. (2014)
–S Triticum aestivumLeaf GeneChip Array E-MEXP-1415 Howarth et al. (2008)
Root GeneChip Array E-MEXP-1694 Bo et al. (2014)
Root GeneChip Array GSE61679 Gupta et al. (2017)
Grain Illumina HiSeqTM PE125/PE1 Yu et al. (2018)
Grain NimbleGen microarray E-MTAB-1782 Dai et al. (2015)
Grain NimbleGen microarray E-MTAB-1920 Vincent et al. (2015)
Experiment . Species . Tissue . Type . ID . References .
–S Arabidopsis Seedling Macroarray Nikiforova et al. (2003)
Leaf, Root Macroarray Hirai et al. (2003)
Leaf, Root Affymetrix 8K Chip Maruyama-Nakashita et al. (2003)
Leaf, Root Agilent oligo microarray E-MEXP-211 Hirai et al. (2005)
Root GeneChip ATH1 GSE5688 Maruyama-Nakashita et al. (2005)
Seed GeneChip A-AFFY-2 E-ATMX-1 Higashi et al. (2006)
Root GeneChip ATH1 GSE4455 Maruyama-Nakashita et al. (2006)
Root cell types GeneChip ATH1 GSE30100 GSE30099 GSE30098 Iyer-Pascuzzi et al. (2011)
Root cell types Agilent-custom promoter array GSE30166 Iyer-Pascuzzi et al. (2011)
Seedling GeneChip ATH1 GSE64972 Bielecka et al. (2015)
Seedling, Leaf Illumina HiSeq 2000 GSE66599 Barciszewska-Pacak et al. (2015)
Leaf, Root GeneChip ATH1 GSE81347 Aarabi et al. (2016)
Leaf GeneChip 1.0 ST GSE93048 Dong et al. (2017)
Root GeneChip 1.1 ST GSE77602 Forieri et al. (2017)
Oilseed rape Leaf, Root, Phloem LC Sciences dual colour GSE20263 Buhtz et al. (2010)
+Se Arabidopsis Leaf, Root GeneChip ATH1 GSE9311 Van Hoewyk et al. (2008)
–O2Arabidopsis Seedling GeneChip ATH1 Branco-Price et al. (2008)
+acid rain S Arabidopsis Leaf GeneChip ATH1 GSE52487 Liu et al. (2014)
–S Triticum aestivumLeaf GeneChip Array E-MEXP-1415 Howarth et al. (2008)
Root GeneChip Array E-MEXP-1694 Bo et al. (2014)
Root GeneChip Array GSE61679 Gupta et al. (2017)
Grain Illumina HiSeqTM PE125/PE1 Yu et al. (2018)
Grain NimbleGen microarray E-MTAB-1782 Dai et al. (2015)
Grain NimbleGen microarray E-MTAB-1920 Vincent et al. (2015)

Transcriptome analyses related to sulphur metabolism

Experiment . Species . Tissue . Type . ID . References .
–S Arabidopsis Seedling Macroarray Nikiforova et al. (2003)
Leaf, Root Macroarray Hirai et al. (2003)
Leaf, Root Affymetrix 8K Chip Maruyama-Nakashita et al. (2003)
Leaf, Root Agilent oligo microarray E-MEXP-211 Hirai et al. (2005)
Root GeneChip ATH1 GSE5688 Maruyama-Nakashita et al. (2005)
Seed GeneChip A-AFFY-2 E-ATMX-1 Higashi et al. (2006)
Root GeneChip ATH1 GSE4455 Maruyama-Nakashita et al. (2006)
Root cell types GeneChip ATH1 GSE30100 GSE30099 GSE30098 Iyer-Pascuzzi et al. (2011)
Root cell types Agilent-custom promoter array GSE30166 Iyer-Pascuzzi et al. (2011)
Seedling GeneChip ATH1 GSE64972 Bielecka et al. (2015)
Seedling, Leaf Illumina HiSeq 2000 GSE66599 Barciszewska-Pacak et al. (2015)
Leaf, Root GeneChip ATH1 GSE81347 Aarabi et al. (2016)
Leaf GeneChip 1.0 ST GSE93048 Dong et al. (2017)
Root GeneChip 1.1 ST GSE77602 Forieri et al. (2017)
Oilseed rape Leaf, Root, Phloem LC Sciences dual colour GSE20263 Buhtz et al. (2010)
+Se Arabidopsis Leaf, Root GeneChip ATH1 GSE9311 Van Hoewyk et al. (2008)
–O2Arabidopsis Seedling GeneChip ATH1 Branco-Price et al. (2008)
+acid rain S Arabidopsis Leaf GeneChip ATH1 GSE52487 Liu et al. (2014)
–S Triticum aestivumLeaf GeneChip Array E-MEXP-1415 Howarth et al. (2008)
Root GeneChip Array E-MEXP-1694 Bo et al. (2014)
Root GeneChip Array GSE61679 Gupta et al. (2017)
Grain Illumina HiSeqTM PE125/PE1 Yu et al. (2018)
Grain NimbleGen microarray E-MTAB-1782 Dai et al. (2015)
Grain NimbleGen microarray E-MTAB-1920 Vincent et al. (2015)
Experiment . Species . Tissue . Type . ID . References .
–S Arabidopsis Seedling Macroarray Nikiforova et al. (2003)
Leaf, Root Macroarray Hirai et al. (2003)
Leaf, Root Affymetrix 8K Chip Maruyama-Nakashita et al. (2003)
Leaf, Root Agilent oligo microarray E-MEXP-211 Hirai et al. (2005)
Root GeneChip ATH1 GSE5688 Maruyama-Nakashita et al. (2005)
Seed GeneChip A-AFFY-2 E-ATMX-1 Higashi et al. (2006)
Root GeneChip ATH1 GSE4455 Maruyama-Nakashita et al. (2006)
Root cell types GeneChip ATH1 GSE30100 GSE30099 GSE30098 Iyer-Pascuzzi et al. (2011)
Root cell types Agilent-custom promoter array GSE30166 Iyer-Pascuzzi et al. (2011)
Seedling GeneChip ATH1 GSE64972 Bielecka et al. (2015)
Seedling, Leaf Illumina HiSeq 2000 GSE66599 Barciszewska-Pacak et al. (2015)
Leaf, Root GeneChip ATH1 GSE81347 Aarabi et al. (2016)
Leaf GeneChip 1.0 ST GSE93048 Dong et al. (2017)
Root GeneChip 1.1 ST GSE77602 Forieri et al. (2017)
Oilseed rape Leaf, Root, Phloem LC Sciences dual colour GSE20263 Buhtz et al. (2010)
+Se Arabidopsis Leaf, Root GeneChip ATH1 GSE9311 Van Hoewyk et al. (2008)
–O2Arabidopsis Seedling GeneChip ATH1 Branco-Price et al. (2008)
+acid rain S Arabidopsis Leaf GeneChip ATH1 GSE52487 Liu et al. (2014)
–S Triticum aestivumLeaf GeneChip Array E-MEXP-1415 Howarth et al. (2008)
Root GeneChip Array E-MEXP-1694 Bo et al. (2014)
Root GeneChip Array GSE61679 Gupta et al. (2017)
Grain Illumina HiSeqTM PE125/PE1 Yu et al. (2018)
Grain NimbleGen microarray E-MTAB-1782 Dai et al. (2015)
Grain NimbleGen microarray E-MTAB-1920 Vincent et al. (2015)

A common feature of all systems biology approaches is that they yield vast amounts of data ( Kopriva et al., 2015). Hence, statistical methods have had to be developed or adapted to deal with this ( Klipp et al., 2016 Xia, 2018). In the context of sulphur systems biology, such methods were already being applied to the early transcriptomics data sets. Especially when attempting to correlate transcriptomics and metabolomics data ( Nikiforova et al., 2005b), it was inevitably necessary to apply bioinformatics approaches in order to allow data interpretation and the development of models ( Hirai et al., 2004 Hirai and Saito, 2004 Nikiforova et al., 2004, 2005a). Results are often displayed as correlation networks ( Nikiforova et al., 2005a). This kind of approach is aimed at filtering the data to remove the ‘noise’ of variability associated with gene expression and metabolite contents, and in doing so to highlight differences that are statistically significant ( Massonnet et al., 2010).

One constraint of systems approaches such as transcriptomics, proteomics, or metabolomics is the fact that even if concentration differences per se are determined, they may not represent changes in activities of relevant proteins or enzymes, or of metabolite fluxes. An example of such a situation where transcriptomics would not reveal an important gene is the transcription factor sulfur limitation1 (SLIM1, AT1G73730), which has been identified through genetic screening of Arabidopsis mutants ( Maruyama-Nakashita et al., 2006) and has been shown to control a major part of the S-starvation response ( Kawashima et al., 2011 Wawrzyńska and Sirko, 2014). As far as current data suggest, SLIM1 itself is not, or is only marginally, transcriptionally regulated upon S deprivation. EIN3 (AT3G20770), a major factor involved in ethylene signalling, has been shown to modulate SLIM1 binding activity to its target gene promoters ( Wawrzyńska and Sirko, 2016). As the authors suggest, this probably interferes with the S deficiency-dependent induction of target genes by SLIM1. However, they do not exclude the possibility that further regulators might be involved in shaping the response to S deprivation. To unravel the complexity of the regulation of plant S metabolism it is therefore obvious that despite the wealth of data provided by systems approaches, targeted analyses need to be combined in order to reveal the cellular and physiological responses to S deprivation ( Fig. 1A).

Deposition of systems biology results in databases allows data to be revisited when new knowledge is available, such as improved gene annotation, and this can not only confirm initial assumptions but also provide novel information ( Fig. 1A Nikiforova et al., 2005a Hoefgen and Watanabe, 2017). Recently, Henríquez-Valencia et al. (2018) have conducted a comparative meta study using existing data sets together with novel bioinformatics approaches. This led to the identification of transcription factor networks that provide new candidate genes for sulphate research that would not otherwise have been identifiable in individual experimental set-ups. This also highlights the need for further transcriptomics studies to be provided to the scientific community to advance our knowledge. With increasing depositions of data related to S metabolism, including data on species other than Arabidopsis ( Table 1), such approaches will have a greater impact on the generation of hypotheses. New candidate genes and biochemical processes interconnected to plant S metabolism will be identified as a result of these systems-based and targeted approaches. It is a matter of ongoing debate, probably driven by individual research interests, as to whether only ‘robust’ processes that occur under a variety of conditions and in various plants are relevant or whether ‘specific’ responses that occur under only certain conditions are the most meaningful for improving our understanding of plant sulphur physiology.

While initial high-throughput analyses can lead to the generation of hypotheses ( Nikiforova et al., 2003, 2005a, 2005b), these need to be tested experimentally for further validation ( Fig. 1A). Cataloguing alone is insufficient to develop knowledge of processes and, eventually, to exploit them for plant breeding and crop production. Validation efforts necessarily need to employ all levels of molecular biology and bioinformatics-based approaches in an iterative manner ( Hoefgen and Watanabe, 2017). An example is the investigation of predicted hub genes ( Nikiforova et al., 2005a) through a mutational approach ( Falkenberg et al., 2008). Three transcription factors, IAA13, IAA28, and ARF-2 (ARF1-Binding Protein), in a network responsive to S deprivation have been identified as being connected to multiple downstream and upstream interactors, and thus constitute hubs, making it likely that they represent important genes ( Mähler et al., 2017). Falkenberg et al. (2008) subsequently showed that these transcription factors indeed play a role in controlling certain aspects of plant sulphate metabolism, and thus validating the assumption that identification of correlative network hubs is indeed a tool that can be used to identify relevant target genes—in this case linking S deprivation to auxin signalling. In fact, IAA28 may constitute the link between auxin signalling, S starvation, and alterations in root development ( Rogg et al., 2001 Falkenberg et al., 2008 De Rybel et al., 2010), although this remains to be demonstrated functionally. A link to auxin had been postulated previously ( Nikiforova et al., 2005a). A further example is the identification of the functional roles of sulfur deficiency induced 1 (SDI1) and SDI2 ( Fig. 1B). An AFLP study on wheat identified SDIs as being strongly responsive S-deprivation genes ( Howarth et al., 2005) and they were also identified in early macroarray studies on S-deprived and OAS-treated Arabidopsis ( Hirai et al., 2003 Nikiforova et al., 2003). However, the function of the SDI genes was not clear from these initial studies. A combination of a bioinformatics approach to OAS-related responses ( Hubberten et al., 2012a) and a mutational approach coupled with transcriptomics and metabolomics analyses ( Aarabi et al., 2016) revealed that SDI1 and SDI2 interact through protein–protein binding with a previously described transcription factor, MYB28. Upon S deprivation in Arabidopsis, this binding down-regulates MYB28 transcription and consequently reduces the biosynthesis of glucosinolates ( Gigolashvili et al., 2007b Sønderby et al., 2007). In functional terms, this may divert S resources from secondary to primary metabolism. Interestingly, Hubberten et al. (2012a) additionally revealed a group of OAS-responsive genes that are co-regulated under various conditions, termed OAS-cluster genes. Co-regulated expression hints at the existence of common upstream regulatory control mechanisms, which would be worth investigating.

## Pathways for Life

Among the archaea, only the Methanosarcineae can form multicellular structures, usually in response to en-vironmental change.

Microbiologist J. Greg Ferry is surprisingly calm when he talks about the most exciting scientific experience he's ever had.

It was spring 2002, and Ferry was in Cambridge, Massachusetts, gathered with "a tight-knit group" of about two dozen researchers to discuss an obscure microbe. "The bug," as he calls it, also known as Methanosarcina acetivorans, is Ferry's baby. He discovered it 20 years ago living in a mass of kelp in an underwater trench off the coast of southern California. He even got to name it. "Acetivorans means voracious for acetate," Ferry explains. Among other things, M. acetivorans eats acetate—a salt derived from acetic acid—and expels methane. It eschews oxygen. It excels in harsh environments, like sludge pools, animal intestines, and bogs. And it's been around for billions of years—long enough to make it a key player in the evolution of life on Earth.

In 2001, as the Human Genome Project neared completion, Ferry had urged colleagues at MIT's Whitehead Institute—a well-funded locus of genetics research—to sequence M. acetivorans. Surely they would recognize "its importance in all of biology," thought Ferry. They did. The sequencing took less than a year.

Computers got first crack at making sense of the sequence—culling databases to match strings of code parsed into genes with known proteins and enzymes. Then the data were sent to a handful of devotees, who with Ferry had been studying M. acetivorans for years. "For the first time we were seeing the details of how the bug works, and it was like, 'Wow!'" Ferry says in a quiet voice. He pauses. "It was almost overwhelming, actually."

But the work was just beginning. The experts needed to verify the computer's translation of the code. "The computer could misidentify a protein," says Ferry. "The information in the databases could be wrong." The ultimate test is to run experiments in the lab to test the gene-protein pair, he explains. "But we're talking about a huge number of genes!" So the team selected certain proteins that have the most impact on the organism's functioning and split up the work, with each research group assigned different parts of the sequence to verify. Ferry's group was asked to test and comment on a gene that codes for a key enzyme in the process of converting acetate to methane—a pathway involving several proteins and enzymes that carry out chemical reactions in steps.

The collaborators communicated via email, sharing their observations, and marveling at what the genome was revealing to them. Then came the best part. They all came together at Whitehead for what amounted to a M. acetivorans summit. For two days they asked questions of each other, presented findings from their respective labs, and formulated plans for further research.

"The meeting was extremely fascinating, enormously fun," says Ferry. "Getting that many scientists together is a true testament to the uniqueness of M. acetivorans."

One surprise that Ferry and his colleagues uncovered was the sheer size of the M. acetivorans genome: With 4,500 genes, the amount of genetic information it contained seemed huge for a one-celled organism. (The human genome, by comparison, is only about 30,000 genes.) "The size of the genome is a manifestation of the organism's diversity," Ferry explains. "It can adapt to its environment better than any other organism in the archaea domain because it has the information to produce proteins and enzymes that allow it to respond to environmental changes."

Science diving: individual species of Methanosarcina have been found in freshwater and marine environments, such as kelp beds (above), and in decaying organic matter, among other places.

The archaea—one of the three main branches of life—have only recently been recognized as a distinct life form. For years, scientists thought these tiny organisms were merely an offshoot of bacteria, distant cousins, perhaps, of familiar disease-causing bugs like E. coli and Streptococcus. Then, in the late 1970s, Carol Woese, a biologist at the University of Illinois, discovered that these microbes had genes that were fundamentally different from those of other bacteria. In fact, genetically-speaking, the archaea are more closely related to the third branch of life, the eukarya—multicellular organisms including fungi, plants, and animals. Scientists now believe that the archaea and the bacteria evolved separately from a common ancestor, and that the eukarya branched off from archaea at a later point.

Still, archaea resemble bacteria in many ways: they are one-celled organisms with a coil of genetic material, and some of them exist alongside bacteria in seawater or soil. But most archaea are more exotic in their choice of habitat. One group prefers extremely cold environments like Antarctic ice. Another likes things hot: the boiling springs of Yellowstone. Yet another subset of archaea, called the methanogens, eke out a living in oxygen-poor environments and produce methane as a waste product. M. acetivorans is a methanogen. So toxic is oxygen to this class of organisms that microbiologists like Ferry must grow them in special sealed chambers accessible only by a pair of built-in rubber gloves.

M. acetivorans is the largest of the archaea, and its genome harbors many of the tricks and tools that archaea have developed over the millenia to survive. Unraveling its genome, the research team uncovered a slew of interesting properties. It is the only species in its domain, for example, that has three different ways of converting its food to methane. (The acetate-to-methane pathway is only one of its choices.) "These pathways are ancient, perhaps the first pathways to obtain energy for life," Ferry explains. Three to four billion years ago, at the time of the origin of life, acetate and carbon monoxide, another food source for M. acetivorans, were in abundance in the environment.

Another interesting feature: Although the organism is a strict anaerobe, meaning it lives without oxygen, it possesses a gene that codes for an enzyme that helps to break down oxygen. Its genetic code also contains instructions for flagella—those whiplike structures that help cells move —and for chemotaxis, the ability to maneuver toward or away from a specific chemical. Curiously, no one has ever observed such purposeful movement in M. acetivorans.

The organism also exhibits several examples of multiple genes coding for the same protein, "which may seem wasteful," Ferry explains, "but is actually an indication of the bug's amazing ability to adapt to changing environments." These may be only the beginning of M. acetivorans's important qualities, he adds. The functions of over 35 percent of the organism's genes remain a mystery.

I would say that 75 percent of the research taking place in my lab now is based on questions that came out of the meeting at MIT," says Ferry. Indeed, in the summer of 2002, with a grant from the National Science Foundation, Ferry spearheaded the formation of the Consortium for Archaeal Genomics and Proteomics—a collaboration among several universities based at Penn State—using M. acetivorans as its model organism. By studying the genes and proteins of this single microbe, the members of the consortium hope to advance the understanding of the entire archaea domain.

M. acetivorans makes a good model because of its size, Ferry says, and also because researchers have developed a procedure for inserting random genetic information into Methanosarcina cells, something that can't easily be done with every type of cell. "Inserting nonsense information into the sequence for a given gene can inactivate or 'knock-out' the gene," Ferry explains. "Then you can look at how the organism has been crippled. Which pathway was knocked out? That can tell you which enzyme in which pathway a particular gene codes for."

Identifying these enzymes could lead to the discovery of novel proteins and enzymes useful for drug design, or to clean up chemical spills, Ferry says. For example, M. acetivorans can make an enzyme to break down kepone—a toxic compound once used as an insecticide—into hydrochloric acid and methane. M. acetivorans' ability to break down so many different food sources into methane is interesting to scientists because methane is a potential alternative energy source.

Ferry himself is particularly interested in finding out how M. acetivorans uses redundancy—multiple genes coding for the same proteins— to respond quickly to changes in its environment, such as the presence or absence of certain kinds of food.

In the lab, he says, if M. acetivorans is given acetate to eat and nothing else, it will "turn on" genes that code for the enzymes that process acetate. But it will also turn on one of three duplicate genes that code for the enzyme that processes methanol, another food source, as a just-in-case. That way, Ferry explains, if methanol suddenly became available, M. acetivorans could begin to metabolize it immediately. Where the situation is reversed, he adds, and it only has methanol to eat, all three genes for the processing of methanol are turned on, and just one of the acetate enzymes is turned on. "Now we have a reason why they may have duplicated genes—it's a new way of adapting to their environment," says Ferry. "Usually there's one gene and it gets turned on and off in response to an environmental condition.

"In nature, a slow-growing organism like M. acetivorans would want to switch pathways rapidly so that it could quickly take advantage of a new food source." If it can manage that without much lag time—the time required to turn on a new gene —"it would have a leg up on competing organisms," he says.

Ferry pauses. "They're thinking," he muses. "It's kind of scary, isn't it? They're enormously adaptive, extremely intelligent."

Ferry is also interested in how M. acetivorans responded to the rise of oxygen in Earth's early environment. "This is an interesting story," he says, smiling. "Did you know that oxygen is the worst pollutant ever produced in the history of life?" Although many forms of life have come to depend on oxygen for respiration, he explains, it's still toxic to most cells. Oxygen can form highly reactive free radicals that attack enzymes. To protect itself, the human body has evolved enzymes that protect us from those free radicals. Still, oxygen wears down cells, and plays a major role in aging.

Ferry paints the scene: For the first billion years of life on Earth, microbes thrived without oxygen. Then plants came along, and photosynthesis. The oxygen produced attacked many of life's molecules, combined with them, changed their structures. "It was a doozy, the biggest environmental change that life had encountered. Organisms had to invent new pathways—new proteins and enzymes—for dealing with toxic oxygen," he says. Fortunately, the rise of oxygen was slow enough that many organisms had time to adjust instead of perishing. To help their chances, some anaerobes hid themselves in places where no oxygen could reach, such as deep in the mud at the bottom of a swamp.

The kelp-filled trench where Ferry and a graduate student discovered M. acetivorans back in 1983 was just that sort of place. Fishermen call the areas above the trenches "bubble holes" because the organisms below release small bubbles of methane gas—their waste product—that expand as they rise and then burst when they reach the surface.

It was here, in this imperfect hiding place, that M. acetivorans might have developed its unique characteristics. "The organisms probably encounter trace amounts of oxygen in the kelp beds," Ferry explains. The water layer above the kelp bed contains dissolved oxygen, and during a storm, trace amounts of oxygen can be mixed into the kelp layer. "Oxygen is very poisonous to it so it must have developed mechanisms to cope," says Ferry, including, perhaps, a pathway to convert free radicals to less toxic compounds. Researchershave already discovered in M. acetivorans a gene that codes for oxidase, an enzyme that partially breaks down oxygen, he notes. A full pathway for metabolizing oxygen has yet to be uncovered, however.

"If we can understand how Methanosarcina deals with oxygen now, Ferry suggests, we'll have a clue as to how life early on developed mechanisms to survive this incredible insult." That could lead to "a fundamental advance in our understanding of evolution, and of how humans deal with oxidative stress.

"Methanosarcina and other anaerobes are our ancestors," he says. "They laid down all the fundamental metabolism for life as we know it today."

## Making (anti)sense of non-coding sequence conservation

A substantial fraction of vertebrate mRNAs contain long conserved blocks in their untranslated regions as well as long blocks without silent changes in their protein coding regions. These conserved blocks are largely comprised of unique sequence within the genome, leaving us with an important puzzle regarding their function. A large body of experimental data shows that these regions are associated with regulation of mRNA stability. Combining this information with the rapidly accumulating data on endogenous antisense transcripts, we propose that the conserved sequences form long perfect duplexes with antisense transcripts. The formation of such duplexes may be essential for recognition by post-transcriptional regulatory systems. The conservation may then be explained by selection against the dominant negative effect of allelic divergence.

Since the early 1980s many studies on particular genes have noted sequence conservation in the 3′ untranslated regions (UTRs) of vertebrate mRNAs ( 1–3). Duret et al. ( 4) estimated that >30% of vertebrate mRNAs had conserved regions in their 3′ UTRs, defined as sharing at least 70% identity over >100 nucleotides between corresponding homologous genes (orthologs). They also noted the less frequent but still significant conservation in 5′ UTRs. We have recently observed long stretches of protein coding regions without silent changes in a substantial fraction of vertebrate mRNAs most of these contain unusually conserved blocks both in the coding regions and in 5′ or 3′ UTRs (H.Sicotte and D.Lipman, unpublished data). A representative sample from a comparison of human and mouse orthologs is shown in Table 2. These conserved sequences are essentially unique in the genome and thus match only to corresponding regions of orthologous mRNAs in other species. The observed level of conservation is far greater than expected for non-coding regions or synonymous sites in coding regions on the basis of known evolutionary rates and divergence times ( 5).

What function constrains these regions? Sequence specific recognition, e.g., by RNA binding proteins, is an unlikely explanation because of the length of the conserved sequences. Furthermore, because so many different mRNAs contain these conserved regions, which are unique for each set of orthologs, sequence specific recognition would lead into an almost infinite regress. With >30% of the genes containing these unique conserved regions, then another 30% of the genes would be needed to code for these binding proteins, not to mention the proteins regulating these binding proteins, and so on. One might posit that many of these different sequences share common RNA secondary structure thus reducing the number of different binding proteins, but the sequence conservation would remain a mystery. It has been shown that short AU rich motifs promote mRNA degradation ( 6). Such motifs are often seen in the conserved portions of 3′ UTRs but these cannot explain the striking conservation between orthologs either. Another possibility would be that the conservation is due to the encoding of a protein on the complementary strand. Extensive database searches using translations of the complementary strand to these conserved regions did not reveal homologies to known proteins which could explain this conservation (results not shown).

A number of studies provide evidence that the conserved regions in 3′ UTRs are required for the regulation of mRNA stability ( 7). Typically deletion of these regions render the mRNA unresponsive to regulatory signals which normally lead to destabilization ( 8–10). Conversely, introduction of these regions into reporter mRNAs make them responsive to regulated destabilization ( 11–13). Conserved regions in 5′ UTRs ( 14) and coding regions ( 15–17) have also been implicated in regulation of mRNA stability.

The large number of bases in conserved blocks suggests a base-pairing interaction between mRNA and another nucleic acid. Over the last several years there has been an increasing number of reports of antisense RNA transcripts encoded by the complementary strand of a gene ( 18–22). Although most reported examples do not show evidence of coding regions, in some cases these countertranscripts encode expressed proteins ( 23, 24). These countertranscripts are sometimes found in different tissues or developmental stages than their corresponding sense mRNA and thus a regulatory role for endogenous antisense has been proposed ( 25–28). Examples of regulation of gene expression by endogenous antisense have also been described for nematode ( 29), dictyostelium ( 30) and prokaryotes ( 31).

Why would the antisense-based regulatory mechanism require sequence conservation? If cells have a destabilization/degradation system which specifically recognizes long, nearly perfect RNA duplex, then mutations in a region corresponding to a duplex will be selected against because of their mismatch with the other allele ( Fig. 1). Consider, for example, the developmental expression pattern for Hoxa 11 sense and antisense transcripts ( 27) where sense transcripts are at high levels, antisense transcripts are at low levels, and vice-versa. When the Hoxa 11 antisense is abundant, most sense transcripts will be duplexed. Assuming the rate of transcription for the two alleles is roughly equal, a mutation in a region corresponding to a duplex would result in approximately half the sense transcripts forming mismatched duplexes. Let us further assume that the half life of a sense transcript is 12 h and the half life of a perfectly matching sense/antisense duplex is 12 min. When most of the sense transcripts are in perfect duplexes the drop in mRNA levels could therefore be an order of magnitude or more. However, a mutation leading to allelic divergence in a complementary region could lead to defective recognition of approximately half of the sense/antisense duplexes thus, half the sense transcripts would have a half life of 12 min and half would have a half life approaching 12 h. The endogenous antisense mechanism would then only be able to reduce mRNA levels by a factor of two. Thus, the conserved regions in mRNAs will be maintained through selection against allelic divergence. In the three cases where the endogenous antisense has been sequenced and the corresponding orthologous mRNA sequences are also available, there is a strong correlation of complementary segments and sequence conservation. For example, in the BFGF gene, there is a single silent change between human and rat sequences in the 280 bases of the coding region which overlap the antisense transcript (unpublished observations).

## Enzyme Kinetics: Catalysis & Control

Far more than a comprehensive treatise on initial-rate and fast-reaction kinetics, this one-of-a-kind desk reference places enzyme science in the fuller context of the organic, inorganic, and physical chemical processes occurring within enzyme active sites. Drawing on 2600 references, Enzyme Kinetics: Catalysis & Control develops all the kinetic tools needed to define enzyme catalysis, spanning the entire spectrum (from the basics of chemical kinetics and practical advice on rate measurement, to the very latest work on single-molecule kinetics and mechanoenzyme force generation), while also focusing on the persuasive power of kinetic isotope effects, the design of high-potency drugs, and the behavior of regulatory enzymes.

Far more than a comprehensive treatise on initial-rate and fast-reaction kinetics, this one-of-a-kind desk reference places enzyme science in the fuller context of the organic, inorganic, and physical chemical processes occurring within enzyme active sites. Drawing on 2600 references, Enzyme Kinetics: Catalysis & Control develops all the kinetic tools needed to define enzyme catalysis, spanning the entire spectrum (from the basics of chemical kinetics and practical advice on rate measurement, to the very latest work on single-molecule kinetics and mechanoenzyme force generation), while also focusing on the persuasive power of kinetic isotope effects, the design of high-potency drugs, and the behavior of regulatory enzymes.

The turnover number or catalytic constant \$ k_>\$ in the Michaelis-Menten model is the rate constant for the productive dissociation of intermediate \$ce\$ :

The constant \$k_>\$ says how much product forms from intermediate but does not say how much intermediate forms in the first place. It is assumed that there is a rapid equilibrium between enzyme, substrate and intermediate: \$ce\$

that can be described by either an association or a dissociation constant for the equilibrium between intermediate, apo enzyme and substrate:

At low substrate concentration \$[ce]approx [ce]_0\$ and

In summary, the reason the ratio might look strange is because \$K_mathrm\$ is a dissociation constant, and \$k_>/K_mathrm\$ is the rate constant for low substrate concentration.

The most basic kinetic scheme for enzymes is represented as

As should be clear, the \$k_\$ is the rate constant for the reaction that occurs after substrate is bound to the enzyme. The resulting rate (kcat[E]tot) is only achieved when every molecule of enzyme essentially always is in the act of converting substrate to product. That is, every time a product molecule is released, a substrate molecule immediately binds. Even if the substrate molecule dissociates before reacting, another immediately takes its place. We describe this situation as the enzyme being "saturated" with substrate.

The Km is a measure of how tightly the substrate binds to the enzyme, approximately equal to the equilibrium constant for the dissociation of the substrate from the enzyme. If the substrate is at a low concentration relative to this Km value, then many of the enzyme molecules will not have substrate molecules bound to them and will be unproductive as a result. The overall rate will be substantially below the maximum kcat[E]tot.

All of this is captured in the basic Michaelis-Menten kinetics equation:

You can see that the impact of \$K_m\$ on the rate increases substantially as [S] decreases relative to \$K_m\$ .

Since most substrates exist physiologically at concentrations below what is required for maximum rate, it is the combination of \$k_\$ and \$K_m\$ relative to [S] that determines the in vivo rate of reaction (along with the amount of enzyme of course).

[OP] The catalytic efficiency of an enzyme is given by \$k_mathrm/K_mathrm\$ where \$k_mathrm\$ is the turnover number, or the number of molecules that can be produced per second per active site of an enzyme.

The last part is not quite accurate. \$k_mathrm\$ is the rate of the reaction under saturating conditions divided by the enzyme concentration. The dimensions are one divided by time (a first-order rate constant).

[OP] \$K_mathrm\$ is a measure of the affinity of the enzyme with the substrate, or the likelihood of binding.

The Michaelis-Menten constant \$K_mathrm\$ has a rigorous definition based on rate constants, and its dimensions are the same as those of a concentration. If you interpret \$K_mathrm\$ as the affinitiy of the enzyme to the substrate, you have to know that higher values of \$K_mathrm\$ correspond to lower degree of binding. The likelihood of binding strongly depends on the concentration of substrate. For discussion of the catalytic efficiency, we are interested in substrate concentrations lower than \$K_mathrm\$ .

Why bother dividing the \$k_mathrm\$ by \$K_mathrm\$ ? Isn't the affinity of the enzyme already encoded into the quantity of \$k_mathrm\$ ? How could you be an enzyme that has low affinity, but still have a huge turnover? To me this doesn't seem possible, and thus it is redundant to divide by \$K_mathrm\$ . Likewise, could there be a situation where \$k_mathrm\$ is low, but \$K_mathrm\$ is high?

Here are three examples showing rate vs. substrate concentration. Let's say the red curve shows kinetics of a given enzyme. If we compare the red enzyme to one (in green) that has the same \$K_mathrm\$ but a \$k_mathrm\$ smaller by a factor of two, the rate is half the "red" rate at all concentrations. On the other hand, the blue enzyme has the same \$k_mathrm\$ as the red enzyme, but twice the \$K_mathrm\$ (remember, this means it is hard to get the substrate to bind). At high substrate concentration, red and blue show the same miaximal rate, but at low concentrations, the "red reaction" is twice as fast as the "blue reaction".

In fact, the green and the blue enzyme show identical behavior at low substrate concentrations because \$frac<>><>>\$ for the two enzymes is the same. That is the idea of catalytic efficiency.

## Abstract

We make sense of the world through our mental representations or models. They allow us to identify and categorize objects and ideas and shape our views of the world determining what we consider relevant and valid. Mental models enable reasoning, including clinical reasoning in regard to diagnosis and therapy. Scientific advances in understanding of biologic processes in health and disease have begun to reveal their complexity. Systems biology has embraced this complexity and is recognized as complementary to the reductionist approach to science. The mental models educators impart in their students create the boundaries for what is deemed relevant scientifically and clinically. The successes emanating from the prevailing Western mental model of health and disease focusing on the individual and the reductionist approach to scientific inquiry is unquestioned. However, as our understanding of biologic processes has grown, the necessity of a new mental model that encompasses factors external to the individual is evident. The author proposes that a mental model, akin to an ecosystem, with the individual residing at the confluence of their genetic, behavioral, environmental, and microbiota factors be consciously developed in students. Embracing the complexity and interactions of biologic processes within and external to the individual is necessary to continue to advance science and medicine.

## 1 INTRODUCTION

Our understanding of human biology, its normal functioning in health and disruptions thereof resulting in disease, is continually evolving. From a historical perspective, as chronicled by Porter, 1 Western biomedicine has its roots in the ancient Greek approach of focusing on the human body and its workings in health and disease. This is in distinction to other ancient traditions, such as Chinese and Indian, that included associations with the physical and social environment in their understanding of health and disease.

In the ensuing millennia, paradigm-changing breakthroughs in the conceptualization of biomedical processes, often facilitated in the last two centuries by technologic advances, heralded periods of great progress and major advances in the understanding of normal human functioning and disease. For example, in the mid-1800s advances in microscope design and optics enabled the discovery of cells and the advancement of the cell theory, and the discovery of bacteria and the development of Koch's postulates. Indeed, the mid-1800s with its advances in technology, chemistry, and physics ushered in the era of “modern medicine” based on the scientific understanding of human biology. Probing ever deeper from organ-level physiology to molecular biology, scientific discovery has allowed us to explore and understand biologic functions at evermore granular levels. The resulting advances in knowledge emanating from this scientific approach to the study of biologic processes and their perturbations have transformed not only the depth of our biomedical understanding but also our clinical options for the diagnosis and treatment of disease.

While resulting in great advances, this reductionist approach to bioscience has its limitations. As our understanding has grown, we have continually been faced with the realization that biologic systems are far more complex than initially envisioned. As we entered the 21 st century, systems biology, seen as the antithesis of reductionism, which embraces an integrative approach to comprehending the complexity of biologic systems has been gaining recognition as a valued scientific research complement to the dominant reductionist approach. While a singular definition of systems biology remains elusive, the NIH defines it as “an approach in biomedical research to understanding the larger picture—be it at the level of the organism, tissue, or cell by putting its pieces together. It is in stark contrast to decades of reductionist biology, which involves taking the pieces apart.” 2

For the past century, the practice of clinical medicine has followed a similar “reductionist” approach to the treatment of disease. Frequently referred to as the “infectious disease” approach, a single putative causal “agent” is sought for a particular disease. While proving invaluable for problems such as pneumococcal pneumonia in an otherwise healthy individual, where identifying the offending pathogen and treating with appropriate antibiotics results in dramatic salutary effects and a return to health, many modern maladies have shown to be intractable to this approach. In parallel to the march of science and its dramatic increase in understanding at progressively more granular levels, there has been a proliferation of specialty and subspecialty physicians with deep expertise in increasingly narrow clinical domains. This leads to the all too frequent lament that “I have multiple physicians treating my different parts but no one is treating me!”.

But, as in biomedical science, astute physicians observing the course of patients’ diseases have repeatedly voiced concerns about the limitations of the prevailing “scientific approach” to clinical medicine. As early as 1927 Francis Peabody in a famous lecture at Harvard Medical School opined “What is spoken of as a “clinical picture” is not just a photograph of a man sick in bed it is an impressionistic painting of the patient surrounded by his home, his work, his relations, his friends, his joys, sorrows, hopes, and fears. Now, all of this background of sickness which bear so strongly on the symptomatology is liable to be lost sight of in the hospital.” 3 Half a century later George Engel advocated for what he termed the biopsychosocial approach to medicine which encompassed the biologic, psychologic, and social cultural aspects of the patient. 4 He argued that an individual's biologic functioning, including disease states, was inexorably linked with psychological and social concerns that must be considered by physicians when providing patient care. And most recently the social determinants of health have been shown to play an important role in the health of an individual as well as in the health of populations and are major contributors to observed health disparities. As defined in the CDC’s Healthy People 2020 report “Social determinants of health are conditions in the environments in which people are born, live, learn, work, play, worship, and age that affect a wide range of health, functioning, and quality-of-life outcomes and risks. Conditions (e.g., social, economic, and physical) in these various environments and settings (e.g., school, church, workplace, and neighborhood) have been referred to as “place.” In addition to the more material attributes of “place,” the patterns of social engagement and sense of security and well-being are also affected by where people live. Resources that enhance quality of life can have a significant influence on population health outcomes. Examples of these resources include safe and affordable housing, access to education, public safety, availability of healthy foods, local emergency/health services, and environments free of life-threatening toxins.” 5 Despite a century of calls for attention to external factors recognized as influencing the course and development of disease, we are only beginning to understand the complexity of these relationships and decipher causal linkages.

It is now clear that factors often considered external to the individual and therefore not relevant, are essential contributors to understanding biologic processes that play significant roles in health and disease. While recognizing the enormous value of the Western biomedical tradition focusing on the human body, we must consider the value of the ancient traditions that embraced the physical and social environment as playing important roles in health and disease. It is time for a paradigmatic change in our conceptualization of biologic processes.

If one accepts that there is value in a broadened consideration of factors worthy of study as relevant to human health and disease, one must consider a multitude of barriers. All too often there is almost no communication and collaboration between basic science researchers and researchers investigating the contributions of the array of “social determinants of health.” Generally, the focus of scientists investigating human biology is the individual or model organism in a controlled laboratory setting. Researchers studying the social determinants of health emphasize communities and populations in real-world settings. Different research methodologies, different approaches for determining what are considered significant findings, different professional organizations, and different journals for the dissemination of the research, all contribute to the seeming lack of progress.

Potentially the greatest barrier is our worldview or mental model of what is considered “true science” and the appropriate questions for biologic researchers to study. As individuals, we make sense of the world through our mental representations or models. These mental conceptualizations pervade our daily lives. They allow us to identify and categorize objects, ideas, and more. But these mental models also shape our views of the world and determine what we consider relevant and valid. This is true not only for daily functioning but also for our professional lives. Mental models enable reasoning, including clinical reasoning in regard to diagnosis and therapy. Therefore, theoretical and empirical work regarding their development has been studied for decades. 6 Importantly, the mental constructs which form the foundation for reasoning are shared among members of a discipline. “We become acculturated into societies that provide us with a cognitive toolkit of knowledge and ways of using such knowledge. Professional education and training are primarily about socializing students into particular ways of knowing and thinking about the world of practice.” 7 The mental models we consciously or unconsciously impart to our students set boundaries as to what is “in scope” and what is not. For life science educators laying the foundation for the development of robust mental models is an essential educational outcome, one that is unfortunately very rarely communicated clearly.

It is time for educators to explicitly convey an expanded model that encompasses the seemingly disparate factors that are external to the body, but pertain to human health and disease. While not minimizing the importance of in-depth study of isolated processes, we must inculcate in our students the centrality of understanding how these processes function in an organism and the complex web of interactions in which they exist. Our tendency is to simplify concepts to enhance understanding, but we are doing our students a disservice. The complexity of biologic systems must be embraced. Major advances in science, and subsequently in clinical medicine, will be made when the full panoply of inputs including the genome, proteome, and other -omes, the external physical, social, and psychological environments and behaviors are investigated and facilitated by the use of modern tools such as machine learning. We must provide our students, a mental scaffold on which to build their understanding that embraces both the complexity of biological processes and the myriad behavioral factors and external relationships that either directly or indirectly impact biologic systems.

A potentially helpful analogy is that of an ecosystem. An ecosystem is the physical environment and the living species that inhabit it. Ecosystems such as a tidal pool can be small or expansive like the Great Lakes. As indicated in Figure 1, each of us, as human beings, can be thought of as an ecosystem existing at the intersection of our genomic, behavioral, environmental, and microbiota elements. 8 While recognizing that a single factor may be deterministic, such as a dominant genetic disease with 100% penetrance, generally these elements act in concert to influence health and disease.

What is the evidence that this is a timely consideration?

The dramatic increases in our understanding of basic pathophysiology and mechanisms of disease have raised new issues, one being that the body has only a limited number of responses to a multitude of insults. Even our current disease taxonomy needs revision as mechanisms and interactions are elucidated. Let us consider the example of myocardial infarctions (MI), a leading cause of death in the US and increasingly so in developing nations. Due to the morbidity and mortality associated with MIs they have been a focus of study for decades. However, investigations have shown that different underlying pathophysiologic mechanisms can result in MIs. Seeking a clinical definition consistent with the pathologic definition, the Task Force for the Universal Definition of Myocardial Infarction in 2000 first published a consensus statement providing definitions for different types of myocardial infarctions incorporating the pathologic mechanism. Subsequently, three revisions have been published, the most recent the Fourth Universal Definition of Myocardial Infarction (2018). 9 Modified with each revision, five different definitions for myocardial injury and infarction exist. They range from MI type 1, presenting with symptoms of myocardial ischemia, new ECG changes consistent with ischemia including the development of pathologic Q waves, and imaging evidence of new loss of viable myocardium or wall motion abnormality consistent with ischemia and an acute coronary atherothrombosis to type 5 which is a MI after coronary artery bypass grafting.

Now let us focus only on MI type I which is due to an acute coronary atherothrombosis. While the proximate cause for the MI is the atherothrombosis, if progress is to be made in reducing the morbidity and mortality from type I MIs we need to move upstream to address the problem of atherosclerosis. For over a century, the lipid hypothesis of atherosclerosis emphasized the central role of cholesterol 10 and based on clinical studies, led to the development of recommendations to lower cholesterol. 11 In addition to the focus on dietary cholesterol and pharmacologic manipulations (statins) to lower cholesterol levels, it was recognized that other factors also play a role in the development of coronary artery atherosclerosis. Familial hyperlipidemia has long been recognized as leading to premature atherosclerosis and MIs and recent studies have expanded our understanding of dyslipidemia and the importance of apolipoproteinB100. 12 Similarly, a genetic predisposition to higher serum calcium levels has been associated with an increased risk for coronary artery disease and MI. 13

Epidemiologic studies have shown a correlation between the intake of red meat and the development of atherosclerosis even though the causal mechanisms remained elusive. Recent studies have shown that dietary choline and l-carnitine (found in red meats) are metabolized by intestinal bacteria to produce trimethyl amine, which is absorbed into the bloodstream and oxidized in the liver by the enzyme flavin monooxygenase 3 to trimethylamine N-oxide (TMAO) which plays a causal role in the development of coronary artery disease. 14 Interestingly, a long-term study in initially healthy women showed that increases in TMAO, attributed to changing dietary patterns, led to an increased risk for coronary heart disease irrespective of the absolute level. 15

Environmental factors, such as air pollution, have also been implicated in epidemiologic studies of coronary artery disease. A study of Chinese individuals with long-term exposures to fine particulate matter with aerodynamic diameter less than 2.5 µm and nitrogen dioxide due to living in proximity to major roads were both independently associated with elevated coronary artery calcium scores (a measure of atherosclerosis). It is hypothesized that these pollutants, or others not yet measured, react with airway and lung cellular membranes and generate oxidative reaction products which in turn may have an atherogenic effect. 16

Intriguing, but as yet unexplainable, are findings such as the inverse relationship between adult coronary artery calcium scores and favorable psychosocial scores in childhood when adjusted for other known risk factors. The childhood psychosocial factors that are included in the score include social-economic status, home emotional environment, health behaviors of parents, stressful events that might threaten a child's sense of stability and continuity, the child's self-regulatory behavior or self-control, and the child's general level of social adjustment. 17 A related finding is that subjective social status as reported by an adult individual is similarly inversely related to coronary artery disease. 18 Subjective social status is an individual's self-perception of their position in the social and social-economic hierarchy. It is related to but has been shown to be independent of traditional social-economic status determinations.

What does all this mean? Is there common pathophysiologic mechanisms such as inflammation that is responsible for the initiation and progression of atheroma and atherosclerotic coronary artery disease or are there are a multitude of mechanisms that must be considered? How important are interactions among an individual's environment, behavior, genome, and microbiota? We know that genetic and behavioral factors are independent, but additive in their effect on the risk of developing coronary artery disease. 19 We also know that epigenetic patterns are modulated by environmental and behavioral factors and that epigenetics may play an important role in the development of coronary artery disease. 20 These are but a few examples of the complex interrelationships being explored. The questions are many even in this exceedingly well-studied “disease.” Interventions based on scientific studies that focus on one or just a few factors may well have only a modest or even inconsequential effect on coronary artery disease when applied broadly to individuals.

The literature is replete with examples of effects observed under controlled experimental conditions that are not replicable in wild type settings. Lack of attention or inability to account for behavioral, environmental, genetic, and even microbiota factors may be responsible for some of the irreproducibility. Perhaps, such myopia is also responsible for the number of pharmacologic agents that showed great promise in experimental laboratory conditions, but failed in human clinical trials. And even drugs that have been approved based upon controlled clinical trials, but were subsequently withdrawn due to untoward effects observed in post-release follow-up. With the benefit of hindsight often these failures are explainable. An ecosystem model that consciously incorporates not only the intrinsic biologic factors but also external factors that directly or indirectly impact the biology will enable investigators to better anticipate and account for important variables.

It is sometimes argued that a simplistic model is superior to an overly complex one. I would argue that advances in our understanding of the complexities of biologic processes and the factors that influence or determine them necessitate embracing this complexity in our educational endeavors. However, how best to develop desired mental models remains to be determined. One approach is to build on work depicting complex systems that characterize multiple components at multiple levels interacting with one another, as proposed by Singh et al. 21 Their model consists of a three-level hierarchical tree composed of organs, tissues, and cells with their interactions within and across levels. The complexity of biologic systems and diseases becomes readily apparent with such a depiction. While it further increases the complexity, we need to add to the causal model the effects of interactions with the environment, one's behaviors, and their microbiota. It is only through such a model, an ecosystem model, that the “in scope” boundaries will be broadened.

As scientific paradigms continue to evolve so to our educational paradigms must evolve. Given the rapidity of change in the life sciences and their implications for clinical medicine, the challenge for educators is great. For decades the focus has been teaching our students “what to think.” Given the ubiquitous access to factual information educators must now pivot to teaching our learners “how to think.” An important part of this transformation is inculcating appropriate mental models on which current and future knowledge may be built. To best enable our students and trainees to study and unravel the complexities of biologic systems and enhance our collective understanding of health and disease, we must instill an appreciation for system biology and lay the foundations for a mental model that includes genomic, environmental, and behavioral factors as well as the microbiota. Similarly, our future clinicians must be trained to understand the central role played by these factors in the maintenance of health and development of disease in their patients. While not diminishing the advances enabled by the ancient Greek tradition of focusing on the body, there is a great deal of wisdom in embracing the Chinese and Indian traditions that recognize the importance of the environment and behaviors.