Creating animal models

I was wondering how a scientist in a lab may know which type of animal model to use. I have been looking at the effect of a protein on a disease. And have thought about the ways that's i know this has been done in other research- - deletion of gene which encodes for protein - insert GFP by Gene for protein leaving gene intact - or insert gfp by gene for protein deleting the gene However I cannot seem to understand how a scientist decides to choose a certain model?

If you look at the definition of a model organism as this can be found on the Nature Homepage, this clarifies quite a lot:

An organism suitable for studying a specific trait, disease, or phenomenon, due to its short generation time, characterized genome, or similarity to humans; examples are a fly, fish, rodent or pig, whose biology is well known and accessible for laboratory studies.

So you want an organism, that:

  • can be bred in large numbers
  • has a sufficiently short generation time, so you can analyze several generations of it
  • is well characterized
  • sufficiently close to the organism you want to learn more about (mostly: Humans), so the findings can be transfered
  • can be mutated and has your gene/protein/etc. of interest
  • has been sequenced completely

Besides this general considerations, practical ones also play a role. If you don't have a fish facility on hand, you will most likely not start using zebrafish as your animal model.

There are certainly more requirements, see one of the references.


  1. Requirements and selection of an animal model.
  2. Selecting appropriate animal models and strains: Making the best use of research, information and outreach
  3. Basic Principles in Selecting Animal Species for Research Projects

Animal Models

Masayuki Mizui , George C. Tsokos , in The Autoimmune Diseases (Fifth Edition) , 2014


Animal models have greatly facilitated the study of systemic autoimmune diseases, notably systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA), and helped to develop rational new treatments. In addition, autoimmunity-prone mice have served as important tools in the study of genes involved in the expression of autoimmunity and related disease. Genes that facilitate or inhibit disease have been identified and these have in turn facilitated the study on immunogenetics and immunopathogenesis of human systemic autoimmune diseases. The etiology of both SLE ( Tsokos, 2011 ) and RA ( McInnes and Schett, 2011 ) is heterogeneous and complicated, but animal models bring a consistent understanding of disease pathogenesis. Mouse models of systemic autoimmune disease can be grouped into three types: spontaneous, gene manipulation derived, and induced.

First step toward testing therapeutics

Cleveland Clinic is a non-profit academic medical center. Advertising on our site helps support our mission. We do not endorse non-Cleveland Clinic products or services Policy

A patient is diagnosed with uveal melanoma. Thanks to therapeutic advances, enucleation is now a last resort. A multidisciplinary team of ophthalmologists and radiation oncologists develops a plan to treat the patient with targeted radiotherapy. They elect to use a radioactive plaque, a treatment known as brachytherapy, to deliver high-dose radiation to the tumor.

The tumor shrinks significantly. However, the patient develops the serious vision-impairing complication of radiation retinopathy.

Understanding the disease: Why we need an animal model

Radiation retinopathy is a broad term describing a spectrum of retinal changes following radiation exposure.

Classically defined by its vasculopathy, radiation retinopathy typically develops six months to three years following irradiation. It begins with preferential loss of endothelial cells, leading to vessel occlusion, leakage and retinal nonperfusion. As the disease progresses, retinal layers are compromised, further impairing vision. Late-stage radiation retinopathy is characterized by ischemia-induced ocular neovascularization.

Despite an established disease progression, the underlying pathophysiological mechanisms of radiation retinopathy remain unclear.

Treatments include risk-factor modification, such as:

  • Limiting the total dose of radiation delivered to the tissue.
  • Intravitreal injection of anti-VEGF and/or corticosteroids.
  • Laser photocoagulation to limit neovascularization.

Modest success has been achieved with these therapies. However, they fail to address the cellular and molecular events leading to radiation retinopathy, and prevention strategies remain limited.

The lack of elucidated mechanisms and dedicated treatment options demonstrates a clear need for more robust research. Similar to other retinal vasculopathies such as diabetic retinopathy, radiation retinopathy research can benefit from the mindful use of an appropriate animal model — a useful tool in the quest to understand disease pathologies.

Making the model

Despite its near 95% effectiveness and use as a first-line treatment in many cancers, episcleral plaque brachytherapy is commonly associated with radiation retinopathy.

Because no brachytherapy-induced radiation retinopathy animal model exists, we sought to establish one. It is our goal to set the stage for more mechanistic studies and eventually test promising therapeutics in a model closest to clinical experience.

Several factors need to be considered when establishing a radiation retinopathy model:

  1. Mode of radiation administration. Though technically challenging, we decided on a radioactive episcleral plaque, as one had not yet been described. Most of the previous models used external beam radiation in the form of x-rays.
  2. Type of ionizing radiation. The emitter used determines the penetrating power and the energy delivered to the tumor and surrounding tissue.
  3. Dose of radiation. Higher doses, measured in gray (Gy), result in greater retinal damage sooner after treatment.
  4. Differences in ocular anatomy between species (Figure 1). Factors such as lens size and vascular architecture may impact the development of retinopathy in different models.

Figure 1. Differences in ocular anatomy among various species must be considered when creating a model of radiation retinopathy.
Image from Ramos MS, Echegaray JJ, Kuhn-Asif S, et al. Animal models of radiation retinopathy – From teletherapy to brachytherapy. Exp Eye Res. 2019 Apr181:240-251. Copyright 2019, with permission from Elsevir.

To create our model, a 1 mm by 4 mm radioactive iodine-125 seed was surgically implanted posterior to the limbus of the left eye of Lewis rats (Figure 2). The initial dose of radiation treatment lasted six hours, after which the seed was removed. A total dose of 45 Gy at a distance of 1 mm from the seed was delivered. Escalating dosages of radiation will be administered to find the optimal range.

Figure 2. Placement of a 1 mm x 4 mm radioactive Iodine-125 seed surgically implanted posterior to the limbus of the left eye of a Lewis rat.

For 12 months, rats will be followed using optical coherence tomography (OCT) and wide-field fluorescein angiography (FA) to monitor the appearance of retinopathy (Figure 3). Based on previous reports using external beam radiation, most animals show signs of retinopathy (i.e., dot hemorrhages, cotton wool spots or retinal thinning) approximately six months after treatment.

Figure 3. An example of wide field fluorescein angiography imagery used to monitor the development of radiation-induced retinopathy in a Lewis rat.

Moving forward

This pilot study is the first attempt at episcleral plaque brachytherapy-induced radiation retinopathy in an animal model. In conjunction with the appropriate model, clinically relevant imaging modalities, such as OCT and wide-field FA, will allow for easier in vivo comparative anatomical assessment and classification of radiation-induced retinopathic changes.

Future investigations can dive deeper into potential pathophysiological mechanisms, such as those involving inflammatory pathways, leukocytes, microglia and apoptosis. These studies will allow for a better understanding of radiation retinopathy and the development of more effective therapies of this disease.

Mr. Ramos is a research technician at Cole Eye Institute. Dr. Yuan is a retina specialist. Dr. Singh is Director of the Department of Ophthalmic Oncology.

Feature image: A color fundus photo of the clinical presentation of a left eye affected by radiation retinopathy shows multiple retinal exudates (yellow material) surrounding the irradiated tumor.

Image from Ramos MS, Echegaray JJ, Kuhn-Asif S, et al. Animal models of radiation retinopathy – From teletherapy to brachytherapy. Exp Eye Res. 2019 Apr181:240-251. Copyright 2019, with permission from Elsevir.

Mission: data democratization

Like many children around the world, Willian da Silveira dreamt of becoming an astronaut. As he grew up, he instead turned to pharmacy, which took him to biophysics and biochemistry and finally on to bioinformatics. While at the Medical University of South Carolina, however, that old dream from his youth in Brazil came knocking. NASA wasn’t looking for astronauts, but analysts. The space agency was amassing data from model organisms that had flown to space, and they were inviting omics experts and bioinformaticians to take a look.

Da Silveira got a small grant and some mouse liver transcriptomes. Things immediately looked a bit strange – to him, it as if the mice were diabetic. Da Silveira, now at the Queen’s University Belfast, and a growing group of collaborators with diverse areas of expertise from institutions around the US dug into more omics data from more mice. It was almost overwhelming, da Silveira recalls, but one detail kept popping up over and over. “A lot of things we were seeing were related to mitochondrial metabolism,” he says.

Mitochondria – the powerhouses of the cells, as the saying goes – provide us and all other eukaryotes with energy space, it seems, zaps it away by way of mitochondrial dysregulation, signs of which were evident when the teams analyzed and ran simulations based on the murine gene expression data. From the mice, they looked to the astronauts. Sure enough, when they started looking they found signs that mitochondrial function had gone awry in urine and blood samples from 59 astronauts and in the NASA Twins Study dataset, which compared astronaut Scott Kelly to his Earthbound twin Scott.

The results were published last November in Cell 3 (alongside 28 other space biology-related papers and commentaries across Cell Press publications) and suggest mitochondrial stress, which can contribute to insulin resistance, premature aging, and immune issues, is a persistent space phenotype that could be a valuable target to mitigate many different ailments that come from living in microgravity and with increased exposure to radiation.

To da Silveira, it was an impossible dream now come true, thanks to an open science project managed through NASA’s Ames Research Center in California called GeneLab. “I would not be able to be in the field if not for GeneLab,” da Silveira says. “They are making data accessible to anyone in the world.”

Since April 2015, NASA has hosted omics datasets and related data from model organism missions in its GeneLab database (non-omics data, meanwhile, has been archived at the Ames Life Sciences Data Archive (ALSDA), which is currently working to improve its integration with GeneLab 4 ). These include transcriptomes, proteomes, epigenomes, metagenomes, and metabolomes for model systems including plants, microbes, and animals. The data have been generated and shared by PIs as well produced from in-house analyses of archived tissues from the NASA Space Biology Biospecimen Sharing Program and the NASA Biological Institutional Scientific Collection. As of April 2021, GeneLab contained 316 omics datasets that are entirely free for anyone in the world to download. It’s data democratization in action, says Project Manager Sylvain Costes.

Both GeneLab and the ALSDA are FAIR (Findable, Accessible, Interoperable, and Reusable) compliant databases designed to be openly available. Such endeavors usually involve a bit of a culture shift, but PIs involved with space missions understand the value “GeneLab has found that their PIs are eager to share their data, because they understand that after their original experiment was conducted, those datasets – because they are spaceflight relevant – are absolutely precious,” says Ryan Scott, a scientist at Ames working for KBR.

In its early years, those omics data were a bit raw and required some bioinformatics expertise to process and analyze, which can be done in slightly different ways. Standards, however, are taking shape to help anyone – regardless of their background – re-use the omics data captured from space-flown models.

How to Make an Animal Cell for a Science Project

This article was co-authored by Bess Ruff, MA. Bess Ruff is a Geography PhD student at Florida State University. She received her MA in Environmental Science and Management from the University of California, Santa Barbara in 2016. She has conducted survey work for marine spatial planning projects in the Caribbean and provided research support as a graduate fellow for the Sustainable Fisheries Group.

There are 7 references cited in this article, which can be found at the bottom of the page.

This article has been viewed 165,319 times.

Cells are one of the important building blocks of living organisms. If you're learning biology in school, your teacher might ask you to create your own model of an animal cell to help you understand how cells work. You might also wish to build a model of a cell as part of a science fair. With some simple materials, you can build your own animal cell to help reinforce your knowledge and teach others.

Lead Guest Editor Dr. Nelson S. Yee

Dr. Nelson S. Yee is an Assistant Professor of Medicine in Hematology-Oncology at Pennsylvania State University. He completed his MD and PhD at Cornell University and Memorial Sloan-Kettering Cancer Center, and he has previously worked at University of Pennsylvania and University of Iowa. He now works primarily on ion channels in cancer using zebrafish and mouse models as well as developing therapeutics and biomarkers in patients with malignant diseases. Dr. Yee is the author or co-author of 40 published papers and has presented at 30 conferences, and holds editorial appointments at Clinical Cancer Drugs, Molecular & Cellular Oncology, Annals of Hematology & Oncology, Biomarkers & Diagnosis, International Scholarly Research Notices, Cloning & Transgenesis, Advances in Biology, and Genetic Disorders & Gene Therapy.

Materials and Methods

OMIM Statistics

Statistics for free-text query of OMIM records were obtained on 2/6/2009 (Table 1). Statistics for the number of OMIM gene records with associated phenotypes were obtained by doing a query in OMIM for any gene record (* or +) with a filter selecting records with allelic variant descriptions and/or clinical synopses. Statistics for the percentage of OMIM phenotype/disease records with known molecular genetic basis were derived from the table of OMIM statistics at, by dividing the count for records with a “Phenotype description, molecular basis known” by the total number of phenotype records (statistics are as of 8/10/2009).

Selection of Genes/Records for Annotation

Human genes from OMIM were selected first by ranking by those with known and described mutant homologs in Danio rerio and Drosophila melanogaster, then by having the greatest number of detailed descriptions of alleles in OMIM. We selected the following 11 genes to be annotated from their OMIM record: ATP2A1 (108730), EPB41 (130500), EXT2 (608210), EYA1 (601653), FECH (177000), PAX2 (167409), SHH (600725), SOX9 (608160), SOX10 (602229), TNNT2 (191045), and TTN (188840). EYA1, PAX2, SOX9, SOX10, and TTN were selected for recording by three independent curators to test for annotation consistency (to be published elsewhere). Where an OMIM gene record referred to a disease record, the annotators would capture as much general phenotype information about that disease as possible.

Annotation Software and Storage

We write ontology terms prefixed with the name of the ontology abbreviations are provided at the beginning of this paper. We use ZFA:gut in place of ZFA:0000112 for legibility purposes. The actual computationally parseable form would use the numeric IDs.

All OMIM annotations were created with Phenote [24] software, using the “human” configuration. This included the following ontologies: CL, CHEBI, FMA, GO, and EDHAA for entity selection, and PATO for quality selection. All annotations were recorded with provenance assigned to the PubMed identifier (PMID) for the original publication as listed in the OMIM record. Ontologies were updated daily during annotation, and any annotations to obsolete terms were reconciled prior to analysis. Annotations, together with reference ontologies, that were analyzed for this paper can be found at the stable URL:

Additional Annotation Sources

Additional phenotype annotations were retrieved for cross-species comparison from MGI [33], ZFIN [13], GAD [63], NCBI gene [64], and homologene [65] in September 2008. Ontologies used in the analysis were downloaded from the OBO Foundry repository [66] in August 2008: BP-XP-UBERON (December 2008), ChEBI, CL, DO, DO-XP-FMA, EDHAA, FMA, GO-BP, GO-CC, GO-MF, MA, MP-XP, PATO, SO, UBERON, ZFA, and ZFS. To link cross-species annotations made to species-specific anatomy ontologies (ssAOs), we created an “Uber-ontology,” UBERON, to fill the gap between the general Common Anatomy Reference Ontology (CARO) [67] and the ssAOs. The first version of UBERON was generated automatically by aligning existing ssAOs and anatomical reference ontologies, and then partially manually curated. Ontologies referenced include: FMA, MA, EHDAA, ZFA, TAO, NIF, GAID, CL, XAO, MAT, FBbt, AAO, BILA, WBbt, and CARO. Additional details can be found in [17] and [16]. All ontologies were loaded into OBD, together with the annotations from the sources listed in Table 4.


Reasoning was performed over the combined set of annotations, ontologies, and ontology mappings. We used the OBD RuleBasedReasoner to compute the closure of transitive relations and to compute inferred subsumption relationships between EQ descriptions [28].


The phenotype analysis was performed using the OBD System [28] that implements a number of similarity metrics, described as follows. All similarity metrics are based on the reasoned graph, and annotations are propagated up the subsumption hierarchy.

Most of these metrics use the IC (Equation 1) of a term or EQ phenotype (collectively called a description), which is the negative log of the probability of that description being used to annotate a gene, allele, or genotype (collectively called a feature). where the probability of a description is the number of features annotated with that description over the total number of features in the database (Equation 2):

Here annotdescription denotes the number of features to which the description applies, after reasoning has been performed. This means that very general descriptions, such as “morphology of anatomical structure,” which subsume many more specific descriptions, are applicable to a greater number of features and thus have a low IC.


The maxIC is obtained by taking all descriptions shared by a pair of features and finding the description(s) with the highest IC. This may be an exact match, or it may be a subsuming description inferred by the reasoner. One characteristic of the maxIC score is that it can hide the contributions of annotations not in the maxIC set. This score is equivalent to the “maximum” variant of the Resnick similarity, as described in [18].

This metric attempts to match every description directly annotated in one feature with a directly annotated description in the other feature. Each directly annotated description di is compared against all the descriptions d’1, d’2,…in the other feature being compared. The most specific (highest scoring) common subsuming description is found, and the unique set of these is called the common subsumers. The ICCS is the average IC of all the common subsumers in this unique set.

This measure is shown in Figure 4 where the center triptych shows the common subsumers. The ICCS metric is described in [28] and has not been described previously to our knowledge. It can be considered a composition of the average and maximum Resnick measures as described in [18].


Given two phenotypic profiles, for example the phenotypic profiles of two genes, or two genotypes, or the two profiles generated by two curators annotating the same genotype, we can calculate the sum of the IC scores for (a) those phenotype EQ descriptions that are held in common (the intersection) and (b) the combined total set of phenotype EQ descriptions (the union). Looking at the ratio of these two sums (those that are shared versus the totality), we can obtain a measure of how similar the two phenotypic profiles are, with perfectly identical phenotypes having a score of 1. The simIC measure is illustrated in (Equation 3).

Here a p denotes the total set of descriptions that can be applied to p, including subsuming descriptions. As an example, given two genotypes, p and q, the simIC is obtained by dividing the sum of ICs for all descriptions in common by the sum of all descriptions in the union. Here, descriptions include the actual descriptions used in the profile, and all subsuming descriptions as determined by the reasoner. This metric penalizes nodes that have differing annotations.

We used one additional similarity metric, the simJ, which does not utilize the IC measures. The simJ between two profiles is the ratio between the number of descriptions in common versus the number of descriptions in both profiles. This is also called the “Jaccard index” or the “Jaccard similarity coefficient.” The number of descriptions in common is called simTO in [18]. The simJ (Equation 4) is a variant of the normalized simTO:

Gene Comparisons

Note that for comparisons between two genes, all annotations made to heterozygous and homozygous genotypes were first propagated to the single (or both, if known) alleles, and then propagated to their gene parent. The genotype annotations used in each query were excluded from the background set in calculating the overall score (Figure 5).

For the allele-to-allele comparisons, we calculated each metric for all pairwise combinations of alleles. Similarity scores between a pair of alleles were sorted into intra-gene (same gene) and inter-gene (different genes) sets, and the mean scores for each gene compared. The significance of the difference between the mean scores for each gene was calculated using a two-tailed Student's t-test.

For the zebrafish shha query, we also compared this gene against all other zebrafish genes (2,908 genes in the total set). For the inter-species queries, we exhaustively compared each gene against all other genes using simJ and then computed all metrics on the top 250.

Why did human trials fail

Related Articles

On October 26, 2006, at the opening day of the Joint World Congress for Stroke in Cape Town, South Africa, disappointing news spread quickly among the attendees: The second Phase III clinical trial for NXY-059 had failed. The drug, a free-radical spin trap agent for ischemic stroke, had been eagerly anticipated as a successful neuroprotective agent for stroke patients. As the drug developer, AstraZeneca, issued a press release reporting the news, e-mails circulated quickly within the stroke research community, many with the subject line, "Have you heard the bad news?"

"We were optimistic that this would be the new stroke drug," says Marc Fisher, director of the stroke program at the University of Massachusetts Medical Center, who was at the conference in Cape Town. "We were.

To the dismay of clinicians and researchers of acute stroke, the compound showed limited efficacy in neuroprotection versus the placebo. Instead, NXY-059 joined the family of more than a dozen failed neuroprotective agents, including glutamate antagonists, calcium channel blockers, anti-inflammatory agents, GABA agonists, opioid antagonists, growth factors, and drugs of other mechanisms. All had reached Phase III clinical trials and failed miserably at doing what their animal model tests had suggested they would: stop the cascade of necrosis in the event of stroke, and protect the remaining viable brain cells.

That NXY-059 had fallen victim to the same fate was particularly disheartening to a stroke roundtable group that had, in 1999, directly addressed the disconnection between animal models for stroke and their counterpart human trials. The group had devised a set of guidelines whose aim was to standardize the path to stroke therapeutics. During its development, NXY-059 had been its poster child. "This drug was being hailed as the first one to follow the standards," says Sean Savitz, assistant professor of neurology at Harvard Medical School. "But it didn't do that."

"It was very disappointing to all of us to have it fail, and it totally failed," says Sid Gilman, director of the Michigan Alzheimer's Disease Research Center in the Department of Neurology at the University of Michigan, and one of the consultants for AstraZeneca on the Phase III clinical design.

If the outcome of the Stroke Acute Ischemic NXY Treatment (SAINT) trials was an anomaly, investigators might have just shrugged it off. But it's not: Nearly half of all molecular entities that come into development fail, according to Janet Woodcock, deputy commissioner of the Food and Drug Administration. "There's no doubt about the absence of an effect [of NYX-059], and that called into question the many other studies in stroke, and how good are the animal models?" says Gilman. "So many agents appeared to be effective in the animal model and failed in human trials."

Because of these failures, hundreds of millions of dollars, and a potential approach to stroke treatment, have disappeared down the drain. The failure of NXY-059 may have stalled the quest for a neuroprotective agent, at least for some time. "This trial has poisoned stroke studies," says Gilman. "I'm doubtful that investors will want to invest in clinical stroke trial[s] for a while." The fault, it appears, may rest in the slipshod use of animal models.

In 1998, Fisher flew from Boston to Germany to help a drug company, along with academics specializing in animal modeling, to examine two sets of clinical trial results for a new stroke treatment that had failed. They wanted to uncover where they had gone wrong. (Fisher declines to disclose the company and trials that were involved.)

On his return flight, it occurred to Fisher that the chaos the stroke research field had been facing for years might benefit from the kind of meeting he had just attended: industry and academia collaborating to develop standardized practices. The next year, Fisher convened the first Stroke Therapy Academic Industry Roundtable (STAIR) group that devised a set of recommendations for preclinical and clinical stroke drug development. On the preclinical side, some of the recommendations seemed obvious: The candidate drug should be evaluated in rodents and also higher animal species blind testing should be performed tests should be done in both sexes and in varying ages of animals and all data, both positive and negative, should be published.

Approximately 26 million animals are used for research each year in the United States and European Union, according to estimates by the Research Defense Society in the United Kingdom. However, the number of animal procedures has been reduced by half over the past 30 years, likely due to stricter controls, improvements in animal welfare, and scientific advances.

Still, unlike in human clinical trials, no best-practice standards exist for animal testing. STAIR is the stroke research community's attempt at standardization. NXY-059 was the first neuroprotective agent to be developed under the auspices of the STAIR guidelines, though the implementation of the guidelines may have been just lip service. In particular, as Savitz wrote in an article published online in Experimental Neurology in May, the preclinical testing had several holes, including statistical robustness and the way in which the results were translated into clinical design. 1

The main problems, Savitz writes, were randomization and bias. In the initial evaluation of NXY-059 in rat models of focal ischemia, reports didn't say whether researchers had been blinded with regard to drug administration, behavioral testing, and histologic analysis. The results from the rodent study were mixed, showing a range of reduction of cerebral infarction size over a variety of intervals. However positive the results might have been, Savitz notes, the clear lack of statistical robustness calls any result into question. A subsequent report on the effects of NXY-059 in a rabbit embolic model showed a 35% reduction in infarction after 48 hours, but it did not indicate whether statistical analysis, blind testing, physiologic measurements, blood flow monitoring, or behavior assessments had been done.

AstraZeneca maintains that the preclinical animal tests and the clinical phase of SAINT adhered to the STAIR guidelines: "The design of the SAINT trials was sound and well considered in light of the strong evidence for neuroprotection that existed across the models and species tested at the time," according to a statement sent to The Scientist in response to Savitz's paper. Gilman, also editor-in-chief of Experimental Neurology, says he is not aware of any official response being drafted or submitted by AstraZeneca.

"So many agents appeared to be effective in the animal model and failed in human trials."
- Sid Gilman

The statistical troubles that mired some of the NXY-059 preclinical trials are common in animal models. Surveys of papers based on animal models find errors in about half, according to Michael Festing, a recently retired laboratory animal scientist at the UK Medical Research Council and board member of the National Center for Three Rs (NC3Rs ? replacement, refinement, and reduction), an organization that advocates using fewer animals in research and streamlining current animal tests. "Whether those are serious enough that the conclusions are invalid is debatable," Festing says.

Even the innumerous successful cases of animal experimentation that led to effective treatments for high blood pressure, asthma, transplant rejection, and the polio, diphtheria, and whooping cough vaccines were all carried out without standardized testing methods.

"People don't report if studies are randomized," says Ian Roberts, professor of epidemiology at the London School of Hygiene and Tropical Medicine. How animals are selected, or whether assessments were blind, are rarely included in the methods and thus create a potential for bias. "Imagine a cage of 20 rats, and you've got a treatment for some," explains Roberts. "So you stick your hand in a cage, and pull out a rat. The rats that are the most vigorous are hardest to catch, so when you pull out 10 rats, they're the sluggish ones, the tired ones, they're not the same as the ones still in the cage, and they're the control. Immediately there's a difference between the two groups."

The NC3Rs, in cooperation with the National Institutes of Health, is surveying a group of 300 papers, half from the United Kingdom, half from the United States, for their statistical quality in mouse, rat, and primate model studies. Researchers hope that by fall they will have a report describing how well (or not) the studies were randomized and whether they used the correct statistical methods. In an initial pilot study of 12 papers conducted in 2001 for the Medical Research Council, Festing reported: In six of the papers the number of animals used wasn't clear only two of the papers reported randomization and only six of the papers specified the sex of the animals tested. (For more on how gender can influence results, see "Why Sex Matters".)

Statistics aren't the only problem. Methodology is arbitrary, replication is lacking, and negative results are often omitted. A report in Academic Emergency Medicine by Vik Bebarta et al. in 2003 showed that animal experiments where randomization and blind testing are not reported are five times more likely to report positive results. 2 In a December 2006 paper in the British Medical Journal, Pablo Perel et al. showed that in six clinical trials for conditions including neonatal respiratory distress syndrome, hemorrhage, and osteoporosis, only three of the trials had corresponding animal studies that agreed with clinical results. 3 The authors attribute this discrepancy to poor methodology (i.e., bias in the animal models) and the failure of the models to mimic the human disease condition.

The difficulties associated with using animal models for human disease result from the metabolic, anatomic, and cellular differences between humans and other creatures, but the problems go even deeper than that.

When experimenting in animals researchers often use incorrect statistical methods, adopt an arbitrary methodology, and fail to publish negative results.

One of the major criticisms of the NXY-059 testing was the lack of correlation between how the effects of the drug were monitored in animals versus in humans. In the rodent model, researchers induced an ischemic event, administered the drug at various time intervals, and measured the size of the infarction. During the clinical trials, however, the drug's effect was evaluated in stroke patients using behavioral indicators such as the modified Rankin scale and NIH stroke severity (NIHSS) scale. In the primate tests the behavior assessments were based on a food-reward system, showing that NXY-059 did not improve left arm weakness in the aftermath of a stroke. "Even if we accept that NXY-059 does improve arm weakness," writes Savitz, "how would such a finding translate to human acute stroke studies that use the modified Rankin scale and NIHSS scores as primary outcome measures?" Indeed, some consider the two phases of testing, from animal to human, completely out of whack, and that only by statistical fluke was SAINT I, the first clinical trial, deemed a success.

Some say that animal research is best when targeted at specific mechanisms of action. "Animals are better used for understanding disease mechanism and potential new treatments, rather than predicting what will happen in humans," says Simon Festing, executive director of RDS (and son of Michael Festing). RDS is a UK organization that advocates the understanding of animal research in medicine. "The 2001 Nobel Prize in medicine involved sea urchins and yeast, organisms that evolved apart from humans by millions of years," says the younger Festing. "And yet, they are ideal models for studying cell divisions ? research that is being used in cancer therapeutics in humans now."

For specific models of human disease, Simon Festing adds, the farther away from the human species the animal studies get, the less predictive the model will be. For example, researchers studying some conditions, including Parkinson disease, have established a clear animal model. The primate model displays symptoms similar to human symptoms, whereas a mouse model may not be able to show the distinct tremor in the limbs. While this difference in essence relates back to fundamental anatomic variation among the various species, finding the best model is inherently difficult.

"The choice of animals is rather narrow," says Michael Festing. "There are 4,000 species of rodents, but we use only three or four of them. Then there's a shortage of anything that's not rodents, and in some cases we're restricted to dogs and cats ? which are a problem from the ethical point of view ? and primates, also a problem from the ethical point of view. So [choosing the right animal model is] sort of done by default: Eliminate the ones that are not suitable and choose from what's left."

Perhaps because of its abundance and short gestation, the mouse has become the flagship of animal testing, especially useful with genetic modifications, gene knockouts, and knockins. In 2003, NIH launched the Knockout Mouse Project (KOMP) and has awarded more than $50 million with the goal of creating a library of mouse embryonic stem cells lines, each with one gene knocked out.

Nonetheless, even genetically manipulated mice have their problems. The current knockout mouse model for amyotrophic lateral sclerosis (ALS) may be completely wrong, according to John Trojanowski at the University of Pennsylvania School of Medicine. He and colleagues recently showed that two versions of the disease, sporadic and hereditary, are biochemically distinct, and that a different mechanism controls the disease in each case. 4 In hereditary ALS the disease is associated with a mutation (SOD-1), whereas the sporadic cases are associated with the TDP-43 protein. Until now, research has focused primarily on SOD-1 knockout mice, with virtually no success in human trials. The new findings relating to the TDP-43 protein suggest that the SOD-1 knockout model for ALS could be wrong. "There was this nagging doubt" about the validity of the current models, Trojanowski says. "And there may be a whole new pathology characteristic, so we need models based on TDP-43."

A recent study at the Massachusetts Institute of Technology shows distinct differences between gene regulation in humans and mouse liver ? particularly how the master regulatory proteins function. 5 In a comparison of 4,000 genes in humans and mice, the researchers expected to see identical behavior ? that is, the binding of transcription factors to the same sites in most pairs of homologous genes. However, they found that transcription factor binding sites differed between the species in 41% to 89% of the cases.

Many of the underlying limitations associated with mice models involve the inherent nature of animal testing. The laboratory environment can have a significant effect on test results, as stress is a common factor in caged life. Jeffrey Mogil, a psychology researcher at McGill University in Quebec, demonstrated last year that laboratory mice feel "sympathy pains" for their fellow labmates. In other words, seeing another mouse in distress elevates the amount of distress the onlooker displays. The average researcher, when testing for toxicity effects in mice for example, likely assumes that they are starting at a pain baseline, when in truth the surrounding environment is not benign and can significantly affect results, Mogil says.

Choosing the right animal model is "sort of done by default: Eliminate the ones that are not suitable and choose from what's left." -Michael Festing

In new research, Mogil's group is demonstrating that the very presence of a lab researcher can alter behavior in mice. "The surprising thing is that these effects are visual, not auditory or olfactory," he says. "It's a huge surprise. Most people think [mice] are mostly blind anyway. I'm being convinced that the visual world of the mouse is a lot richer than expected."

Although the failure of NXY-059 may be one insult too many for clinicians and patients eagerly awaiting a neuroprotective agent, some experts feel that this hurdle is far from being the final chapter. Whether they blame weak animal test standardization, poor clinical design, or inadequate statistical analysis, questions often return to the NXY-059 itself as an indicator for the future of neuroprotection. "This drug is known to have antioxidant effects, but it was never shown what its mechanism was on the brain. Early studies were only hinting at possibilities," Savitz says.

In a field where much work is concentrating on nitrone-based spin trap agents, NXY-059 became the parent compound. But it's clear that it wasn't the answer. "The drug probably isn't a good drug to begin with," says Myron Ginsberg, professor of neurology and clinician at the University of Miami School of Medicine. Despite NYX-059's disappointing failure, other neuroprotective options are still in the pipeline. Ginsberg is in the early stages of working on albumin as a neuroprotective therapeutic, and researchers are also considering hypothermia as a way of preserving brain cells after ischemic stroke. "The fact that this drug failed, Ginsberg says, "doesn't say anything about the potential for neuroprotection [in the future.]."

For further information, please contact

Kasper Kjær-Sørensen, PhD
Department of Molecular Biology and Genetics
Aarhus University - +45 5144 6497

Claus Oxvig, Professor
Department of Molecular Biology and Genetics
Aarhus University - +45 3036 2460

Aage Kristian Olsen Alstrup, PhD, Veterinarian
Department of Clinical Medicine, the PET Centre
Aarhus University - +45 78464396

This article was published in Dyrlægen (The Veterinarian) on 27 February 2017, pp. 26-30, and is reproduced here by permission of the editor of Dyrlægen Dyrlægen.


Knocking out the activity of a gene provides information about what that gene normally does. Humans share many genes with mice. Consequently, observing the characteristics of knockout mice gives researchers information that can be used to better understand how a similar gene may cause or contribute to disease in humans.

Examples of research in which knockout mice have been useful include studying and modeling different kinds of cancer, obesity, heart disease, diabetes, arthritis, substance abuse, anxiety, aging and Parkinson's disease. Knockout mice also offer a biological and scientific context in which drugs and other therapies can be developed and tested.

Millions of knockout mice are used in experiments each year. [3]

There are several thousand different strains of knockout mice. [3] Many mouse models are named after the gene that has been inactivated. For example, the p53 knockout mouse is named after the p53 gene which codes for a protein that normally suppresses the growth of tumours by arresting cell division and/or inducing apoptosis. Humans born with mutations that deactivate the p53 gene suffer from Li-Fraumeni syndrome, a condition that dramatically increases the risk of developing bone cancers, breast cancer and blood cancers at an early age. Other mouse models are named according to their physical characteristics or behaviours.

There are several variations to the procedure of producing knockout mice the following is a typical example.

  1. The gene to be knocked out is isolated from a mouse gene library. Then a new DNA sequence is engineered which is very similar to the original gene and its immediate neighbour sequence, except that it is changed sufficiently to make the gene inoperable. Usually, the new sequence is also given a marker gene, a gene that normal mice don't have and that confers resistance to a certain toxic agent (e.g., neomycin) or that produces an observable change (e.g. colour or fluorescence). In addition, a second gene, such as herpes tk+, is also included in the construct in order to accomplish a complete selection. are isolated from a mouse blastocyst (a very young embryo) and grown in vitro. For this example, we will take stem cells from a white mouse.
  2. The new sequence from step 1 is introduced into the stem cells from step 2 by electroporation. By the natural process of homologous recombination some of the electroporated stem cells will incorporate the new sequence with the knocked-out gene into their chromosomes in place of the original gene. The chances of a successful recombination event are relatively low, so the majority of altered cells will have the new sequence in only one of the two relevant chromosomes – they are said to be heterozygous. Cells that were transformed with a vector containing the neomycin resistance gene and the herpes tk+ gene are grown in a solution containing neomycin and Ganciclovir in order to select for the transformations that occurred via homologous recombination. Any insertion of DNA that occurred via random insertion will die because they test positive for both the neomycin resistance gene and the herpes tk+ gene, whose gene product reacts with Ganciclovir to produce a deadly toxin. Moreover, cells that do not integrate any of the genetic material test negative for both genes and therefore die as a result of poisoning with neomycin.
  3. The embryonic stem cells that incorporated the knocked-out gene are isolated from the unaltered cells using the marker gene from step 1. For example, the unaltered cells can be killed using a toxic agent to which the altered cells are resistant.
  4. The knocked-out embryonic stem cells from step 4 are inserted into a mouse blastocyst. For this example, we use blastocysts from a grey mouse. The blastocysts now contain two types of stem cells: the original ones (from the grey mouse), and the knocked-out cells (from the white mouse). These blastocysts are then implanted into the uterus of female mice, where they develop. The newborn mice will therefore be chimeras: some parts of their bodies result from the original stem cells, other parts from the knocked-out stem cells. Their fur will show patches of white and grey, with white patches derived from the knocked-out stem cells and grey patches from the recipient blastocyst.
  5. Some of the newborn chimera mice will have gonads derived from knocked-out stem cells, and will therefore produce eggs or sperm containing the knocked-out gene. When these chimera mice are crossbred with others of the wild type, some of their offspring will have one copy of the knocked-out gene in all their cells. These mice do not retain any grey mouse DNA and are not chimeras, however they are still heterozygous.
  6. When these heterozygous offspring are interbred, some of their offspring will inherit the knocked-out gene from both parents they carry no functional copy of the original unaltered gene (i.e. they are homozygous for that allele).

A detailed explanation of how knockout (KO) mice are created is located at the website of the Nobel Prize in Physiology or Medicine 2007. [4]

The National Institutes of Health discusses some important limitations of this technique. [5]

While knockout mouse technology represents a valuable research tool, some important limitations exist. About 15 percent of gene knockouts are developmentally lethal, which means that the genetically altered embryos cannot grow into adult mice. This problem is often overcome through the use of conditional mutations. The lack of adult mice limits studies to embryonic development and often makes it more difficult to determine a gene's function in relation to human health. In some instances, the gene may serve a different function in adults than in developing embryos.

Knocking out a gene also may fail to produce an observable change in a mouse or may even produce different characteristics from those observed in humans in which the same gene is inactivated. For example, mutations in the p53 gene are associated with more than half of human cancers and often lead to tumours in a particular set of tissues. However, when the p53 gene is knocked out in mice, the animals develop tumours in a different array of tissues.

There is variability in the whole procedure depending largely on the strain from which the stem cells have been derived. Generally cells derived from strain 129 are used. This specific strain is not suitable for many experiments (e.g., behavioural), so it is very common to backcross the offspring to other strains. Some genomic loci have been proven very difficult to knock out. Reasons might be the presence of repetitive sequences, extensive DNA methylation, or heterochromatin. The confounding presence of neighbouring 129 genes on the knockout segment of genetic material has been dubbed the "flanking-gene effect". [6] Methods and guidelines to deal with this problem have been proposed. [7] [8]

Another limitation is that conventional (i.e. non-conditional) knockout mice develop in the absence of the gene being investigated. At times, loss of activity during development may mask the role of the gene in the adult state, especially if the gene is involved in numerous processes spanning development. Conditional/inducible mutation approaches are then required that first allow the mouse to develop and mature normally prior to ablation of the gene of interest.

Another serious limitation is a lack of evolutive adaptations in knockout model that might occur in wild type animals after they naturally mutate. For instance, erythrocyte-specific coexpression of GLUT1 with stomatin constitutes a compensatory mechanism in mammals that are unable to synthesize vitamin C. [9]