14 Science, mathematics, and computing
14.2 Biological nomenclature
14.2.1 Capitalization of English names
In general contexts the English names of animal or plant species should not be capitalized, unless one of the words is a proper name:
greater spotted woodpecker
In specialized contexts such as field guides and handbooks capitalization is more usual, especially in ornithology. A third style is to capitalize only the first word of the compound. In most contexts, however, the use of lower case is still preferable. This could give rise to ambiguities—a little gull could be a particular species or just ‘a small gull’—but these are avoidable by careful wording.
14.2.2 Structure of taxonomic groups
In descending order, the hierarchy of taxonomic groups is:
phylum (in botany, division)
All organisms are placed in categories of these ranks: the domesticated cat, for example, is described in full as Carnivora (order), Felidae (family), Felis (genus), Felis catus (species). In addition, intermediate ranks may be added using the prefixes super, sub, and infra, and sub-families may be further divided into tribes. All taxonomic names are Latin in form, though often Greek in origin, except for the individual names of cultivated varieties of plant.
Rules for naming taxonomic groups are published in the following guides: for animals, the International Code of Zoological Nomenclature (ICZN); for wild plants and fungi, the International Code of Botanical Nomenclature (ICBN); for cultivated plants, the International Code of Nomenclature for Cultivated Plants (ICNCP); for bacteria, the International Codes of Nomenclature of Bacteria (ICNB); for viruses, the International Committee on Taxonomy of Viruses (ICTV).
14.2.3 Groups above generic level
Names of groups from kingdom to family are plural and printed in roman with initial capitals (Bacillariophyceae, Carnivora, Curculionidae). The level of the taxon is usually indicated by the ending: the names of botanical or bacteriological families and orders, for example, end in -aceae and -ales, while zoological families and sub-families end in -idae and -inae respectively. (Ligatures such as æ and œ are not now used in printing biological nomenclature.)
14.2.4 The binomial system
Living organisms are classified by genus and species according to the system originally devised by Linnaeus. This two-part name—called the binomial or binomen—is printed in italic, and usually consists of the capitalized name of a genus followed by the lower-case specific name. Thus the forget-me-not is Myosotis alpestris, with Myosotis as its generic name and alpestris as its specific epithet; similarly the bottlenose dolphin is Tursiops (generic name) truncatus (specific name). Specific names are not capitalized even when derived from a person’s name: Clossiana freija nabokovi (Nabokov’s fritillary), Gazella thomsoni (Thomson’s gazelle).
A genus name is printed in italic with an initial capital when used alone to refer to the genus. If, however, it also has become a common term in English for the organism concerned it is printed in roman and lower case (rhododendron, dahlia, tradescantia, stegosaurus); thus ‘Rhododendron is a widespread genus’ but ‘the rhododendron is a common plant’. Specific epithets are never used in isolation except in the rare cases where they have become popular names ( japonica), when they are printed in roman.
Latin binomials or generic names alone may be followed by the surname of the person who first classified the organism. These surnames and their standardized abbreviations are called ‘authorities’ and are printed in roman with an initial capital: ‘Primula vulgaris Huds.’ shows that this name for the primrose was first used by William Hudson; ‘Homo sapiens L.’ shows that Linnaeus was the first to use this specific name for human beings. If a species is transferred to a different genus, the authority will be printed in parentheses. For example, the greenfinch, Carduelis chloris (L.), was described by Linnaeus but placed by him in the genus Loxia.
After the first full mention of a species, later references may be shortened by abbreviating the generic name to its initial capital, followed by a full point: P. vulgaris, E. caballus. If two organisms with the same single-letter abbreviation occur in the same sentence or paragraph, a multiple-letter abbreviation is sometimes used (Staphylococcus and Streptococcus may become Staph. and Strep. for example) but is best avoided as there are very few ‘official’ multiple-letter abbreviations so inconsistencies between publications can occur. The species names are usually sufficient to avoid confusion, or use the full version.
14.2.6 Subspecies and hybrids
Names of animal subspecies have a third term added in italic to the binomial, for example Motacilla alba alba (white wagtail), M. alba yarrelli (pied wagtail). Plant categories below the species level may also have a third term added to their names, but only after an abbreviated form of a word indicating their rank, which is printed in roman:
variety (Latin varietas, abbreviation var.)
subvariety (Latin subvarietas, abbreviation subvar.)
form (Latin forma, abbreviation f.)
subform (Latin subforma, abbreviation subf.)
So ‘Salix repens var. fusca’ indicates a variety of the creeping willow, and ‘Myrtus communis subsp. tarentina’ a subspecies of the common myrtle.
Other abbreviations are occasionally printed in roman after Latin names, such as ‘agg.’ for an aggregate species, ‘sp.’ (plural ‘spp.’) after a genus name for an unidentified species, ‘gen. nov.’ or ‘sp. nov.’ indicating a newly described genus or species, and ‘auctt.’ indicating a name used by many authors but without authority.
The names attached to cultivated varieties of plants follow the binomial, printed in roman within single quotation marks (Rosa wichuraiana ‘Dorothy Perkins’). The cultivar name may be preceded by the abbreviation ‘cv.’, in which case the quotation marks are not used (Rosa wichuraiana cv. Dorothy Perkins). The names of cultivated varieties may also appear after variety or subspecies names, or after a genus name alone: for example, the ornamental maple Acer palmatum var. heptalobum ‘Rubrum’, and the rose Rosa ‘Queen Elizabeth’. Names of hybrid plants are indicated by a roman multiplication sign (×): Cytisus × kewensis is a hybrid species, × Odontonia is a hybrid genus. Horticultural graft hybrids are indicated by a plus sign (+Laburnocytisus adami).
14.2.7 Bacteria and viruses
Genera used adjectivally are lower case roman (salmonella poisoning, streptococcal infection); plural noun forms can add -s or take the Latin form (rickettsias, salmonellae, staphylococci). Bacterial strains are usually designated by capitals and numbers in roman (Escherichia coli K12) but consult the relevant authority as nomenclature changes.
The International Committee on Taxonomy of Viruses (ICTV) has developed a system of classifying and naming viruses. The ranks employed for animal, fungal, and bacterial viruses are order, family, subfamily, genus, and species. Wherever possible, Latinized names are used for the taxa; hence names of genera end in the suffix -virus, subfamilies end in -virinae, families end in -viridae, and orders end in -virales. Latinized specific epithets are not used, so binomial nomenclature does not obtain. The ICTV recommends italicizing and capitalizing the first word of all Latinized names when used in a taxonomic sense, but not in vernacular usage.
taxonomic: a newly acquired virulence of several bee viruses belonging to the family Dicistroviridae has been observed worldwide
vernacular: measuring the impact of herpes zoster and post-herpetic neuralgia on quality of life
However, not all publishers (including OUP) follow this guidance, italicizing only genera and species, so check which style is used. Those genera or higher groups that do not yet have approved Latinized names are referred to by their English vernacular names. Capitalization is used only for proper nouns:
We have focused our interest on acute bee paralysis virus, which shares some antigenic and sequence similarities with Kashmir bee virus and Israeli acute paralysis virus.
The ranks of genus and species are not used in the taxonomy of plant viruses, which are classified in groups—not families—with the approved group name ending in -virus. Existing names employ various combinations of Roman or Greek letters, Arabic or Roman numerals, and superscript and subscript characters. Many names are prefixed with a capital P or lower-case phi (PM2, ɸ6, ɸX, Pf1). It is important therefore to follow carefully the conventions employed in the original name.
Enzyme nomenclature has several forms, depending on context. Most enzyme trivial names are based on the name of the type of reaction they catalyse and the name of the substrate or product they are associated with. Most end in -ase, though some do not.
A systematic nomenclature has been devised by the International Union of Biochemistry and Molecular Biology. Each enzyme has a systematic name incorporating its type designation (i.e. group name), the name of its substrate(s), and a unique four-digit numerical designation called the EC (Enzyme Commission) number. For example, the systematic name of glutamate dehydrogenase is L-glutamate:NAD+ oxidoreductase (deaminating), EC 184.108.40.206.
Because systematic names are so unwieldy, they tend to supplement or clarify trivial names rather than replace them. In most contexts trivial names alone suffice, though the EC number and full systematic name should follow at first occurrence in formal usage.
Although restriction endonucleases also have EC classifications, they have additional nomenclature that is based on the genus, species, and strain from which they are derived (such as BamHI from Bacillus amyloliquefaciens strain HI).
14.2.9 Genes and chromosomes
Conventions for naming genes and gene products (the proteins they code for) vary widely among species or type of organism, and also change rapidly. Authors and editors are advised to check the website of the relevant nomenclature committee or other authority for current guidelines—many organisms now have dedicated online databases containing a searchable catalogue of genes with annotated gene sequence data, plus descriptions of gene functions. The New Oxford Dictionary for Scientific Writers and Editors has a helpful summary of the basic rules of gene nomenclature for selected groups of organisms or species.
The following is some general guidance but there are many exceptions so editors with limited knowledge of the subject should not be tempted to standardize capitalization, hyphenation, and italicization for example, without checking the relevant authority. As an example, acute myeloid leukaemia genes in humans and mice are AML1 and Aml1, respectively, and their protein products are AML1 and Aml1.
• Gene names and their more commonly used symbols are usually italicized; most symbols are given three letters (such as CDH1 for cadherin 1), although older symbols may differ (w for the white gene in white-eyed fruit flies Drosophila for example).
• For dominant traits, the name and symbol generally begin with a capital letter, whereas for recessive traits the initial letter is lower case (in baker’s yeast Saccharomyces cerevisiae, for example, CUPI, arg2).
• Plus and minus signs are used in some organisms to indicate normal (wild type) or mutant forms, such as the cyc+ Cyclops wild-type gene in zebrafish; others use all capitals (AGAMOUS-LIKE, symbol AGL) for wild types and lower case for mutants (accelerated cell death, acd) in wall cress Arabidopsis thaliana.
• Protein products usually follow the corresponding gene symbol but are set roman (for example, the protein product of the trpA gene required for tryptophan synthesis in E. coli is TrpA).
• Human gene names are roman but the symbols are italic capitals and Arabic letters (G-protein coupled receptor, GPR1). The Gene Nomenclature Committee Human Genome Organization (HUGO) is responsible for assigning gene symbols; their website
genenames.orgis a database of approved nomenclature and associated resources. For the major histocompatibility complex see 14.3.5.
Each community also has its own convention for designation of genetic markers and transgenic strains.
Chromosomes are designated by a roman letter or number (in humans 1 to 22, X, Y); different regions and bands on the short (p) or long (q) chromosome arms are specified with Arabic numbers and letters, in roman, for example 14q32.2. Duplicated, deleted, or translocated regions are designated dup, del, or t respectively; for example t(9;22)(q34;q11.2) refers to a reciprocal translocation between bands 34 and 11.2 of the long arms of chromosomes 9 and 22. Gain or loss of a chromosome is indicated by a plus or minus sign, so +21 indicates an extra chromosome 21. The length of DNA sequences is cited in base pairs (bp), kilobases (kb), or megabases (Mb); the centiMorgan (cM) is a measure of genetic distance and relates to recombination frequency.