Latin nonsense may be used to label bacteria as scientists can’t keep up with naming new species

A bacteria sample - PASIEKA/Science Photo Library RM
A bacteria sample - PASIEKA/Science Photo Library RM

Latin nonsense could be used to label tens of thousands of new bacteria species, as scientists can not think of enough names to keep up with the rate of discovery.

The ancient language has been the crux of scientific nomenclature since the time of Carl Linnaeus, who invented the binomial system more than 250 years ago.

Still used today, the system involves two words, with the first capitalised and both italicised, and they refer to the organism’s genus and species.

But now scientists are struggling to keep up with the many thousands of new species, especially of bacteria, being discovered and the traditional method has left a backlog of 50,000 species that are given automatic placeholder tabs made of random numbers and letters until a proper name is found. At the current rate, the backlog would take 50 years to clear.

But while given Latin labels usually have a relevant meaning to the species - Homo sapien means "wise man", for instance - new examples include Dupisella tifacia for a bacterium from the sheep gut, Hopelia gocarosa for a bacterium living in Swedish groundwater or Saxicetta apufaria from a Russian salt lake.

Nonsense Latin

The Latin-sounding names have no actual meaning, and have been generated automatically by an AI-powered computer system built by researchers at the Quadram Institute in Norwich. The computer system, published in the International Journal of Systematic and Evolutionary Microbiology, was taught the grammatical rules and syntax of the classical language and tasked with putting letters together in faux-Latin.

Engineers enforce a handful of rules to ensure no swear words or offensive terms, in any language, would be produced by the algorithm but aside from this, the system is instantaneous.

"The front ends of the names are just strings of letters put together reflecting the phonetic rules of Latin, whereas the back ends of the words are suffixes used in Latin that mean the names can be defined as Latin nouns," study author Prof Mark Pallen, Group Leader at the Quadram Institute and Professor of Microbial Genomics at the University of East Anglia, told The Telegraph.

The team's Latin-faking system has been described by its creators as "radical" but they say it maintains the familiarity and gravitas of Latin, even if they lack any meaningful pedigree.

'Just the first step'

"I was a member of the working group that gave us Greek letters for Covid variants, which were rapidly adopted by the scientific community," Prof Pallen said. "I hope that the names proposed here are also rapidly adopted and used widely. This is just the first step. The age of microbial discovery is far from over, but it will be easy to create future names en masse using the principles we have established here."

The system was built to specifically provide names for bacteria that have been discovered by scientists but not yet grown in a lab, but Prof Pallen says it could apply to any other organisms, including plants and mammals.

"The system could easily be scaled so that, for example, one could add arbitrary components to endings like "saurus" for say reptiles," he said. The concept of using fake Latin names with no meaning has puzzled some experts.

Stephen Hunt, a Latin language expert at the University of Cambridge who is also the Editor of The Journal of Classics Teaching (JCT), said the system is "a bit weird".

One way forward

"I guess if you’re a scientist you need to name them - although surely in ‘big data’ they use number-letter codes in reality?

"Clearly identifying each bacterium by characteristics for which you can then sort out a name is incredibly time consuming. So maybe Latin created by computer is the way forward - though I feel that the Latin becomes just as much a set of numbers and words as a code would be.

"I also wonder if assigning Latin names at random also goes against the idea of categorisation. So, weird. Unnecessary in my view - just give them a code! But then, maybe the Latin is the code."