INTRODUCTION
Taxonomy has been the focus of debate since the XIX century, and even
recently the recognition of the taxonomic research is subject of
discussion (Packer et al., 2018, Zeppelini et al., 2021). The global
diversity crisis exposes the urgency for investment in taxonomy to
reveal the largely unknown species diversity. Using Collembola as a
parameter, where about 20% of its estimated diversity is known (Hopkin
1997), between 100 and 120 new species are described each year, and it
would take to taxonomists more than 400 years to uncover and describe
all the unknown species diversity (Potapov et al., 2020). To be able to
understand the diversification processes in Collembola, we need to speed
up the rates of species description. This is a matter of concern in
every area of entomology, and in some extent, the whole zoology.
Collembola Lubbock, 1870 are minute wingless arthropods, basal hexapods
found in every terrestrial habitat on the planet, including soil, leaf
litter, canopy trees and caves (Bellinger et al., 1996-2023, Hopkin
1997). There are about 9000 described species, and its diversity is
extensively underestimated and poorly known (Bellinger et al.,
1996-2023, Hopkin 1997). They play important role in the food web and
the global metabolism (Bardgett & van der Putten 2014, Filser et al.,
2016, Potapov et al., 2023, Rusek
1998).
Similar to many other taxonomic groups of meso and micro fauna,
Collembola taxonomy is largely based on morphological analysis,
observing, and describing discrete variations in diagnostic characters.
The most abundant morphological source of information for species
definition in Collembola is the number, distribution, and shape of
cuticular chaetae, this is called chaetotaxy. The current morphological
approaches for inference of homology, chaetotaxic systems for chaetal
identification, are often room for great subjectivity depending on what
is seen and what is visible under an optic microscope, and often
different chaetotaxy systems are hardly comparable (Betsch 1997, Betsch
& Waller 1994, Bretfeld 1990, 1999, Potapov et al., 2020). The
challenges and perspectives for Collembola taxonomy is discussed in
detail, and the need for an integrative taxonomy and international
efforts to direct financial support and expertise recognition to face
the global biodiversity crisis, was also the focus of debate (Potapov et
al., 2020, Zeppelini et al., 2021).
The impact of recent technologies of high-resolution imaging, molecular
sequencing and machine learning will be a great deal towards taxonomic
techniques that can improve new and known taxa recognition (Potapov et
al., 2020). Integrative taxonomy, combining morphological and molecular
data to define species limits is likely to be a trend for most taxonomic
groups, not only Collembola.
There is, however, a particular aspect in Collembola (and nearly every
taxon of the micro-fauna) that affects the viability of including
molecular sequences in new species descriptions, in many, if not most
cases. It is rather a logistic problem, but many times there is not an
alternative. The problem is that almost all new species are discovered
under light microscope, which means that the specimen was mounted in a
slide, after being cleared under several different techniques of
chemical washes, which destroy the tissues and, consequently, genetic
material.
It is only after the taxonomic identification, that a species is
recognized as new for science or undescribed. More often than not, the
material analyzed is a limited set of specimens, and there is no
available material for molecular analysis after the taxonomic
identification and morphologic study. Accepting that molecular analysis
facilities are available, many times the biological specimens needed for
molecular sequencing may be available only in a future, after the
species is described. Even when Scanning Electron Microscopy (SEM) is
possible, depending on the structure, it is hard to get images of all
diagnostic features and light microscopy may be needed as well. However,
high-resolution imaging and molecular data are powerful tools, and may
be indispensable for accurate taxonomic research and species
delimitation. Therefore, the morphologic descriptions must be dynamic,
open to easy amendment and additional data insertion. Furthermore, it
must be presented in an interchangeable language, to allow the
information to flow across different disciplines.
Among all methods applied to the external morphology study of
Collembola, chaetotaxy is certainly the most complex and extensively
detailed (Betsch & Waller 1994, Cassagnau 1974, Deharveng 1983,
Fjellberg 1999, Jordana & Baquero 2005, Nayrolles 1988, 1990a, 1990b,
Potapov 2001, Szeptycki 1979, 1972, Yosii 1960). There are many chaetae
and groups of chaetae that vary in position and shape in such a way that
they allow a great deal of homology inferences. However, the most
advanced approaches are also very complex, which makes interpretation
difficult and increases ambiguity. These aspects circumscribe the deep
taxonomic research to restricted groups of experts, posing difficulties
to comparative studies even among different Orders of Collembola. In
addition, the traditional descriptive texts with morphological and
chaetotaxic information are difficult to integrate with machine learning
and computational novelties, which could give a lot of agility to
phylogenetic analysis, big data comparison, biogeography, and their
various applications (Potapov et al. 2020).
Despite all advances in technological instruments and methods, taxonomic
descriptions are still written basically in the format as it was about
two centuries ago, with a hermetic language in nearly incomprehensible
texts for non-experts. This is often a greater barrier to communication
among different areas of science, than the access of high-tech equipment
and analytical facilities.
The proposal of a coded and illustrated description of new species that
can be easily imported, transformed, amended, corrected, or expanded is
presented as an alternative to the traditional descriptive taxonomic
method.
The strength of the coded description is that new characters, whether
morphological, molecular, ecological, can be easily added to the list
and improve the descriptive matrix as new information is produced. These
matrices can be uploaded to public libraries and kept up to date with
all available information about the species, and linked to data bases as
GBIF, ZooBank and electronic taxonomic catalogs available in different
parts of the world e.g.,
fauna.jbrj.gov.br/fauna/listaBrasil
(Zeppelini et al., 2023);
www.collembola.org(Bellinger et al.,
1996-2023) .