Species misidentification affects biodiversity metrics: Dealing with this issue using the new R package naturaList

Published in Ecological Informatics

By Rodrigues A. V., Nakamura G., Steggemeier V. G. and Duarte L.D.S. in Macroecology Bioinformatics

March 14, 2022

Abstract

Biodiversity databases are increasingly available and have fostered accelerated advances in many disciplines within ecology and evolution. However, the quality of the evidence generated depends critically on the quality of the input data, and species misidentifications are present in virtually any occurrence dataset. Yet, the lack of automatized tools makes the assessment of the quality of species identification in big datasets time-consuming, which often induces researchers to assume that all species are reliably identified. In this study, we address this issue by evaluating how species misidentification can impact our ability to capture ecological patterns, and by presenting an R package, called naturaList, designed to classify species occurrence data according to identification reliability. naturaList allows the classification of species occurrences up to six confidence levels, in which the highest level is assigned to records identified by specialists. We obtained a list of specialists by using the species occurrence dataset itself, based on the identifier names within it, and by entering an independent list, obtained by contacting experts. Further, we evaluate the effects of filtering out occurrence records not identified by specialists on the estimations of species niche and diversity patterns. We used the tribe Myrteae (Myrtaceae) as a study model, which is a species-rich group in Central and South America and with challenging taxonomy. We found a significant change in species niche in 13% of species when using only occurrences identified by specialists. We found changes in patterns of alpha diversity in four genera and changes in beta diversity in all genera analyzed. We show how the uncertainty in species identification in occurrence datasets affects conclusions on macroecological patterns by generating bias or noise in different aspects of macroecological patterns (niche, alpha, and beta diversity). Therefore, to guarantee reliability in species identification in big data sets we recommend the use of automated tools such as the naturaList package, especially when analyzing variation in species composition. This study also represents a step forward to increasing the quality of large-scale studies that rely on species occurrence data.

Read more ->

Posted on:
March 14, 2022
Length:
2 minute read, 338 words
Categories:
Macroecology Bioinformatics
See Also: