Post provided by FLORIAN D. SCHNEIDER
The past ten years brought a major game changer to ecological community and ecosystems research: functional trait data. This has shifted the focus from assessing and analysing ecosystems by not just the quantities of species but also those species’ qualities. Functional trait data can give us major insights into how communities are composed and how species assemblages evolve under certain environmental pressures. They also link community composition to ecosystem functioning and provide a framework for the assessment of communities across trophic levels and functional groups.
The Need for a Template
In spring 2017 Caterina, Nadja, Malte, Martin, Andreas and I were discussing trait data at the annual assembly of the Biodiversity Exploratories project. In this project, multidisciplinary researchers assess plant, animal and microbial diversity as well as ecosystem functions at the same grassland and forest sites across different land-use regimes. Many of the research groups involved in the project have adopted trait-based approaches. Also, synthesis research on the project had an increasing need to incorporate trait data from the different research groups. We were looking for a simple template to bring trait data from the different research fields together within the project’s database management system (BExIS).
We decided to conduct a survey among the project members about the types of trait data that researchers in the consortium had produced. The results showed us that the existing propositions for structuring trait data were not able to combine all of the different types of trait data into a single format. We also found that existing data-standards for traits had been developed with a certain data structure in mind. This meant that they were difficult to implement for small research projects with limited resources on the data-management end.
So we formed a small committee of data providers, data curators and data synthesis researchers (Birgitta, Gaëtane, and Pete were joining the team) to develop a vocabulary that could capture the necessary variables of trait data for BExIS. By adopting terms from the Darwin Core Standard, we made sure that those data were easy to feed into wider database contexts. As one of the goals for BExIS is to make data permanently available, this was a great improvement. There were additional terms to define as well though. This could pose a threat to the long-term accessibility of data, if the definitions of terms were lost from the metadata. We realised that we had to create a stable reference for our new trait-data terminology.
This was also the point when we noticed the implications of the work. If published as a reference terminology, our template could be useful for anyone working on the assessment or compilation of trait data. If this was widely adopted for future trait-data publications, it would become much easier to re-use and re-analyse these data.
A Wider Scope
Luckily, in 2016, the German Federation for Biological Data (GFBio) was founded. GFBio is a national research infrastructure that aims to facilitate data management and long-term accessibility for biodiversity research projects in Germany. Part of this project is the GFBio Terminology Service. The service’s task is to provide the underlying semantics for the databases to make them human and machine readable.
Anton and David gave us great support for our idea of a publicly available Ecological Trait-data Standard vocabulary (ETS). David developed a script to transfer the ETS terms (which at this stage were in a simple csv spreadsheet and rendered into a human-readable website via Rmarkdown) into the machine-readable OWL language for computational ontologies and prepared the release of the ETS via the GFBio Webservice. For GFBio, this was an ideal case study for the development of their service portfolio.
Up to this point, no formal project consortium had been formed and no project funding been acquired to develop the ETS. We were just a bunch of ecologists and data managers trying to solve a common issue, advancing further and further into the matter.
Social Media Sparking New Collaborations
In April 2018, Brian Maitner, Rachael Gallagher and Brian Enquist got in touch on Twitter about a workshop in the wake of the ESA Annual Meeting in New Orleans. Its aim was to form a global Open Traits initiative. One of the primary tasks of the workshop (besides the call for open data publications and efforts for developing tools for data management and analysis) was to create a global data standard and terminology for traits.
Of course, we immediately got on board. The Open Traits initiative were aiming for a global collaboration on data and research and we very much agreed with this approach.
Within a few weeks, we released our manuscript describing the standard as a pre-print on biorxiv.org and published the first preliminary version of the ETS vocabulary as an open source project on Github and GFBio. The pre-print received quite a lot of attention on Twitter and prompted a valid critical comment from the committee of the TRY plant trait database. We were really grateful for this as it highlighted the need to link our project more directly to existing initiatives working on trait-data standardisation. In August 2018, Pete and Caterina were presenting the ETS at the Open Traits Workshop and it was very well received. The next month, I went to Jena for the International Conference on Ecological Informatics and discussed the ETS in the context of other biodiversity-data-semantics initiatives.
The Open Traits Workshop and subsequent online discussions integrated our initiative into a community working towards the same goals. A recent pre-print on EcoEvoRxiv, led by Rachael Gallagher, describes the strategic goals of the newly-formed Open Traits Network (OTN). Some of the members of the network secured funding for a series of synthesis workshops starting from spring 2020 at the German Centre for Integrative Biodiversity Research, iDiv. The Open Traits Network is now gaining momentum.
Not a Normal Project
Thanks to some re-assigned budget from FUSION Lab at Uni Jena, I was finally able to complete the work on the ETS and ‘Towards an ecological trait‐data standard’ . We also released an R-package (‘traitdataform’, on CRAN) that assists users in applying the ETS to their own dataset before publication to improve accessibility and facilitate data re-use in synthesis research. The package can be applied to harmonize heterogeneous data from multiple sources into compilations as well.
From start to end, no formal project collaboration has ever been established and no major funding was acquired. As such, the ETS vocabulary is not the product of classical research funding structures. It’s a fully democratic by-product of open collaboration, made possible by open-access preprints, open-source development, and social media – a striking idea filling a gap at the right time.
We’re now aiming to further consolidate the ETS by discussing the terms and technical implementation in a broader community of the Open Traits Network and also looking for inspiration and help within the Biodiversity Information Standards (TDWG), the global body for the development and propagation of data standards in biodiversity informatics. We’d also like to form a community for future development of the terminology within the framework of the Open Traits Network. This is an open process and anyone with a stake in trait-based research is welcome to join these efforts. If you’re a data provider, data manager or data synthesis researcher you can get in touch via Github or E-mail (firstname.lastname@example.org).
To find out more, read our Open Access Methods in Ecology and Evolution article ‘Towards an ecological trait‐data standard‘
Thanks to Caterina Penone, Malte Jochum and Nadja Simons for comments on a draft of this post.