biodiversity1: The Importance of Liberating Data

Content is prepared for deposit in BLR via two workflows. Most often data mining processes extract content from PDF or HTML, identifying and labeling relevant data elements, either named entities such as DNA accession codes or geographic localities, or larger textual segments such as material citations and entire treatments. This can be automated by developing templates for each journal. During the upload of the data to BLR, each deposited object (article, treatment, or image) is assigned a DOI, which is cited by each related object. In the best-case scenario, this process is completely automated. An advanced workflow is based on publications that have already structured data based on standard vocabularies that machines can understand.

After each article is deposited in BLR, GBIF is notified that a new data set derived from the content of the publication is available subsequently GBIF downloads and integrates the data into its service.

The entire process from PDF via BLR to GBIF can take just a couple of minutes.

Liberation also means dissemination. Collaboration with global research infrastructures such as GBIF promotes usage and also promotes the improvement of data structure and quality.

Last modified: Sunday, 19 November 2023, 10:50 PM