What is the Plazi Workflow
The Plazi workflow refers to a set of processes and tools used by Plazi, a non-profit organization and digital biodiversity publisher, for creating, managing, and disseminating taxonomic and biodiversity data. It focuses on making taxonomic literature openly accessible and machine-readable.
The Plazi workflow typically includes the following key steps:
- Literature Ingestion: Plazi collects taxonomic literature, including taxonomic descriptions, taxonomic treatments, and associated data, from various sources, such as scientific journals and publications. This literature is then digitized and processed.
- Text Mining and Markup: Plazi uses text mining and natural language processing (NLP) tools to extract taxonomic and biodiversity data from the digitized literature. This includes taxon names, synonyms, references to type specimens, and other taxonomic information.
- Data Markup: The extracted data is marked up and enriched with standardized and controlled vocabularies, such as the Taxonomic Concept Transfer Schema (TCS), which allows the data to be semantically linked and machine-readable.
- Data Publishing: The marked-up taxonomic and biodiversity data are published online in various biodiversity databases and repositories, including the Global Biodiversity Information Facility (GBIF), the Encyclopedia of Life (EOL), and others. This ensures that the data is openly accessible to the scientific community and the public.
- Data Integration: The marked-up data can be integrated with other biodiversity data, making it possible to link taxonomic information to a broader context of biodiversity research.
- Data Accessibility: Plazi places a strong emphasis on open access and data liberation, ensuring that the taxonomic and biodiversity data are available to researchers, conservationists, and policymakers for various applications, such as species identification, conservation efforts, and scientific research.
The Plazi Workflow plays a crucial role in making taxonomic and biodiversity data more accessible, discoverable, and usable for scientific research, conservation efforts, and other applications. It supports the broader goals of open science and open access to biodiversity information.
Impact on Biodiversity Research
Plazi Workflow has had a significant impact on biodiversity research through its efforts to digitize taxonomic literature and make taxonomic and biodiversity data more accessible and usable. Here are some of the key ways in which Plazi has influenced biodiversity research:
Data Accessibility: Plazi's workflow and data publishing efforts have made a vast amount of taxonomic and biodiversity data openly accessible to researchers, conservationists, and the broader scientific community. This increased accessibility has facilitated easier access to historical taxonomic literature, which can be essential for biodiversity research and species identification.
Data Integration: Plazi's data markup and publishing allow taxonomic data to be integrated with other biodiversity databases and repositories. This integration enables researchers to cross-reference taxonomic information with environmental, geographic, and ecological data, providing a more comprehensive view of biodiversity patterns and relationships.
Semantic Markup: Plazi's use of semantic markup, including standardized vocabularies like the Taxonomic Concept Transfer Schema (TCS), enhances the interoperability and machine-readability of taxonomic data. This, in turn, supports data integration and automated data analysis.
Improved Taxonomic Research: Plazi's work has made it easier for taxonomists and biologists to find, access, and compare taxonomic descriptions and treatments, aiding in species identification and the creation of comprehensive taxonomic databases. This is particularly important for the study of new species and the revision of existing ones.
Conservation and Policy: Open access to biodiversity data through Plazi supports conservation efforts and helps inform policy decisions. Researchers and conservationists can use the data to assess the distribution and status of species, identify areas of high biodiversity, and prioritize conservation actions.
Citizen Science: Plazi's efforts contribute to citizen science initiatives by making biodiversity data available to the public. This encourages citizen scientists to contribute to biodiversity research, such as species observations and data collection.
Promoting Collaboration: By digitizing and openly sharing taxonomic data, Plazi fosters collaboration among researchers and institutions. It helps bridge gaps in knowledge and encourages the global scientific community to work together in advancing biodiversity research.
Preservation of Scientific Heritage: Plazi's digitization of taxonomic literature contributes to the preservation of scientific heritage. Historical taxonomic literature that might otherwise be at risk of deterioration or loss is safeguarded for future generations of scientists.
In summary, Plazi's impact on biodiversity research is characterized by increased data accessibility, improved data quality, and enhanced collaboration within the scientific community. By promoting open access to taxonomic and biodiversity data, Plazi plays a vital role in advancing our understanding of the natural world and supporting conservation efforts.
A Use Case for the Plazi Workflow
One of the prominent use cases for the Plazi Workflow is in the context of taxonomic research, where it plays a crucial role in making taxonomic literature and data more accessible and usable. Here's a specific use case illustrating how the Plazi Workflow can be applied:
Taxonomic Revision and Species Identification
Scenario: A taxonomist or biologist is conducting a taxonomic revision of a group of species (e.g., a genus or family of insects, plants, or any other taxonomic group). The goal is to update and clarify the taxonomy of this group, which involves identifying new species, reclassifying existing ones, and providing comprehensive descriptions and keys for species identification. The taxonomist needs to access and analyze a wide range of taxonomic literature, historical descriptions, and other relevant data sources.
How the Plazi Workflow is Used:
Literature Collection: The taxonomist accesses the Plazi database, which contains a vast amount of digitized taxonomic literature, including taxonomic descriptions, treatments, and articles from scientific journals. The Plazi workflow involves systematically collecting and digitizing this literature from various sources.
Data Extraction: Plazi's text mining and natural language processing tools are used to extract taxonomic data from the digitized literature. This includes taxon names, synonyms, type specimen references, diagnostic characters, and other relevant information.
Data Markup: The extracted data is marked up and enriched using standardized vocabularies and controlled terminologies. For instance, Plazi employs the Taxonomic Concept Transfer Schema (TCS) to ensure that the data is structured and semantically linked.
Data Accessibility: The taxonomist can access the marked-up taxonomic data through the Plazi database and search for relevant information related to the group of species they are studying. Plazi's open-access approach ensures that the data is freely available to the researcher.
Species Descriptions: The taxonomist can use the data from Plazi to assist in writing and updating species descriptions, creating identification keys, and determining the taxonomic relationships within the group.
Data Integration: The taxonomist can integrate the taxonomic data from Plazi with other relevant biodiversity databases and environmental data sources to provide a broader context for their research.
Collaboration: Plazi facilitates collaboration among taxonomists and scientists working on similar or related taxonomic groups, as they can access and share data through the platform.
This use case demonstrates how the Plazi Workflow is instrumental in taxonomic research by simplifying the process of accessing, extracting, and using taxonomic literature and data. It streamlines the taxonomic revision process, supports species identification, and promotes collaboration among taxonomists and researchers in biodiversity science.