Polar Data Discovery Enhancement Research (POLDER)

Image

Making data discovery easier by developing aggregated search tools



Polder (Dutch verb): to work collaboratively to achieve a common goal

Federated metadata search for the polar regions will dramatically simplify data discovery for polar scientists. Instead of searching dozens of metadata catalogues individually, a user should be able to search them all from a single search page.

The Polar Federated Search Working Group (POLDER) is a collaboration between the Arctic Data Committee (ADC), Standing Committee on Antarctic Data Management (SCADM), and Southern Ocean Observing System (SOOS), to develop the tools and resources to support metadata aggregation and federated search tools to improve the discoverability of polar science data.
During the Polar Data Forum III in Helsinki, November 2019, POLDER held two days of workshops to explore the feasibility of using schema.org and its associated technologies to support federated metadata search for polar-relevant metadata catalogues. 
During the meeting we agreed the following: 
  1. That data curators in our community should be improving the discoverability of their datasets by implementing schema.org, following the guidance of the science-on-schema.org community. This will make their datasets visible to existing and new metadata crawlers, such as Google’s Dataset Search.
  2. That our community should contribute to the existing conversations on schema.org extensions that are happening through science-on-schema.org, the Earth Science Information Partnership schema.org cluster, bioschemas.org, and geoschemas.org 
  3. That the tools for developing a community-specific federated search are developing rapidly, so that once a significant number of our data centres have implemented schema.org, it is likely that there will be a clearer path to developing our own federated search.
  4. That we will continue to contribute to global conversations on schema.org in the earth and other natural sciences.

Loading Products

Loading Events
POLDER believes that the recent groundswell in interest in schema.org, driven by the development of Google’s dataset search, offers a rare opportunity to simplify and connect metadata discovery tools. Schema.org is structured header text that is attached to a dataset’s landing page and that can draw metadata elements from existing metadata standards. It is a lightweight way to share the load of aligning metadata standards that does not require a data centre to alter its systems and infrastructure for managing metadata. 
However, the open and extensible nature of schema.org poses a danger, in that it would be easy for this community to replicate errors from the past by implementing it in divergent ways. This would undermine the reason for implementing schema.org - the need for a uniform way of sharing basic discovery metadata.
POLDER encourages all publishers of metadata and data in polar regions to implement schema.org in a way that is interoperable with the approaches taken by the science-on-schema, Bioschemas, and Geoschemas communities. The resources listed below will help you ensure that your schema mark-up is interoperable with this broader community.
Resources for metadata providers:
sample schema.org json-ld file developed at the Polar Data Forum III in Helsinki, November 2019
Science-on-schema GitHub repository and How-To guides
The Earth Science Information Partners have more resources and hold regular teleconferences to discuss issues. It's quick and easy to join, and participation is encouraged.
Resources for metadata aggregators:
Gleaner and its associated tools for harvesting metadata records
Geocodes are developing prototype federated search tools that you can explore. In particular, you may be interested in exploring their tools that can search on either text or spatial location (though they're not yet integrated). 
This is a rapidly developing field and all the groups listed here are interested in your feedback about ways schema.org and its related extensions should evolve to meet the needs of the entire community. We encourage you to post issues on GitHub repositories and to join the various teleconferences and workshops that these groups are organising, to ensure that polar voices are heard.
This image for Image Layouts addon
Photo: Esmee van Wijk
Objectives
  1. POLDER will investigate the needs of the polar research community and opportunities for developing metadata aggregation and federated search
  2. POLDER will advise ADC, SCADM, and SOOS on the best approaches to metadata aggregation federated search
  3. POLDER will pursue funding and resource opportunities with other related groups to support metadata aggregation federated search
  4. Once funding/resources are found, POLDER will act as a scientific advisory group for the developers
  5. POLDER will maintain contact with the broader data management community to ensure that polar metadata aggregation and federated search is linked with other global initiatives and minimises duplication of efforts
  6. POLDER will work in as transparent and open a manner as possible, including the open distribution to group materials in publicly accessible resources.
  7. We expect that members of POLDER and the broader data management and polar communities will treat this openly shared information with respect and with due diligence towards citation, attribution, and re-use of the materials (e.g., without “scooping” the community’s work to publish under their own name).
  8. POLDER will adhere to the Polar Information Commons