Personal tools

News

WDS Blog

The DLI Service: An Open and Universal Data–Literature Interlinking Service

A Blog post by Paolo Manghi and Sandro La Bruzzo (OpenAIRE)

Sharing links between the published literature and datasets is crucial to achieve the full potential of research data publishing. This article presents the coordination and implementation efforts of the ICSU-WDS–RDA Data Publishing Services Working Group (DPS-WG) and the OpenAIRE infrastructure towards realizing and operating an open and universal data-literature interlinking service (DLI Service). The service is the result of an open collaboration between major stakeholders in the field of data publishing. It provides access to a graph of dataset–literature and dataset–dataset links collected from a variety of major data centres, publishers, and research organizations. On the basis of feedback from content providers and consumers, the service will also enable the incremental refinement of an interlinking data model and exchange format, towards shaping up a universal, cross-platform, cross-discipline solution for sharing dataset–literature links. 

Introduction and vision

DLI ServiceChallenges to realize the full potential of research data exist at different levels—from cultural aspects, such as proper rewards and incentives, to policy and funding, and to technology. The challenges are interconnected and impact a diversity of stakeholders in the research data landscape—including researchers, research organizations, funding bodies, data centres, and publishers. To make progress in overcoming barriers and building a stronger research data infrastructure, it is essential that the different stakeholders work together to address common issues and move forward on a common path. Alongside other organizations, the ICSU World Data System (ICSU-WDS), the Research Data Alliance (RDA), and OpenAIRE provide useful forums for such collaborations. In particular, they are today working in synergy on an initiative that brings together different parties in the research data landscape with the objective of creating the Data Literature Interlinking Service (DLI Service), namely, 'an open, freely accessible, web-based service that enables its users to identify datasets that are associated with a given article, and vice versa'. At the moment of writing, members of the initiative include: the ICSU-WDS–RDA DSP-WG, OpenAIRE, RDA, ICSU-WDS, STM, CrossRef, DataCite, ORCID, the Australian National Data Service, and the RMap project. The vision is that of moving away from several bilateral arrangements that characterizes the research ecosystem today, towards establishing common standards and tools that sit in the middle and interact with all parties (see Figure)Data Publishing Services WG. Such a transition would facilitate interoperability between platforms and systems operated by the different parties, reduce systemic inefficiencies in the ecosystem, and ultimately enable new tools and functionalities to the benefit of researchers.

The service

The DLI Service populates and provides access to a graph of 'authoritative' dataset–literature links collected and aggregated from a variety of major data centres, publishers, and research organizations. It is intended to offer facilities for the following classes of actors:

– End users: Searching and browsing the graph of links via the Prototype PORTAL
– Third-party service developers: Accessing publications and datasets in the graph via programmatic APIs
– Content providers: Willing to feed high-quality authoritative links between publications and datasets or between datasets to the service (complete list of content providers).

Note: Formal data acquisition policies, SLAs, and data provider registration procedures will be produced at a later stage; currently each 'application' is processed independently with bilateral agreements. on the basis of feedback from content providers and consumers, The DLI Service will refine its underlying interlinking data model and exchange format to make it a universal, cross-platform, cross-discipline solution for collecting and sharing dataset–literature links, balancing between the information that can be shared across content providers and the information needed by its consumers.

Next steps

In the forthcoming months, further work will be carried out towards the delivery of a production service that is fully reliable in terms of QoS and quality of content. The following actions will be undertaken:

  • Definition of a content acquisition policy: minimal quality requirements to be respected by content providers in order for their publications, datasets and relative relationships to be aggregated by the system;
  • Definition of SLAs for content providers: make sure content providers are aware and agree on how their content (metadata) will be made openly accessible via the service;
  • Technical enhancements: data harmonization (e.g. cross-PID deduplication), data programmatic access (e.g. high-throughput resolver), data scalability (e.g. moving away from open source databases).
  • Deployment as an OpenAIRE infrastructure operational service: deploying the service on the OpenAIRE hardware infrastructure.
     

Relevant Links

WDS Data Publishing Services Working Group page

NASA, USGS, and NSF Respond to White House OSTP Open Access Memo

On March 18, the National Science Foundation (NSFannounced the publication of its plan—Today's Data, Tomorrow's Discoveries—to promote and expand public access to the results of NSF-sponsored research. This announcement follows those given by the National Aeronautics and Space Administration (NASA) and the United States Geological Survey (USGS) in response to the White House OSTP open access memo published in February 2013.

NSF states that accepted manuscripts or versions of record must be publicly available in an approved repository within 12 months of publication. Availability signifies that any user can download, read, and analyze the data free of charge. This will apply to new awards resulting from proposals submitted, or due, on or after the effective date of the Proposal & Award Policies & Procedures Guide that will be issued in January 2016.

The responses of the three organizations can be accessed below:

 – NASA Plan: Increasing Access to the Results of Scientific Research
 – Data Management Policy References: What the U.S. Geological Survey Manual Says…
 – NSF’s Public Access Plan: Today’s Data, Tomorrow’s Discoveries

LANCE AMSR2 Near Real-time Data Made Available

The NASA Land Atmosphere Near real-time Capability for EOS (LANCE) AMSR2 Processing Center at the Global Hydrology Resource Center (WDS Regular Member) in Huntsville, Alabama would like to announce the availability of its first AMSR2 near real-time dataset, NRT AMSR2 L2B Global Swath GSFC Profiling Algorithm 2010: Surface Precipitation, Wind Speed over Ocean, Water Vapor over Ocean and Cloud Liquid Water over Ocean. These LANCE AMSR2 near real-time products, with noted limitations, are generated and available to registered users via HTTPS with an average latency of less than 3 hours. More information about LANCE AMSR2 near real-time data is available here.

COAR Roadmap: Future Directions for Repository Interoperability

The Confederation of Open Access Repositories (COAR) has announced the publication of the COAR Roadmap: Future Directions for Repository Interoperability. This document is the culmination of over a year’s work to identify priority issues for repository interoperability, and identifies important trends and their associated action points for the repository community.

Scholarly communication is undergoing fundamental changes, with new requirements for open access to research outputs, new forms of peer-review, and alternative methods for measuring impact. In parallel, technical developments, especially in communication and interface technologies, facilitate bi-directional data exchange across related applications and systems. The success of repository services in the future will thus depend on the seamless alignment of the diverse stakeholders at the local, national, and international level.

WGMS Publish Latest Glacier Mass Budget Results

The World Glacier Monitoring Service (WGMS; WDS Regular Member) have published the latest glacier mass budget results on its website. These results were compiled from the 2014 call-for-data, which covered the observation period of the hydrological year 2012/13. In addition, WGMS has introduced near-time reporting from its 'reference' glaciers (those having more than 30 years of continued observations) to provide a preliminary estimate for the glacier budgets for 2014.