Personal tools

Call for EoIs to Host WDS-IPO: New Deadline of 31 July!

Call for EoIs to Host WDS-IPO: New Deadline of 31 July!

Owing to the continued disruptions in many countries from the COVID-19 pandemic, it has been agreed with the International Science Council to set a new deadline of Friday, 31 July 2020 for Expressions of Interest. Please note that this will be the final deadline for submissions. The  WDS International Programme Office (WDS-IPO) was created in March 2011. Since then, it has been hosted by ...

International Symposium: Global Collaboration on Data Beyond Disciplines (Call for Abstracts & Registration)

The Call for Abstracts and registration are now open for this  online symposium  commemorating the 10-year hosting of the WDS-IPO by the Japanese  National Institute of Information and Communications Technology (NICT).  The World Data System – International Programme Office  (WDS-IPO) was first established by NICT in April 2011 and was formally inaugurated by in May 2012. The ...

Webinar – COVID-19: The Role of Open Science

Webinar – COVID-19: The Role of Open Science

The COVID-19 pandemic presents a major test for our science system and for our research and data infrastructures. These infrastructures, such as open science clouds and data commons, must serve the needs of science, policy, and humanity not only in ‘normal times’, but also in times of crisis by providing controlled access to quality data in real time and at scale for a range of scientific- and ...

UNESCO Global Consultations on Open Science for Western Europe and North America

UNESCO Global Consultations on Open Science for Western Europe and North America

An online multi-stakeholder UNESCO Regional Consultation on Open Science for Western Europe and North America will be held on 23 July 2020 at 15:00–18:00 (CEST). Registration is necessary for participation by 17 July 2020. A selection might be done based on the final number of requests. Register here This online meeting is part of a series of regional consultations aimed at ...

More »

WDS International Technology Office Signs MoU with Canada's New Digital Research Infrastructure Organization

Karen.jpgBlog post by Karen Payne (WDS-ITO Associate Director)


You spoke. We listened. 

The WDS International Technology Office (WDS-ITO) was created to support Member Organizations of WDS as they develop their data repositories in the areas of data and metadata management, infrastructure, and interoperability. In order to respond most effectively to Member needs, last year the WDS-ITO, with the support of the WDS International Program Office, conducted a survey to evaluate your areas of interest and determine what types of projects you would like WDS to support. Our key finding was a list of potential WDS-ITO projects, ranked according to interest. You can read the report of the survey here. We discovered that the top two areas of interest were adding: 1) semantic markup to metadata and 2) harvestable metadata services. In response, the WDS-ITO has secured funds from Canada’s national New Digital Research Infrastructure Organization (NDRIO) to hire two fulltime staff members to work on these projects. The funding provides dedicated resources to develop collaborative partnerships among the WDS-ITO, its members, and relevant international and Canadian interest groups to increase availability and interoperability of metadata assets globally.

Over the next year, the WDS-ITO will be working with the Research Data Alliance Research Metadata Schemas Working Group (WG) to help provide repositories with guidance and tools to add Schema.org markup to metadata. As a first step, the WDS-ITO has prototyped an online visualization tool based on a survey of current practices in using schemas to describe research datasets. The tool shows how some communities have crosswalked common metadata terms to Schema.org properties, and can be useful to repositories that are interested in knowing how other repositories are utilizing Schema.org terms. It can also be used as consensus building for communities of practice that have not yet created a crosswalk between their metadata format of choice and Schema.org properties. We will continue to build on that tool, and provide other guidance to WDS Members to help make their metadata more ‘web friendly’ in the coming months.

Sankey DiagramFigure 1: A screenshot from the WDS-ITO prototype visualization tool showing crosswalks between Schema.org and common metadata standards.
Try it yourself at https://rd-alliance.github.io/Research-Metadata-Schemas-WG/

As part of our support for those groups interested in harvestable metadata, the WDS-ITO has created a WG of WDS Members who are interested in standing up harvestable metadata services. This WDS Harvestable Metadata Services (HMetS) WG is co-chaired by two members of the WDS Scientific Committee: Aude Chambodut, Director of the International Service of Geomagnetic Indices in Strasbourg (WDS Regular Member) and Juanle Wang, Director of the WDC for Renewable Resources and Environment in Beijing (WDS Regular Member). The HMetS WG is coordinated by Alicia Urquidi Diaz, the WDS-ITO’s first employee! To date, eight WDS Member Representatives have expressed interest in participating in the WG, and we welcome any other Members who would like to join.

This project is designed around three objectives:

  1. Documenting use cases, the current challenges faced by WDS Members who wish to create harvestable services. What is their current infrastructure?
  2. Helping develop implementation plans, written by Members to define a pathway to creating harvestable metadata services.
  3. A paper identifying lessons learned and guidance materials that can be used by the wider Research Data Management community

The HMetS WG will convene regular online meetings, and bring in presenters who can speak to some of the pathways and long-term benefits of creating harvestable metadata services.

Both of the above work packages will draw on the expertise of and synchronize with ongoing research data management activities in Canada, with the ultimate goal of opening up more metadata records to the international scientific community.

You can read the NDRIO funding announcement here in English and French.

Springboard Blog Post on the TRUST Principles

We would like to point you to the following article, published on 8 June 2020 on the Springboard blog of the Springer Nature Group, and which we believe is of direct interest to the WDS community:

• Future-proofing research data – it’s a question of TRUST

In this blog post, Varsha Khodiyar (Data Curation Manager, Research Data and New Product Development) describes why Springer Nature has endorsed the TRUST Principles and their importance to data management within the research community.

For more information on the TRUST Principles and how your organization can endorse them, please see our news article.

Knowledge Service for Disaster Risk Reduction: A Practice Using Big Data Technology

Juanle WangBlog post by Juanle Wang (2019 WDS Scientific Committee Member)

Under the dual influences of global climate change and human activities, the frequency and the intensity of natural disasters have been growing in recent years, and resulting in increasingly serious disaster losses. Disaster Risk Reduction (DRR) is thus a common and urgent global challenge. Driven by the United Nations Educational, Scientific and Cultural Organization’s (UNESCO’s) DRR mission, the DRR Knowledge Service (DRRKS) System was founded under the UNESCO International Knowledge Centre for Engineering Sciences and Technology. The remit of the System is to formulate global disaster metadata standards; build global disaster metadata database; integrate global or regional disaster data; establish disaster knowledge services; carry out disaster prevention education, training, and technology promotion; and form comprehensive technology and service capabilities [1].

The DRRKS System has established 16 online knowledge applications, as shown on their homepage, to mine, analyze, and visualize disaster information based on Big Data resources. In this blog post, I would like to briefly introduce two cases that are supported by Big Data technologies in remote sensing and social media mining.

Case 1: Land Degradation and Restoration Monitoring in Mongolia Using Remote Sensing [2]

Land degradation is an important environmental problem facing the world. ‘Land Degradation Neutrality’ is one of the core indicators of Goal 15 (Life on Land) of the United Nations Sustainable Development Goals. Mongolia is one of the areas of the world that is most affected by desertification. It is therefore of great importance to accurately comprehend the state of desertification in Mongolia to (1) prevent its further advance, (2) control desertification risks, and (3) guarantee ecological security and sustainable social development. To this end, fine resolution (30-m) land cover datasets of Mongolia were obtained by using an object-oriented method, and the land degradation and restoration patterns during 1990–2010 and 2010–2015 analyzed (Fig.1). For the past 25 years, the trend of land change in Mongolia has been dominated by land degradation. However, this land degradation was accompanied by ongoing restoration of some land areas in Mongolia, and the capacity for land restoration is gradually improving. The northwestern and northeastern parts of Mongolia have shown the most significant land restoration; namely, the areas having relatively sufficient water resources.

Figure 1: Typical regions of land degradation and land restoration between 1995–2010 in Mongolia. (a) 1990–2010 (land degradation), (b) 1990–2010 (land restoration)

Figure 1: Typical regions of land degradation and land restoration between 1995–2010 in Mongolia.
(a) 1990–2010 (land degradation), (b) 1990–2010 (land restoration)

Case 2: Public Sentiment Analysis of COVID-19 Events in China Using Social Media

Similar to Twitter, SINA microblog is a social media channel in which Chinese people regularly post their opinions. These types of social media indicate the public’s changing thoughts and emotions rapidly and frequently during an epidemic (now pandemic) such as the Novel Coronavirus Disease (COVID-19). The DRRKS team analyzed the temporal and spatial changes to microblogs referencing the (then) epidemic, and gathered the main topics being discussed by the public according to data from SINA microblog. Through the permitted data Application Programming Interface of the SINA Microblog, original messages have been collected since 00:00 on 9 January 2020 containing the keywords “coronavirus” and “pneumonia”. The following information has been extracted: timestamp (i.e., the time when the message was posted), text (the message posted by a user), and location information. The DRRKS team have then analyzed the Microblog messages related to the Coronavirus outbreak in terms of space and time. Temporal changes over one-hour and one-day intervals, and spatial distribution at provincial levels, have been investigated through a kernel density estimation using ArcGIS to identify hotspots of public opinion. The spatial and temporal distribution of public opinion in China during the early stages of the epidemic has been discovered and is available in a DRRKS online application. For example, Figure 2 shows the distribution of help and donation hot spots from 9 January to 10 February. 

Figure 2: Distribution of help and donation hot spots according to microblogs in China (9 January to 10 February 2020)

Figure 2: Distribution of help and donation hot spots according to microblogs in China
(9 January to 10 February 2020)

Reference

[1] Juanle Wang, Kun Bu, Fei Yang, Yuelei Yuan, Yujie Wang, Xuehua Han, Haishuo Wei. Disaster Risk Reduction Knowledge Service: A Paradigm Shift from Disaster Data Towards Knowledge Services, Pure and Applied Geophysics. (2020) 177:135-148
[2] Juanle Wang, Haishuo Wei, Kai Cheng, Altansukh Ochir, Davaadorj Davaasuren, Pengfei Li, Faith Ka Shun Chan, Elbegjargal Nasanbat. Spatio-Temporal Pattern of Land Degradation from 1990 to 2015 in Mongolia, Environmental Development, 2020.

WDS–ECR Data Curation and Management Workshop

Agneta GhoseA Blog post by Agneta Ghose (2019 WDS–ECR Training Workshop Participant)

If ‘Data is the new gold’ then it certainly must be managed. Science has always valued data. Scientific data are not only an output of research but also an input to new hypotheses, enabling scientific insights and driving innovation. Therefore, accountability, transparency, and verifiability of science make data preservation and sharing part of scientific integrity.

The World Data System (WDS) recently organized a training workshop for early career researchers (ECRs) on data curation and management. The workshop was held at the Institute de Physique du Globe, Paris on 6–8 November 2019. The objective of this workshop was to familiarize ECRs with the methods and jargon used in research data management in addition to introducing future challenges and technological solutions to data management. In this blog, I would like to share the key messages from the informative presentations at this workshop.

Members of the WDS Scientific Committee (WDS-SC), Programme and Technology Offices, and ECR Network presented and discussed methods to ensure how research data remains findable, accessible, Plenary discussions on challenges of data managementinteroperable and reuseable (i.e., FAIR). This was reinforced by an interactive exercise, in which the attendees spoke about their personal challenges in accessing, storing, and managing data. This discussion showed us how the challenges are similar across scientific disciplines.

The workshop began with an introduction into understanding what are ‘data’ and their attributes. Aude Chambodut and Alice Frémand explained the characteristics of research data such as their origin, type, size, and format. These characteristics influence data management; specifically, how to access, process, store, and reuse data. We were introduced to the data lifecycle, which refers to the sequence of stages that data go through from their initial generation to their eventual archival and/or deletion at the end of their useful life. The longevity of research data can be increased by implementing a Data Management Plan (DMP), which provides guidelines on how data are to be handled throughout their lifecycle, (i.e., during and after a research project). In principle, a DMP is a pre-requisite when applying for major EU funding, but Isabelle Gärtner Roer and Alice Frémand explained to us the practicality of developing and implementing these plans in relation to our research. Ensuring a robust DMP increases research efficiency, re-enforces scientific integrity, and most importantly promotes innovation by improving the accessibility of data. Most universities and research institutions have platforms that provide advice and support on research data services. For an ECR, it is worth reaching out to these services to understand the recommended DMP in their research domain.

Research Data Life Cycle

Research Data Life Cycle. Sourced from Massey University: https://www.massey.ac.nz/massey/research/library/library-services/research-services/manage-data/manage-data_home.cfm

We were introduced to the resources available for rigorous data management that will ensure our research data remain FAIR. Sandy Harrison explained the value of Open Data in scientific research. She elaborated on how datasets produced from scientific work are increasingly deposited into data repositories. This is a better alternative to including these only as supplementary materials to a journal paper. Repositories provide long-term data archiving; ensure high technical standards with the possibility of updates. Moreover, publishing research data on Open Access platforms adds to their discoverability. It is important to note that not all research data needs to be openly available. Data can be kept private, but information that the data exists and what are the pre-conditions of accessing it must be shared. Ensuring data accessibility must not take away credit from those who produce data. To prevent or discourage unauthorized use or commercial exploitation, it is important to disclose knowledge (data) safely. Ioana Popescu discussed the importance of copyright and licensing. Different conditions and types of Creative Common Licenses are available to ensure data providers receive due credit, or to determine whether the data is available for commercial use, and so on.

Sourced from OpenAire: https://www.openaire.eu/how-to-make-your-data-fair Comment

Image: https://book.fosteropenscience.eu/

On data interoperability, Elaine Faustman introduced ontologies and knowledge graphs, which define the concepts and relationships between data. Ontologies are useful to turn data into machine-readable formats, and thus connect them to the semantic web: an extension of the World Wide Web that contains machine-readable data. Embedding semantics is advantageous, especially when working with heterogeneous data sources. Karen Payne discussed how data have increased in volume, velocity, and variety over the years (i.e., Big Data). In 2018, the International Data Corporation estimated the global data sphere had reached 33 zettabytes (1 zettabyte = 1 x 1012 gigabytes). The volume and variety in data influences their management. To address issues with Big Data and complex computing, cloud computing resources have been developed that are delivered over the Internet. Cloud computing refers to virtual resources—such as infrastructure resources, services, and applications—orchestrated by Rorie Edmunds discussing the value of certified data repositoriesmanagement and automation software so they can be accessed by users on-demand through self-service portals. Automatic scaling and resource allocation support these portals.

Technical barriers to data sharing include incomplete datasets or unguaranteed services such as datasets that do not contain what they claim to! Moreover, certification standards play an important role in establishing trust, and hence sustaining the opportunities for long-term data sharing. Rorie Edmunds presented the certification procedures and framework available for data repositories. Certification standards such as the CoreTrustSeal look at technical, organizational, and financial infrastructure, as well as legal aspects, workflows, and risk management. Depositing data into certified repositories ensures longevity, discoverability of one’s data, in addition to access to recognized expertise to address technicalities. On the other hand, those using data from certified databases have the ability to verify results, know the provenance, and even give feedback to the data producer.

With an overview of the various resources available for data management, participants were asked to revisit both the DMPs they had started to create on their respective research projects, as well as the challenges identified at the beginning of the workshop. The workshop definitely helped clarify most of the concerns the attendees had expressed. Personally, it was a great learning experience, and I am grateful to have been selected for this workshop. During the past few months since the workshop took place, I have become much more aware about data management within the realm of my project, as well as having discussions on this with my colleagues. I know that this workshop was the first WDS training event for ECRs, I am glad to have been a part for it, and would definitely recommend it to my peers. Finally, I acknowledge the work of everyone involved in the organization of the workshop. I hope that there are many more such workshops in the future, and especially aimed at ECRs.

Isabelle Gärtner Roer and Aude Chambodut ask whether the Workshop addressed the participants’ RDM Challenges

Isabelle Gärtner Roer and Aude Chambodut ask whether the Workshop addressed the participants’ RDM Challenges

More »