Personal tools

News Archive

View all »

Second LAC Scientific Data Management Workshop: Call for Abstracts

This Workshop is convened by WDS, in collaboration with the  Brazilian Academy of Sciences , the  São Paulo Research Foundation , and the  Research Data Alliance  (RDA). It builds on the success of the  First Latin America and Caribbean Workshop  in 2018 that explored the data landscape in the region to understand the opportunities and challenges, and discussed how data initiatives could ...

21st Meeting of the WDS Scientific Committee

21st Meeting of the WDS Scientific Committee

The 21st Meeting of the WDS Scientific Committee (WDS-SC) took place on 04–05 November in Paris, France. We are very grateful to our  parent organization, the  International Science Council (ISC), for kindly hosting the WDS-SC during its  second  biannual meeting of 2019 . The 21st Meeting began with an update on the latest developments in ISC by Dr Heide Hackmann, ISC Chief Executive ...

WDS PICO Session at EGU 2020: Call for Abstracts

WDS PICO Session at EGU 2020: Call for Abstracts

The  European Geosciences Union (EGU) General Assembly 2020  will be held on 3–5 May 2020 in Vienna, Austria. The World Data System of the International Science Council is leading the following session, and we would like to encourage your  abstract submission  by the deadline of Wednesday, 15 January 2020, 13:00 CET. Session ID: ESSI3.7 Session Title:  Inspiring the Next Generation of ...

WDS Blog

View all »

Metadata Stewardship in Genetic Research: Enabling a Research Community Toward Best-practice

Dr Libby LigginsA Blog post by Libby Liggins (2019 WDS Data Stewardship Award Winner)

For over four decades, scientists have been collecting genetic DNA sequence data for thousands of the world’s species. In the biodiversity and eco-evolutionary sciences, these data are generated to describe new species, define their evolutionary relationships, determine the levels of dispersal among populations, and assess levels of genetic diversity across a species range. The rate at which we accrue these DNA sequences has increased over time as the use of genetic data has diversified, and the sequencing technologies used to decode the DNA sequences of organisms have become faster, cheaper, and much higher through-put. As this trend continues into the future, it is anticipated that we may soon have more DNA sequences in a digital form than we have existing in the natural world.

This massive and growing data resource could now be consolidated for multiple species and populations and reused to better understand the world’s biodiversity at the genetic level. Genes are recognized as a fundamental component of the biodiversity hierarchy, but have received less attention than species- and ecosystem-level measures of biodiversity. In part, this may be due to synthetic analyses of genetic data being challenging and sometimes impossible, as there has been no concerted effort towards the curation and stewardship of this valuable data resource. While funding agencies and publishers advocate deposition of DNA sequence data in open-access repositories (such as the National Center for Biotechnology Information; and the European Bioinformatics Institute), they do not require the deposition of standardized metadata such as the sampling location, date, and habitat of the sampling event (Pope et al. 2015). This ‘metadata gap’ means that information essential for multispecies analyses to better understand biodiversity and evolutionary patterns across our globe, has not been readily available.

The Genomic Observatories MetaDatabase (GEOME; Deck et al. 2017) has recently provided a solution to this metadata gap. GEOME links ecologically and evolutionarily relevant metadata with DNA sequences uploaded to open-access repositories. The metadatabase incorporates the latest international standards for biodiversity and genomic data, and helps researchers store and access genetic data relevant to studies concerning large scale biodiversity and conservation problems. In conjunction with the open-access DNA sequence repositories, GEOME ensures that researchers and projects generating genetic data can adhere to the FAIR Principles (Findable, Accessible, Interoperable, Reusable; Wilkinson et al. 2016), promoting research community best-practice.


The Ira Moana Project logo. The Māori phrase Ira Moana could be interpreted as meaning ‘ocean genes’ or ‘dot in the ocean’. Both seem appropriate when thinking about the scale of DNA in the vastness of the ocean. The use of te reo Māori (Māori language) resonates with the project objectives that are uniquely New Zealand, as is the Māori language. Yet, moana is used to describe the ocean by many Pacific nations, reminding us of the connections that New Zealand’s biodiversity has with the wider Pacific region.

The Ira Moana Project has partnered with GEOME both to enable a collaborative network of researchers to adhere to these standards in community best-practice, and deliver a searchable metadatabase for the genetic data of Aotearoa New Zealand’s marine organisms. The Project aims to build and maintain the most comprehensive national database of marine genetic data in the world, ensuring kaitiakitanga (guardianship and stewardship) and creating opportunities for data synthesis to inform New Zealand’s future research directions and conservation decisions. The Ira Moana Project builds on the success of the Diversity of the Indo-Pacific Network (DIPnet) that through the use of GEOME and multi-national collaboration, has created the largest population genetic database in the world. DIPnet consolidated over 200 genetic datasets for Indo-Pacific marine organisms, and is now delivering novel biodiversity insights for the Indo-Pacific Ocean (e.g., Crandall et al. 2018), which is the largest and one of the most threatened biogeographic regions on our globe.

The Ira Moana Project is similarly founded in concern for the marine environment. New Zealand is a marine nation—we have one of the largest exclusive maritime economic zones in the world, which sustains our marine and tourism industries, and provides significant recreational and social benefits for New Zealanders. Nationally, and as global citizens, we are under pressure to make informed decisions regarding commercial and recreational activities, and how they can be balanced with the protection of our marine ecosystems. Such decisions of environmental, economic, and societal impact need to be transparent and based on robust information, as well as including knowledge about biodiversity that stretches from ecosystems to genes. The Ira Moana Project has established that there are over 430 genetic datasets for New Zealand marine organisms, and is now working to consolidate these data for the benefit of future researchers and generations of New Zealanders.


The data lifecycle in genetic research. DNA sequence data is routinely deposited into open-access genetic data repositories (under OUTPUTS). Despite metadata being accrued at every step of research (*), starting with COLLECTION, the practice of depositing metadata into repositories such as the Genomics Observatory Metadatabase (GEOME) is very recent. The Ira Moana Project is one of the project’s using the infrastructure provided by GEOME. Stewardship of metadata alongside DNA sequence data ensures that genetic research in the biodiversity, ecological, and evolutionary sciences can be reproducible, the genetic data can be re-used, and that the provenance of the genetic data and the rights of the local communities involved in the research are maintained.

As the first national project to make use of the GEOME infrastructure, the Ira Moana Project has worked with GEOME to extend the capability of the metadatabase to additionally acknowledge indigenous rights. It has become apparent that what is considered fair and equitable research practice within the research community, may not be fair and equitable within broader society. Through collaboration with Local Contexts and Te Mana Rauranga (the Māori Data Sovereignty Network), the Ira Moana Project and GEOME are now beta-testing the capacity for researchers to add Notices (such as the Traditional Knowledge Notice; TK Notice) and new Biocultural Labels as metadata for DNA sequence data. Notices signal that there are accompanying Indigenous rights needing further attention for any responsible and equitable future use of the data. Biocultural Labels further allow the addition of provenance information and community expectations for future use based on Indigenous Data Sovereignty principles—including the CARE Principles (Collective Benefit, Authority to Control, Responsibility, Ethics) launched by the Global Indigenous Data Alliance—thereby enabling Indigenous stewardship and persistent recognition of Indigenous rights within an international framework (complying with the Nagoya Protocol to the Convention on Biological Diversity). The implementation of Notices and Biocultural Labels using GEOME infrastructure is a first for a biological resource and for genetic data, establishing new ethical standards in this research community.

Workshops and datathons for New Zealand researchers have encouraged uptake and use of the metadata infrastructure provided through the Ira Moana Project and GEOME. There are now greater than 85 researchers who have joined the Ira Moana Project Network; being part of the network means being ‘on-board’ both with the things that the Ira Moana Project is trying to achieve for New Zealand, and the metadata standards that GEOME is accommodating for researchers worldwide. As there is a global community of researchers who generate genetic data, it will be some time before there is universal uptake of these newly recognized standards of best-practice. Nonetheless, we should be encouraged by the fact that as a community, we have made similar transformations in our practice in the past; since the introduction of the Joint Data Archiving Policy, it has been considered standard practice to deposit genetic data into open-access repositories. As such, we anticipate that the Ira Moana Project metadatabase will continue to grow and serve New Zealander’s, and there will be increasing uptake of the services that GEOME provides to the research and wider community.

Literature cited
 – Crandall ED, Riginos C, Bird CE, Liggins L, Treml E, Beger M, Barber PH, Connolly SR, Cowman PF, DiBattista JD, et al. 2019. The molecular biogeography of the Indo-Pacific: Testing hypotheses with multispecies genetic patterns. Global Ecology and Biogeography. 58(5):403–418.
 – Deck J, Gaither MR, Ewing R, Bird CE, Davies N, Meyer C, Riginos C, Toonen RJ, Crandall ED. 2017. The Genomic Observatories Metadatabase (GEOME): A new repository for field and sampling event metadata associated with genetic samples. PLoS Biology. 15(8):e2002925.
 – Pope LC, Liggins L, Keyse J, Carvalho SB, Riginos C. 2015. Not the time or the place: the missing spatio‐temporal link in publicly available genetic data. Molecular Ecology. 24(15):3802-9.
 – Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten JW, da Silva Santos LB, Bourne PE, Bouwman J. 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data. 3

Use of Data from Citizen Observatories to Complement GEOSS Repositories – Experiences from EU Funded Projects

Popescu.jpgA Blog post by Ioana Popescu (WDS Scientific Committee Member)

Citizen science is getting more and more attention worldwide; in particular, there is a growing interest in involving citizens in data collection due to its capability to complement the acquisition of data classically accomplished through existing complex instrumentation networks. Scientists have experimented with multiple forms of citizen science projects, which have been successfully implemented in many fields. The value of using citizen contributions has been proven—or at least explored—in almost all scientific domains, and its potential is currently also being investigated in the processes of decision- and policy-making.

There are many definitions of citizen science. The definition most often used is that of Buytaert et al. (2014): The participation of the general public (i.e. non-scientists) in the generation of new knowledge. In this blog post, I focus on citizen science from the perspective of data collected by citizens and the use of these data, but there is also much research looking into how to involve citizens, and consequently, how they are participating in the collection of data. Data CollectionTaking the latter viewpoint, there is now lots of terminology that can be found in the literature; for example, citizen observatory (CO), citizen sensing, trained volunteers, crowdsourcing, community-based monitoring, volunteered geographic information, eyewitnesses, and so on. 

As mentioned in the title, I would like to spend the remainder of this blog post briefly introducing four Horizon 2020 funded projects that have used innovative technologies for collecting data with the help of citizen scientists. The projects ran from the second half of 2016 until mid-2019, and were clustered under WeObserve, which examines the challenges faced by COs in terms of awareness, acceptability, and sustainability. They shared the specific goal that their final (analyzed and processed) data products would not only complement existing data elements within the Global Earth Observation System of Systems (GEOSS), but also become new GEOSS contributions.

SCENT (Smart Toolbox for Engaging Citizens into a People-Centric Observation Web)

Citizens were engaged in environmental monitoring of land-cover/use changes using their smartphones and tablets, enabling them to become the ‘eyes’ of the policymakers. In particular, the project looked at two pilots—the urban case of the Kifisos river in Attica, Greece and the rural case of the Danube Delta in Romania—where the citizen-collected data were used to assess flood models and flooding patterns. You can read more about this project here.

LANDSENSE (Connecting citizens with satellite imagery to transform environmental decision making)

The focus of this project was on the potential of Earth observations taken by citizen scientists to augment and improve the way we see, map, and understand the world. Three main areas of application were selected as demonstrators: urban landscape dynamics, agricultural land use, and forest and habitat modelling. Read more about LANDSENSE here.Data Aquisition

Data collection cycle for citizen science campaigns in water management. The study focus is highlighted in yellow. (Taken from IEEE article: Citizens’ Campaigns for Environmental Water Monitoring: Lessons From Field Experiments.)

Groundtruth2.0 (How to impact decision making with citizen observatories)

The interaction was investigated between people and technology when it comes to setting up a successful system for land and natural resources management. The project combined the social dimensions of COs and enabling technologies so that the implementation of each observatory was tailored to its envisaged societal and economic impacts with a specific emphasis on flora and fauna, as well as water availability and quality. Find out more about the project here.

GROW (Grow Observatory)

In this project, citizen scientists collected information on land, soil, and water resources to answer a long-standing challenge for space science; namely, the validation of soil moisture detection from satellites. Read more here.

Buytaert, W., et al: Citizen science in hydrology and water resources: opportunities for knowledge generation, ecosystem service management, and sustainable development, Front. Earth Sci., 2, 26, doi: 10.3389/feart.2014.00026, 2014.

Health Data Challenges Regarding ‘Scientific Medical Processing Challenges’

Marc NyssenA Blog post by Marc Nyssen (WDS Scientific Committee Member)

Recently, the biomedical and clinical engineers who are associated with the International Federation for Medical and Biological Engineering (IFMBE) and also belonging to the International Union for Physical and Engineering Sciences in Medicine (IUPESM)—the umbrella organization linking the engineers at IFMBE and the medical physics experts at the International Organization for Medical Physics—took the initiative to include competitions called ‘scientific challenges’ as a part of their conferences. The purpose of these challenges is to encourage young researchers to develop their skills by showing how they can extract information from biomedical datasets and report on their results.

A ‘challenge call’ is made public a few months before the conference alongside a deadline for the result papers, which are then evaluated by a jury. Introduced by Prof Paulo Carvalho from Coimbra University in Portugal and Prof Ratko Magjarevic from Zagreb University in Croatia, the challenges have proved quite successful, with the participation of 20–30 groups of young researchers responding to the first call.

A major problem for the organizers, however, has been to find adequate datasets containing well-documented biomedical data, such as respiratory measurements, electroencephalography recordings, electro-cardiac recordings, and the like. While many state that Big Data is widely accessible and available, well-documented and consistent biomedical datasets are difficult to find. This has resulted in the IFMBE having to actually sponsor teams to collect appropriate datasets of biomedical measurements specifically for the ‘scientific challenge’ competitions!

To address such issues, programmes are now being started that encourage universities and research groups in the Biomedical Sciences to make their datasets public while taking adequate precautions to protect the privacy of patients when such datasets are linked to physical persons. IFMBE is currently exploring practical ways to constitute collections of well-documented biomedical datasets that comply with the FAIR principles and that are made publicly available to researchers via a repository. Moreover, it is encouraging member societies at large to take up similar schemes either themselves or via universities.

To be continued...


View all »

New Journal & Call for Paper: Patterns From Cell Press

New Journal & Call for Paper: Patterns From Cell Press

We are happy to announce that a new journal Patterns from Cell Press will be launching soon.  Register here to receive the first issue Patterns is a premium open access journal from Cell Press, publishing ground-breaking original research across the full breadth of data science. Data are the foundation of all research, and all data are in scope, regardless of original domain. ...

Data Engineer Job Opening at Jet Propulsion Lab for PO.DAAC

Data Engineer Job Opening at Jet Propulsion Lab for PO.DAAC

The Physical Oceanography DAAC (PO.DAAC; WDS Regular Member) is seeking a Data Engineer whose technical skills and interests are focused on data management, documentation, and data curation approaches applied to remote sensing and Earth science data systems. You will apply your experience in data management, particularly in evaluation and integration of heterogeneous oceanographic data and ...

ISRIC: Sneak Preview of New Edition of SoilGrids250m & Request for Feedback

ISRIC: Sneak Preview of New Edition of SoilGrids250m & Request for Feedback

ISRIC – World Soil Information  (WDS Regular Member) has been updating its soil property maps for the world (SoilGrids250m). Numerous improvements were implemented since publication of the '2017 version', making this a completely new product. A sneak preview is available here ; the full version with supporting web viewer will be released around March 2020. So far, the layers have been ...


View all »

Call for Contributions for iPRES 2020

iPRES is the premier and longest-running conference series on digital preservation. Since 2004, we have had annual iPRES conferences in rotation around the globe on four continents so far. Our conference brings together 300-400 scientists, students, researchers, archivists, librarians, providers, and other experts to share recent developments, innovative projects and to collaboratively solve ...

Data Distribution Centre Support for the IPCC Sixth Assessment

Stockhause et al.  in Data Science Journal (Volume 18, Number 20; ). Abstract: The information provided in the Intergovernmental Panel on Climate Change (IPCC; ) Assessment Reports (ARs) inform climate change policy development. Within the IPCC the scientific coordination of the ARs is conducted by three Working Groups (WGs) comprising of the ...

Towards Trusted Data Services: World Data System & Certification

WDS Scientific Committee in SCOSTEP VarSITI Newsletter (Volume 18). Over the last 20 years, the exchange and availability of research data has undergone a major upheaval with the widespread use of the Internet. Researchers and research organizations, such as those involved in SCOSTEP activities, had obviously not waited for this electronic era to exchange observations, data, and ...

Press Releases

View all »