This workshop convened by WDS aims to discuss scientific data management best practices developed by research institutions in Latin America and the Caribbean. It will map initiatives that are either under way or in the process of being developed, their strengths and limitations, and new opportunities for collaboration. In addition, future trends and perspectives for scientific data systems, as ...
Congratulations to Dr Linhuan Wu, who has been chosen by the WDS Scientific Committee (WDS-SC) as the 2017 winner of the WDS Data Stewardship Award. Dr Wu will be presented with the 2017 Award and a prize at International Data Week 2018 . On being informed of her win, Dr Wu said, 'I fully understand that the WDS Data Stewardship Award has an excellent reputation and always has high selection ...
We are pleased to inform you of the launch of the WDS Training Resources Guide , a dynamic list of training resources offered by WDS Members and beyond that are acknowledge to be useful in the mobilization of capacity in the area of data stewardship. In particular, an indication is given as to which of the Core Trustworthy Data Repositories Requirements a training resource may be relevant ...
World’s leading bodies of Social and Natural Sciences agree to merge in 2018, becoming the International Science Council to serve as the global voice for science. At a historic joint meeting in Taipei, members of world's two leading international science councils voted today to merge, launching a process that will see the formation of a single global entity—the International Science ...
Thoughts on Future Trust
A Blog post by Wim Hugo (WDS Scientific Committee member)
The ICSU World Data System (ICSU-WDS) and the Data Seal of Approval have recently collaborated on the alignment of their respective sets of criteria for certification as a Trusted Digital Repository, and is in process of establishing a joint certification authority—the CoreTrustSeal—to manage the certification process associated with it. This activity contributes to a significant future focus on the trust that can be placed in elements of a distributed global research infrastructure, and the increased automation of its verification. However, it is the tip of the iceberg.
The WDS Knowledge Network defines many of the components of research activity for which there is some form of trusted service or infrastructure component required: ranging from the obvious need to reliably refer to research outputs, researchers, institutions, artefacts, projects, and the like, though the more complex aspects of trusted repositories, registries, vocabulary, and ontology services, to the assigning of levels of maturity, sustainability, or quality to these.
The trust that is required for research infrastructure to function properly is somewhat different to the trust that can be placed in the content that is curated by the research infrastructure—although one has to recognize that the two aspects are interrelated and, in some instances, inseparable. Furthermore, the trust that can be placed in content should ideally also distinguish between the significance and usability of that content, and its quality. These facets are not necessarily the same, but again are conflated to some extent in discussions about fitness-for-use, quality metrics, and the like.
Let’s work though these distinctions at the hand of some examples.
The main aim of a scholarly publication is to assert a claim in respect of a novel finding, and to expose that claim to peer review for the purpose of correction, as required1. One needs to distinguish the rules (criteria for trust) associated with the process of science and the value of the content. The latter is largely judged by significance, and measured—with varying degrees of usefulness—through citation indices and impact factors.
There are arguments that this stream of self-correcting progress is broken, especially in some disciplines, and this is strongly related to the criteria for trust. Such criteria are largely stated informally and implemented with varying degrees of diligence in research institutions, and are mostly delegated to peer review to determine if the result is trustworthy. Peer review purports to determine originality (not easily automated, and essentially linked to end-user value), quality (certainly possible to automate) and validity (can be partly automated).
One could—and in my view, should—argue that processes can be verified objectively and preferably automatically, and that our aim should be to certify their veracity using measurable criteria. Such validity and quality criteria could be extended to feasibility of reproduction, access to supporting datasets, and the like. References to widely used protocols and methods, standards, samples, and research pattern—increasingly linked to persistent identifiers—also increase the verifiable level of trust in the process.
Vocabulary (name) services play an increasingly important role in research infrastructures for a variety of reasons. Firstly, vocabularies and name services are critical to the realization of the semantic web and Linked Open Data: in essence, reducing ambiguity by referring precisely to a concept, entity, relationship, and/or characteristic of either. Secondly, these services are used to enhance the experiences of users and the value of knowledge by navigating the relationships that exist among them, which is conceptually captured in the WDS Knowledge Network and is increasingly implemented, for example, in projects such as Scholix. Again, one should not confuse the acceptability of the vocabulary or service content (e.g., whether all taxonomists in the world agree that a taxon is correct), and the quality of the service provided by the infrastructure component. For the first case, there may never be agreement (especially with taxonomists!); but, for the latter, it is a relatively simple matter to determine what constitutes a well-defined, standardized vocabulary or name service, and community efforts are underway to document and define these criteria. In addition to such operational requirements, one should include the need for sustainability and continued access into the reasonable future.
In general, one can distinguish—for all of the elements of the WDS Knowledge Network—a clear separation between judgements about value (significance, originality, inclusiveness, consensus, etc.) and the quality of the process (sustainability, standards compliance, reproducibility, and similar concerns). And, extrapolating this into the future, I suspect that we need to get ready for the following:
- Significant broadening of services and infrastructure that cover all aspects of the WDS Knowledge Network, as well as a parallel rise in the need for certification of these services and infrastructure. Already, there is a perceived need for the certification of repositories of open source code and of vocabulary services, to name but two.
- Increased automation of the certification of processes that is in tune with an expected, rapid upturn in artificial intelligence and machine learning. This will be needed because I have no doubt that the scientific method will be increasingly automated within the next decade or so. We are already overwhelmed by volumes of data and numbers of publications, and science cannot scale any further as it is limited by human capacity.
On the basis of the above, and with science increasingly reliant on trust in a wider context, ICSU-WDS should start focussing on defining trust criteria beyond data repositories and services, and on how to automate its assessment: this being the only really scalable solution to a problem of rapidly growing scope.
1 There is a parallel focus on review and consolidation or synthesis based on existing knowledge.
From People to Pixels: Integrating Data Across the NASA DAACs
A Blog post by Lindsey M. Harriman (SGT, Inc. Contractor to USGS EROS Center/LP DAAC) and Alex de Sherbinin (WDS Scientific Committee member)
Socioeconomic and Earth Sciences researchers in search of pertinent data can now reap the benefits of a recent collaboration between two Regular Members of the ICSU World Data System.
Today, our planet supports about 7.6 billion people, with a projected increase to nearly 10 billion by 2050, and more than 11 billion by 2100. These 7.6 billion people are using land and water resources to meet their basic needs. As the population increases, their use of, and their impact on, Earth’s resources is going to change. Researchers who study the dynamics between such human–land interactions and their changes over time will look at a range of variables, such as surface temperature, vegetation health, forest cover extent, and change in land cover and habitat, as well as impacts of natural disasters, and climate trends and extremes.
Research questions that often ask about such dynamics include:
- What is the proximity between populated areas and fire occurrences over time?
- What is the correlation between the increase of population and land surface temperature in urban areas?
- How has population affected land-cover change and vegetation growth over time in urban sprawl areas?
- How will land-cover changes affect flood and drought risk around rural and urban settlements?
To answer these types of questions, researchers need to integrate census data with Earth observation data, including data collected by NASA’s Earth Science Division Operating Missions. Recently, two NASA Distributed Active Archive Centers (DAACs)—the Land Processes DAAC (LP DAAC; WDS Regular Member) and the Socioeconomic Data and Applications Center (SEDAC; WDS Regular Member)—collaborated to make that integration much easier. LP DAAC and SEDAC worked together to provide access to georeferenced population data alongside land remote sensing data in the Application for Extracting and Exploring Analysis Ready Samples (AppEEARS). SEDAC’s Gridded Population of the World version 4 (GPWv4) aggregates census data from around the world into a globally consistent grid with 30 arc-second resolution (1 kilometer at the equator) for population density and counts. Soon researchers will also have access to age and sex distribution grids. LP DAAC disseminates land remote sensing data collected by several NASA missions—including from the popular Moderate Resolution Imaging Spectroradiometer (MODIS) sensor onboard Terra and Aqua—and provides access to a selection of these datasets through AppEEARS.
Figure 1. Daily land surface temperature in Kelvin (K) and population trend, 2010–2017 for rural and urban points in North Carolina (based on MODIS MOD11A1 daily 1-km data and GPWv4, UN-Adjusted)
(a) Farm northwest of Nashville, North Carolina, USA. The red pin represents the location 36°N, 78°W. Image: Google Maps. Time series plots: output from AppEEARS.
(b) Suburban area of Charlotte, North Carolina, USA, experiencing rapid population growth. The red pin represents the approximate location 35°N, 81°W). Image: Google Maps. Time series plots: output from AppEEARS.
Figure 1 provides examples of time series plots of population growth and daily land surface temperature using the Point Sample function in AppEEARS. Users can interact with these visualizations within the application and also download the data values in comma separated value format.
Additionally, LP DAAC has collaborated with a third DAAC, the National Snow and Ice Data Center DAAC (NSIDC DAAC; WDS Regular Member), to provide MODIS snow-cover data from its archive for access through AppEEARS as an additional variable describing land dimension. SEDAC, LP DAAC, and NSIDC DAAC are all part of NASA’s Earth Observing System Data and Information System, and through their collaborations, AppEEARS now provides access to more than 100 data products from the three data centers in a single place, at no cost to the user. Many possible combinations of data can be extracted from AppEEARS for use in analyses of the dynamics between populations and ecosystems over time.
AppEEARS also provides benefits during the data preparation process. When performing a sample request, users drastically reduce the amount of data they ultimately need to download to perform their analysis. AppEEARS enables users to subset data based on geographic and temporal parameters, as well as by specific data layer. Since users can reformat the data and reproject within the application, the amount of post-processing required is reduced. Furthermore, AppEEARS not only provides data values, but also quality data values and their descriptions, when applicable. Lastly, users can visualize plots of the data values (point sample) or summary statistics (area samples) from the sample request within the application.
The collaboration around AppEEARS represents an initial step away from the idea that users need to download large amounts of data for local filtering, processing, integration, and analysis, and moves towards a model where analysis-ready data can be more immediately accessed. Coordinated tools and application development on the substantial holdings of all 12 DAACs is an important strategic direction for NASA’s Earth Science Data and Information System Project (WDS Network Member).
So, what’s your use case for AppEEARS?
Have questions about AppEEARS? Email: firstname.lastname@example.org.
Finding Paleoclimate Data via the World Data Service for Paleoclimatology Just Got Easier
A Blog post by Wendy S. Gross, and Eugene R. Wahl, (World Data Service for Paleoclimatology)
The World Data Service for Paleoclimatology (WDS Regular Member; https://www.ncdc.noaa.gov/paleo), housed at the National Oceanic and Atmospheric Administration's National Centers for Environmental Information, provides data and information to understand natural climate variability and future climate change.
Paleoclimatology is the study of ancient climates, prior to the widespread availability of instrumental records. Paleoclimatologists study several different types of environmental proxy evidence to understand what the Earth’s past climate was like and why.
Paleoclimate proxies and reconstructions used to understand the Earth’s past climate.
Finding the paleoclimate data you need among the greater than thirteen thousand studies, covering the globe and freely available online, just got easier. With our new web service, you can search for data across a wide range of proxy types and climate reconstructions. The new service integrates all of the capabilities of our previous search mechanisms, allowing them to be used together in new and powerful ways, and in conjunction with logical operators.
Geographic coverage of World Data Service for Paleoclimatology data.
There are multiple ways to search for relevant data: input a search term into the general search text box, select a data type from the menu, narrow your selections in the advanced search feature, or use all these capabilities together. The search automatically builds an application programming interface for you based on your search criteria that you can then reuse in the future. After inputting your search criteria, the results will be populated with all relevant studies, as well as providing an overview of the metadata that links to any additional data and information.
A new feature is a section of the site that hosts predefined searches paleoclimatology scientists have found most useful in the past. You'll be able to select one or multiple data types, such as ice cores or corals, from the list assembled by scientists, and the search will produce the most relevant and noteworthy studies related to that topic. In addition, the predefined searches page enables you to jointly query by location and data type. You’ll also be able to search through every study related to a specific data type, with user-friendly columns that allow you to easily sort through the studies.
Using the new web service can help you discover information on topics such as:
• Finding common years of great drought or wetness across specific regions
• Coral records related to El Niño occurrences
• Air temperature reconstructions
The World Data Service for Paleoclimatology archives and distributes data contributed by thousands of scientists around the world. We highly appreciate their long-lasting contributions of data submission, and our collaborations with them. To contact the World Data Service for Paleoclimatology, please email: email@example.com.
Essential Climate Variables – Global Glacier Change Data Indicate Continued Strong Ice Losses in 2015 and 2016
A Blog post by Isabelle Gärtner-Roer (WDS Scientific Committee member)
Changes in glaciers provide some of the clearest evidence of climate change, and as such they constitute key indicators and unique demonstration objects of ongoing climate change. Beside this scientific aspect, glacier changes have an impact on local hazard situations, regional water cycles, and global sea level.
The Global Terrestrial Network for Glaciers (GTN-G) is the framework for the internationally coordinated monitoring of glaciers in support of the United Nations Framework Convention on Climate Change. Within GTN-G, the World Glacier Monitoring Service affiliated at the University of Zurich, Switzerland (WGMS, WDS Regular Member)—which celebrated its 30th anniversary last year—is responsible for the collection and documentation of glacier fluctuations such as annual mass balances and length changes.
Figure 1. Mean annual mass balance of reference glaciers.
Latest mass balance data of the hydrological period 2014/15 and preliminary estimates for 2015/16 indicate continued strong ice losses. In fact, after 2002/03, 2014/15 is the second most negative year since the beginning of the monitoring program at WGMS (as shown in Fig. 1 for glaciers with long, continuous measurement programmes; the so-called 'reference glaciers'). This value is negative despite most of the glaciers in Norway and Iceland, as well as the few that are monitored in New Zealand and Antarctica, showing positive balances in the corresponding year (see Table 3 on this page). Since 1999/00, WGMS has already documented four years with a global mean ice thickness loss of more than 1000 millimetre water equivalent (mm w.e.). These new data show a continuation in the global trend of strong ice losses over the past few decades, and bring the cumulative average thickness loss since 1980 of the reference glaciers to almost 20.000 mm w.e.
As a Regular Member of the ICSU World Data System, WGMS publishes glacier data in a standardized format and makes them freely available to scientists, policy makers, and the wider public. Access is provided online through the 'Fluctuations of Glaciers Browser' and the 'Glacier App', as well as being consolidated in the 'Global Glacier Change Bulletin'.
Figure 2. Training course on glacier mass balance in La Paz, Bolivia (Photo: M. Zemp)
Upcoming challenges in glacier monitoring are very much related to the disintegration and vanishing of glaciers. Some of the glaciers under monitoring programmes disintegrate into several parts, while others—such as the Lewis Glacier on Mount Kenya—completely disappear. These issues demand continuous adaptation of monitoring strategies on both a local and global level. This is one reason why WGMS organizes training courses for Principal Investigators who perform glacier measurements and deliver their glacier data to WGMS. The last training course was held in 2016, with participants from Latin America (Mexico, Colombia, Ecuador, Peru, Bolivia, Chile, and Argentina) joining who are involved in ongoing mass balance programmes in their region (see Fig. 2). These participants were trained in both fieldwork and data analysis by an international team of experts in glacier monitoring and capacity building.
Our work relies on the cooperation and help of many scientists and observers throughout the world. We highly appreciate their long-lasting contributions in collaboration with our National Correspondents coordinating the collection of data in their country for submission to WGMS.