Martha Maiden, Program Executive for Earth Science Data Systems, NASA
NASA is at the forefront of providing Earth Observation data to researchers worldwide. Martha Maiden, Program Executive for Earth Science Data Systems, outlines how the unit’s activities still keep her interested after 20 years of service to the organisation
Firstly, could you give an overview of your role as Program Executive for Earth Science Data Systems? What is the primary aim of this programme in the field of Earth system science?
As Program Executive for Earth Science Data Systems, I am privileged both to manage support for and represent to managers and stakeholders, and articulate the vision and strategy for operations and evolution of these vibrant resources. The primary aim of this Program is to make NASA’s Earth Observation (EO) data and supporting materials easily available and as simple to use for researchers as possible. Satellite data can be quite large and can require lots of information to be useful. Documentation is sent with the data, and information (often called ‘metadata’) such as time and place of observation, accuracy and validation information are tagged in the digital file.
The Earth Observing System Data and Information System (EOSDIS) is part of NASA’s Earth Science Data Systems Program. What is its goal and how is this benefiting users around the world? What Earth Science disciplines does it cover?
The EOSDIS is one of the largest Earth Science data systems in the world. It is a distributed system of discipline-orientated processing systems and archives, most of which are called Distributed Active Archive Centers (DAACs). This ‘System of Systems’ – as such architectures are now known – functions as a unified system to users, who can find data through a single search by connecting to the Earth Observing System (EOS) Clearinghouse or ECHO through our NASA entry portal, called Reverb. It is also possible to connect to ECHO using a variety of standard interfaces (called Application Program Interfaces or APIs) with an Internet portal of the user’s choice or creation.
Access to data is a primary concern. How is NASA trying to ensure that the data is easily available? Are there certain restrictions?
While planning the EOS programme in the late 1980s NASA recognised that data availability was as important to our mission success as deployment of its satellites. The Data System Program was initiated in 1991 and is a vital part of our Flight Program. NASA fully funds our science data systems, including downlink, pre-processing into instrument raw data, science processing of the data by peer-reviewed science teams to geophysical parameters, followed by automated ingest into our DAACs and automated metadata production, and availability with complete documentation. In complementary programmes, NASA’s Earth Science Division supports mission operations, data analysis, and climate modelling.
Most importantly, all NASA datasets are available under a free and unrestricted data policy. It is the U.S. government policy that all Federal information is available under a full and open and unrestricted data policy at not more than the marginal cost of distribution. U.S. government datasets are in the public domain by statute. NASA decided to make the data available at no charge – that is, without any marginal cost of distribution, in order that these EO data would be fully utilised. NASA did not wish in any way to inhibit anyone who wished to obtain and use these datasets from ordering it. NASA’s EO data (here I use a broad definition of data to include observation data, metadata, products including ancillary data used to generate, information, algorithms, including scientific source code, documentation, models, images, and research results) are created with the primary intent for research use. However, broader uses of applied science, education, decision support, and any public and private uses are also common.
The Earth Science Data Program has been keeping metrics of distinct users, number and volume of products distributed since 1990. Summary tables of these metrics all display year-upon-year exponential increases. During the U.S. government fiscal year 2010 (October 2009 – September 2010), about 500,000 distinct users obtained well over 400 million products, comprising over 3,600 Terabytes of data. In fiscal year 2011 through July 2011 just under 400 million products have already been distributed to more than 360,000 distinct users, with over 35 million products distributed in July alone. Usage of Earth system science data has surpassed all expectations of several decades ago by orders of magnitude. There are datasets held in EOSDIS produced by international partners to NASA that we make available through a partnership, for example, foreign space agencies. In those cases any restrictions in usage or redistribution is solely due to the request of our partners.
To what extent do you work with research communities?
I work as closely as I can with research communities that use our data. To that end, I manage a research programme of data product development and improvement above and beyond the mission and measurement-orientated science teams, in order to create what NASA calls ‘Earth System Data Records’, which are essentially what those readers in the international EO arena may know as Essential Climate Variables (ECVs). These records are assembled from instruments flying on multiple spacecraft and/or multiple instruments to create consistent data records, and the work requires careful sensor intercalibration and analysis.
I also work closely with computer science and information technology researchers who are interested in Earth science data system development, which is quite a healthy area of research and development in Internet tools for improving data discovery, evaluation, acquisition and use.
In order to understand Earth system science better what still needs to be done?
Understanding the Earth and how it is changing is a scientific challenge and also so important for us Earthlings! NASA participates in the U.S. Global Change Research Program (USGCRP), which coordinates and integrates federal research on changes in the global environment and their implications for society. The USGCRP began as a presidential initiative in 1989 and was mandated by Congress in the Global Change Research Act of 1990, which called for “a comprehensive and integrated United States research program which will assist the Nation and the world to understand, assess, predict, and respond to human-induced and natural processes of global change”. The USGCRP has 13 participating agencies.
All over the world and in international arenas, different aspects of Earth system science are maturing. All of our NASA Earth Science discipline communities – closely reflected in our disciplinary DAACs and Data Centers – participate fully in international scientific organisations working on the cutting edge of Earth system science. We have made a great deal of progress, but much more has to be done, for it seems answering one question – understanding one aspect – often leads to a new question concerning this very complex system.
Another area that you work in is environmental data policy. Can you tell us more about NASA’s view of this, especially considering a lot of NASA’s work is cutting edge?
As explained, NASA’s overall policy is the U.S. policy of full and open and unrestricted use, and within Earth Science, NASA further articulates our data availability as timely: there is no period of exclusive use for our science team members. While in the international Group on Earth Observations (GEO), requiring attribution for use is not considered a restriction, data created through public funds of NASA, and for that matter of the U.S. Government, does not require attribution for use as a consequence of nationally-legislated public domain status. However, NASA does request attribution, and the researcher best practice is to attribute data and fully document what kinds of transformations have been done to the original observations so that other researchers can understand its accuracy and validity for analysis.
Earth Science is part of a larger Science Mission Directorate that also includes space science fields of astrophysics, planetary science, and heliophysics. I reiterate that these datasets are also available at no charge, but some astrophysics data may have periods of exclusive use of up to one year.
The Program operates a number of data systems that archive and distribute different data, obtained and derived mostly from NASA’s EO satellite instruments. What initiative are you most excited about personally? Where does your interest lie?
Being trained as a physicist, I was taught rigorous error estimation methods and to always include accuracy estimates and attributes with results. Providing in-depth error analysis of the properties of long-term datasets created primarily from satellite instrument records is very tricky. I have always emphasised data and parameter accuracy assessment and reporting during my career with the NASA community scientists, but in 2010 I was able to solicit Earth System Data Records uncertainty analysis for projects that would provide in-depth analysis of the properties of these long-term datasets, with a focus on detecting systematic error, better quantifying error, and properly attributing uncertainty sources. Now, with so many different users of Earth Science data, I see a real need for quantitative accuracy assessments that are carried with the data (not just discussed in scientific journals), so that applied users such as environmental managers and decision makers know the risks and ‘odds’ so to speak of making decisions based on the data they obtain. The projects have just begun this year, and I’m very excited to see how they progress.
Could you explain the challenges that face Earth Science Data Systems?
As discussed, answering a question about Earth’s complex system often brings up another question and a new hypothesis. Thus, scientific needs for data collections change. Other users, for applications such as resource management, may well see their needs for collections vary or differ over time as well. It’s a real challenge to make large volumes of data from the entire globe for decades available in flexible or multiple ways so that various user needs are easily met. Can a user order data over a region for the last 20 years as easily as a global map from the past year? What about users who need parameters coincident in space over time from different instruments? What about users who want to look at particular features? For example, upper tropospheric water vapour, and how it is changing, became of high interest some time ago due to a significant debate in the Earth system science community, and so became listed as an ECV.
At NASA we work very hard to make our collections easily available for our science users and to coordinate our data and user services across our Data Centers so that users are well served. The Program supports the deployment of data and information capabilities that enable the freer movement of data and information within our distributed environment of providers and users. This often requires the utilisation of tools and services to aid in measurable improvements of Earth Science data access and data usability.
EO data of so many disparate disciplines, timeframes, resolutions are held all over the world by many governments, scientific researchers at educational institutions, non-profit and for profit private sector institutions in distributed and heterogeneous systems and services. I have so often heard voiced as the ultimate users’ dream to have a capability for EO as ubiquitous and easy to use as Google is for general information. The U.S. National Academy’s National Research Council once termed that dream a global ‘environmental infrastructure’ available through the World Wide Web.
Those of us in these kinds of posts around the world have long been working to approach that goal. And we’ve made tremendous progress in the last 20 years, as rapid technology advances have helped us. The field of data systems for EO is rapidly growing as more inspired interdisciplinary scientists from the information technology and Earth Science fields embrace the challenge. The scientific method works because of data, so making data accessible and usable through good validation and documentation practices is key to understanding our Earth System.