1. Introduction
The Open Geospatial Consortium (OGC) is releasing this Call for Participation (CFP) to solicit proposals for the OGC Testbed-19. The Testbed-19 initiative will explore six tasks, including Agile Reference Architecture, Analysis Ready Data, Geodatacubes, Geospatial in Space, High-performance Computing, and Machine Learning: Transfer Learning for Geospatial Applications.
1.1. Background
OGC Testbeds are an annual research and development initiative that explore geospatial technology from various angles. They take the OGC Baseline into account, and at the same time explore selected aspects with broad teams from industry, government, and academia to advance Findable, Accessible, Interoperable, and Reusable (FAIR) principles and OGC’s open standards capabilities. Testbeds integrate requirements and ideas from a group of sponsors, which allows leveraging symbiotic effects and makes the overall initiative more attractive to both participants and sponsoring organizations.
The Open Geospatial Consortium (OGC) is a collective problem-solving community of more than 550 experts representing industry, government, research and academia, collaborating to make geospatial (location) information and services FAIR - Findable, Accessible, Interoperable, and Reusable. The global OGC Community engages in a mix of activities related to location-based technologies: developing consensus-based open standards and best-practices; collaborating on problem solving in agile innovation initiatives; participating in member meetings, events, and workshops; and more. OGC’s unique standards development process moves at the pace of innovation, with constant input from technology forecasting, practical prototyping, real-world testing, and community engagement.
OGC’s member-driven consensus process creates royalty free, publicly available, open geospatial standards. Existing at the cutting edge, OGC actively analyzes and anticipates emerging tech trends, and runs an agile, collaborative Research and Development (R&D) lab – the OGC Innovation and Collaborative Solution Program – that builds and tests innovative prototype solutions to members’ use cases.
1.2. OGC COSI Program Initiative
This initiative is being conducted under the OGC Collaborative Solutions and Innovation (COSI) Program. The OGC COSI Program aims to solve the biggest challenges in location. Together with OGC-members, the COSI Team is exploring the future of climate, disasters, defense and intelligence, and more.
The OGC COSI Program is a forum for OGC members to solve the latest and hardest geospatial challenges via a collaborative and agile process. OGC members (sponsors and technology implementers) come together to solve problems, produce prototypes, develop demonstrations, provide best practices, and advance the future of standards. Since 1999, more than 100 funded initiatives have been executed - from small interoperability experiments run by an OGC working group to multi-million dollar testbeds with more than three hundred OGC-member participants.
OGC COSI initiatives promote rapid prototyping, testing, and validation of technologies, such as location standards or architectures. Within an initiative, OGC Members test and validate draft specifications to address geospatial interoperability requirements in real-world scenarios, business cases, and applied research topics. This approach not only encourages rapid technology development, but also determines the technology maturity of potential solutions and increases the technology adoption in the marketplace.
1.3. Benefits of Participation
This initiative provides an outstanding opportunity to engage with the latest research on geospatial system design, concept development, and rapid prototyping with government organizations (Sponsors) across the globe. The initiative provides a business opportunity for stakeholders to mutually define, refine, and evolve service interfaces and protocols in the context of hands-on experience and feedback. The outcomes are expected to shape the future of geospatial software development and data publication. The Sponsors are supporting this vision with cost-sharing funds to partially offset the costs associated with development, engineering, and demonstration of these outcomes. This offers selected Participants a unique opportunity to recoup a portion of their initiative expenses. OGC COSI Program Participants benefit from:
-
Access to funded research & development
-
Reduced development costs, risks, and lead-time of new products or solutions
-
Close relationships with potential customers
-
First-to-market competitive advantage on the latest geospatial innovations
-
Influence on the development of global standards
-
Partnership opportunities within our community of experts
-
Broader market reach via the recognition that OGC standards bring
1.4. Master Schedule
The following table details the major Initiative milestones and events. Dates are subject to change.
Milestone | Date | Event |
---|---|---|
24 February 2023 |
Release of CFP. |
|
6 March 2023 |
Questions from CFP Bidders for the Q&A Webinar due. (Submit here using Additional Message textbox.) |
|
8 March 2023 |
Bidders Q&A Webinar to be held 10:00-11:00EST (Recording Link) |
|
10 April 2023 |
CFP Proposal Submission Deadline (11:59pm EST) Note: Extended to 11 April 2023 (11:59pm EST) |
|
24 April 2023 |
All testbed Participation Agreements Signed. |
|
10-11 May 2023 |
Kickoff Workshop (In Person in Philadelphia, PA at Cesium HQ) |
|
5-9 June 2023 |
OGC Member Meeting Huntsville, AL. (optional) |
|
22 June 2023 |
Initial Engineering Reports (IERs). |
|
29 September 2023 |
Technology Integration Experiments (TIE) component implementations completed & tested; preliminary Draft Engineering Reports (DERs) completed & ready for internal reviews. |
|
31 October 2023 |
Ad hoc TIE demonstrations & Demo Assets posted to Portal; Near-Final DERs are ready for review; WG review requested. |
|
16 November 2023 |
Final DERs (incorporating internal and WG feedback) posted to pending to meet the 3-week-rule before the technical committee (TC) electronic vote for publication. |
|
6 December 2023 |
Last deadline for the final DER presentation in the relevant WG for publication electronic vote. |
|
7 December 2023 |
Last deadline for the TC electronic vote on publishing the final DER. |
|
29 December 2023 |
Participants' final summary reports are due. |
|
Jan 2023 |
Outreach presentations at an online demonstration event. |
2. Technical Architecture
This section provides the technical architecture and identifies all requirements and corresponding work items. It references the OGC standards baseline, i.e. the complete set of member approved Abstract Specifications, Standards including Profiles and Extensions, and Community Practices where necessary.
Please note that some documents referenced below may not have been released to the public yet. These reports require a login to the OGC portal. If you don’t have a login, please contact OGC using the Additional Message textbox in the OGC COSI Program Contact Form.
The Testbed deliverables are organized in a number of tasks:
-
Task 1: Geospatial in Space
-
Task 3: Geodatacubes
-
Task 4: Analysis Ready Data
-
Task 5: Agile Reference Architecture
-
Task 6: High Performance Computing
The above tasks will be grouped into common use cases or stories as use cases and data are finalized.
2.1. Geospatial in Space
Currently, most OGC Standards focus on data that is observed on the surface or directly above planet Earth. There has been less focus on extra-terrestrial space and the exact location of the remote sensors.
Testbed 18 evaluated current standards with respect to the exact positioning of sensors at any location within the solar system and their corresponding data streams. The next step is to evaluate Implementation Specifications. Use cases have been identified and this task seeks additional sponsor participation as well as sample data that should be realistic but does not have to be authentic.
2.1.1. Problem Statements and Research Questions
The Geospatial in Space task brings together the results of two Testbed 18 work items, 3D+ and Moving Features, and Sensor Integration. The OGC Moving Features architectures developed through Testbed 18 have achieved a fairly high degree of maturity. The Connected Systems Standards Working Group (SWG) has been formed to take this work to the next level, formal standardization. Before taking that step, it is important to make sure that all potential uses of this technology are addressed.
Testbed 18 also explored the extension of existing OGC standards and technologies to support extra-terrestrial applications (3D+). This includes spatial-temporal services and data for both non-Terrestrial planetary and open space applications. Features in this environment are almost always Moving Features. It would be premature to advance new Moving Features standards without also addressing the 3D+ requirements. The Moving Features in 3D+ Dimensions work item addresses that issue.
To achieve this objective, the following research topics should be explored:
-
Extend the architecture and draft standards for Moving Feature content developed through Testbed-18 to support 3D (six degrees of freedom) and 4D (spacetime) geometries.
-
Develop ISO 19111 conformant definitions for non-Earth planetary Coordinate Reference Systems. Register those definitions in a Coordinate Reference System registry. This should include at a minimum an CRS for the Moon and Mars.
-
Develop a Spatial Reference System definition for Minkowski spacetime based on ISO 19111 and ISO 19108. Identify any required modifications to those standards. Register that definition in a CRS registry.
-
Explore the ability for existing Moving Features standards and software to work with the non-Terrestrial CRS (#2) and spacetime CRS (#3). Identify shortfalls and propose solutions.
-
Explore the ability for existing Moving Features standards and software to work with Moving Features traversing open space.
-
Develop and prototype standards for implementing a graph of coordinate transformations as described in Testbed-18: Reference Frame Transformation Engineering Report.
-
Research and prototype an effective approach to accommodate Lorentz space and time contractions within coordinate system transformations.
-
Develop one or more versions of GeoTIFF for extraterrestrial use. This should include the case where the corner coordinates are at infinity.
-
Develop and submit change requests for existing standards as needed.
2.1.2. Aim
To free OGC standards and technologies from Terrestrial constraints. Allow geospatial analytic tools and techniques to be used on other astronomical bodies as well as in deep space. Fully integrate the terrestrial and extraterrestrial analytic toolset and processes.
2.1.3. Previous Work
Definitions for non-Earth Planetary Coordinate Reference Systems and Coordinate Transformations
Testbed-18 analyzed in the 3D+ Standards Framework Engineering Report (22-036) current standards from ISO, such as ISO 19111: Geographic Information – Spatial referencing by coordinates, and the OGC GeoPose Standard. Both are not adequate for dealing with non-Earth geospatial data. Alternative approaches by geodetic and astronautic organizations, such as the Consultative Committee for Space Data Systems (CCSDS) Navigation Data — Definitions and Conventions, the International Earth Rotation and Reference Systems Service (IERS) IERS conventions, or the NASA NAIF SPICE Toolkit are also being discussed in the Testbed-18 report. The resulting practices are often similar to each other but cannot be understood as defining a Standard framework or approach.
The OGC Testbed-18 3D+ Data Space Object Engineering Report (23-011) begins with the application of ISO 19111: Geographic Information – Spatial referencing by coordinates to the reference frame of objects in space such as celestial bodies or spacecraft in orbit. The Engineering Report is located between the 3D+ Standards Framework Engineering Report (OGC 22-036), which presents the theoretical foundations, and the OGC Testbed-18 Reference Frame Transformation Engineering Report (22-038), which applies ISO 19111 to coordinate operations between the above frameworks for objects in orbit of any celestial body or in free flight in our solar system. Leaving dynamic reference frames aside, ISO 19111:2019 distinguishes two types of coordinate operations: Conversions and transformations. A conversion can include translation, rotation, change of units, etc. but with a result that is still associated to the same reference frame, for example the same spacecraft. By contrast a transformation involves a change of reference frame, for example from one spacecraft to another one.
OGC 22-038 discusses OGC GeoPose in addition to ISO 19111. GeoPose can be used for describing a relationship between, for example, a spacecraft and a ground station. Most concepts defined in GeoPose can also be expressed using existing ISO 19111 constructs and shared as an OGC Geography Markup Language (GML) encoding.
Additional information is available in the Compatibility Study between ISO 18026 CD, Spatial Reference Model (SRM) and ISO 19111, Geographic information – Spatial referencing by coordinates. The study provides an assessment of the compatibility of the concepts and data elements described in ISO 18026 and 19111.
The International Federation of Surveyors, as part of their Volunteer Surveyor program, hosts a two-week hackathon-style event that focuses on allowing individuals to work on the development of a (hypothetical) Land Tenure Reform System for Mars. Though planned as a fun exercise, interesting CRS related concepts may come out of this effort.
GeoTIFF
Testbed-17 researched Cloud Optimized GeoTiff (COG) aiming to develop a specification that can be directly considered by the GeoTIFF SWG to be put forward as an OGC Standard. It also compared COG with other solutions for multi-dimensional data in the cloud context with focus on Zarr. This Testbed-17 Tasks produced the OGC Testbed-17: Cloud Optimized GeoTIFF Specification Engineering Report (21-025). COG enables efficient access to GeoTIFF data on the cloud.
2.1.4. Work Items & Deliverables
The following diagram outlines all activities, work items, and deliverables in this task.
The following list identifies all deliverables that are part of this task. Detailed requirements are stated above. All participants are required to participate in all technical discussions and support the development of the Engineering Report(s) with their contributions.
-
D100-101 Moving Features Components – Components that implement the Topics #1, #4, and #5 as defined in Section Problem Statements and Research Questions above.
-
D102-103 CRS & Transformation Components – Components that implement the Topics #2, #3, #6, and #7 as defined in Section Problem Statements and Research Questions above.
-
D104 GeoTiff Component – Component that implements the Topics #8, and #9 as defined in Section Problem Statements and Research Questions above.
-
D001 Non-Terrestrial Geospatial Engineering Report – An Engineering Report which documents the approach, methodology and conclusions for Topics #1 through #7 above. The editor shall submit Change Requests to existing standards as needed.
-
D002 Extraterrestrial GeoTIFF – An Engineering Report which documents the approach, methodology and conclusions for Topics #8 and #9 above. The editor shall submit Change Requests to existing standards as needed.
2.2. Machine Learning: Transfer Learning For Geospatial Applications
New and revolutionary Artificial Intelligence (AI) and Machine Learning (ML) algorithms developed over the past 10 years have great potential to advance processing and analysis of Earth Observation (EO) data. While comprehensive standards for this technology have yet to emerge, OGC has investigated opportunities in ML standards for EO in its COSI Program (the Machine Learning threads in Testbeds 14, 15, and 16, see Engineering Reports available here), it has developed a proposal for an ML Training Data standard through its TrainingDML-AI Standards Working Group, and has provided analyses and recommendations of the proposed standard and its next steps in the Machine Learning thread of TestBed-18 (see Testbed-18 Engineering Report). This work continues these beginnings.
Among the most productive methods in the application of ML to new domains has been the re-use of existing ML solutions for new problems. This is achieved through Transfer Learning, where a subset of the Domain Model produced by ML in a related domain are taken as the starting point for the new problem. With Transfer Learning, the investment in previous ML tasks, which can be enormous both in terms of the Training Data required to support a good model, and in terms of the computing power required to refine the model to achieve good performance in the learning process, can be made to pay off repeatedly. A major goal of this effort is to ascertain the degree to which Transfer Learning may be brought into an OGC standards regime.
Re-use depends on the new ML application being able to incorporate the results of previous ML applications. This means that the computational model, i.e., the ML architecture, of the earlier model has somehow to be aligned with that of the later ML application. Part of the work in this Testbed thread is to determine the data and information elements needed to succeed with Transfer Learning in the EO domain, e.g., how much information about the provenance of the ML model’s training data needs to be available? Is it important to have a representation of what is in-distribution vs what is out-of-distribution for the ML model? Do quality measures need to be conveyed for Transfer Learning to be effectively encouraged in the community? Are other elements required to support a standard regimen for building out and entering new Transfer Learning-based capabilities into the marketplace?
The strong need for alignment has also meant that, in practice, Transfer Learning has almost always been applied only within a single ML architecture, such as between earlier and later instances of TensorFlow. It would be of great benefit for cross-architecture Transfer Learning to be available such as, for instance, between instances of TensorFlow and PyTorch. This topic explores that possibility for the special case of geospatial applications.
The goal of Testbed-18 was to develop the foundation for future standardization of training data sets (TDS) for Earth Observation applications. The goal of this Testbed-19 task is to develop the foundation for future standardization of ML Models for transfer learning within geospatial, and especially Earth Observation, applications. The task shall evaluate the status quo of Transfer Learning, metadata implications for geo-ML applications of Transfer Learning, and general questions of sharing and re-use. Several initiatives, such as MicroSoft’s ONNX effort, have developed implementations that could be used for future standardization.
2.2.1. Problem Statement and Research Questions
Reusable ML models are crucial for ML and AI applications, and re-use through Transfer Learning has been central to the rapid deployment of AI over the past decade. But the Transfer Learning process remains somewhat informal, hindering clear understanding of system capabilities and opportunities for synergy. Moreover, Transfer Learning is in general bound to individually-developed architectures, and therefore are becoming a significant bottleneck in the more widespread and systematic application of AI/ML. The research question to pursue include:
-
General lack and inaccessibility of appropriate high-quality ML Models;
-
Absence of standards resulting in inconsistent and heterogeneous ML Models (source formats, architecture layout, quality control, meta data, repositories, licenses, etc.);
-
Limited discoverability and interoperability of ML Models;
-
Lack of best-practices and guidelines for generating, structuring, describing and curating ML Models.
This task shall tackle the following research questions:
-
How is an ML Model to be described to enable Findability of the model for Transfer Learning applications?
-
How is an ML Model to be managed to enable Access of the model for Transfer Learning applications?
-
Are there significant opportunities for interoperability among ML Models even if they derive from different architectures, e.g., by mapping to a canonical representation as in, e.g., the ONNX standard?
-
How is an ML Model to be described to enable efficient re-use through Transfer Learning applications?
-
What are the main characteristics of the ML Model itself that make it suitable for Transfer Learning, and what additional information needs to be provided to sufficiently understand the nature and usability of the model? E.g., is provenance of the models’ domain required? Are quality measures (and could these be automatically generated)?
-
Does the relationship between an ML Model and the training data set it was ultimately derived from need to be represented (and if so, could that relationship be automatically generated).
-
Could/should the ML Models’ metadata describe a “performance envelope” based on the distribution of characteristics in the training data it was ultimately derived from (and could such a piece of information be automatically derived)?
2.2.2. Aim
The objective of this task is to document current approaches and possible alternatives to lay out a path for future standardization of interoperable/ transferable Machine Learning Models for Earth Observation applications.
2.2.3. Previous Work
Machine Learning was subject of the last four OGC Testbed activities, with Testbed-15 to Testbed-17 reports available here, the Testbed-18 report is currently in its final revision and directly available only to OGC members, but can be made available to interested parties on request.
2.2.4. Work Items & Deliverables
The following diagram outlines all activities, work items, and deliverables in this task.
The following list identifies all deliverables that are part of this task. Detailed requirements are stated above. All participants are required to participate in all technical discussions and support the development of the Engineering Report(s) with their contributions.
-
D003 Machine Learning Models Engineering Report - This Engineering Report captures all results of the ML TDS task and can serve as a baseline for future standardization.
-
D106-107 Machine Learning Components - Demonstrations of transfer learning. Ideally, the transfer happens across software environments, e.g., from PyTorch to TensorFlow or similar. Earth observation use cases are preferred, such as satellite image classification or object detection tasks.
2.3. Geodatacubes
Over the past decade, GDCs have been developed independently, resulting in a lack of interoperability between them. By improving interoperability, the vendor community will be able to proceed with specific GDC variants, while at the same time the consumer community will be able to interact much more effectively with different instances. With the increase in available data products served as GDCs, it is becoming increasingly important to understand exactly what a GDC entails and how it was created.
In recent years, the international community has invested significant resources in Geospatial Data Cubes (GDCs): Infrastructure that supports storage and use of multidimensional geospatial data in a structured way. While advances have been made to develop GDCs to support specific needs, it is necessary to gain a clear understanding on whether existing GDCs are interoperable to allow organizations to address their specific needs. There is also a need for reference implementations that make GDCs interoperable and exploitable within an Earth Observation (EO) exploitation environment. This project aims to develop solution prototypes based on existing EO GDCs. Project findings will contribute to the further development of GDCs if necessary.
The OGC Geodatacube Standards Working Group (GDC SWG) is currently forming and will become active within the next few weeks. The charter - available online here - identifies the following work items as priorities:
-
Define an Application Programming Interface (API) that serves the core functionalities of GDCs. GDC users will be able to handle different GDCs according to the same principles, as interoperability between GDCs will be achieved.
-
Define a metadata model including provenance and data lineage information to describe all details about a GDC
-
Identify formats to be used for data exchange. If existing formats do not meet the requirements, the SWG will extend its work to the development of a GDC exchange format.
This Testbed 19 task supports the work of the Standards Working Group, with which there will be close collaboration throughout the Testbed term. Testbed Participants will help define the GDC API and develop prototypes in an agile way to experiment with the API. The SWG charter has defined the following steps as in scope:
-
Identification of real-life use cases with industrial relevance
-
Definition of the GDC API (which may be a profile of (an) existing OGC API(s) or a new development)
-
Definition of exchange format recommendations, profiles, or new developments
-
Definition of the GDC metadata model (in particular information about how the GDC was built; similar to ARD concepts and vision around data provenance and lineage)
-
The GDC API shall support accessing and processing at minimum
-
Analysis of the usability of existing standards
2.3.1. Problem Statement & Research Questions
This task shall define the GDC API and metadata model, test the new API against a set of use cases partly defined in this document, partly to be defined together with the sponsors during the Testbed-execution, develop implementations that allow further experimentation (at least two implementations shall be available as open source software under the Apache 2.0 License, with the OGC name used in lieu of the Apache Software Foundation name, or a compatible permissive open source license approved by the Open Source Initiative), and develop a number of client applications for both data access and advanced visualization. The following three use case scenarios have been defined already. It is possible that as a result of the first ad-hoc meeting of the Geodatacubes SWG at the 125th OGC Member Meeting, 20-24 February 2023, additional use cases will be added or existing use cases will be revised. Please check the Corrigenda & Clarifications before submitting your proposal for any modifications.
Use Cases A
Develop a solution prototype to enable access to and exploitation of data and processors within distributed GDCs of different types.
-
Investigate and choose open existing open-source solution for the implementation of the prototype
-
Integration of at least one EO dataset and one other form of a geospatial data (e.g. vector, earth system prediction data)
-
Deploy and demonstrate the prototype in a cloud environment
-
Advances to open-source approaches developed through the project shall be incorporated into the appropriate open source initiatives by the participant
-
Document additional development effort required for the prototype to be implemented in an operational setting
Use Cases B
Use Cases C
2.3.2. Aim
This task aims to:
-
Understand usability of existing solutions to serve and process GDC data. The task shall provide a comparative analysis of similar standards to help identify gaps and guidance on which standard(s) to use for given GDC use cases.
-
Test specifications of the draft standard against operational requirements to determine the degree to which they meet real-world needs.
-
Develop the GDC API draft standard
-
Develop the GDC metadata model
-
Test both the API and the metadata model in real world scenarios, ideally by GDC operators/users
-
Develop open-source libraries that facilitate the experimentation with GDCs
-
Demonstrate advanced visualization concepts for GDC data
-
Demonstrate how GDC and Analysis Ready Data can align
-
Describe what needs to be done next
2.3.3. Previous Work
The OGC Testbed-17: Geo Data Cube API Engineering Report (21-027) is, like all COSI Program Engineering Reports, available from the OGC public engineering reports webpage. The OGC Testbed-17 Engineering Report (ER) documents the results and recommendations of the Geo Data Cube API task in 2021. The Testbed-17 Call for Participation provides additional insights into requirements and use cases. The ER defines a draft specification for an interoperable GDC API leveraging OGC API building blocks, details implementation of the draft API, and explores various aspects including data retrieval and discovery, cloud computing and Machine Learning. Implementations of the draft GDC API have been demonstrated with use cases including the integration of terrestrial and marine elevation data and forestry information for Canadian wetlands.
OGC Testbed-16: Data Access and Processing Engineering Report (20-016) summarizes the results of the 2020 Testbed-16 Data Access and Processing task. The task had the primary goal to develop methods and apparatus to simplify access to, processing of, and exchange of environmental and Earth Observation (EO) data from an end-user perspective.
The openEO initiative develops an open API to connect R, Python, JavaScript and other clients to big Earth observation cloud back-ends in a simple and unified way
2.3.4. Work Items & Deliverables
The following diagram outlines all activities, work items, and deliverables in this task.
The following list identifies all deliverables that are part of this task. Detailed requirements are stated above. All participants are required to participate in all technical discussions and support the development of the Engineering Report(s) with their contributions.
-
D111/D112 OGC API-GDC instance: Instances of the future GDC API, open source implementations
-
D171/D172 OGC API-GDC instance: Instances of the future GDC API, ECMWF use cases
-
D173/D174 Viz Client: Client instance with support for advanced visualization of GDC data, at least support for ECMWF use cases
-
D113/114 Data Client: Client instances optimized for GDC interaction with support for GDC metadata model, support for all use cases
-
D175/176 Usability Tests: All prototypes and specifications shall be tested for ease of use and technical usability. These tests are ideally executed by Geodatacube providers or users. In any case, Participants for these work items need to be different to the other Participants in this task.
-
D11 GDC Task Engineering Report: OGC Testbed-19 Geodatacubes Task Engineering Report, capturing all use cases, existing technology assessments, implementation descriptions, and recommendations for future activities together with all experiences made and lessons learned during the execution of the task.
-
D71 OGC API-GDC draft standard: Draft standard, jointly developed and under the aegis of the OGC GDC-SWG
2.4. Analysis Ready Data
As we work towards the FAIR principles of finding, accessing, interoperating, and reusing physical, social, and applied science data easily, Analysis Ready Data is a key component of this capability. It has been estimated that data analysts spend up to 80% of their time identifying, selecting, and preparing datasets in order to analyze and integrate them. Analysis readiness aims to reverse this proportion by preparing data in advance for reusability across a range of analytical tasks. We want to select data with only the necessary characteristics for our needs, select specific areas and columns of interest, select applicable layers, and rely on defined, documented quality and provenance.
Most of all, we want to be able to combine, concatenate, and intersect multiple datasets based on their compatible states of spatiotemporal referencing and phenomenon calibration. Additionally, we need visualization capabilities, both as map layers and immersive, especially for improved climate and disaster understanding, mitigation, and response. Many innovative tools and techniques are being developed to address these needs. Analysis readiness is a way of normalizing and standardizing the use of these innovations.
The Analysis Ready Data (ARD) SWG is currently being chartered to develop both a core ARD framework standard and multi-part standards for analysis readiness of specific geospatial data product families and products. Born from work undertaken and challenges identified in the OGC Disaster Pilot 2021 and OGC Testbed-16, the Analysis Ready Data (ARD) SWG will develop a multi-part Standard for geospatial Analysis Ready Data in partnership with ISO/TC 211. The concept of ARD was initially developed by the Committee on Earth Observation Satellite (CEOS). CEOS defines ARD as “satellite data that have been processed to a minimum set of requirements and organized into a form that allows immediate analysis with a minimum of additional user effort and interoperability both through time and with other datasets.” CEOS has developed several ARD Product Family Specifications for optical and radar imagery. Through consultations with the community, CEOS has concluded that formal standardization of the CEOS-ARD concepts and specifications is needed to achieve broad uptake, particularly by the commercial sector. International ARD standardization will also help promote the broader concept and help avoid diverging interpretations of the ARD concept. The OGC ARD SWG will build on the CEOS work to develop standards that both formalize that work and extend it to all geospatial data.
The goal of ARD is for data to be processed once and then easily used and especially reused for a variety of analyses. This task will support the work of the newly formed ARD SWG in pursuing the goal of reuse by creating and exploring a number of scenarios for generating, integrating and applying ARD. Scenarios should examine the reuse of satellite-derived ARD and also consider the nature of analysis readiness for other geospatial datasets. With climate change and disaster resilience as major challenges of our time and in the current digital transformation we are experiencing, scenarios in these domains are preferred. The scenarios should demonstrate how one can discover, access, and use reliable ARD, as well as how ARD systems can best incorporate trustworthy and high-quality workflows, interoperability and scalability, and mapping / visualization of analytical outcomes.
There is a strong link between the standardization of dimensional axes in ARD and the resource organization of Geodatacubes. Proposals that include both the ARD topic and the Geodatacube topic are encouraged.
2.4.1. Problem Statement and Research Questions
OGC is working to define foundational elements which allow for the mixing and matching of different standards and target its mission of implementing Findable, Accessible, Interoperable and Reusable (FAIR) principles for scalable and repeatable use of data. The concept and implementation of analysis readiness can significantly address both climate and disaster resilience needs for information agility by improving access to interdisciplinary sciences such as natural, social, and applied sciences as well as engineering (civil, mechanical, etc.) public health, public administration and other domains of analysis and application.
In this task, we are looking for participants to provide scenarios that take advantage of current CEOS-ARD specifications and products while advancing ARD capabilities towards ARD workflows that can be utilized by end users to significantly reduce processing time, repetitive user tasks, and use of computational resources, enabling faster, easier, and more imaginative analysis of data. OGC invites participants to propose scenarios that target as many of the following outcomes as possible:
-
Improve characterization of data - metadata and catalogs
-
Improve ease of integration for satellite and non-satellite data and services, especially through the medium of datacubes
-
Improve ARD interoperability of data, services and tools through open standards usage / development
-
Reduce uncertainty and risk levels of data usage
-
Improve data provenance and therefore trust by including lineage, source data, processes, validity, and data quality
-
Improve autonomous workflows and use of datacubes as workflow resources
-
Improve geolocation accuracy
-
Improve reproducible workflow characterizations
-
Refine requirements for ARD file format, tiling schemes, pixel alignment or anything related to improvement of repeatable ARD distribution, access, and integration.
OGC prefers user scenarios be developed towards climate and/or disaster focus arenas; however, any relevant scenario is acceptable.
2.4.2. ARD Exemplar Use Cases
Oona develops an analytical workflow that processes different optical imagery bands to categorize land cover. She wishes to target input datasets conforming to an ARD specification so as to reduce the processing the workflow must perform and increase its reusability across different satellite data products.
Ogden is investigating trends in surface temperature observations and uses datasets conforming to an ARD specification so as to maximize comparability of temperature estimates for a given location across multiple collections.
Olson develops a workflow that integrates satellite collections in 3-day intervals to create cloudcover-minimized images. Use of collections conforming to an appropriate ARD specification will maximize the availability of cloud-free pixels for integration and maximize the positional accuracy of pixels in the resulting composite images.
2.4.3. Aim
Create, develop, identify, and implement where possible Analysis Ready Data (ARD) definitions and capabilities to advance provision of information to the right person in the right place at the right time. Additionally, increase ease of use of ARD through improved backend standardization and implementation of varied ARD application scenarios. This work should inform ARD implementers and users on standards and workflows to maximize ARD capabilities and operations. It should further support the ARD SWG in their standardization work. A close cooperation with the ARD SWG and associated activities is envisioned for this task.
2.4.4. Previous Work
This task is based on:
-
CEOS Analysis Ready Data Governance Framework. https://ceos.org/ard/files/CEOS_ARD_Governance_Framework_18-October-2021.pdf
-
CEOS Analysis Ready Data for Land, Product Family Specification, Surface Reflectance. https://ceos.org/ard/files/PFS/SR/v5.0/CARD4L_Product_Family_Specification_Surface_Reflectance-v5.0.pdf
-
CEOS Analysis Ready Data for Land, Product Family Specification, Surface Temperature. https://ceos.org/ard/files/PFS/ST/v5.0/CARD4L_Product_Family_Specification_Surface_Temperature-v5.0.pdf
-
CEOS Analysis Ready Data for Land, Product Family Specification, Normalised Radar Backscatter. https://ceos.org/ard/files/PFS/NRB/v5.5/CARD4L-PFS_NRB_v5.5.pdf And accompanying metadata specification: https://ceos.org/ard/files/PFS/NRB/v5.5/CARD4L_METADATA-spec_NRB-v5.5.xlsx
-
CEOS Analysis Ready Data for Land, Product Family Specification, Polarimetric Radar. https://ceos.org/ard/files/PFS/POL/v3.5/CARD4L-PFS_Polarimetric_Radar-v3.5.pdf And accompanying metadata specification: https://ceos.org/ard/files/PFS/POL/v3.5/CARD4L_METADATA-spec_POL-v3.5.xlsx
-
CEOS Analysis Ready Data for Land, Product Family Specification, Aquatic Reflectance. https://ceos.org/ard/files/PFS/AR/v1.0/CARD4L_Product_Family_Specification_Aquatic_Reflectance-v1.0.pd
-
CEOS Analysis Ready Data for Land, Product Family Specification, Ocean Radar Backscatter. https://ceos.org/ard/files/PFS/ORB/v1.0/CARD4L_Product_Family_Specification_Ocean_Radar_Backscatter-v1.0.pdf And accompanying metadata specification: https://ceos.org/ard/files/PFS/ORB/v1.0/CARD_METADATA-spec_ORB-v1.0.xlsx
-
CEOS Analysis Ready Data for Land, Product Family Specification, Nighttime Lights Surface Radiance. https://ceos.org/ard/files/PFS/NLSR/v1.0/CARD4L_Product_Family_Specification_Nighttime_Light_Radiance-v1.0.pdf
-
OGC Testbed-16: Analysis Ready Data Engineering Report (OGC 20-041). http://docs.opengeospatial.org/per/20-041.html
-
OGC Testbed-16: Data Access and Processing API Engineering Report (OGC 20-025). http://docs.opengeospatial.org/per/20-025r1.html
-
OGC Testbed-16: Data Access and Processing Engineering Report (OGC 20-016). http://docs.opengeospatial.org/per/20-016.html
2.4.5. Work Items & Deliverables
The following diagram outlines all activities, work items, and deliverables in this task.
The following list identifies all deliverables that are part of this task. Detailed requirements are stated above. All participants are required to participate in all technical discussions and support the development of the Engineering Report(s) with their contributions.
D051: ARD Task Engineering Report: An engineering report describing all the elements for an Analysis Ready Data system and building block components. It will describe ARD requirements, identify initial use case objective(s); components and elements needed to reach the objectives; how the testbed reached it’s objectives; identify technology gaps or elements of future work if applicable. This ER should also include how this work would be scalable within other domains or to be applied more broadly within the same domain. The report will be jointly developed by all task participants.
D151-153 ARD Demonstration Scenarios - Software implementations and demonstrations showing ARD in action. The demos shall be delivered as screen recordings and clearly describe the ARD scenario in detail, as well as how and to what extent the scenario implementation meets target outcomes of the task. Ideally, each scenario implementation is delivered as a repository and Docker container to allow future experimentation.
2.5. Agile Reference Architecture
The world has changed significantly since 2010 with Web 2.0 as the importance of location-referenced information is increasingly about the utilization of data (and hence spatial-temporal streaming and query) rather than map presentation. There is an emerging technological landscape that is dominated by scalable Cloud computing infrastructures, Internet of Things and Edge computing technologies. Distributed physical system deployments enable AR/VR and Digital Twins to bridge physical and virtual environments via data integration and analysis capabilities drawing from multiple sources. It is likely that the generation after this is going to be dominated by the semantic interoperability challenges to unleash the opportunities of increasing computational power and data volumes.
There is an accelerating rate of change in a technological landscape that is increasingly reliant on APIs that communicate through JSON-encoded messages rather than Web Services that communicate through XML-based encoded messages. It is likely that YAML which further simplifies but also extends JSON syntax will become more prevalent - in general flexibility of encoding and data views will need to be supported through better documentation of underlying semantics. This trend will serve to realize some of the core FAIR principles not well supported by current data exchange capabilities.
Increasingly there is going to be a need that information and data services can be created, developed and built so that they support and enable secure resilient data services that provide the flow of End to End (E2E) systems and/or networks. Interoperability of data over time spans greater than configurations of particular technical components will support both large scale cross-domain use of data, but also more dynamic environments where network availability and trust is a short term issue.
2.5.1. Problem Statement and Research Questions
We are seeking to understand in the future 2-3 years as we gain experience with new API technologies how information and data can operate across communications and networks in a robust, FAIR manner. How can you achieve the flow of information and data so that the client/server and or federated/mesh network is able to implement resilient data services for E2E systems? How can increasingly heterogeneous sources of data about the same real world phenomena be integrated as sensors and AI provide greater potential insights? How can we scale up techniques for models of the world to interact to support or validate our understanding as both modeling and data sources proliferate?In the future information and data services are likely to be secure and digital by design, they will have adopted a Zero Trust Architecture approach. This will be supported by data being secure, standardized, that is machine readable and exploitable. It will be necessary to develop and understand how generation after next resilient data services will operate. It is important to understand the necessary architecture and building block components that need to be implemented as foundations now.
Generation-after-next is an approach that does not exist and/or is a contributing technology that is not fully understood. Concepts will be ‘leap ahead’ that challenge the boundaries of current and emerging understanding. [TRL 1-3]
Resilient data services are important as they incorporate data centric decision making – relevant data, assured and delivered in the right way to enable the right decisions to be made at the right place.
This work should consider the following aspects:
-
Needs to address the challenges of resilience, integration, and interoperability
-
Universal access, for discovery and assurance
-
Transformation of heterogeneous data sources into locally-useful data forms, including reformatting and dimensional up and down-scaling.
-
Continuous integration and testing (CI/CT)
-
Network Characteristics and incorporating intelligent monitoring to ensure Quality of Service
2.5.2. Aim
Seeking to create, develop and identify the architecture elements for agile reference architectures and understand how these can be used for defining different use cases which allow different implementations for API Building Blocks. This work is seeking to inform how generation after next Resilient Data Services will operate.
2.5.3. Previous Work
-
Secure Registry and Metadata
-
Data Centric Security: authenticity, integrity and security of the data
-
Information architectures
-
Features and Geometry FG-JSON extension of GeoJSON
-
Analysis Ready Datasets
-
Machine Readable
-
RDF Knowledge graphs
-
Flexible SPARQL query
-
HTML and machine readable formats and links to implementation resources to support discovery of options
2.5.4. Work Items & Deliverables
OGC is defining the foundational elements which will allow for the mix and matching of different standards and is embracing the Findable, Accessible, Interoperable and Reusable (FAIR) concept for scale and repeatable use of data. The use of agile reference architectures can be used for defining different use cases which allow different implementations for Building Blocks. This work is seeking to inform the foundations for this generation after next (GEN) approach.
Figure 01 – This illustrates processes (Collect, Process, Manage and Store, Disseminate, Query and Consume) which are important functions of a Reference Architecture.
The proposed Testbed task will focus on the design of an Agile Reference Architecture that allows and informs the generation-after-next implementation of Resilient Data Services over secure and degraded networks (Denied Degraded Intermittent and Limited (DDIL)). For instance how might ML be used to ensure and maintain quality of services if the flow of information and data services for critical E2E systems are degraded? How can autonomous systems enable the flow of information and data enabling and ensuring data centric decision making? It is possible that different elements have not yet been considered, designed or considered and this work is the first step to enable this. Engineering Report deliverables from the proposed task are below:
D021 ARA Task Engineering Report: An engineering report describing all the architecture elements for Agile Reference Architectures and the role of API Building Block components. It will identify the components and elements needed for the target Use Case and the responsibilities of each. Options for realizing these responsibilities with existing specifications will be considered and elements of future work if they are necessary?
D022 ARA Task Engineering Report DDIL Use Case: An engineering report describing the enterprise, engineering, information, technological, and computational viewpoints for an implementation of Resilient Data Services based on a DDIL use case.
D121: An integrated knowledge base linking machine readable specifications required to implement the target DDIL Use Case
D122. The Agile Reference Architecture represented in RDF/Turtle format (i.e. a description of how the components and specifications are related in the form of a reusable pattern that can be adapted to new circumstances)
D123. An instance of OGC API Processes that:
-
Implements the OGC API Processes Part 3: Workflows and Chaining candidate Standard to take a specification of a reference architecture as input and creates an application package for deployment in a containerisation or cloud computing environment.
-
Implements OGC API Processes Part 1: Core to conduct decision support operations that are supported by Machine Learning
-
Supports sufficient semantic annotation to identify necessary contextual information to support reuse.
D124. An instance of OGC API Features serving OpenStreetMap data represented according to the NSG Topographic Data Store schema
D125. An instance of OGC API Features (and/or STA and/or EDR) - supporting real-time observations of some phenomena relevant to the Use Case,
-
Supports sufficient semantic annotation to identify necessary contextual information to support reuse.
D126. An instance of OGC API processes to support transformation of observational data into the reusable form required by the main analytical workflow.
-
CICT support for specification of transformation based on machine readable knowledge graph (deliverable D121).
D127. An instance of OGC API Tiles serving OS Open Zoomstack data.
D128. An instance of OGC API Records that provides metadata about all of the software components available in the system.
2.6. High Performance Computing
Notice: This topic has not yet been funded; all work proposed to the HPC topic should therefore be scoped as in-kind contributions. Some resource support will be available through OGC’s participation in the NSF-funded I-GUIDE project
2.6.1. Problem Statement and Research Questions
Large-scale geospatial analytical computation is becoming a critical need for tackling a wide range of sustainability problems such as climate change, disaster management, and food and water security. Geospatial researchers and practitioners interested in and utilizing geospatial analytics come from a wide range of disciplines including geography, hydrology, public health, social sciences, etc. While advanced cyberinfrastructure and computer science expertise is what enables large-scale computational problem solving, experts in various geospatial-related domains cannot all be expected to have the in-depth technical expertise necessary to interact directly with high-performance computing (HPC) resources on advanced cyberinfrastructure and optimize the use of those resources for their particular computational challenges.