Technologies for the Management of LOng EO Data Time Series (LOOSE)
The continuously increasing amount of long-term and of historic data in EO facilities in the form of online datasets and archives makes it necessary to address technologies for the long-term management of these data sets, including their consolidation, preservation, and continuation across multiple missions. The management of long EO data time series of continuing or historic missions, with more than 20 years of data available already today, requires technical solutions and technologies which differ considerably from the ones exploited by existing systems. Novel technical and organizational solutions for durable efficient handling, storage, access and processing of long EO data time series are needed.
The objective of this activity is to investigate and develop solutions for access- and exploitation-optimised organisation of data repositories to overcome limitations of today?s storage-optimised implementations. To satisfy such new data access and exploitation scenarios, the archiving structures and data models will have to facilitate user-friendly bulk data retrieval, data processing, visualization, as well as displaying and downloading data layers served by standardized spatial data services. New technologies and advanced approaches have to be studied and demonstrated before integration into the existing ground segment infrastructure. At a more detailed level, the following technology areas will be analysed, prototyped with open source implementations, squired with standardization and evaluated under real conditions in this activity:
- Bulk Data Placement: large amounts of data (e.g. upcoming and historic) need to be ingested from backend archives and registered in online data access services in short time, in order to make them available for online exploitation;
- Data Discovery: consolidated EO data time series may be composed of original data from different missions and collections. They need some additional, standardised descriptive metadata to allow their interoperable discovery in EO data access portals;
- Integration of Access and Processing Services: both need to be located near the storage facilities to improve the performance when working on large EO time series; standardised procedures and interfaces are needed to reference EO products from discovery and access services as input to processing;
- Server-side Sub-setting, Re-projection and Formatting: it is essential to reduce the amount of input data as early as possible, i.e. before transferring the data to the processing cache, either within a platform or to the user;
- Data Formats for Raster Data Time Series: existing streaming, sequence and video data formats need to be analysed and their use for EO data time series should be evaluated and standardized, in order to enhance the time-based extraction of data subsets;
- Online Visualization: current online EO data visualization solutions do not really take into account the time dimension; exploring long EO data time series requires seamless swapping between views of spatial and temporal extents, multi-dimensional filtering, visualization of specific content of user?s interest (e.g. thresholds on measured values, applying masks or combining different value layers);
- Scalability and Security: special focus must be put on the scalability of online data storage structures and data access services in order to be able to react on dynamic demands; the security of data needs as well to be preserved when transferring very large amounts of data.