What Is the DATAMITE Project About?
During the last decade, companies have been storing massive amounts of data, and more often than not, they are doing so without a clear strategy or well-defined policy. The results are dramatic. Around 70% of data is never used, mostly because potential data consumers within the organisations are unaware of what data they have due to the lack of proper data catalogues and data governance strategies. Even when data is used, in many cases consumers find that the data is low quality due to the lack of quality evaluation tools. This results in incomplete or inconsistent data, outliers, problems during data ingestion, and other issues.
Now that data sharing is becoming increasingly important, local deficiencies in data management forbid companies from sharing data because they lack the tools and maturity to do so properly. DATAMITE aims at putting an end to this situation, helping companies with tools, training material and a community that can assist them in improving how they manage their data.
How Does DATAMITE Contribute to the Data Economy?
The data economy depends highly on the quality of the data being shared, how enriched it is, or on aspects like semantic interoperability. This ensures that the data is findable, accessible, interoperable and reusable, the well-known FAIR principles. The DATAMITE framework assists companies from the inside, providing data governance, quality, security or data discovery and ingestion tools, allowing them to improve their management and their capacity to better internally monetise their data. Once companies have the tools to improve the usage and quality of data internally, they will be in a position to go outside, share their data and make a significant contribution to the data economy.
DATAMITE will also assist them there, providing tools to build their data products, define their own data policies and share their datasets in dataspaces (aligned with Gaia-X, IDSA or other reference architectures) or in other portals or marketplaces like EOSC or the AI On-Demand platform.
How Is DATAMITE Related to the Eclipse Dataspace Working Group (EDWG) and Eclipse Projects?
A number of DATAMITE partners such as IDSA, Tecnalia and Fraunhofer are also members of the Eclipse Dataspace Working Group and are deeply involved with the dataspace specifications and developments.
Furthermore, the DATAMITE project relies on the results of several Eclipse open source projects: Eclipse Dataspace Components (EDC), Eclipse Dataspace Protocol specification project, Eclipse Dataspace Protocol TCK project — which are all under the umbrella of the Dataspaces Working Group.
On top of that, a number of DATAMITE partners involved in the Eclipse Dataspaces Working Group are also preparing a new set of Dataspace specification projects for Decentralised Claims and Conformity Assessment, which aims at providing standardises identity and trust features on top of the Dataspace Protocol.
DATAMITE is developing some extensions for Eclipse Dataspace Connector at the moment, using the Dataspace protocol for interoperability. The possibility of contributing the DATAMITE developments back to these projects is also being considered.
How is DATAMITE Related to the Eclipse Dataspace Protocol?
The Dataspace protocol specification provides the technical framework for interoperable data exchange and it is on its way to becoming an international standard. Interoperability is an important aspect for DATAMITE, so the project is keen to use this specification to make sure its developments are future-proof.
The Dataspace protocol specification will be maintained as part of the Eclipse Dataspace Protocol (EDP) project. DATAMITE relies on the EDC, which follows the Dataspace protocol specification.
This way, DATAMITE can also make sure it is aligned with the potential ISO standard coming out of the Dataspace Protocol specification and also with the EU Data Act, which is taken as a fundamental baseline for the definition of the protocol. This is paramount as a EU funded project.
Which Dataspace-Related Features Is the Project Working On?
DATAMITE's objective is not to deploy a dataspace, but to assist the companies that are deploying dataspaces by providing tools to build their data products, define their own data policies and share their datasets in dataspaces. DATAMITE’s contribution to dataspaces is focused on three main topics: The definition of the so-called Rulebook, the logging functionality and the usage control enforcement.
The Rulebook provides the governance framework for dataspace implementation, including templates and checklists to define the business, legal and technical foundations for implementing dataspaces. It also includes contractual models that are needed to legally set up a dataspace. The Rulebook also acts as a basis for the technically implementable policies and rules, and has been endorsed by the IDSA, and Data Spaces Support Centre.
Monitoring and auditing are two main dataspace functionalities. The logging tool serves as an intermediary component in the data sharing process within an IDSA dataspace. It plays a vital role in enhancing the transparency and integrity of data sharing processes within a dataspace connected with the Eclipse connector. Essentially, it archives actions taken during the creation of assets on the Eclipse connector, facilitating seamless sharing under usage policies. Additionally, it diligently monitors access requests and ensures compliance with established data agreements. This functionality is achieved by extending event capabilities on the Eclipse connector, seamlessly integrating it with the logging tool.
Regarding the data usage policy engine, DATAMITE aims at contributing to the development of two Eclipse Dataspace Component (EDC) features, the policy engine and the policy monitor. The policy engine framework allows to create and register new constraint functions to the policy engine defined as extensions. The policy monitor is a new module that permits to continuously enforce policies on running Transfer Processes that need to be under control to ensure that they don't continue transferring data also when the contract policy is not valid anymore.
How Can People Know More About DATAMITE and Contribute to the DATAMITE Project?
DATAMITE is a research project, funded by the European Commission, and its code, which is still under development, is hosted in the Eclipse Research Labs.
If you are interested in the DATAMITE project and want to know more, do not hesitate to visit our website, subscribe to its newsletter or find it in LinkedIn or X (formerly Twitter).
Started in January 2023, DATAMITE is an initiative funded by the European Union’s Horizon Europe Research and Innovation programme under grant agreement N° 101092989, with a strong focus on data monetising, interoperability, trading and exchange.
|