Friday, June 28, 2024 - 07:00

What Is the DATAMITE Project About? 

During the last decade, companies have been storing massive amounts of data, and more often than not, they are doing so without a clear strategy or well-defined policy. The results are dramatic. Around 70% of data is never used, mostly because potential data consumers within the organisations are unaware of what data they have due to the lack of proper data catalogues and data governance strategies. Even when data is used, in many cases consumers find that the data is low quality due to the lack of quality evaluation tools. This results in incomplete or inconsistent data, outliers, problems during data ingestion, and other issues.

Now that data sharing is becoming increasingly important, local deficiencies in data management forbid companies from sharing data because they lack the tools and maturity to do so properly. DATAMITE aims at putting an end to this situation, helping companies with tools, training material and a community that can assist them in improving how they manage their data. 

How Does DATAMITE Contribute to the Data Economy? 

The data economy depends highly on the quality of the data being shared, how enriched it is, or on aspects like semantic interoperability. This ensures that the data is findable, accessible, interoperable and reusable, the well-known FAIR principles. The DATAMITE framework assists companies from the inside, providing data governance, quality, security or data discovery and ingestion tools, allowing them to improve their management and their capacity to better internally monetise their data. Once companies have the tools to improve the usage and quality of data internally, they will be in a position to go outside, share their data and make a significant contribution to the data economy. 

DATAMITE will also assist them there, providing tools to build their data products, define their own data policies and share their datasets in dataspaces (aligned with Gaia-XIDSA or other reference architectures) or in other portals or marketplaces like EOSC or the AI On-Demand platform. 

How Is DATAMITE Related to the Eclipse Dataspace Working Group (EDWG) and Eclipse Projects?

A number of DATAMITE partners such as IDSA, Tecnalia and Fraunhofer are also members of the Eclipse Dataspace Working Group and are deeply involved with the dataspace specifications and developments.   

Furthermore, the DATAMITE project relies on the results of several Eclipse open source projects: Eclipse Dataspace Components (EDC), Eclipse Dataspace Protocol specification project, Eclipse Dataspace Protocol TCK project — which are all under the umbrella of the Dataspaces Working Group.

On top of that, a number of DATAMITE partners involved in the Eclipse Dataspaces Working Group are also preparing a new set of Dataspace specification projects for Decentralised Claims and Conformity Assessment, which aims at providing standardises identity and trust features on top of the Dataspace Protocol.

DATAMITE is developing some extensions for Eclipse Dataspace Connector at the moment, using the Dataspace protocol for interoperability. The possibility of contributing the DATAMITE developments back to these projects is also being considered. 

How is DATAMITE Related to the Eclipse Dataspace Protocol? 

The Dataspace protocol specification provides the technical framework for interoperable data exchange and it is on its way to becoming an international standard. Interoperability is an important aspect for DATAMITE, so the project is keen to use this specification to make sure its developments are future-proof.    

The Dataspace protocol specification will be maintained as part of the Eclipse Dataspace Protocol (EDP) project. DATAMITE relies on the EDC, which follows the Dataspace protocol specification. 

This way, DATAMITE can also make sure it is aligned with the potential ISO standard coming out of the Dataspace Protocol specification and also with the EU Data Act, which is taken as a fundamental baseline for the definition of the protocol. This is paramount as a EU funded project.

Which Dataspace-Related Features Is the Project Working On? 

DATAMITE's objective is not to deploy a dataspace, but to assist the companies that are deploying dataspaces by providing tools to build their data products, define their own data policies and share their datasets in dataspaces. DATAMITE’s contribution to dataspaces is focused on three main topics: The definition of the so-called Rulebook, the logging functionality and the usage control enforcement. 

The Rulebook provides the governance framework for dataspace implementation, including templates and checklists to define the business, legal and technical foundations for implementing dataspaces. It also includes contractual models that are needed to legally set up a dataspace. The Rulebook also acts as a basis for the technically implementable policies and rules, and has been endorsed by the IDSA, and Data Spaces Support Centre.   

Monitoring and auditing are two main dataspace functionalities. The logging tool serves as an intermediary component in the data sharing process within an IDSA dataspace. It plays a vital role in enhancing the transparency and integrity of data sharing processes within a dataspace connected with the Eclipse connector. Essentially, it archives actions taken during the creation of assets on the Eclipse connector, facilitating seamless sharing under usage policies. Additionally, it diligently monitors access requests and ensures compliance with established data agreements. This functionality is achieved by extending event capabilities on the Eclipse connector, seamlessly integrating it with the logging tool. 

Regarding the data usage policy engine, DATAMITE aims at contributing to the development of two Eclipse Dataspace Component (EDC) features, the policy engine and the policy monitor. The policy engine framework allows to create and register new constraint functions to the policy engine defined as extensions. The policy monitor is a new module that permits to continuously enforce policies on running Transfer Processes that need to be under control to ensure that they don't continue transferring data also when the contract policy is not valid anymore. 

How Can People Know More About DATAMITE and Contribute to the DATAMITE Project? 

DATAMITE is a research project, funded by the European Commission, and its code, which is still under development, is hosted in the Eclipse Research Labs.

If you are interested in the DATAMITE project and want to know more, do not hesitate to visit our website, subscribe to its newsletter or find it in LinkedIn or X (formerly Twitter).  

Started in January 2023, DATAMITE is an initiative funded by the European Union’s Horizon Europe Research and Innovation programme under grant agreement N° 101092989, with a strong focus on data monetising, interoperability, trading and exchange.

European Union Flag. Yellow Stars Over Blue Background. Eu Symbol. Vector Eps 10 datamite

About the Author

Jordi Arjona Aroca

Jordi Arjona Aroca

Jordi Arjona is the Distributed Systems Research group coordinator at ITI. He got his PhD by the UC3M and Institute IMDEA Networks. He has worked in Bell Labs or Fundación ValenciaPort or enjoyed research stays in IBM Research Labs in Bangalore or in the Institute for Computing Technology of the Chinese Academy of Sciences. He has published several articles in international conferences and journals, and participated in multiple European Projects. He is also the Technical Coordinator of the DATAMITE project. 

Ilknur Chulani

Ilknur Chulani

Ilknur Chulani studied computer engineering at the Ege University in Turkey.  She is a Senior Program Manager at the International Data Spaces Association, helping research projects with dataspace concepts. She has also worked at Atos, coordinating activities in research and innovation projects as part of European framework programs on software, cloud infrastructures, IoT, edge and FIWARE topics and with the Atos Scientific Community, and many years with the IBM Software labs in the U.S. and Canada, building IBM's Java class libraries and Eclipse-based Java IDEs for embedded systems.

Valentín Sánchez

Valentín Sánchez

Valentín Sánchez BSC in Physics with more than 30 years of experience in Information Technologies, currently a researcher in the Operational Unit: DIGITAL at TECNALIA RESEARCH & INNOVATION, participating as project director in projects in the field of data management and governance with the aim of developing and deploying analytics models based on data shared between different companies. His research interests include the management and commercialization of personal data respecting data sovereignty.

Vasileios Siopidis

Vasileios Siopidis

Vasileios Siopidis is a Research Assistant at the Centre for Research & Technology Hellas/Information Technologies Institute (CERTH/ITI). He holds a Master's degree in Advanced Computer and Communication Systems from the Aristotle University of Thessaloniki, Greece. With a focus on data sharing, governance, and sovereignty, Vasileios has contributed to numerous European research projects. His research interests encompass blockchain technology, identity management, access control, and data storage and management.

Marko Turpeinen

Marko Turpeinen

Prof. Marko Turpeinen has over 25 years of experience as business visionary and international education and innovation leader in digital transformation of industries and network society. He is the Founder and CEO of 1001 Lakes, a company that specialises in enabling fair and trusted data ecosystems. He is also an Adjunct Professor at Aalto University focusing on topics such as digital ethics and power of algorithms.

Konstantinos Votis

Konstantinos Votis

Dr. Konstantinos Votis is a computer engineer and a senior researcher (Researcher Grade B’) at Information Technologies Institute/Centre for Research and Technologies Hellas (CERTH/ITI) and Director of the Visual Analytics Laboratory of CERTH/ITI. He is also a visiting professor in the University of Nicosia, Institute of the Future, regarding Blockchain and AI technologies (since October 2019). He received an MSc and a Ph.D. degree in computer science and service oriented architectures from the Computer Engineering and Informatics department, University of Patras, Greece. In addition, he holds an MBA from the Business School department in the University of Patras. His research interests include Artificial Intelligence, Information Visualisation and management of big data, Human Computer Interaction (HCI) and interactive technologies, knowledge engineering and decision support systems, Internet of Things, cybersecurity as well as pervasive computing, with major application areas such as mHealth, eHealth, and personalised healthcare. His PhD was in the domain of service Oriented Architecture and Information management systems.