The last thirty years have witnessed a revolution in digital technology. The rate and volume at which research data are created and the potential to make outputs readily available for analysis and reuse has increased exponentialy. Despite the new opportunities that technological advances afford, significant challenges remain. It has long been recognised that it is not sufficient simply to post data and other research-related materials onto the web and hope that the motivation and skill of the potential user would be sufficient to enable reuse. In order to discover and reuse relevant data to perform machine-analysis at scale or employ techniques such as artificial intelligence, we need well-described, accessible data that conforms to community standards. To enable data reuse, the European Commission defined several attributes and guidelines for researchers working with data, known as the Findable, Accessible, Interoperable, Reusable (FAIR) principles. The FAIR principles articulate the attributes data need to have to enable and enhance reuse, by humans and machines. There is a need for various things, including contextual and supporting information (metadata), to allow that data to be discovered, understood and reused by both humans and machines.
To support FAIR principles, CERN launched Zenodo, a catch-all repository for EC funded research and beyond. Thus, through Zenodo API, it is possible to access open data and metadata. More information about the Zenodo repository can be found here [1][2].
However, complying with FAIR and implementing them is not straightforward. First, FAIR principles are not a new standard, but rather a guide for optimal choices to be made during many aspects of data and tool generation as well as (re)use and long term stewardship. Second, stewardship skills are normally required for proper data and metadata management.
FAIRability is a tool that guides data providers into the evaluation of the FAIRness of their data and suggests improvements to be done to better comply with FAIR principles.
A user-friendly procedure to guide the data provider to improve metadata towards FAIR principles must be provided as a result of the challenge. The procedure must include:
1. A call to Zenodo API to extract metadata related to open access data.
2. A model to assess metadata FAIR compliance. The model has to provide that is able several index that rank each FAIR principles and a global index for an overall evaluation
3. The model must expose the variable considered to evaluate each principle and suggest specific improvements
Information for students: data management, data stewardship, data scientist and web developer background are recommended.
Additional information: The project is linked with the CERN Science for Open Data (CS4OD) project that might integrate the results in its digital platform.
[1]zenodo.org
[2] https://developers.zenodo.org/