At our last community call on January 26th, we had Matteo Fortini from the Italian National Department of Digital Transformation, who led a discussion about DCAT and Frictionless Data.
Open data is key to ensure transparency and accountability, understand the world, and have an economy of data. The open data publishing chain in Europe starts with distribution of datasets that go into a national catalogue, which is then harvested by an EU catalogue - all this enabled by metadata.
In practice, Matteo and his colleagues would publish the data (e.g. on the Next Generation EU funds, or on the National Population Registry) as Frictionless Data with DCAT metadata, a format that is mandatory to get into the EU catalogue.
The data is gathered on GitHub (a CKAN instance is sadly not available yet) through scripts that are run everyday. The data is published in both CSV and JSON format, with foreign keys to other tabular data (e.g. geographical data for municipalities) and Frictionless metadata to have a standard way to document all the different attributes of the data, to enforce constraints, and ensure data quality in general. On top of that there is the Italian DCAT_AP, and the mandatory attributes for metadata.
While DCAT is very useful to understand the content, the themes, and the licences, Frictionless Data goes down to attribute descriptions, data types and constraints. So what Matteo would like to have in the future is one type of metadata that would cover both the data description and attributes, and the catalogue information.
Some efforts were already made in the past by community members Augusto Herrman and Ayrton Bourne to map data packages to DCAT (as documented in this issue (opens new window)). Now Matteo and his colleagues are actively looking for other people who would be interested in creating a working group about this, to try to get to some kind of shared standard.
Other community members present at the call shared their own experience with Frictionless and DCAT:.
The German State of Schleswig - Holstein shared a very interesting example (opens new window) from their portal. As they did not find a good way to attach the Frictionless Specification to the DCAT Distribution, they created a separate distribution for the Frictionless Tabular Data Resource. Switzerland took the same approach, linking the Frictionless Specification as a separate distribution, as you can see in this example (opens new window). They are unsure about this approach though, as it seems to be a misuse of the DCAT Class.
To make Frictionless Data more interoperable with other semantic web standards, Dan Feder pointed out the idea to create RDF or JSON-LD Specification, something that had already been discussed in the past, as documented in this issue (opens new window).
If you want to know more about Matteo’s presentation, here’s the recording:
# Join us next month!
Next community call is on February 23rd and we are going to hear about the database curation software for the World Glacier Monitoring Service (WGMS) from Ethan Welty.
You can sign up for the call already here (opens new window). Do you want to share something with the community? Let us know when you sign up.
And if you have a cool project that you would like to show to the community, please let us know! You can just fill out this form (opens new window), or come and tell us on our community chat.
# Call Recording
On a final note, here is the recording of the full call: