Frictionless Data Frictionless Data
Guide
  • Application
  • Framework (Python)
  • Framework (JavaScript)
  • Libraries
    • GoodTables
    • DataHub
    • Labs
  • Table Schema
  • Data Package
  • Reproducible Research
  • Case Studies
  • Pilots
  • Chat
  • Forum
  • Support
  • Events Calendar
  • Contribute
  • Code of Conduct
Team
About
Blog
Guide
  • Application
  • Framework (Python)
  • Framework (JavaScript)
  • Libraries
    • GoodTables
    • DataHub
    • Labs
  • Table Schema
  • Data Package
  • Reproducible Research
  • Case Studies
  • Pilots
  • Chat
  • Forum
  • Support
  • Events Calendar
  • Contribute
  • Code of Conduct
Team
About
Blog
  • GoodTables

GoodTables

A simple yet powerful tool to ensure the quality of tabular data, in Python and on the command line.

GoodTables is a managed service to validate tabular data. It can check the structure of your data (e.g. all rows have the same number of columns), and its contents (e.g. all dates are valid). Internally, it uses the Data Quality Spec for common tabular data errors. GoodTables also supports data described by Data Package and Table Schema.

Let’s visit the GoodTables website and login with GitHub to start the process of validating our data.

goodtables dashboard

Add a data source in the dashboard using GitHub (Amazon S3 is also supported, but we’re only covering GitHub here):

INFO

We need to create a GitHub repository to store our helloworld.csv file. Make sure you use the valid CSV from our example above.

adding source to goodtables

Because we have valid and well-structured data in ourhelloworld.csv, the results will come back as valid, as seen in the image below

valid data

Now, let’s change to invalid tabular data and see what the checks return:

Name,Email,,Age
Jill,[email protected]
Jack,[email protected],33
23,Jane,[email protected], 22, 33

Invalid data

Of course, this build will fail because some structural errors were detected by GoodTables (“Blank Header”, “Missing value”, and “Extra Value”).

Additionally, here’s a video walkthrough of the content outlined above