Frictionless Data Frictionless Data
Introduction
Projects
Universe
Adoption
People
Fellows (opens new window)
  • Architecture
  • Roadmap
  • Process
  • Get Help
  • Contribute
  • Code of Conduct
  • Events Calendar
  • Forum (opens new window)
  • Chat (Slack) (opens new window)
  • Chat (Matrix) (opens new window)
Blog
Introduction
Projects
Universe
Adoption
People
Fellows (opens new window)
  • Architecture
  • Roadmap
  • Process
  • Get Help
  • Contribute
  • Code of Conduct
  • Events Calendar
  • Forum (opens new window)
  • Chat (Slack) (opens new window)
  • Chat (Matrix) (opens new window)
Blog
  • Publish Your Data Package Online

    • It's Only Files Online
      • Github, Bitbucket etc
        • S3, Google Storage etc
          • Google Drive
            • Dropbox
              • Key Tips

              Publish Your Data Package Online

              August 29, 2016 by Frictionless Data
              Price icons created by Pixel perfect - Flaticon

              This tutorial is about how to publish your Data Package online for others to find and use.

              It assumes you have already finished packaging up your data as a Data Package (if not, check out the instructions here).

              # It’s Only Files Online

              Publishing your Data Package is incredibly simple: you just need to post it online somewhere that others can access.

              Note: if you just want to to share your Data Package with a few others you can just send it directly, for example via email. Since a Data Package is just some files there are as many ways to do this as there are ways to put files online. Here we will just provide some general tips and illustrate some of the most popular publishing options.

              Advertise it

              Once you have published your data package you may want to advertise it to others. One way to advertise the existence of your dataset is to add it to the catalog-list file in the registry repo (opens new window), it will then automagically appear as a community dataset on the data.okfn.org (opens new window) site

              # Github, Bitbucket etc

              One nice option for the more sophisticated is to manage your Data Package in a git or mercurial repo and push it to github, gitorious, bitbucket or similar.

              # S3, Google Storage etc

              Cloud storage like S3 and Google Storage are perfect for storing your Data Packages.

              # Google Drive

              The directory structure of a Data Package shared on Google Drive must be flat; that is, the Data Package must not contain any folders.

              OK

              shared-folder
              |-- datapackage.json
              |-- README.md
              |-- data.csv
              

              Not OK

              shared-folder
              |-- datapackage.json
              |-- README.md
              |-- data
                  |-- data.csv
              
              1. Upload your Data Package folder (help (opens new window))

              2. Change your folder’s share setting to Public on the web - Anyone on the Internet can find and view (help (opens new window))

              3. Get a shareable link for your folder (help (opens new window))

              4. Find your folder’s ID in the link

              • Example Link:
                • https://drive.google.com/open?id=0B-f6D5RM8awSfkdtRWpiTlpxdmhPblJRd2NhdHpHMFZPOFZKcWhpT2NkQlZCUlNWUnFwaHM&authuser=0
              • Example ID:
                • 0B-f6D5RM8awSfkdtRWpiTlpxdmhPblJRd2NhdHpHMFZPOFZKcWhpT2NkQlZCUlNWUnFwaHM
              1. Your datapackage.json link is https://googledrive.com/host/{ID}/datapackage.json; for example, using the Example ID from the previous step, the datapackage.json link is:
              • https://googledrive.com/host/0B-f6D5RM8awSfkdtRWpiTlpxdmhPblJRd2NhdHpHMFZPOFZKcWhpT2NkQlZCUlNWUnFwaHM/datapackage.json

              # Dropbox

              Just upload your files to Dropbox.

              You do need to be a bit careful as Dropbox does not always replicate your local file layout in its online URLs. Therefore, make sure you read the Key Tips section below.

              # Key Tips

              However you publish your Data Package there are a few key points to keep in
              mind:

              • All the files in the Data Package should be accessible online

              • The structure of your Data Package should be preserved. Specifically the paths between your datapackage.json and the data files must be preserved. For example, if your Data Package directory looked like this on disk:

                datapackage.json
                data.csv
                somedir/other-data.csv
                

                then online it should look like:

                http://your.website.com/mydatapackage/datapackage.json
                http://your.website.com/mydatapackage/data.csv
                http://your.website.com/mydatapackage/somedir/other-data.csv
                

                This can be a problem with services like e.g. Google Drive where files in a given folder don’t have a web address that relates to that folder. The reason we need to preserve relative paths is that when using the Data Package client software will compute the full path from the location of the datapackage.json itself plus the relative path for the file give in the datapackage.json resources section.

              Recommended reading: Find out how to use Frictionless Data software to improve your data publishing workflow in our new and comprehensive Frictionless Data Field Guide.

              Blog Index