Adding Data Package Specifications to InterMine’s im-tables
This grantee profile features Nikhil Vats for our series of Frictionless Data Tool Fund posts, written to shine a light on Frictionless Data’s Tool Fund grantees, their work and to let our technical community know how they can get involved.
# Meet Nikhil Vats
I am an undergraduate student pursuing BE Computer Science and MSc Economics from BITS Pilani, India. My open-source journey started as a Google Summer of Code student with Open Bioinformatics Foundation (opens new window) in 2019 and currently, I am a mentor at InterMine (opens new window) for Outreachy. I’ve been working part-time as a full-stack web developer for the last two years. The latest project that I worked on was DaanCorona (opens new window) (daan is a Hindi word which means donation) - a non-profit initiative to help small businesses affected by Coronavirus in India. Through the Frictionless Data Tool Fund, I would like to give back to the open-source community by adding data package specifications to InterMine’s im-tables. Also, I love animals, music and cinema!
# How did you first hear about Frictionless Data?
I first heard about Frictionless Data from my mentor Yo Yehudi. She had sent an email to the InterMine community explaining the Frictionless Data initiative. The introductory video of Frictionless Data by Rufus Pollock inspired me deeply. I researched about Frictionless Data Specifications, Data Packages, and other tools and was amazed by how useful they can be while working with data. I wanted to contribute to Frictionless Data because I loved its design philosophy and the plethora of potential tools that can go a long way in changing how we produce, consume, and reuse data in research.
# What specific issues are you looking to address with the Tool Fund?
InterMine is an open-source biological data warehouse. Over thirty different InterMine instances exist and can be viewed using InterMine’s web interface im-tables (opens new window), a Javascript-based query results table data displayer. The export functionality of the im-tables supports common formats like CSV, TSV, and JSON. Whilst this is standardized across different instances of InterMine, exported data doesn’t conform to any specific standards, resulting in friction in data especially while integrating with other tools. Adding data package specifications and integrating with frictionless data specifications will ensure seamless integration, reusability, and sharing of data among individuals and apps, and will affect a broad number of InterMines based in research institutes around the world. In the long run, I would also like to develop and add a specification for InterMine’s data to the Frictionless Data registry.
# How can the open data, open source, or open science communities engage with the work you are doing?
I will be working on the im-tables (opens new window) and intermine (opens new window) GitHub repository, writing blogs every month to share my progress. I also plan to write documentation, tutorials, and contributing guidelines to help new contributors get started easily. I want to encourage and welcome anyone who wants to contribute or get started with open-source to work on this project. I’ll be happy to help you get familiar with InterMine and this project. You can get in touch here (opens new window) or here (opens new window). Lastly, I welcome everyone to try out and use the features added during this project to make data frictionless, usable, and open!