Tool Fund Grantee: Oleg Lavrovsky
This grantee profile features Oleg Lavrovsky for our series of Frictionless Data Tool Fund posts, written to shine a light on Frictionless Data’s Tool Fund grantees, their work and to let our technical community know how they can get involved.
Over the years, I have tried other languages like Clojure and Pascal, Groovy and Go, Erlang and Haskell, Scala and R, even ARM C/C++ and x86 assembly. Some have stuck in my dev chain, others have not. As far as possible, I hope to keep a beginner’s mind open to new paradigms, a solid craft of working on code and data with care, and the wisdom to avoid jumping off every tempting new thing on the horizon.
I first came across tendrils of Open Knowledge ten years ago while living in Oxford, a vibrant community of thinkers and civic reformers. After we started a hackspace (opens new window), I got more involved in extracurricular open source activities, joined barcamps and hackathons, started contributing to projects. I started to see so-called ‘big IT’ or ‘enterprise software’ challenges to be, on many levels, problems of incompatible or intractable data standards. It was in the U.K. that I also discovered civic tech and open data activism.
Helping to start a Swiss Open Knowledge chapter (opens new window) presented me with the opportunity to be involved in an ambitious and exciting techno-political movement, and to learn from some of the most deeply ethical and forward-thinking people in Information Technology. Running the School of Data (opens new window) working group and supporting many projects in the Swiss Opendata.ch (opens new window) association and international network is today no longer just a weekend activity: it is my
I first heard the term frictionless from a philosopher (opens new window) who warned of a world where IT removes friction to the point where we live anywhere, and do anything, at the cost of social alienation - and, along with it, grave consequences to our well-being. There are parallels here to “closed datasets”, which may well be padlocked for a reason. Throwing them into the wind may deprive them of the nurturing care of the original owners. The open data community offers them a softer landing.
Some of the conversations that led to Frictionless Data took place at OKCon 2013 (opens new window) in Geneva, where I was busy mining the Law (opens new window). Max Ogden mentioned related ideas in his talk (opens new window) there on Dat Project (opens new window). It later became a regular topic in the Open Knowledge Labs hangouts (opens new window) and elsewhere. My first impression was mixed: I liked the idea in principle, but found it hard to foresee what the standardization process could accomplish. It took me a couple of years to catch up, gain experience in putting the Open Definition (opens new window) to use, struggle with some of the fundamental issues myself - just to wholly accept the idea of an open data ecosystem.
Working with more unwieldy data as well as having an interest in Data Science, and the great vibe of a growing community all led me to test the waters with the Julia language (opens new window). I quickly became a fan, and started looking for ways to include it in my workflow. Thanks to the collaboration enabled by the Frictionless Data Tool Fund, I will now be able to focus on this goal and start connecting the dots more quickly. More bridges need to be built to help open data users use Julia’s computing environment, and Julia users could use sturdier access to open data.
There are two high level use cases which I think are particularly interesting when it comes to Frictionless Data: strongly typed and easy to validate dataset schema leading to a “light” version of semantic interoperability, helping data analysts, developers, even automated agents, to see at a glance how compatible datasets might be. Take a look at dataship, open power system data and other case studies at Frictionlessdata.io for examples. The other is the pipelines approach which, as a feature of Unix and other OS (opens new window) is the basis for an incredibly powerful system building tool, now laying the foundation of a rich and reliable world of shared data (opens new window).
At a more practical level, I have been using Data Packages to publish data for hackathons (opens new window), School of Data workshops (opens new window) and other activities in my Open Knowledge chapter, and regularly explaining the concepts and training people to use Frictionless Data tools in the Open Data module I teach at the Bern University of Applied Sciences (opens new window). I have built support for them into Dribdat (opens new window), a tool we use for connecting the dots between people, code and data.
Over the years, I have made small contributions to OKI’s codebases on projects like CKAN (opens new window). Contributing to the Frictionless Data project clears the way to the frontlines of development: putting better tools in users’ hands, committing directly to the needs of the community, setting an elevated expectation of responsibility and quality. That said, I am a novice in Julia. But my initial ambition is modest: make a working set of tools, produce a stable v1.0 specification (opens new window) release. Run tests, get reviewed, interact with the community, and iterate. This project will be a learning process, and my intention is to widen the goalposts as much as I can for others to follow.
The Julia language also needs to be better known, so I will start threads on the OKI forums (opens new window), at the School of Data (opens new window), in technical and academic circles. I am likewise really looking forward to representing Frictionless Data in the diverse and wide-ranging Julia community (opens new window), sharing whatever questions and needs arise both ways. The specifications, libraries and tools will help to preserve key information on widely used datasets, foster a more in-depth technical discussion between everyone involved in data sharing, and open the door to more critical feedback loops between creators, publishers and users of open data.
I will be developing the datapackage-jl (opens new window) and tableschema-jl (opens new window) libraries on GitHub, and you can follow me on GitHub (opens new window) to see how this develops and read stories about putting Frictionless Data libraries to use. Please feel free to write me a note (opens new window), send in your use case, respond to anything I’m working on or writing about, share a tricky dataset or any other kind of challenge - and let’s chat (opens new window)!