Wednesday, 22 May 2013

It didn't have to be like this

It was only a question of time before the tensions between the paucity of really useful/economically valuable things being done with the "open data deluge" and the continuing noises over the necessity of it resulted in an apparent shift in the language of responsibility.

Gone is "raw data now", forgotten the criticism of Spikes Cavell, overlooked the hackneyed (yet poetic and often so) "the best thing that can be done with your data will be done by somebody else for some other reason". Instead we have a plea for open data publishers to engage with re-users. Completely understandable but in part at least risible nonetheless.

As ODI's piece Engaging With Reusers recognises at the outset, "Publishing open data requires time and effort" (though why its not on ODI website is unclear). Data custodians and their teams have in many cases across the public sector to do just that and, while one can and should argue not only with the utility of many of the releases, with the opaqueness of data currency and maintenance, with the absence of context by which to assess in any realistic way the relevance of the data, with the fact that the collecting methodology effectively cooks the data at the point of capture and that there is little genuinely raw data, with the formats of release and so on, there is significant endeavour in meeting the letter if not the entire spirit of open data initiative and the underpinning TBL 5 stars of linkedness. Now they are being exhorted to maximise reuse. That's surely to miss the point or is it a figleaf/buck-pass for the open data project.

As someone who has mostly believed that the best outcomes of this 'grand projet' will be significant efficiencies within and between government agencies this comes as no great surprise. Many of the numbers posited for the economic benefits may be fulfilled via productivity gains in the public sector itself rather than from the often ad-dependent tax returns of start-ups (the ad platforms not being domiciled for tax purposes not featuring in the calculations any more!). As a 13 year old start up steeped in the mysteries and myths of data licensing, re-use, utility, value and so on, assisting different markets in finding or realising value, it seems strange that hard-pressed internally focused data personnel within the public sector are now being asked to go into bat for the data whose release appeared to be their end game. Who knows, next they'll be wanting to develop and licence their own applications to recover all that time and effort! There are start-ups who are both advocates and practical showcases for what can be done with open data - Mastodon C, Placr come to mind - which is excellent and who they will tell you have had to do battle with data custodians and data alike to progress and the feedback provided will have helped the custodians I am sure. That surely is the point - if you build it they will come as the old saying goes - and the energies of the market will bring their own benefits as well as a sense of relative value, though whether that it is the same as the value sought for the PSIHs or along a different trajectory is less certain. If, on the other hand, you reach out or focus or engage or identify possible reuse cases/groups not only will you be diverting hard pressed resources but with the best will in the world you wont be doing what you're best at.

Having said all that I think it is unfortunate that the piece is structured in the way that it is. There are useful advisories on documentation, metadata and the downstream user community that serve as flags to PSI custodians so that when data is published it already has optimal utility and demands minimal further input or promotion or re-user engagement from the publisher (other than maintenance!). However, to me at least, these are lost in the opening salvo and are actually what demand all our attention. Metadata and API documentation may be the dull end of the project but no question they are central to value being added! This document is a draft - we can expect changes - so do comment on it!