Monday, 29 June 2009

CPS - Our Data, Big IT - another missed opportunity

The entry of the Centre for Policy Studies (CPS) into the "free"/"open" debate is not surprising ( It feels though like another missed opportunity to move the debate on hampered as it is by political point scoring.

Varney has it that the government needs to hold “...a ‘deep truth’ about the citizen based on their behaviour, experiences, beliefs, needs and rights”; the CPS report argues for something they call "Government Relationship Management" at whose heart lies choice in the location of your personal data and access to it based on standards (and, not mentioned, rights). Whether or not data "belongs" to the individual, that data should be exchanged using open standards - the web services and metadata chestnut. Hence my interest.

The report focuses on cost, ownership and security and that the solutions to this lie with a change to the model. From the perspective of opening up access the report actually offers little in the way of new ideas above and beyond the central thesis that who holds our data should be our choice.

Of course many private sector entities hold data about each of us. Are they any more trusted (fallible humans account for most mistakes) or cost effective (consider the long run case for PPP/PFI and that government will have to run the same infrastructure anyway before answering!)?

Innovation with new tools and technologies is generally far more rapid in the private sector and there are good lessons and cost savings assuredly for government IT regarding all manner of these from the cloud, to SOA, APIs, usability etc.

The report bemoans data replication (as it is easy to do from an armchair) but storage costs continue to tumble and techniques such as automated de-duping provide further savings. It also bemoans data sharing which is in part the flip side of the same coin. This is not where major cost savings lie either and suggestions of 50% are “vague” anyway admits the report?

But cost savings there can be for sure; mechanisms that drive value to the citizen and ease service provision should be at the core of the debate not point scoring phraseology and impossible to substantiate claims. So where might that value derive?

The CPS report rightly endorses (increasing) adoption of SOA and the cloud to spur efficiencies and the big providers are on this already. We are beginning to see ‘vertical market’ and ‘localised’ interfaces for a range of ‘use cases’ from citizen through service provider to analyst. Private sector experience is that these approaches deliver significant downstream IT savings whilst embracing outsourcing, providing greater flexibility and speed to market for service providers. At their heart lies adoption of authentication and authorisation standards and technologies that deliver rights based access to data and functionality depending on the user and other factors. All power to that elbow as it deals with both the replication and sharing arguments when implemented correctly.

Government would point out it needs access to a consistent data set so that government's (presumably outsourced) analysts could compare apples with apples - without a level playing field what hope for the postcode lottery? A fertile imagination will see risks in an online privatised ID!

Like other recent reports (including POIT, UK Location Strategy) those close to Whitehall (which policy wonks, analysts, civil servants etc inevitably are) sometimes appear reluctant to recognise what distributed architectures bring to this debate - it doesn't actually matter where the data is held as long as it can be discovered, accessed, exchanged and so on.

There is of course a debate to be had about what data should be held, who should collect it, who should have access to it, where it might be stored, what rights management apply and in what circumstances and so on.

However, what has got lost in the wash is some definition or acknowledgement that what is in essence under discussion is ‘data for the “public good”’, be it through aggregation or for the individual citizen.

Under this world view ‘public’ would be defined as discoverable or searchable and to be those things you need openness, interoperability, web services and above all a mechanism that integrates authentication and authorisation into the solution via the construct of metadata and rights management. Mandating metadata capture and discoverability (publishing) would provide much of the enabling framework and dissipate the faux concern over whose data it is.

It is easy and correct to point the finger at ineffective and poor value government IT projects (I'll give you some less familiar - £50m for RPA's SPS so far, £7.2m for planningportal architecture alone over 3 years). But to intimate that a vague and unpalatable solution offers some panacea for these failings is an incoherent leap based on a narrow philosophical outlook and narrow technical thinking. The promise of the distributed discoverable semantic web fits far better with the sought after vision but has been mostly missed.

ps Attracting advertising spend requires the advertising portals (sorry, search engines) to harvest ever more granular data about their users in order to 'segment' and then 'target' the adverts accordingly to garner the greatest revenues. They seek a ‘deep truth’ about the citizen based on their behaviour, experiences, beliefs and needs and how to get click-throughs for advertisers.

Sounds suspiciously familiar no? Only difference is the absence of rights - a consent easily given and hard to wrest back – the government has been notable for its ‘light touch’ regulatory environment with weak regulatory, governance and compliance - you would likely be astonished at the permissions you have given the business to whom you have in effect licensed yourself. ‘Minority Report’ was an exemplar to Dubya not the savage warning that Mr Dick intended.

Geo (Digital) Rights Management - love and hate, love to hate?

I would have liked to have gone to last weeks OGC event about which Adena ( and Ed ( have recently reported/blogged. Didn't go to Glastonbury or Hyde Park either but Springsteen honouring Strummer worth the nod of the title I think.

As you can imagine as a geoportal understanding licensing, sub-licensing and licence management, provising licence advice and so on are key competitive advantages for emapsite. However, we all know that getting to grips with licencesis something of an on-going challenge, be they public sector or commercial licences. We like to think we're on top of it and have worked very hard to provide appropriate components to the control module of our 'emapsite inside' web services platform (

Talking over the years with Graham Vowles, who heads up OGC's GeoRM activity, is always interesting if often conceptual and I do think that the work being done and published to date offerrs something of a road map for GeoRM implementation be it by geoweb specialists as part of their own services, by integrators, by technically literate mashers or even as stand alone on demand services in their own right.

Licences are essentially a set of rules - whether one likes them or not, if the licences are drawn up unambiguously (and they often aren't) they will (or should) be robust as far as the licence holder is concerned. The degree of ambiguity determines whether or not the legal 'encoding' can be formally computer 'encoded'. Thus, and as long as all the other licence dependencies (rules) are also captured in the licence then, as these details are effectively metadata, in a perfect world it should not be very difficult to deploy tools/technologies in a licence 'engine'. Such an 'engine' can for example ensure that users understand the implications of agreeing to obtain (and comply with) the rights they seek; even better a licence engine can advise users as to the 'correct' licence for a given scenario.

With web services (WMS or WFS) then implementation of the agreed licensing can be seen as a metadata conformance tracking (provenance is key for both licensor and licensee) component and can be reported equally unambiguously (enforcement always a dirty word in this arena).

In an other than perfect world, it is where licences are ambiguous or disingenuously linked to other licences that life becomes more difficult, legally and formally.

(Automated) rights management is essentially about ensuring that the mechanism by which a resource is processed into a deliverable (input-action-output) is 'permissible'. If any part of the request is not then there is no deliverable - by geography, by time, by user, by use or usage, by platform or by some other measure the request does not pass a series of comparisons relating to rights (ISO 21000 defnes 5 broad sets of rights that such tests would cover) within the agreed licence.

Sounds so simple! Especially as much of this can be built into user profiles associated with the authentication and authorisation integral to enterprise solutions. And GeoRM has a rights expression language (REL) to do this already.

This document can seem intimidating - - but bears sticking with because as Adena says its not easy but it is not beyond the comprehension of anyone who is truly interested and I would argue is essential to anybody who is. Not understanding licensing and the options available for implementation is no excuse for abuse.

The licence holders set the rules; few yet see the licence/rules as metadata so there is ambiguity in public and prvate sector licences so it can be difficult to offer any kind of dynamic licensing mechanism. This is a challenge for mid/small scale data provision but as this is the kind where most of the pressure for 'simpler' licensing falls, unambiguous licensing should eventually prevail and would allow licensees to implement along the lines of the GeoRM model.

I think Ed's cart/horse reference must relate to ambiguity in geodata licences undermining the GeoRM model. My own take is that the vital need to formally encode geodata licences that a GeoRM model demands means that those drafting the licences need to ensure incorporation of unambiguous rights within the licence. The same might be said of all those 'catch-all' terms of use licences though! Paid for or otherwise users need to understand their rights - licence engines (in and beyond geo) based on formal encoding and clear language have the capacity to offer much needed clarification of a relationship between licensor and licensee that is oft mired in rhetoric.

Sunday, 28 June 2009

Compromise in the air?

Started off rebuffing belief that shapefile is proprietary but then went a bit deeper into the article and re-read Mr C's speech - then had to filter to end up with something that linked (my) data (awful pun, apologies to Sir TBL); so worth cross-referencing here -

Tuesday, 16 June 2009

"better meta data now" in Digital Britain

Think we might be seeing some joined up thinking in government? Consider the OEP, the still unpublished Trading Fund Review, the OS Revised Strategy, the appointment of web-founder and Linked Data evangelist Sir Tim (as well as Martha LF in another parallel advisory capacity) and the Digital Britain report - consistent references are made (and this is my shorthand interpretation admittedly) to the value of digital content, both in its creation and its distribution and consumption, to UK plc.

As the economic profile of the creative industries rises and advertising revenues and the financial services sector suffer, there is creeping recognition that just because you can (copy and distribute digital content for next to nothing) don't make it right. File sharing and DRM are inevitably at the forefront of this debate for the consumer but, in business, enterprises value their integrity to the point of making such copying or use a dismissable offence (I've seen the noticeboards, believe me, compliance is a competitive advantage).

This has a serious edge because the search engines and ISPs are using tools and technologies that harness both private and corporate information, with and sometimes without your direct consent, combining it with third party data to aid, amongst other things, their advertising services where narrowly targeted highly granular adverts are of highest value. This is creating an opportunity for advert-less solutions based on open source tools. Often start-ups, bedroom coders and the third sector (that includes some so-called not for profits alongside genuine charities) are the key innovators in this area in the search for maximum bang for buck.

Which leads us to public sector information holders, seeing as they collect (sometimes in much the same way as search engines) reams of data about people, places and business that may (or may not) have utility and value to citizens, consumers, corporations and charities. Local authorities, government departments, trading funds, executive agencies, regional assemblies and their agencies, PCTs and SHAs and more, there are 1000s of possible sources of PSI. And for most we know little about what they collect, can discover less about what they publish and can't thus value or use.

Those that we do know about have attracted a great deal of attention over the last 3 years and following the OEP we now have a slightly less muddy idea of where they might be heading - down the principle of the user should pay (and Digital Britain has followed this what is in effect a a £6 levy on fixed lines to bring broadband to as many as people as possible over the next decade). These developments are not to everyone's liking and the devil as ever will be in the detail; the reality is that as the Digital Britain report acknowledges but which so many others have overlooked, developments in technology have already, and will continue to, lead to falling unit costs for access to, and for content carried over, digital communications networks.

Taking one area as perhaps the most visible and contentious example, technology investment has resulted in a near 40% fall in real terms of the price of the most detailed digital mapping product in the world, called OS MasterMap, over the last 8 years. Further investments will likely see this real terms unit price fall further, in part to demur to (very vocal) critics, in part as government negotiates a better deal for access to such data, in part due to new competition (re UKMap). It is entirely conceivable that the creator of this data (the much pilloried but changing Ordnance Survey) could find themselves in a position where they could "give" data to their colleagues across government and charge everyone else and still make a return on capital employed, eliminating the charge of "subsidy" and providing the much needed consistent location database on which UK plc can build (and charge for and pay taxes for) its services.

Already, users from consumers paying £15 or so for a map based planning application for a £40,000 extension to consulting engineers paying £000s for multi-million pound developments see both the necessity of such location information and the incidental nature of the cost of it to the activity in question. The acknowledged thorny issue of 'derived data' is set aside for now!

The current government's message seems to understand this and to promote the idea that digital data and content does have value, that those that collect it should be compensated and that those that use should pay for it. New business models are the order of the day - consider the "all you can eat" subscription deal from Virgin announced today.

The latter (alongside recent changes in PRS terms for example) aims to ensure that amongst other things way we can be sure that our creative industries are nourished and sustained while the former aims to ensure that digital content backbone upon which UK plc depends is itself guaranteed.

The missing part of the jigsaw as I have commented on other blogs stems not from an appreciation that standards should be used or that data shouldn't be available but rather what that accessibility might mean to its creators and consumers alike. Political parties of all persuasions are attracted by the term "with rights come responsibilities" (or similar) and the same applies to digital content. Recognition and in some instances reward are expected.

But, and here's the rub, unless and until digital content creators either choose (commercially) or are mandated (by executive order) to publish information about what they collect and can distribute and on what terms then we are all, as consumers, blind. Metadata, data about data including the rights and responsibilities associated with its use, is central to the thesis that we are an information economy and it is sorely lacking (as any screen scraping hacker will affirm) from all these reports and from most commentaries on the subject - please no more mention of central repositories, it ain't necessary and will cost a fortune.

So, "better meta data now" has a handy alliterative quality to it and I commend this phrase as an enabler for a digital Ben Bradshaw didn't, sadly, say!

Wednesday, 3 June 2009

To be or not to be a cadastre

Thanks to Bob Barr to his tweet alerting me to Ms Spelman's question and the DEFRA written response - .

I had the pleasure of working on a number of land information systems development initiatives in emerging economies back in the 1990s where the establishment of rights in land (and property) were (and remain in some countries we are involved in as I speak) a key plank in the transition from centrally planned to "western" economy. The focus tended to be on the creation of what mainland european counterparts would recognise as a cadastre - land demarcation and subsequent registration - recording, protecting and securing rights in land and providing a stepping stone for investment, entrepreneurial activity, sustainable land use practices, improved yields and farm-gate incomes and a market in land and property.

To this day I find it ironic that it was often British organisations with British personnel who were widely regarded as having world leading expertise in land registration in particular but also in LIS and boundary dispute resolution, despite the fact that the United Kingdom does not have its own cadastre.

The 2007 eurographics survey report (at kind of ducks this fact - to be fair I think their approach to the 5 baseline parameters mostly works - with England, Wales and Scotland being notable for their differences from other Member States in key 'areas' - pardon the pun (if you read the report you'll get it!).

In this context Ms Spelman's question looks a tad better conceived than at first sight - after all HMLR does use OS data to underpin its won transition to e-business. And the answer from DEFRA is technically correct - until these definitions are in place (and one can conjecture as to the lobbying going on in Brussels) any affirmation as to what data will be covered by the transposition and thus the organisations responsible would be less than judicious.

One final thought - to what extent are land parcels a necessary element of a European wide spatial data infrastructure whose initial motivation was for improved cooperation in environmental issues?

Monday, 1 June 2009

Malaria anyone? #2

Prescient or what - malaria is the coming story after all - swine flu "pandemic" dissipates in the heta of election expenses while the malarial parasite is revealed as becoming immune to one of the major prophylactics. And, one can only assume that because this was broken on BBC, ITN News for whatever reason chose to ignore it (this was last week in France and this was what was on!) - more brucisation of news, only from the opposition - very disappointing.

The time has (finally) come

It always happens - the gestation period, the testing, the iteration, the collateral...and finally, the release - it all takes longer than you think!

emapsite has been a 'dynamic' site, adding new content and functionality and altering usability to reflect new norms, embrace emerging standrads, assimilate feedback and so on. As such we haven't majored on new "releases" of the website.

However, on this occasion I do think that something more is merited as the site update includes a re-branding of the business, enhanced usability and a little lifting of the lid of our services platform - at this juncture in marketing and communications terms only but if you are interested to learn more you can....

So, if you are interested in or use digital mapping or if you are a user of broader digital geographic content or if you wish to embed location content within your business there are even more reasons to visit

Absent friend

Just back from a fantastic 9 days away in La Vendee. Some who know me will know that good friend, co-boat owner, man who got things done and enjoyer of life Marcus passed away very suddenly in January leaving a vast hole in the lives of his family. The same friends with whom we have just had our holiday in the sun, sand and swimming pool (well the sea was a little cold still!) - I am not alone in missing a friend but I really really don't know how they do it - much loved, sorely missed.