Open strategy

December 3, 2009

So, the Conservative Party have leaked a leaked copy of the draft Government IT Strategy! I’d been privvy to an early draft through the Local CIO Council and hadn’t really thought anything was worth shouting about. In fact I’m not really sure that another government would do any much different, apart from branding and terminology. Whilst I am a strong believer in, and my dissertation relates to, “co-production”, I’m not a believer in crowdsourcing per se, it’s a bit like mob rule or, even worse, minority rule or oligarchy, which is apparently the Conservative Party rationale for leakage. I had wondered if it had been a deliberate leak on John Suffolk’s part but I gather from the Cabinet Office that this was not the case, however they do insist it was an early draft and that the feedback will be very useful!

This is the leakage is latest Conservative version of a Conservative ICT non-strategy and not some little way from their earlier rallying cries around “open source”,  “open gov” and “open data”. On the W3C group on e-government, someone recently posted a list of alternative “definitions” for such data and here they are with credit to Winchel “Todd” Vincent III of
<xmlLegal> http://www.xmllegal.org/ This may be developed as part of the groups work but the original is his.

“Unavailable: You simply cannot get the data.  Data is cost prohibitive to publish. There may be security or privacy reasons not to publish.  Or, simply, no one ever thought to publish the data.

Not Translated: Data is available, but exists in a different language than the end user’s language.

Paper: Data is available, but it is only available on paper.

Free: Data is available at no cost and without restrictions.

Fee Based: Data is available, but only for a fee.
— Public: Fee Based: Government provides data for a fee.
— Private: Fee Based: Private company provides data for a fee.

Copyright: Data is available (in some way) but there are copyright restrictions on republication or reuse.

Copyright with License: Data is available (in some way), there is a copyright, but also a license that allows some use (other than all rights reserved).

Public Domain: Data is available (in some way) and is in the public domain, so there are no restrictions on use of the data.

Electronic: Data is available electronically.

Electronic: Web Browser or Paper-Like Electronic Document Format: Data is available but only via a web browser or an electronic document format and not in an easily parsed format (where Images/Graphics, HTML, XHTML, PDF, Word, and Word Perfect do not count as easily parsed formats).

Electronic: Structured Format: Data is available electronically and in a structured format.  A structured format would include delimited text, spreadsheet, XML, and the like.

Electronic: Structured Format: Schema: Data is available electronically and in a structured format.  Additionally, there is a schema available that defines the structured format.

— Government Schema: A government promulgates the schema. The schema may or may not be in the public domain.

— Standards Body Schema: A recognized standards body promulgates the schema.  Schema is licensed under a “copyleft” (perpetual, free, but with restrictions not to modify) or similar license (typical of W3C, OASIS, but not all “recognized” standards bodies).

— Private Schema:  A private company promulgates the schema.  The schema may or may not have licensing restrictions associated with it.

Electronic: Browser/Viewer: Electronic data, whether structured or not, is available only via a web browser or other viewer for viewing.

Electronic: Download: Electronic data, whether structured or not, is available to download.  Here, download means a “manual” download. Some manual user input must be done to download the data (e.g., downloading a spreadsheet or structured text file via an HTTP link or FTP) to the user’s local machine.

Electronic: Web Service: Electronic data, typically structured, is available via a web service (meant in a generic way, not specific to a technology) for machine consumption.  There is some standard, specification, or documented publication rules, such that machines can reliably access the data on an ongoing basis.  The point here is not the format of the data, but the reliability and availability of the connection to the data, so that machines can get to the data feed without human intervention.

Each of these qualities makes the data more or less “open” or “accessible” as a practical matter.  There are  many combinations of these that one could put together.”

If anybody in UK wants to remember the recent history of the National Land and Property Gazetteer (NLPG), they’ll remember the local property data expensively gathered with great efforts spent cleansing it. The authorities who have spent large sums of money are now likely to find this being given away. There is current effort on matching this data with that from the Electoral Register, this is the Coordinated Online Register of Electors (CORE) project. One of the issues around propert data in recent times has been resistance from the Royal Mail which produces the postcode file to allow any fee-free use of the PAF. So local authorities are expected to give away data that has been expensively cleansed in order that private organizations may profit – if that is the Conservative plan – to see it given it away like North Sea Oil, public transport, British Gas etc etc. The comment is that public money paid for it, so the public should have it – but what if they have to pay twice?

Advertisement