Tonight I have an immense desire to blog. It’s been months I have not been blogging, but today some information that reached me via my Yahoo Pipes and Twitter folks pushed me to blog. In fact what I want to share now is something that I had in mind for quite some time already. I was almost done when I lost my notes in Angkor Vat end of September. I want to blog about the Open Data movement and its implications for a knowledge economy.
Defining the Open Data Movement
What is it?
Open Data is a movement whereby organisations release data, that they were collecting and processing for their own operational purpose, and make them publicly and freely available as datasets, for access, re-use, re-mix. To do so organisations have to locate, identify and format those data and build the online storage facilities for people to access. As usual the web is a powerful platform. Open Data seats on top of two related concepts: semantic computing and linked data. Semantic computing to put it simply is a set of technologies and standards that makes information understood by human understandable by computers. Linked Data is a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.
Property One: A field under construction
For most people, Open Data is related to Linked Data and Sir Tim Berners Lee’s advocacy work (see his paper here and is TED talk there). That is the visible face of the iceberg. A lot of people have been working on related matters that help today come up with Open Data. This is particularly the case of metadata specialists and groups. They defined standards and best practices. All in all they progressively set a full-layered stack. What we now see emerging are initiatives that make datasets available along with some services derived from those datasets. This is the moment where services blossom and make the whole infrastructure work visible and relevant.
Property Two: One step further Creative Commons
Creative Commons is taking the Copyright issue from the opposite angle as traditionally accepted Copyright policies: instead of granting the maximum rights to the author by default, it grants the minimum rights. Open Data is one step further as authors or owners release entirely and unilaterally their rights on the data.
Property Three: A State / Public Service Leadership
It’s been ages that Public Service is criticised or mocked by everyone and that the private organisations are praised for efficiency; to a point that private became a model to public. One reason for that is the Chicago Boys who influenced the Reagan Administration and progressively the rest of the world … until the collapse of the model and its implementation in 2008. Add to this the massive dissatisfaction of the workforce – that requires no PhD in Management to capture, but just some sensibility – for quite some time and you get some reasons to understand why Corporate America isn’t trendy, and so are “sister Corps”. But beneath the ideologically influenced perception and judgement call, the evolution of the wild wild web illustrates pretty much how “Corporate” has become irrelevant over the last 15 years. The rise and shine of Web 2.0 / Social Media and its correlated Enterprise 2.0 / Enterprise Social Computing is a sterling example. First, corporations are struggling with copyrights issues as a lot of them try to make money out of that. Second, organisations are struggling with command and control issues. Here Government bodies and other public bodies are not better off than organisations. Yet, they have an indirect commercial relation (i.e. taxes) with the general public – as opposed to many private organisations (except those adopting full financing through advertisement) – they have developed more social computing services. The current rise of Open Data, made possible with the remarkable work of some people on standards and a fair deal of advocacy, follows the same path. Don’t expect corporations that already fought against social computing to understand and adopt Open Data: there is DNA incompatibility there!
Property Four: A Western Approach
The Open Data movement is western, even if it looks principally Anglo-Saxon, as the Guardian initiative demonstrates. This is no surprise. Germans are a leading nation of Open Source. French are a leading nation of Free Speech (via blogs and now social networks). The rest are less visible, as there are some language barriers and because international media networks are Anglo-Saxons.
Westerners are initiating the whole thing because they have a culture of accountability, particularly Anglo-Saxons and Nordic. People in command are legitimate as long as they demonstrate they deliver value; and people demand proofs. Other cultures may follow but they are not initiating the movement. This is mostly cultural. The Web is and remains profoundly rooted into the European Humanistic culture. This means that explicitly or not westerners are pushing their model a bit further and by doing so have an opportunity to maintain a cultural, economic and strategic leadership. This is precisely what is behind the Google war between the US and China (see a very ideologically loaded view here).
Yet, don’t expect all westerners to embrace Open Data at the same time. As the recent Google vs. BNF case over Digitization demonstrates, French people have a particular relation to data and information. French are probably better intellectually equipped to understand the notion of Digital Commons (see below) that lies behind the Open Data Movement, as there is a lasting tradition of centralism and interventionism. However, we French tend to preserve for the future rather than take full advantage now of our patrimony. France is principally made out of peasants, United Kingdom of merchants and this has a massive impact on how we see and interact with the world, including its Digital portion. The European Commission is helping address the topic with the Directive 2003-98/CE.
Property Five: A sensible definition of “sensitive”
Definitely Open Data raises issues and concerns over Privacy and Security. In the current international environment, where terrorist organisations happened to have as much Information Technology and Systems skills as any other organisations, and where China finally voices on the diplomatic scene with a strength that matches its importance, the release of data can be a risk.
Definitely the Open Data movement is composed of people, who have a very liberal, if not libertarian idea of data management. As usual, those people are activists, easily identifiable … and a minority. This is where the UK example is a true case study: compared to the US or other initiatives, the UK has released military-related information. This means that they have come to a balanced and sensible definition of what is “secret”, most probably aligning to what is truly jeopardizing national security and what is not. And this is precisely this definition of “sensitivity” that helps them take a clear leadership.
Property Six: Riding the Crowdsourcing wave
As some social computing initiatives – say Wikipedia and Innocentive to mention the most visible – (as well as books, one example here) illustrated, it is possible to engage people to do things that one feels it’s too low value-added to allocate resource to or that requires skills one is short of. In the case of open data one important trigger is the fact that information explodes to a point it is too expensive to store, process, make sense … and justify to the tax payers.
Making sense of the Open Data Movement
Opportunity One: Creating true Digital Commons
First, private organisations massively invested into telecom and information technologies to create the infrastructure, to a point that investors got afraid and stopped investing … leading to the Internet crash of the early two thousands and cheap broadband. Second, users started generate content via cheap devices such as blogs and later social networking platforms. Third, Governments, i.e. entities that own data and define Laws, but also several media entities such as The BBC, The Guardian and the New York Times, that are massively producing content they own, are making publicly available datasets. Data help build information that help people create knowledge. This means that this third moment of the Internet allows the emergence of true digital commons. For definition sake, a commons is a pool of resources that is public and can be used by anyone to derive by-products. The notion of commons got recently new momentum with Elinor Ostrom’s Nobel Prize.
Opportunity Two: Developing new grounds for learning, innovating and wealth creation
By making publicly and freely available datasets, Open Data make possible innovation development, but also learning opportunities and therefore wealth creation. General public and particularly students have a playground to test ideas, put into practice what they learned at the University or benefit from the work of others to discover, access and make sense of information. This helps build a better skilled society that can compete in a global competitive economy, which is a traditional role of Governments via fiscal incentives, infrastructure and education policies. One just need to look at what is already available out there (one example here with FreeBase) to anticipate a dramatic surge – both in quantity and quality – of information and therefore knowledge. Examples of private platforms such as SalesForce, the iPhone, Android and now Kindle also illustrate what one can anticipate. In fact, the United Kingdom Report on the Re-use of Public Sector Information 2009 (OPSI) shows page 54 that 34% of Click-Use License re-users are private individual already.
Now, from the angle of supporting developing countries to move to a different stage of development, this also is a meaningful policy. There is no particular investment, but people in developing countries can take advantage of available resources to learn and build services too. For instance, resources made available via Open Data / Linked Data / Semantic Web technologies by the BBC, the Guardian, Wikipedia (DBpedia) or the Library of Congress for instance can help a local library to propose to its public a whole lot of resources, at a fraction of the cost. What would be required in this situation is a clear decision of managers and a group of developers who are familiar with standards and available resources.
Opportunity Three: Tapping into the worldwide dispersed brains to serve the local community
By adopting Open Data, Governments are facilitating and supporting initiatives that serve the local community. It acknowledges the fact that civil servants are not the only ones and don’t necessarily have all the skills to deliver services that are meaningful to the public. It also acknowledges the fact that contributions are not geographically limited and cannot only be handled by selective immigration policies, as not all the contributors can or want to relocate. Let’s be honest: even if outsiders can tap and consume available data, most of the consumption will be made locally. Take the example of Vancouver. The data released makes primarily sense to people who live or visit Vancouver. More people can use datasets to learn and by doing so develop new services, but the only thing they would gain is technical knowledge and concrete examples of their ability. They will not take direct advantage of the service, from a consumer perspective.
Opportunity Four: Inventing new interactions to strengthen Democracy
As previously mentioned, the Open Data movement is grounded into a very specific western ideological / philosophical environment: accountability of people in command. The datasets that are publicly and freely made available often are data used for policy making by representatives and civil servants. This means that anyone – from Joe Six-Pack to lobbies – can access the data, review it, remix it, make sense of it and come up with questions and alternatives. Overall this ability of questioning on the same foot decision makers is reputed beneficial for Democracy. A relevant conversation can happen and facilitate a more astute and better decision, in the interest of all. The same applies in election periods where citizens can be in a better position to evaluate what concretely has been done and how relevant is the proposed, beyond what political communication and marketing offer. The fundamental democratic logic of check-and-balance is possibly reinforced with Open Data.
I have tentatively demonstrated what’s behind Open Data and why it matters. Obviously, no one will escape the necessity to sort the digital divide, to reinforce the need for robust and widespread information literacy and to facilitate the ability to make sense and critically evaluate information. Nothing new here, it’s a perpetual work-in-progress, just more urgency! Nor it will allow Governments to escape working the lines for equilibrated finance, eco-friendly development and social fairness policies. Similarly a close attention is to be brought to standards, as – even if they have crystallised over the last 18 months – nothing is set. Some big Internet players may have the wish to develop their own and kill or cannibalise the work of a whole community. Future evolutions of the Internet may jeopardize certain bricks of the semantic stack. As a related example, the rise of e-readers is putting at risk a lot of e-resources because these devices do not support certain standards. However, the trend is very positive and a true contribution to the reinforcement of a knowledge economy and knowledge-management related strategies and initiatives.