The unaddressed portion of open data: social

After Open Source, Open Data is trendy. Governments are revamping their public service information policies, robust and commonly accepted standards of data emerge, results blossom.
As a said in a previous post, the public service is at the forefront, not the private sector. The two major reasons are financial – it is an opportunity to get more revenues from PSI – and ideological – governments in democracies come from and work for the people so they should be as transparent as they can.

A review of the initiatives, particularly the UK that is the most visible and probably the most systematic one, shows that the work so far has been on:

  • Re-evaluating charging policies
  • Re-viewing copyrights
  • Favoring the standardization of data via a single format, at least per type of content (text, image, video …)

Let’s review them in details.

The re-evaluation of charging policies

It is the result of balancing 1) the incentivisation of clients for a higher consumption of the data and 2) the respect of the cost of production as well as the competition landscape. Between free of charge, marginal cost, cost recovery and full cost, the UK opts for the marginal cost approach, following a recommendation by Cambridge experts. However, not everyone is on the same page despite strong incentives by the European Commission. For instance, there are some voices that advocate for cost recovery or full cost in France, while the UK introduced earlier this year a process that opens doors to exceptions.


The review of copyright
It is meant to make it simpler to understand and access. A consequence is for instance that *by default* public service information is public and not restricted. As restriction becomes the exception, people are encouraged to look at available data as they have higher chance to effectively use them. The UK recently released an OS OpenData license.


The standardization of data formats
The adoption and promotion of certain standards, particularly RDF, facilitate the use and circulation of data. Skills and tools are clear, there is no hurdle to translate the formats so that it is faster and of higher quality. Most importantly those format help Internet services, particularly search engines, to reference, display, re-use, remix the data and combine it with existing data. The World Wide Web is comforted in its central position of digital commons.

The construction of a Market of Data

Those three concurrent initiatives are forming a new market. What happens behind the open data and the reforms of public service information is the creation of a market of data, with a global reach and a centralized place with the websites. Market is here to be understood in the pure classical sense coming from political economy.
- The re-evaluation of charging policies is the economical side of market creation. The purpose is to provide pricing transparency so that transactions are fairer and transaction costs reduced.
- The review of copyright is the legal side of market creation. The purpose is to set a common framework within which transactions can happen securely.
- The adoption and promotion of data standards is the technical side of market creation, like weights and measures. The purpose is to set a common framework so that people can better benchmark between items.

What see forming now is similar to what Dutch and Britons (later the rest of Europeans) have witnessed 4 centuries ago for physical goods. This systematic approach is very positive. The construction of the European single market can be a good illustration of the results of a systematic approach. In fact the US experience is more interesting. Several US agencies have already been releasing freely, with a user friendly copyright a lot of data 10+ years ago. However the absence of political momentum and co-ordinated and systematic approach kept those initiatives non-visible.

With the current setting, what the UK OPSI report shows is an effective growth in volumes and revenues of the Click-Use Licence market as well as the appearance of individuals. The growth in volumes is the result of increased delivery on the supply side, while the growth in revenues is the result of new players on the demand side. These new players are mostly individuals and they account for one third of the consumer base. From there one can assume that the work done concurrently at economical, legal and technical levels, is effectively reducing the barrier of entrance to the market. Rather than creating a market, they are re-creating a market by lowering its barriers.

License re-users by categories

source: The United Kingdom Report on the Re-use of Public Sector Information 2009, page 54, Crown copyright.

Beyond building the market, structuring it

Now that being said, the emergence of such Lilliputian actors is creating a new deal on that market. Before the market was pretty oligopolistic given the lesser transparency on the economical and technical levels, and stronger complexity on the legal level. Only big organizations could afford being on that market. They have their own structures, skills, processes, budgets to deal routinely with that. All the transaction costs are internalized (Coase). On the contrary these new players don’t have that backend organization.

So, how do we deal with that? How do you secure their presence on the long term so that the market does not get back to its previous oligopolistic state and consequently the expected benefits don’t show up? As far as I can see, nobody is looking at the matter right now. They all are busy creating the market of data so that no one really pays attention to structuring that market to make sure that that market is fair, fluid and conducive of wealth creation on the long term.

That is where knowledge management can play a role. By knowledge management, I do not refer to the popular and limited understanding that confuses managing knowledge and managing documents. Documents are containers of knowledge, not knowledge. I refer to managing insights and experiences, which is relational and people oriented.

The web has demonstrated that it was possible to build powerful, yet user-friendly, knowledge sharing platforms. The mechanisms of knowledge sharing can be used to structure the market of data. This would help individuals connect to data automatically, saving them the hassle of looking for the data. This would help individuals find similar or complementary profiles so that they can combine forces and grow to form real organizations. Implementing a platform like this would help individuals, by reducing quite a number of transaction costs. It is a way to structure the market to contribute to fairness, fluidity and wealth creation.

As we can see there is yet some work to be done., for instance, has to go far beyond the simplistic, non-connected Drupal profiles it offers now. How the market of data is going to be structured is the key to its sustainability.

Tagged with: , , , , , , , , , , , , , , ,
Posted in Misc

Government 2.0 and Enterprise 2.0: different environments, very same fundamental issue (culture)

Australia recently surfaced in a big way on the Government 2.0 scene, the equivalent to Enterprise 2.0 for public services. They have an exhaustive, state-of-the-art report that is worth reading with attention: Engage – Getting on with Government 2.0 – Report of the Government 2.0 Taskforce.


In this document that I have not quite finished reading, I have noted two consecutive points (page 16) :

  • “Web 2.0 provides public servants with unprecedented opportunities to open up government decision making and implementation to contributions from the community. In a professional and respectful manner, public servants should engage in robust policy conversations.
  • Equally, as citizens, APS employees should also embrace the opportunity to add to the mix of opinions contributing to sound, sustainable policies and service delivery approaches.”

These two paragraphs underline a fundamental problem. One can change licensing, charging policies, encourage the implementation of tools, the cultural aspect is fundamental. Because the culture in question is bureaucratic, and bureaucracy is built upon the assumption that people are not reliable. Lack of trust is the root of over-focus and over-confidence on processes, to a point where people are just facilitating the functioning of a machinery. What needs to be achieved is to put people back in command, a revolution literally, as in a Government 2.0 environment, people would have to :


1 – Engage with “customers” not only to perform their job/task.
In organisations, this means moving the back-office to the front-office and is distinct from quality management theories, which advocate a customer-driven production (type, quantity, quality of outputs). However, this calls for different soft skills. How people would cope? There probably are lessons to learn from the retail banking industry, as they have been through this process recently.


2 – Explain and discuss the frame in which actions and decisions are being taken.
Here again this calls for different skills. Not only people have to apply the policy/framework, but they need to understand it, while it is very common to meet people doing things just because they’ve been told to do so. What will surely happen there is that we will move from a simple loop learning frame to a double loop learning frame (Cyert & March). People will finally understand that a rule/process is only a situational arrangement that is efficient only for some time. The more it lasts, the better is the policy maker / process designer, no more, no less. We should therefore finally see emerging a culture where people are more responsible, don’t bluntly enforce outdated rules, nor bypass them, but contribute to adapting it. The single and double loop of Cyert & Schon can provide a useful conceptual framework to make sense of it. Moisdon et alii (1997) demonstrated that this is already happening in organisations anyway, thanks to few pragmatists or idealists so far, and that is precisely how the system survives and eventually thrives.


And for both to happen, there surely is a need for some work to be done at pure management level. How do people take responsibility over their work, in their context and at their level? How does the incentive system is being reformed to encourage people to contribute, including senior management (who are both an example and a driver)? How do you get rid of a culture where the boss is right in any case, not because s/he is relevant but simply the boss? How do you get rid of people who have no positive contribution?
My point is that beyond the technically-related changes (licensing, charging policy, data formatting, delivery models and platforms), what needs to be addressed and changed is “culture”.


Don’t expect techniques to facilitate the cultural change, this simply does not happen … even if some 2.0 features help surfacing relevance as an alternative to authority, provide room for “public” recognition (a simple, efficient and key incentive). That is something everybody involved in Enterprise 2.0 knows, as s/he experiences it daily.

Tagged with: , , , , , ,
Posted in PostDoc

Open data: money, money, money

Another element that is worth noting in the  “Models of Public Sector Information Provision via Trading Funds” (2008) report is that one of the reason for open data is public administration reform.

As the report notes, ‘trading funds’ were created in the UK to be financially sound and self-sustainable, not that they had to make money but cover their cost. A notable mindshift for organisations that have been relying on taxes, and no particular pressure re break-even. The very nature of public central administration is processing information. The attention to  the ‘knowledge economy’, culminating on political agendas with the Lisbon Summit (2000), created a new row of incentives. Altogether, making public information that is collected and processed by public administration in the course of performing appeared as an opportunity for trading funds reach financial sustainability, by opening a whole new set of (by)products to monetize.

Of course, there are many other rationale for data release, such as innovation steering and democracy strengthening. But given the state of public debts, and with a pinch of mauvaise foi, innovation steering and democracy strengthening are almost good excuse. And the great “Linked Data” vision of Sir Tim shaping by himself the internet and the digital future of civilization can be considered just solving problem, right on time*. The European Commission took responsibilities of standardizing initiatives throughout the Union, in its mission of cutting internal barriers. This eventually lead to concerted initiatives, pushing many countries to look into this issue and therefore creating the basis for a large “market”.

* just kidding here as the contribution is both massive and inspirational, driving contributions from others.

Tagged with: , , , ,
Posted in Misc, PostDoc

Open data: raw and value-added information

Recently the city of Rennes announced some open data initiatives. While browsing around, I’ve recently discovered some useful resources, particularly the consistent “Models of Public Sector Information Provision via Trading Funds” (2008) report by Prof. Bently, Prof. Newbery and R. Pollock (all from Cambridge University).

This report happens to be extremely useful for making sense of open government, open data initiatives. Precisely like the “knowledge economy”, “open data” is a multi-faceted emerging reality that requires different perspectives to be grasped. The Cambridge report is right on the spot providing rationale for monetization and economic optimization.
As I am reading it, I might share various insights on my blog, starting with that particular post.

Citing the “Charges for Information: When and How” (Treasury, 2001) report, the authors distinguish between raw and refined information.
Raw information is “information collected, created, or commissioned within Government which is central to Government’s core responsibilities. The supply of selected components of a raw data package, exactly as in the package is raw data supply, but the supply with further analysis, summarisation etc, or of data at a different level of aggregation to that used by Government, is not raw data for the purposes of this report but is value-added information”.

Value Added information is “information where value is added to raw data enhancing and facilitating its use and effectiveness for the user, for example through further manipulation, compilation and summarisation into a more convenient form for the end-user, editing and/or further analysis and interpretation, or commentary beyond that required for policy formulation by the relevant government department with policy responsibility. It also includes supplying retrieval software, or where work on material is included as part of the compilation of related data, and where there is not necessarily a statutory or operational requirement for Government to produce the material.

In the light of those definitions, most of the information made available within the frame of the open data initiatives are refined information. A more sensible approach would lead to a different opinion as what is released is data (and not information), under specific formats that are made to be understood by computers, not humans. In fact, that is precisely this last property that qualifies the content for “refined”.

From a Digital Library angle (my current occupation), I can share that most of the work that happens inside is about refining information.

Tagged with: , , , ,
Posted in Misc, PostDoc

Making Sense of the Open Data Movement for a Knowledge Society

Tonight I have an immense desire to blog. It’s been months I have not been blogging, but today some information that reached me via my Yahoo Pipes and Twitter folks pushed me to blog. In fact what I want to share now is something that I had in mind for quite some time already. I was almost done when I lost my notes in Angkor Vat end of September. I want to blog about the Open Data movement and its implications for a knowledge economy.


Defining the Open Data Movement


What is it?

Open Data is a movement whereby organisations release data, that they were collecting and processing for their own operational purpose, and make them publicly and freely available as datasets, for access, re-use, re-mix. To do so organisations have to locate, identify and format those data and build the online storage facilities for people to access. As usual the web is a powerful platform. Open Data seats on top of two related concepts: semantic computing and linked data. Semantic computing to put it simply is a set of technologies and standards that makes information understood by human understandable by computers. Linked Data is a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.

Property One: A field under construction

For most people, Open Data is related to Linked Data and Sir Tim Berners Lee’s advocacy work (see his paper here and is TED talk there). That is the visible face of the iceberg. A lot of people have been working on related matters that help today come up with Open Data. This is particularly the case of metadata specialists and groups. They defined standards and best practices. All in all they progressively set a full-layered stack. What we now see emerging are initiatives that make datasets available along with some services derived from those datasets. This is the moment where services blossom and make the whole infrastructure work visible and relevant.


Property Two: One step further Creative Commons

Creative Commons is taking the Copyright issue from the opposite angle as traditionally accepted Copyright policies: instead of granting the maximum rights to the author by default, it grants the minimum rights. Open Data is one step further as authors or owners release entirely and unilaterally their rights on the data.


Property Three: A State / Public Service Leadership

It’s been ages that Public Service is criticised or mocked by everyone and that the private organisations are praised for efficiency; to a point that private became a model to public. One reason for that is the Chicago Boys who influenced the Reagan Administration and progressively the rest of the world … until the collapse of the model and its implementation in 2008. Add to this the massive dissatisfaction of the workforce – that requires no PhD in Management to capture, but just some sensibility – for quite some time and you get some reasons to understand why Corporate America isn’t trendy, and so are “sister Corps”. But beneath the ideologically influenced perception and judgement call, the evolution of the wild wild web illustrates pretty much how “Corporate” has become irrelevant over the last 15 years. The rise and shine of Web 2.0 / Social Media and its correlated Enterprise 2.0 / Enterprise Social Computing is a sterling example. First, corporations are struggling with copyrights issues as a lot of them try to make money out of that. Second, organisations are struggling with command and control issues. Here Government bodies and other public bodies are not better off than organisations. Yet, they have an indirect commercial relation (i.e. taxes) with the general public – as opposed to many private organisations (except those adopting full financing through advertisement) – they have developed more social computing services. The current rise of Open Data, made possible with the remarkable work of some people on standards and a fair deal of advocacy, follows the same path. Don’t expect corporations that already fought against social computing to understand and adopt Open Data: there is DNA incompatibility there!


Property Four: A Western Approach

The Open Data movement is western, even if it looks principally Anglo-Saxon, as the Guardian initiative demonstrates. This is no surprise. Germans are a leading nation of Open Source. French are a leading nation of Free Speech (via blogs and now social networks). The rest are less visible, as there are some language barriers and because international media networks are Anglo-Saxons.

Westerners are initiating the whole thing because they have a culture of accountability, particularly Anglo-Saxons and Nordic. People in command are legitimate as long as they demonstrate they deliver value; and people demand proofs. Other cultures may follow but they are not initiating the movement. This is mostly cultural. The Web is and remains profoundly rooted into the European Humanistic culture. This means that explicitly or not westerners are pushing their model a bit further and by doing so have an opportunity to maintain a cultural, economic and strategic leadership. This is precisely what is behind the Google war between the US and China (see a very ideologically loaded view here).

Yet, don’t expect all westerners to embrace Open Data at the same time. As the recent Google vs. BNF case over Digitization demonstrates, French people have a particular relation to data and information. French are probably better intellectually equipped to understand the notion of Digital Commons (see below) that lies behind the Open Data Movement,  as there is a lasting tradition of centralism and interventionism. However, we French tend to preserve for the future rather than take full advantage now of our patrimony. France is principally made out of peasants, United Kingdom of merchants and this has a massive impact on how we see and interact with the world, including its Digital portion. The European Commission is helping address the topic with the Directive 2003-98/CE.


Property Five: A sensible definition of “sensitive”

Definitely Open Data raises issues and concerns over Privacy and Security. In the current international environment, where terrorist organisations happened to have as much Information Technology and Systems skills as any other organisations, and where China finally voices on the diplomatic scene with a strength that matches its importance, the release of data can be a risk.

Definitely the Open Data movement is composed of people, who have a very liberal, if not libertarian idea of data management. As usual, those people are activists, easily identifiable … and a minority. This is where the UK example is a true case study: compared to the US or other initiatives, the UK has released military-related information. This means that they have come to a balanced and sensible definition of what is “secret”, most probably aligning to what is truly jeopardizing national security and what is not. And this is precisely this definition of “sensitivity” that helps them take a clear leadership.


Property Six: Riding the Crowdsourcing wave

As some social computing initiatives – say Wikipedia and Innocentive to mention the most visible – (as well as books, one example here) illustrated, it is possible to engage people to do things that one feels it’s too low value-added to allocate resource to or that requires skills one is short of. In the case of open data one important trigger is the fact that information explodes to a point it is too expensive to store, process, make sense … and justify to the tax payers.


Making sense of the Open Data Movement


Opportunity One: Creating true Digital Commons

First, private organisations massively invested into telecom and information technologies to create the infrastructure, to a point that investors got afraid and stopped investing … leading to the Internet crash of the early two thousands and cheap broadband. Second, users started generate content via cheap devices such as blogs and later social networking platforms. Third, Governments, i.e. entities that own data and define Laws, but also several media entities such as The BBC, The Guardian and the New York Times, that are massively producing content they own, are making publicly available datasets. Data help build information that help people create knowledge. This means that this third moment of the Internet allows the emergence of true digital commons. For definition sake, a commons is a pool of resources that is public and can be used by anyone to derive by-products. The notion of commons got recently new momentum with Elinor Ostrom’s Nobel Prize.


Opportunity Two: Developing new grounds for learning, innovating and wealth creation

By making publicly and freely available datasets, Open Data make possible innovation development, but also learning opportunities and therefore wealth creation. General public and particularly students have a playground to test ideas, put into practice what they learned at the University or benefit from the work of others to discover, access and make sense of information. This helps build a better skilled society that can compete in a global competitive economy, which is a traditional role of Governments via fiscal incentives, infrastructure and education policies. One just need to look at what is already available out there (one example here with FreeBase) to anticipate a dramatic surge – both in quantity and quality – of information and therefore knowledge. Examples of private platforms such as SalesForce, the iPhone, Android and now Kindle also illustrate what one can anticipate. In fact, the United Kingdom Report on the Re-use of Public Sector Information 2009 (OPSI) shows page 54 that 34% of Click-Use License re-users are private individual already.

Now, from the angle of supporting developing countries to move to a different stage of development, this also is a meaningful policy. There is no particular investment, but people in developing countries can take advantage of available resources to learn and build services too. For instance, resources made available via Open Data / Linked Data / Semantic Web technologies by the BBC, the Guardian, Wikipedia (DBpedia) or the Library of Congress for instance can help a local library to propose to its public a whole lot of resources, at a fraction of the cost. What would be required in this situation is a clear decision of managers and a group of developers who are familiar with standards and available resources.


Opportunity Three: Tapping into the worldwide dispersed brains to serve the local community

By adopting Open Data, Governments are facilitating and supporting initiatives that serve the local community. It acknowledges the fact that civil servants are not the only ones and don’t necessarily have all the skills to deliver services that are meaningful to the public. It also acknowledges the fact that contributions are not geographically limited and cannot only be handled by selective immigration policies, as not all the contributors can or want to relocate. Let’s be honest: even if outsiders can tap and consume available data, most of the consumption will be made locally. Take the example of Vancouver. The data released makes primarily sense to people who live or visit Vancouver. More people can use datasets to learn and by doing so develop new services, but the only thing they would gain is technical knowledge and concrete examples of their ability. They will not take direct advantage of the service, from a consumer perspective.


Opportunity Four: Inventing new interactions to strengthen Democracy

As previously mentioned, the Open Data movement is grounded into a very specific western ideological / philosophical environment: accountability of people in command. The datasets that are publicly and freely made available often are data used for policy making by representatives and civil servants. This means that anyone – from Joe Six-Pack to lobbies – can access the data, review it, remix it, make sense of it and come up with questions and alternatives. Overall this ability of questioning on the same foot decision makers is reputed beneficial for Democracy. A relevant conversation can happen and facilitate a more astute and better decision, in the interest of all. The same applies in election periods where citizens can be in a better position to evaluate what concretely has been done and how relevant is the proposed, beyond what political communication and marketing offer. The fundamental democratic logic of check-and-balance is possibly reinforced with Open Data.


I have tentatively demonstrated what’s behind Open Data and why it matters. Obviously, no one will escape the necessity to sort the digital divide, to reinforce the need for robust and widespread information literacy and to facilitate the ability to make sense and critically evaluate information. Nothing new here, it’s a perpetual work-in-progress, just more urgency! Nor it will allow Governments to escape working the lines for equilibrated finance, eco-friendly development and social fairness policies. Similarly a close attention is to be brought to standards, as – even if they have crystallised over the last 18 months – nothing is set. Some big Internet players may have the wish to develop their own and kill or cannibalise the work of a whole community. Future evolutions of the Internet may jeopardize certain bricks of the semantic stack. As a related example, the rise of e-readers is putting at risk a lot of e-resources because these devices do not support certain standards. However, the trend is very positive and a true contribution to the reinforcement of a knowledge economy and knowledge-management related strategies and initiatives.

Tagged with: , , , , , , , , , , ,
Posted in Misc, PostDoc