Skip to content

Updates

Aid Transparency Barcamp Nepal on August 4th

Aid Transparency Barcamp Nepal, run jointly by YoungInnovationPvt. Ltd and aidinfo, is a conference to raise the awareness of the foreign aid scenario in Nepal. It intends to create a platform to initiate conversations and connections on the effective use of ICTs to support aid transparency and effectiveness. There will be a chance for organisations and individuals to showcase innovative ideas and tools that promote effective accessibility and visualization of aid data. Further, it is hoped that it will create a platform where the best technology products around aid data can be collaborate, supported, sponsored and promoted. It is also an opportunity to raise awareness of the International Aid Transparency Initiative’s (IATI) standard for publishing aid data.

Featured speakers will include Bibek Raj Kandel, Simon Parrish, Anjesh Tuladhar, Aman Shakya, Bibhusan Bista, Hemanta Sapkota, and Prabhas Pokharel. Sessions will include:

  • Linked data and semantic web technologies for aid transparency
  • Civic engagement: creating a feedback loop on aid effectiveness
  • Community led development projects: Information processes and pitfalls
  • What is IATI format and how it enhances aid effectiveness?
  • Social media for aid awareness
  • Taking aid transparency local: radio and SMS-based transparency opportunities
  • Crowd-sourcing for better data: geo-coding and traceability

This event is targeted at the tech community (programmers, app developers, FOSS enthusiasts, mobile developers), INGOs, aid donor communities, government officials, the media and aid transparency professionals and practitioners.

For more information on this event, please visit the website: http://nepalaid.yipl.com.np/

“Everything I need to know about open data, I learned from open source”

../../../assets/posts/de22a71eac0dc493960f5cd090f09975_MD5.jpg

BoF “Open data in development” at OKCon (via Tobias Eigen)

But what did we learn from open source? Two days of Open Knowledge Conference gave lots of food for thought. And lots of inspiration as well: plenty of projects doing interesting work, and experiences to share. And to add a cherry to the cake, we had a great “open lunch for development” with several people active in development aid. My (delayed) take-aways for Open for Change.

From data.gov.* to data.your.org

Nigel Shadbolt and Andrew Stott shared their lessons from setting up data.gov.uk, and Tom Lee talked about data.gov and the recent threats of its budget cuts.

It’s crucial to have top-down support, bottom-up activists, and middle-tier connectors, to bring everything together.

  • We need to continue nurturing a network of people active in open data for development, to make sure we have the tools, ideas, reports, cases, and standards we need, and to support the early adopters within organisations.
  • We have done some work on an “open data briefing”, based on the OKFN open data manual, and we need to continue work on that: in four pages, explain the why, how, what and who of opening up your organisation.
  • It’s important to understand the decision-making and budgeting processes to make the case at the right time and the right place. We could review the best material on “how to convince your boss to use open source” as a starting point.

There are many reasons to embrace open data, don’t rely on a single one to make your case: tranparency and accountability; economic value, growth and innovation; efficiency and cost-effectiveness; improving (public) services; (public) engagement; and civil society and social capital.

Close the feedback loop: the “build it and they will come” approach won’t work here either. Try to publish data that matters to people, but also consider that “data probably has a long tail”.

  • Releasing data early and incrementally creates a steady news flow, and also enables you to work with feedback and to champion people to create peer pressure on the “refusniks”.
  • Friedrich Lichtenberg tries to turn the lack of opportunity to debate what you see in “ Where does my money go ” into an opportunity for “ Open Spending ”. In essence: how do we provide support to create a data cycle instead of a data pipeline.
  • There is a challenge to create or appoint authoritative sources and URIs. In fact, this was a key topic in the online workshop organised by David Pidsley during the Open Data for Development Camp.
  • In a broader sense, the emerging standards and work done on open data in development cooperation needs to settle down in one or more ontologies. This will also pave the way to build more focused data enrichment services that can tap into the existing wealth of documents we have, to help professionals navigate those.

Statutory requirements matter.For governments, this mainly has to do with legal frameworks and obligations. But every organisation could (and should) enshrine crucial elements of open data in their policies: how to ensure “open” stays open, and how to prioritise.

  • The UK government re-used lots of existing policies, such as the National Statistician’s guidelines about reasonable measures to prevent identification of individuals. The open data manual could be a good portal to such policies.
  • Principles and policies should not be set in stone (at least: not early on), to prevent weasel words like “to the extent feasible” creep in. The essential lesson: in the first phase, compromise on the data that will be published, but not on the license applied to it. (Again: the paradigm shift in thinking and acting is what matters first.)

Why we want to be open: the fallacy of community collaboration

In open data, a lot can be learned from “open source”. In terms of tools and practices, I think we are, but in terms of the stories we tell ourselves and others about why we do it, Benjamin Mako Hill gave some interesting insights on the promise of open source to create better software because more people will be able to see, comment on, and improve the code.

In reality, this hardly happens. He showed graphs of projects on Sourceforge, the first major hub of open source software. The median number of developers working on a project is one. If you only look at “mature projects” (multiple releases, longer history), the median is still one. If you look at the most popular projects (10% of most downloaded), the median is two.

In other words: there are very, very few projects where mass collaboration did happen.

And we actually don’t really understand why some of them succeed.

It doesn’t mean we should not do open source, but we should not promote it with the story that it leads to peer review and better code. There are plenty of other reasons, though, and we should make sure we capture those in our Open for Change Manifesto as well:

  • It gives the users freedom: autonomy, control and empowerment. The technology constrains how and what we can communicate as people. Openness allows you to remove barriers.
  • It is resistant to “anti-features” (limitations built in to charge for removal). With an open license, anyone can process a data set to make it more useful for themselves or others.
  • It makes failure cheap: since the investment to be open is low, there already is reward in just making your own solution available.
  • It also makes success cheap: some products failed to have a big enough market to sustain a company producing it, but a community of users can produce and maintain it.
  • It is not dependent on persons or organisations: even if the original producer(s) stop working on it, others can continue and keep it available.
  • It sometimes does lead to mass collaboration. And it then can produce something that would be impossible to organise through traditional means.

Why we want to be open: a stronger vision

In our beta Manifesto, we tried to capture the essence of why we want to be open, and OKCon was a chance to reflect on it.

I liked a definition given by Jose Alonso of the Web Foundation: the web is humanity connected through technology. And as Brewster Kahle of archive.org said: the last generation put a man on the moon. Pretty cool, but our generation can make all knowledge available to all people on earth, for always and for free. That’s a powerful ambition too.

It is crucial to also translate the promise of open data, open access and open knowledge to “effective use”: how do we make sure we create autonomy, control and empowerment, but more even so: security for the ones who want to realise their “ right to access”? “Open” is part of a struggle for human rights.

Hopefully, a joint “Slash Open” campaign can unite the efforts of many organisations working for humanity in shaping the technology we need and put it to effective use.

"World Bank Institute: We’re also the data bank"

At the Activate Conference, Aleem Walji of the World Bank Institute gave a brief overview about their first experiences with open data (their data catalogue website gets more visitors than their home page now, and Google translated the top indicators they saw people were searching for into 39 languages), and how they hope to connect their aggregate data with the detailed service delivery we can now collect and make available, to build a “Yelp for Development”.

Open for discussion: the "Open for Change" Manifesto

Ever since we started networking as “development 2.0 pioneers”, we wanted to express our core values, so we can grow our network into a movement.

In January we adopted the name “Open for Change”, and worked to organise the world’s first “Open Data for Development Camp” (ODDC) in May, to bring together the people in that movement.

Based on the conversations we had with many of you (before, during and after ODDC), and inspired by other manifestos, we now have created a beta version of the Manifesto for Open for Change, and love to hear your feedback.

The Manifesto is the foundation for more plans and activities to grow our global movement. Schematically, it could look like this:

../../../assets/posts/ecc185b1d82babb476abb813afdd9442_MD5.png

We see “Open for Change” as an open source brand and as an international ecosystem, based on the Manifesto and with a light-weight organisational and technical structure to connect various projects and activities:

Most of these are in their conception phase, some are already starting to take shape with people working on it.

Our focus now is to get feedback on the Manifesto and to hear your thoughts on the “open source brand” and the organisational form it should take.

Click here to go to the Manifesto and give your feedback! Leave a comment, and forward this message to people you think should be involved! Thanks!

Getting my GSM modem working under Ubuntu

../../../assets/posts/ef4b092c09313cac69cca10fc5ce9eae_MD5.pngAnother “hack post”, to capture how I got mobile broadband working on my Sony laptop. Sony makes laptops with cutting-edge features (small, solid-state disk, full HD screen) and a stylish look, but doesn’t like to help you take full advantage of it unless you’re on Windows. Undocumented tweaks to the hardware, hard-to-find technical information, and so on.

I bought a Sony VPCZ1 (to be more precise: VPCZ13C5E) with a WWAN module installed, with the idea that I could be online anywhere, without any dongles sticking out, or having to connect by tethering it to my phone over Bluetooth or USB. I’ll pay the extra fee… provided it works.

I have installed Ubuntu 10.10 (which was an effort in itself), and went on an excursion to get mobile broadband running.

First, let’s find out what the hardware is: it’s not listed on Sony’s specs page, but according to the specs of a similar model, it’s a Qualcomm Gobi 2000 (PDF link). It apparently needs to be loaded with a firmware before it will operate. I had to install the Ubuntu package gobi-loader as a starting point.

apt-get install gobi-loader

However, that package does not provide the actual firmware. It expects the firmware in /lib/firmware/gobi, but that directory doesn’t even exist.

The discussion about a Qualcomm problem on Launchpad made me look at a way to get the firmware from the Microsoft-based Qualcomm Gobi2000 (WWAN) Driver 1.1.80 that Sony provides. After unzipping that file, we have a directory with a file GobiInstaller.msi… the firmware is somewhere in there.

After installing either p7zip-full or cabextract, we can extract the contents of the.msi file, to end up with a long list of crypticly named files.

Thanks to a post by Madox, and the discussion after it, I saw what to look for:

/lib/firmware/gobi$ ls -l
total 13888
-rwxr-xr-x 1 root root 11096116 2009-12-11 21:10 amss.mbn
-rwxr-xr-x 1 root root 3104812 2009-12-11 21:10 apps.mbn
-rwxr-xr-x 1 root root 9284 2009-12-11 21:10 UQCN.mbn

As it happens, there is just one file with length 11096116, just one with length 3104812, and 18 with length 9284 bytes. The first two are easy, the UQCN.mbn file contains the specific setup for a region or provider.

When using a hex editor (like hte in the package ht) to inspect the various variants of 9284 byte files, there is a string at the end revealing a bit what it is intended for: umts_gen, umts_orange_nogps, umts_tmo_noxtra and so on. See below for a table of strings, file names, and md5 checksums in my setup.

../../../assets/posts/7fa7d93755494aad8e6c412ba8926dde_MD5.pngAfter copying the appropriate three files with the right name into /lib/firmware/gobi, I flipped the “wireless” switch off, waited some 10 seconds, and switched on again. And was greeted with a pop-up to enter the PIN code for my SIM card: indicating that the modem had been detected, the firmware had been loaded, and my SIM card was working.

Under the Networking menu, I could add a new mobile broadband network, and connect after a few simple steps of selecting my country and provider.

Notes: various posts for earlier versions of Ubuntu mention hacking in qcserial, or compiling your own kernel modules. I didn’t have to do any of that.

string file name in.msi md5sum
umts_gen _61F1C9E9670341009A49DFBF7ED9308B e601a7bf3c55104badcdf21bcbb0bfa9
umts_gen_nogps _82F1F7B633254DD8943C3C66695180D4 633bed88c29244683635c261849d0e88
umts_gen_noxtra _B1755EF712704F6EA05AD29399FDAAD2 f1911bcefc4bd5bf8d8fd401082c1a5a
umts_orange _D086B5600A824F88A7B5B4DA9AEC7393 0044ef086b828c30689b899a3570dd56
umts_orange_nogps _EAA5B766B8FC42258CF0903EB29B3866 1dbb1ce26cb59f9d7b551e54c9f71c80
umts_orange_noxtra _97CDF82428924F739C17CEC7DB7642C5 668d1e8903f362b4fc5ec66145ab9b36
umts_telital _FF884A864DC04D3B9BE3B80CA0D6365D 6f575f681ffad81bf3159c7b2d7122a9
umts_telital_nogps _F5A8481D6D0141DDB5AC04E02C5D6B77 bf6b02a2e4ac42c40b028519ed5db487
umts_telital_noxtra _CE104EC699FA4012AF9BF053838EFEEA 6a1b2b342a9e3548dc02f882f156ec21
umts_tellfon _F85609A0A9B64C399F670AE6D78A9EDD b0edb9f5ee92204f9d0e455ff860ca84
umts_tellfon_nogps _AB685C9BC9BF406691ED3CC70C0EA2F8 345c4671242f94d31e3161ead89227db
umts_tellfon_noxtra _D9B76BE055B04137AB32632049501DF1 f064a0c0c7806d30dacd33b4672661cc
umts_tmo _FC981235AEDB429BA1F601941B97E11C 1061d15ca89d0d8f66919c99cb67cc45
umts_tmo_nogps _A6C51028090341929FAC167B2938F19C 6d7b94fed93f47ceafc9ba0c7889fc1f
umts_tmo_noxtra _9358B6316845471886EF8B592B400046 4132ebbea25e4014043d902d7e272f71
umts_vod _A51C11D307D344229DD775AD527BA6DA 4d1b58cb79817dbe111194dfc286e57a
umts_vod_nogps _890ED25310C543B483CA0E67C40B9C54 d06886a62c5c42e2076e0d2a055d1675
umts_vod_noxtra _A66130F57F1E4EFCAA571D5BFBF84CE4 39f0b2663f682b5c9d97cdaddaa72813

Participate in ODDC: in Amsterdam… or online!

../../../assets/posts/1d368a515c79805ca8656760e0691a21_MD5.jpgThe list of people signing up for the Open Data for Development Camp is growing. We’ll have people involved in defining the International Aid Transparency Initiative standards, people involved in making development data open in governments and organisations, people working on Open Access, on the Apps for Development contest, and more!

We hope even more of you will make it to Amsterdam on May 12 and 13, but now there is a chance to participate online as well: David Pidsley is coordinating a two-day online workshop to collaborate on Linking Development Data.

Join us, online or offline!

Open Data for Campaigning

../../../assets/posts/0a1cf00a389f6473b84741af7bcc4b84_MD5.pngTwo weeks ago was the Ecampaigning Forum (ECF) organised by Fairsay in Oxford, and directly after that, the Open Data for Campaigning Camp (ODCC) put together by Tim Davies, Javier Ruiz and myself. One direct result of our efforts to promote the use of open data in campaigning organisations is Greenpeace’s experiments to make their measurements of radiation levels near Fukushima available as raw data: http://www.greenpeace.org/fukushima-data (way to go, Andrew! That’s two-star open data). Good to remember that the teams have to deal with lots of logistics and radioactive decontamination, so publishing spreadsheets isn’t at the top of their priorities.

By the way, next week, I’ll be at re:publica and re:campaign in Berlin, again talking with NGOs and campaigners about open data.

Accessing aid information

../../../assets/posts/iati-logo-name.png

International Aid Transparency Initiative

The international development aid sector is starting to get on board the “open data” train. Preparing for the Open Data for Campaigning Day in Oxford in three weeks, and our own Open Data for Development Camp in Amsterdam in May, I had a look at what’s possible already.

International Aid Transparency Initiative (IATI): IATI aims to make aims to make information about aid spending easier to access, use and understand

IATI has been working on a standard to exchange information about development aid projects. In February, a milestone was reached:

Agreement on aid standards confirmed: We’re happy to report that last week’s meeting of the International Aid Transparency Initiative’s (IATI) signatories and Steering Committee members resulted in agreement on the remaining items to be included in the IATI standard.

The IATI Standard: This site contains all reference materials for both publishers and users of the International Aid Transparency Initiative’s standards.

In parallel, the Open Knowledge Foundation has been building out their CKAN software to build open data registries. A registry basically is a central place to find information about sources of data around the web.

CKAN – the Data Hub

One of these registries is the IATI registry, currently containing pointers to data from the UK (DFID)

IATI Registry: The IATI Registry is a hub for international development data published by agencies, community orgranisations and partner countries in a standard, open format.

So I tried to access the data from a desktop reporting tool:

In a reply to to that tweet, Tariq Khokhar from AidInfo pointed me to their Labs site with examples of tools and techniques to use (IATI and other) information in international development aid.

aidinfo labs | Innovation in aid information

The first examples look interesting, and I hope that we can explore more opportunities as part of the Open Data for Campaigning Day on 24th March in Oxford.

Come join some 40 other people for a day of hacking and plotting on open data for campaigners: using and producing data yourself!
http://opendatacampaigningcamp.eventbrite.com

Hacking away on my Zim notebook

Working with open source is fun because it lets me explore software and change little things. If you’re not really into code and config files, you might want to skip this post 🙂

../../../assets/posts/bbf2d03c0084735555b1df6f5de661b9_MD5.png

Zim task list (default)

Since some time I use Zim, a desktop wiki, to take notes in meetings and at conferences. It shares a few good things with Tomboy, the default notebook on Ubuntu (linking pages, immediately saving what you type so you don’t loose anything on sudden power loss) but has a few things I prefer:

  • Notes are stored as plain text: easier to use other text editors, version control, and scripts, and if for any reason Zim fails, I can still access my data quite easily.
  • The main interface makes organising and navigating notes easier.
  • The Task List plugin extracts lines with possible tasks, so it’s easier to keep track of follow-ups or actions in notes by simply typing a line like:
    [ ] Document my hacks in a blog post
    

The Task List itself is a separate window you can pull up, to see all open to do’s. But I wanted to fix a few things to make it more powerful in daily use.

The suggestions below are based on Ubuntu 10.10, and with Zim 0.49 installed as Ubuntu package.

../../../assets/posts/132b9064332c92edf3b2f785d333ada2_MD5.png

Zim notes with tasks

Quicker access to Zim

The first step is to have Zim available at my finger tips, so that taking notes is instant. I want to use the key combination Win-Z to bring up the notebook. You need the wmctrl package to be able to raise the window if it already is open but buried under other windows.

sudo apt-get install wmctrl

Next, I went to the menu System > Preferences > Keyboard Shortcuts and added a custom shortcut. It starts Zim to make sure it is running (make sure your notebook is set as the default notebook to open in Zim’s preferences), then raises the window to the top:

bash -c "zim && wmctrl -R \"Notes - Zim\" "
../../../assets/posts/12cbe898bd36a09ff55e6c2c818aab2e_MD5.png

Zim custom shortcut

Quicker access to the task list within Zim

I’d like to have the task list available via a keyboard shortcut. I prefer the combination Ctrl-T, which already is assigned to Format > Verbatim (monospaced). So I edited ~/.config/zim/accelmap and uncommented the relevant lines to show the task list and to apply that formatting, to assign the key combination to the task list.

(gtk_accel_path "<Actions>/TaskListPlugin/show_task_list" "<Control>t")

(gtk_accel_path "<Actions>/PageView/apply_format_code" "")

Bonus key combination: (while we’re here anyway). I have enabled the Inline Evaluator plugin in Zim, so that I can do quick math within my notebook. If I type 1500*1.19 and select the menu item Tools > Evaluate Math, Zim calculates the outcome and changes the line to: 1500*1.19= 1785.0. To make that easier, I’ll assign Ctrl-= to that function:

(gtk_accel_path "<Actions>/InlineCalculatorPlugin/eval_math" "<Control>equal")

Shorter task descriptions column

The column with task descriptions is made wide enough to contain complete task descriptions. Which means that you usually need to scroll to see the other columns.

../../../assets/posts/8ebe99130683b636b20e87d82744dd10_MD5.png

With many pages, seeing in which page a task resides helps a lot to understand its context. The descriptions should be limited to a more narrow column.

../../../assets/posts/89814ab2317c357b0b9b8692460c9f48_MD5.png

Task list more optimal

Diving into the source code, I found out that the developers already thought about that as well. They implemented a fixed-width column for the Maemo platform, with the remark

# FIXME probably should also limit the size of this
# column on other platforms ...

That made it really easy to change my own copy of Zim to do just that. It results in a simple patch:

--- tasklist.py.or 2011-01-15 16:36:42.624406264 +0100
+++ tasklist.py 2011-01-15 16:12:01.555792557 +0100
@@ -564,6 +564,8 @@
             column.set_sort_column_id(i)
             if i == self.TASK_COL:
                 column.set_expand(True)
+                column.set_sizing(gtk.TREE_VIEW_COLUMN_FIXED)
+                column.set_fixed_width(500)
                 if ui_environment['platform'] == 'maemo':
                     column.set_sizing(gtk.TREE_VIEW_COLUMN_FIXED)
                     column.set_fixed_width(250)

It’s a good idea to keep the patch around: upgrading to a newer version of Zim would remove the hack, so I just made a script to re-apply the patch:

#!/bin/bash
# Apply patch to Zim tasklist plugin to set fixed-width task description column
# Best performed with sudo
cd /usr/share/pyshared/zim/plugins
patch < /home/rolf/bin/fix-zim-tasklist.patch
cd -

Better positioned task list window

The Task List window often pops up over the notebook itself, and stays on top. Clicking on a task brings up the note in which the task resides, and focuses it on that task. But that’s not so useful if the note is hidden under the task window.

I already use Devil’s Pie, self-described as “A totally crack-ridden program for freaks and weirdos who want precise control over what windows do when they appear.” Install gdevilspie if you like a GUI to set up rules for windows.

I have plenty of screen (1920×1080) so I decided to let the notes and the task list live next to each other, with these two rules in ~/.devilspie/zim.ds

( if ( begin ( is ( window_name ) "Notes - Zim" )) ( begin ( geometry "926x1032+0+24" )))
( if ( begin ( is ( window_name ) "Task List - Zim" )) ( begin ( geometry "990x1032+930+24" )))
../../../assets/posts/3bc8c158860f824d1ad00def1003efa4_MD5.png

Zim notes and tasks side by side

Done

My laptop has become a better notebook! At any moment, Win-Z brings up my notes, to quickly jot down something, and when in my notebook, Ctrl-T brings up the task list to let me easily navigate the to do’s and follow-ups in my notes. Staying on top of things has become a little bit easier.

Let’s build a “Debian for Development Data”

I just returned from an intense week in the UK: an IKM Emergent workshop in Oxford, and the  Open Government Data Camp in London had me almost drowning in “open data” examples and conversations, with a particular angle on aid data and the perspectives of international development.

As the result of that, I think we’re ready for a “Debian for Development Data”: a collection of data sets, applications and documentation to service community development, curated by a network of people and organisations who share crucial values on democratisation of information and empowerment of people.

“Open data” is mainstream newspaper content now

Mid 2009, after the 1%EVENT, a couple of innovative Dutch platforms came together to explore the opportunities of opening up our platforms: wouldn’t it be great if someone in an underdeveloped community had access to our combined set of services and information?

We had a hard time escaping new jargon (federated social networks, data portability, privacy commons, linked open data, the semantic web) and sketching what it would look like in five years. But then again, suppose it was five years earlier: in mid 2004, no-one could predict what Youtube, Facebook and Twitter look like today, even though many of us already felt the ground shaking.

  • The technical web was embracing the social web, of human connections.
  • The social web pushed “literacy”: people wanted to participate and they learned how to do that.

A year and a half later, “open data” is catching up with us, and going through a similar evolution. Governments and institutions have started to release data sets (the Dutch government will too, the UK released data on all spending over £25,000 on Friday). So when will the social dimension be embraced in open data?

A week of open data for development

At an IKM Emergent workshop in Oxford, on Monday and Tuesday, around 25 people came together to talk about the impact of open data on international development cooperation. We discussed when we would consider “linked open data” a success for development. One key aspect was: getting more stakeholders involved.

Then at Open Government Data Camp (#OGDCamp) in London, on Thursday and Friday, around 250 people worked in sessions on all kinds of aspects of open data. Several speakers called for a stronger social component: both in the community of open data evangelists and in reaching out to those for whom we think open data will provide new opportunities for development.

At IKM, Pete Cranston described how his perception of access to information changed when a person approached him in a telecentre to ask how the price of silk changed on the international market: he was a union representative, negotiating with a company who wanted to cut worker salaries because of a decline in the market price. Without access to internet or the skills to use it, you don’t have the same confidence we have that such a question can be answered at all.

Then at OGDCamp, David Eaves reminded us that libraries were (partly) built before the majority of the population knew how to read, as an essential part of the infrastructure to promote literacy and culture 1.

Telecenters fulfil a role in underdeveloped communities as modern-day libraries, providing both access as well as the skills to access information and communication tools via the internet.

But we don’t have “open data libraries” or an infrastructure to promote “open data literacy” yet.

How open source software did it

It shouldn’t be necessary for people to become data managers just to benefit from open data sets. Intermediaries can develop applications and services to answer the needs of specific target groups based on linked open data, much as librarians help make information findable and accessible.

There are also parallels with open source software. Not every user needs to become a developer in order to use it. Although it is still to think otherwise sometimes, the open source movement has managed to provide easier interfaces to work with the collective work of developers.

The open data movement can identify a few next steps by looking at how the open source movement evolved.

Open Source Open Data
Software packages (operating systems, word processors, graphics editors, and so on) are developed independently. Each software package can choose the programming language, development tools, the standards and best practices they use. Data sets (budget overviews, maps, incident reports) are produced independently as well. The data formats and delivery methods can be chosen freely, and there are various emerging standards and best practices.
Communities around software packages usually set up mailing lists, chat channels and bug trackers for developers and users to inform each other about new releases, problems, and the roadmap for new versions. The mantra is “many eyes make all bugs shallow”: let more people study the behaviour or the code of software, and errors and mistakes will be found and repaired more easily. Data sets mainly are published. As Tim Davies noted in one of the conversations, there don’t seem to be mailing lists or release notes around data sets yet. To deliver the promise of a “wisdom of the crowds”, users of data sets should have more and better ways to provide feedback and report errors.
Open source software is mostly used via distributions like Debian, Redhat, Ubuntu, separating producers and integrators. A distribution is a set of software packages, compiled and integrated in a way that makes them work well together, thereby lowering the barrier of entry to use the software. Distributions each have a different focus (free software, enterprise support, user-friendliness) and thus make different choices on quality, completeness, and interfaces. Perhaps the current data sets released by governments could be considered “distributions”, although the producer (a department) and the integrator (the portal manager) usually work for the same institution. CKAN.net could be considered a distribtion as well, although it does not (yet?) make clear choices on the type and the quality of data sets it accepts.

Software distributions make it possible to pool resources to make software interoperable, set up large-scale infrastructure, and streamline collaboration between “upstream” and “downstream”. The open character stimulates an ecosystem where volunteers and businesses can work together, essential to create new business models.

Towards a “Debian for Development Data”

To sum up several concerns around open data for development:

  • Open data is currently mainly advocated for by developers and policy makers, without a strong involvement of other stakeholders (most noteworthy: those we like to benefit in underdeveloped communities). It tends to be driven mostly by web technology and is mostly focused on transparency of spending. It does not take into account (political) choices on why activities were chosen, and also lacks a lot in recording the results.
  • Data sets and ontologies are hard to find, not very well linked, with few generic applications working across data sets, and examples of good use of multiple data sets. Once you want to make data sets available, it is hard to promote the use of your data, provide feedback loops for improvements, administer dependencies, and keep track of what was changed along the way and why.
  • There are hardly any structural social components around current open data sets, repositories and registries.

So why don’t we start a “Debian for Development Data”?

  • A Social Contract and Open Data Guidelines like those for Debian can capture essential norms and values shared by community members, and inform decisions to be made. The contract can for instance value “actionable opportunties” over financial accountability. The Agile Manifesto is another example to draw from.
  • The community should set up basic communication facilities such as a mailing list, website, and issue tracker, to ease participation. Decision-making is essentially based on meritocracy: active participants choose who has the final say or how to reach consensus.
  • The data sets should be accompanied by software and documentation, to take away the problem of integration for most end users. Each data set and tool should have at least one “maintainer”, who keeps an eye on updates and quality, and is the liaison for “upstream” data set publishers, offering a feedback loop from end-users to producers.
  • The CKAN software (powering the CKAN.net website mentioned before) draws on the lessons from distributions like Debian for its mechanisms to keep track of dependencies between data sets, and has version control, providing some support to track changes.
  • Ubuntu divides packages in categories like “core”, “non-free” and “ restrcited” to deal with license issues, and to express commitment of the community towards maintaining quality.

We stimulate the social component by providing more stakeholders a point of entry to get involved through socio-technical systems. We stimulate literacy by offering the stakeholders ways to get open data, publish their own, experiment with applications, and learn from each other. And we circumvent the tendency towards over-standardisation by doing this in parallel with other initiatives with sometimes overlapping goals and often different agendas.

1A quick check on Wikipedia indicates this seems to have mainly been the case in North-America, though.