Rolf Kleef¶

28 November 2011
in Fundstücke
1 min read

Around the web in week 47, 2011

../spider-web.jpg Fundstücke published this week:

Paul Kilelu — Fighting for a Noble Cause 21 Nov 2011, pelle By Halima Tahirkheli Nabuur is pleased to introduce you to Paul Kilelu, the local representative of …

9 November 2011
in News
1 min read

Who is implementing the aid transparency agreement?

Owen Barder published an overview by the International Aid Transparency Initiative (IATI) on how far countries and donors are on their road map to publish their aid spending data before the High Level Meeting in Busan, end of this month.

If you’re curious about how the International Aid Transparency Initiative (IATI) can help aid effectiveness, have a look at their video with some stakeholders:

The International Aid Transparency Initiative (IATI) from Development Initiatives on Vimeo.

26 October 2011
in Conferences, Research
5 min read

Describing organisational relations

One of the side-events of the Open Government Data Camp, last week, was an Organisational Identifiers Workshop put together by Tim Davies and Chris Taggart. The meeting discussed the various challenges in linking information about organisations held in separate data sets. Although participants were careful to avoid the word “ ontology “, one of the break-out groups did look at describing relations between organisations. Since I graduated on research into “part-of” relations in an ontology, and what you can infer from them, I joined that discussion. Here’s what we came up with.

../../../assets/posts/64e5bb7c94bcbdffd0206ff2b6c693cc_MD5.png

Outset

The workshop was a good chance to catch up with where things are right now, with several organisations at the table and participating online that have to deal with information about organisations:

The IATI standard needs organisational identifiers to refer to individual donors and recipients of grant money and payments. IATI does not want to provide this standard, but rely on an external one. They will need some way to represent up to the level of government departments as part of an upcoming pilot project, to capture intended donor flows in a meaningful way.
The Open Corporates website, and its companion the Open Charities website, capture information about organisations, but also lack a common identifier scheme, as well as ways to describe relations between organisational entities (especially the complicated relations between companies).
Within the open government data movement, and the Open Knowledge Foundation, there is a need to represent organisational units such as departments, and be able to deal with renaming and reorganisation of such units over time.
The Sunlight Foundation is dealing with for instance DUNS numbers, which often are too detailed for the purpose of identifying a larger organisation (every outlet of a supermarket chain will have its own number).
GlobalGiving, OpenSpending and IATI are looking into decentralised registrars, but each registar basically expresses a different type of relation between a legal or organisational entity and a purpose, such as tax registration or legal entity.
Everyone faced a difficulty of dealing with entities which cannot register as such (e.g. informal associations), and so are not in any registrar’s database.
To end this list, many people will talk about a known brand as if it is a company, and would expect to access information that way, but even these have no single register.

How to create identifiers for organisations across the world, which might not be registered anywhere, and which relate to each other and to more generic concepts, in such a way that we can capture all the meaningful relations and data we want to capture?

How to make sure it works with the schemas already in use in big organisations? And that it works with data stores that are not open? Without introducing another naming authority?

You should be able to determine an ID without requesting it from anyone.
You should be able to resolve it to commonly known registrars.
You should know where to find the list of those registrars.
You should be able to represent the granularity (aggregating detailed levels of information, allowing for splitting up individual entities into smaller ones)
Who decides what is a good registrar?

We split up in a couple of groups, one looking at identifying public bodies, another at the technical architecture that might be needed, and a third at common terms to describe relations between organisational entities. I joined that third group.

Inputs

We spent some time discussing various types of relations, and I also looked around to find possible candidate schemes, but without much luck. I couldn’t find an obvious example, like the FOAF standard for personal relations. A few standards, like OrgPedia, or the Organizational Ontology, seem likely candidates, but don’t cover this area (yet?).

We looked at some use cases:

A company wants to show their supply chain, to demonstrate that their suppliers are ok, or perhaps to “crowd-source” the question whether they are: “these are our suppliers, if you think they’re not ok, let us know”.
A campaigning organisation wants to express what they know about organisational ties, to support their arguments on why the ties should be broken.
A reporting entity wants to express their donation relations, for instance to a government department, and be able to deal with changes due to reorganisation.
A watch-dog organisation wants to express that a certain company has changed names or merged or split operations, but still remains to pursue the same activities.
A consumer wants to find out what a certain company has done, but basically only knows that company through a name or brand, without knowing the exact structure behind it.

We acknowledged additional cases, like finding influential relations between corporate or organisational entities based on board membership or roles of individuals, but decided not to take that on in this discussion.

Output

We came up with a first-version typology of relations. The naming and exact semantics will need further review.

Persistent relations

These are relations between entities that have a “permanent” and “structural” character. Of course, all these relations are bound in time, but the beginning and end points may not be known.

We distinguished two sub categories.

Organisational relations express membership, ownership, or hierarchy.
- “is member of” (an association, group, cabal); “is affiliated to”; “is organisational unit of” (department, location); “is shareholder of”; “is owner of”
Contractual relations express transactions between entities. For instance, a relation “donates to” would express a sizable or structural donation from one entity to another. In the IATI standard, this would mean there should be (one, but probably more) “activities” records or “transactions” records.
- “has contract with” with eg. subcategories “owes money to” (long-term debts, mortgages), “is supplier to”, and “licenses to”; “in legal conflict with”; “donates to”

(This typology still fails to capture something like a brand as abstract entity.)

Temporal relations

These are relations that express a change in the structure or responsibilities of some entities, often the beginning or the end of particular entities. We identified four basic types:

Split into: A splits into B, C, … A ceases to exist, B, C, … come into existence.
Spin-off off: A creates B as a separate entity
Merger: A, B, … merge into C. A, B, … cease to exist, C comes into existence.
Acquisition: A acquires B and moves B’s assets into A. B ceases to exist.

Further work

More work is needed to mold this into a useful standard (relations are currently described from the perspective of one end, there is still plenty of room for interpretation, things have not been tested on real-life examples described as use cases, and so on).

And, of course, we’d need those organisational identifiers to refer to other entities, and find ways to delegate resolving identifiers to services that can provide additional information on those identities. See the whole report of the workshop on the OGDCamp wiki for the results in the other discussions as well.

But thinking about and discussing relations between those entities brought back memories of all the fun in making machines infer and report unknown relations 🙂

17 October 2011
in Syndicated
2 min read

It’s the linking, stupid!

The current discussion around open data often boils down to releasing data sets, and seeing nice visualisations and apps. But lets not forget that the full phrase is linked open data. The real power comes from linking the data. This week, the Open Government Data Camp in Warsaw lets us explore this more.

Just as web pages today link to other pages for further information, the data sets of tomorrow will link to other data sets, for more data. Your browser will help you navigate the data space.

The BBC is ahead in this game, and working on a “Digitial Public Space” project, linking together many sources of cultural data. Jake Berger writes on the BBC blog:

Early versions of this data model indicate that – as hoped – there will be many, varied and often unexpected journeys that can be made through these catalogues and the material they describe. For example, a user starting out by watching a film of a production of Macbeth from the Royal Opera House might then look at a scan of a rare musical manuscript from The National Archives, then browse similar manuscript scans held at the British Library, watch a clip from a BBC documentary about how paper was produced in Shakespeare’s era, before ending up learning about the plants used to make the paper using information from The Royal Botanic Gardens At Kew. In a [Digital Public Space], all of this could happen in the same online space.

That may still sound a bit like the current web of pages. Except: the publishers only provide standardised links for “Shakespeare”, “paper”, etcetera, and your browser makes the connections to offer you ways to move forward:

Mo blogged about the development of a web browser-based user interface, which navigates through these catalogues using the concepts of “people”, “places”, “events”, “things” and “collections”.

In international development aid, the IATI standard is an effort to work towards a similar “digital public space” in which you can navigate through “organisations”, “activities” and “results”.

An important part of establishing that space is to work towards joint standardised identifiers. At our ODDC conference in May, David Pidsley’s Virtual Workshop on Linking Development Data was focused on that, and next weekend, Tim Davies is organising an Organisation Identifiers Workshop as a fringe event for the Open Government Data Camp, in Warsaw, to continue working on this. And we’ll have more general Open Data for Development: Open Space session on Saturday morning.

10 October 2011
in Fundstücke
1 min read

Around the web in week 40, 2011

../spider-web.jpg Fundstücke published this week:

ALL CLEAR: FeedMedic Alert for rolfkleef 8 Oct 2011 Your Source Feed, http://www.drostan.org/rss.xml, is now working fine. Carry on! We will let you kno…

10 October 2011
in Ideas, News
1 min read

Peter Eigen: Overcoming the fear of transparency

We've participated in two interesting events in the last two weeks: the Open Aid Data Conference in Berlin, and the IATI in the Visigrad countries conference in Prague. A proper post is still due, but here's already a video of the closing talk from Transparency International's founder Peter Eigen, "Overcoming the fear of transparency":

[openaiddata.de] Peter Eigen – Overcoming the fear of transparency from Open Knowledge Foundation on Vimeo.

13 September 2011
in News
2 min read

Dutch development data and results online

../../../assets/posts/bb21848612edc9fd9e03b903a0047562_MD5.jpg Today, Ben Knapen, the Dutch State Secretary for development cooperation, presented the “Resultatenrapportage”, the “reporting of results” on Dutch efforts in development aid in the period 2009-2010. He used the occassion to also present the first release of Dutch government data in the IATI standard format, making The Netherlands the fifth signatory to deliver on its commitment for phase 1 of the IATI agreement.

The “Resultatenrapportage” report itself is the result of the collaboration of some two hundred writers from government and NGOs, in a process led by the Ministry of Foreign Affairs and Partos, the Dutch platform of organisations in development cooperation.

To make the report a stepping stone towards sharing more data online, it is no longer being printed, but instead made available as PDF document, with links to offer easy access to background documents and data.

At the same time, the State Secretary announced the opening of http://www.minbuza.nl/opendata with the first set of spending data released, up to date to the end of the second quarter of 2011.

Behind the scenes, the last bits are put in place to automatically register the data at the IATI registry. The data will be udpated quarterly, so the next release is expected to already happen in early October.

It will be interesting to see how the regular publication of the data leads to continuous improvement of the data: less jargon, less abbreviations, clearer descriptions.

In addition to the IATI data, there also is a set of documents of embessies, reporting on the progress of particular projects in countries around the world. An open challenge is to now find ways to link the information in these documents to the data available in IATI.

Knapen expects civil society to also publish their data in IATI, so that it becomes easier to look directly into the operations and processes happening at the core of international development cooperation.

The Guardian has made a website to slice and dice the data of the World Bank, DFID, etc. In collaboration with Akvo, the Ministry is doing a pilot project to show their water project portfolio, but Knapen hopes that parties in The Netherlands will also explore the expanding universe of data to answer their own questions and develop their own perspective on what is happening.

After offering the report on a (symbolic) USB stick to Nebahat Albayrak of the Parliamentary committee for foreign affairs, she said she hopes the open data will make the political debates more informed and perhaps even a bit more objective, but also stressed the importance of reflecting on actual impacts as done in the “Resultatenrapportage”.

Albayrak also hopes to also evaluate the “resultatenrapportage” and the “open data” initiatives and standards from the perspective of effectiveness for their work in the “kamercommissie”.

4 August 2011
in Work in progress
2 min read

What are the effects of open development?

At “Open Data for Development Camp” in Amsterdam, Marijn Rijken of the Dutch research institute TNO presented on “open data opportunities in development”. Together, we’re now drafting a research proposal to gather answers on pertinent questions around open data in development: “What are the social, organizational, technological, financial and legal effects of open development?”. It’s part of our efforts to build network as the basis for a Dutch knowledge platform.

../../../assets/posts/ecc185b1d82babb476abb813afdd9442_MD5.png

TNO has done similar sector-wide research around open data in other sectors, and would like to take the existing research in this area a step forward. AidInfo published a cost-benefit analysis on open data, and a framework for this. The Transparency and Accountability Initiative recently published a report on introducing open data in middle income and developing countries. And we also like to include “effective use” and impacts on e.g. privacy and security, and a possible “data divide”.

We plan to look at existing literature and research, sketch a vision on what “open development in 2020” could look like, and provide a framework for a social cost-benefit analysis, and the ground work for a road map to help organisations embrace open data and “do it right” (e.g. critical success factors, activating crucial stakeholders and infomediaries).

Of course we like to hear of other research projects with recent publications or currently underway.

2 August 2011
in Conferences
1 min read

Aid Transparency Barcamp Nepal on August 4th

Aid Transparency Barcamp Nepal, run jointly by YoungInnovationPvt. Ltd and aidinfo, is a conference to raise the awareness of the foreign aid scenario in Nepal. It intends to create a platform to initiate conversations and connections on the effective use of ICTs to support aid transparency and effectiveness. There will be a chance for organisations and individuals to showcase innovative ideas and tools that promote effective accessibility and visualization of aid data. Further, it is hoped that it will create a platform where the best technology products around aid data can be collaborate, supported, sponsored and promoted. It is also an opportunity to raise awareness of the International Aid Transparency Initiative’s (IATI) standard for publishing aid data.

Featured speakers will include Bibek Raj Kandel, Simon Parrish, Anjesh Tuladhar, Aman Shakya, Bibhusan Bista, Hemanta Sapkota, and Prabhas Pokharel. Sessions will include:

Linked data and semantic web technologies for aid transparency
Civic engagement: creating a feedback loop on aid effectiveness
Community led development projects: Information processes and pitfalls
What is IATI format and how it enhances aid effectiveness?
Social media for aid awareness
Taking aid transparency local: radio and SMS-based transparency opportunities
Crowd-sourcing for better data: geo-coding and traceability

This event is targeted at the tech community (programmers, app developers, FOSS enthusiasts, mobile developers), INGOs, aid donor communities, government officials, the media and aid transparency professionals and practitioners.

For more information on this event, please visit the website: http://nepalaid.yipl.com.np/

15 July 2011
in Conferences, Ideas
6 min read

“Everything I need to know about open data, I learned from open source”

../../../assets/posts/de22a71eac0dc493960f5cd090f09975_MD5.jpg

BoF “Open data in development” at OKCon (via Tobias Eigen)

But what did we learn from open source? Two days of Open Knowledge Conference gave lots of food for thought. And lots of inspiration as well: plenty of projects doing interesting work, and experiences to share. And to add a cherry to the cake, we had a great “open lunch for development” with several people active in development aid. My (delayed) take-aways for Open for Change.

From data.gov.* to data.your.org

Nigel Shadbolt and Andrew Stott shared their lessons from setting up data.gov.uk, and Tom Lee talked about data.gov and the recent threats of its budget cuts.

It’s crucial to have top-down support, bottom-up activists, and middle-tier connectors, to bring everything together.

We need to continue nurturing a network of people active in open data for development, to make sure we have the tools, ideas, reports, cases, and standards we need, and to support the early adopters within organisations.
We have done some work on an “open data briefing”, based on the OKFN open data manual, and we need to continue work on that: in four pages, explain the why, how, what and who of opening up your organisation.
It’s important to understand the decision-making and budgeting processes to make the case at the right time and the right place. We could review the best material on “how to convince your boss to use open source” as a starting point.

There are many reasons to embrace open data, don’t rely on a single one to make your case: tranparency and accountability; economic value, growth and innovation; efficiency and cost-effectiveness; improving (public) services; (public) engagement; and civil society and social capital.

We should have good stories illustrating each of these reasons, and not rely on just one. Aid transparency is hot right now, but many actors want to go open for one or more of the other reasons (and may even be scared off by too much transparency). The paradigm shift is what matters to us.
Andrew had a great slide with excuses to justify “data hugging disorder”. We could turn it into a data hugging bingo game for workshops.
People tell stories about people. Numbers can be good (in the US case, showing that the IT dashboard project resulted in $3 billion savings on IT spending makes a strong case).
Some stories may be too scary for some organisations. The UK, for instance, now provides the organograms of all departments, including for instance salaray ranges at each level, and creating a web address for each job post to include it in the web of linked data.

Close the feedback loop: the “build it and they will come” approach won’t work here either. Try to publish data that matters to people, but also consider that “data probably has a long tail”.

Releasing data early and incrementally creates a steady news flow, and also enables you to work with feedback and to champion people to create peer pressure on the “refusniks”.
Friedrich Lichtenberg tries to turn the lack of opportunity to debate what you see in “ Where does my money go ” into an opportunity for “ Open Spending ”. In essence: how do we provide support to create a data cycle instead of a data pipeline.
There is a challenge to create or appoint authoritative sources and URIs. In fact, this was a key topic in the online workshop organised by David Pidsley during the Open Data for Development Camp.
In a broader sense, the emerging standards and work done on open data in development cooperation needs to settle down in one or more ontologies. This will also pave the way to build more focused data enrichment services that can tap into the existing wealth of documents we have, to help professionals navigate those.

Statutory requirements matter.For governments, this mainly has to do with legal frameworks and obligations. But every organisation could (and should) enshrine crucial elements of open data in their policies: how to ensure “open” stays open, and how to prioritise.

The UK government re-used lots of existing policies, such as the National Statistician’s guidelines about reasonable measures to prevent identification of individuals. The open data manual could be a good portal to such policies.
Principles and policies should not be set in stone (at least: not early on), to prevent weasel words like “to the extent feasible” creep in. The essential lesson: in the first phase, compromise on the data that will be published, but not on the license applied to it. (Again: the paradigm shift in thinking and acting is what matters first.)

Why we want to be open: the fallacy of community collaboration

In open data, a lot can be learned from “open source”. In terms of tools and practices, I think we are, but in terms of the stories we tell ourselves and others about why we do it, Benjamin Mako Hill gave some interesting insights on the promise of open source to create better software because more people will be able to see, comment on, and improve the code.

In reality, this hardly happens. He showed graphs of projects on Sourceforge, the first major hub of open source software. The median number of developers working on a project is one. If you only look at “mature projects” (multiple releases, longer history), the median is still one. If you look at the most popular projects (10% of most downloaded), the median is two.

In other words: there are very, very few projects where mass collaboration did happen.

And we actually don’t really understand why some of them succeed.

It doesn’t mean we should not do open source, but we should not promote it with the story that it leads to peer review and better code. There are plenty of other reasons, though, and we should make sure we capture those in our Open for Change Manifesto as well:

It gives the users freedom: autonomy, control and empowerment. The technology constrains how and what we can communicate as people. Openness allows you to remove barriers.
It is resistant to “anti-features” (limitations built in to charge for removal). With an open license, anyone can process a data set to make it more useful for themselves or others.
It makes failure cheap: since the investment to be open is low, there already is reward in just making your own solution available.
It also makes success cheap: some products failed to have a big enough market to sustain a company producing it, but a community of users can produce and maintain it.
It is not dependent on persons or organisations: even if the original producer(s) stop working on it, others can continue and keep it available.
It sometimes does lead to mass collaboration. And it then can produce something that would be impossible to organise through traditional means.

Why we want to be open: a stronger vision

In our beta Manifesto, we tried to capture the essence of why we want to be open, and OKCon was a chance to reflect on it.

I liked a definition given by Jose Alonso of the Web Foundation: the web is humanity connected through technology. And as Brewster Kahle of archive.org said: the last generation put a man on the moon. Pretty cool, but our generation can make all knowledge available to all people on earth, for always and for free. That’s a powerful ambition too.

It is crucial to also translate the promise of open data, open access and open knowledge to “effective use”: how do we make sure we create autonomy, control and empowerment, but more even so: security for the ones who want to realise their “ right to access”? “Open” is part of a struggle for human rights.

Hopefully, a joint “Slash Open” campaign can unite the efforts of many organisations working for humanity in shaping the technology we need and put it to effective use.