Helsingin Sanomat provides guidelines to journalists on use of open data
By Esa Mäkinen
Helsingin Sanomat has issued its journalists with guidelines on the use of open data in day-to-day editorial work.
The objective of the instructions is to bring open data more prominently into the work of the various desks on the paper.
Open data in this context can for instance be a database or dataset published by some public body or authority that is made available on the Net for anyone to use - rather than merely to be read.
Such published datasets already in the public domain include the National Audit Office's list of the election funding received by elected Members of Parliament and their deputies in accordance with the rules on disclosure established in 2009, or the public transport timetables and journey planners published by Helsinki Region Transport (HSL).
What makes open data "open" is the fact that the information contained in the online content is in machine-readable form: it can easily be handled for instance using spreadsheet software such as Microsoft's Excel or open-source variants of this.
Open data is given with a non-exclusive licence for its use, so it is possible to build new applications and services on and from the original body of material.
The basic principle in the HS guidelines is that the exploiting of data for reworking should increase its value to the society and end-user.
In future the newspaper will also seek where possible to publish background material pertaining to its own articles as open data, thereby providing users with the opportunity to re-use the content and inspect its veracity.
Helsingin Sanomat will also be looking to make downstream use of information and content generated by others. One example of this already in existence is the easily-readable display of election funding sums paid to individual MPs, to be found at www.vaaliraha.com [in Finnish], which is based on the original National Audit Office data.
Another is the popular iPhone route-planner application ReittiGPS (available in Finnish, Swedish, and English), built on the back of the HSL timetable and routing data.
In the United States and Great Britain, public sector authorities and the media have been applying the principles of open data for some time.
Governments in these countries have also published their databases in machine-readable form.
A case in point is the United States' government's Data.gov pages, which have released thousands of datasets from various authorities in this fashion.
In Finland, too, the government has woken up to the idea of activating information from various authorities as open data.
Earlier this year, the government approved a decision in principle that will further the efficient and free distribution of public information data in the future as open data.
A good example of this in practice is that currently the National Board of Patents and Registration (PRH) gathers a great deal of information on companies in Finland, which it sells on to third parties.
If the government decision is ratified, by 2013 PRH would be obliged to pass this information on free of charge.
As it stands now, the state collects around EUR 40 million a year by selling information.
There is nevertheless a wish to make this accessible to all without charge, because in this way the data are believed to be capable of developing commercial applications that could generate as much as hundreds of millions of euros in revenue.
Guidelines for the use of open data within Helsingin Sanomat
Helsingin Sanomat uses open data created by other parties in its journalism and publishes background data on its own studies as openly as possible.
Herewith some guidelines for the use of open data in the newspaper:
1. Our basic principle is that the opening up of data for reworking should increase its value.
2. The term "open data" in this context refers to machine-readable information published by state or local officials, companies, or associations and the like, which is distributed via the Net and which is subject to a non-exclusive public licence (e.g. Creative Commons) for its subsequent use.
3. Helsingin Sanomat will also publish as open data background material to its own articles and data relating to services, wherever this is possible and necessary.
The intention is to provide parties outside of HS with an opportunity to re-use and rework the information and to inspect the veracity of data.
4. External producers of data will be requested to provide background material on research and surveys as open data in order to ascertain that interpretations drawn from the studies are accurate and valid.
If this information is not made available, it may be necessary to state as much to the readers. (e.g. "HS did not receive the background data [and methodology] pertaining to the study in spite of requests to this effect.")
5. Surveys published by and in the newspaper will where possible be worked up as visualisations and interactive graphic reports.
The intention is to provide users with an opportunity to shape and rework the content and in so doing to collect users' opinions, analysis, or comments.
6. In the developing of new services, there will be a conscious effort to create APIs (application programming interfaces) and back-end interfaces for administrators that will facilitate the opening up of gathered data for further use.
7. Where possible, data will be published in a section of HS given over to open data (at present, the HS Next blog pages).
8. Data will be published as and where possible in open source code formats and with suitable licences for re-use. The formats will favour open interfaces and CSV files, in which numbers and text are stored in plain-text form that can be easily written and read in a text editor.
9. In publishing any data it is necessary to be particularly careful not to violate the terms of the Personal Data Act or copyright legislation, or to infringe on personal privacy.
Particular caution should be exercised in the publication of such matters as precise dates of birth, religious persuasion, criminal records, incomes, personal wealth, sexual orientation, and ethnicity.
10. Registers comprising personal data from private individuals may not be published as such, but rather the personal data must be rendered anonymous prior to release.
If personal data are to be published, there must be a strong journalistic justification for this.
Helsingin Sanomat / First published online 10.10.2011
Open Data (Wikipedia)
Open Content (Wikipedia)
ESA MÄKINEN / Helsingin Sanomat