In association with heise online

Free scientific data?

Research results and raw data present an entirely different problem. Since many research institutions are now expected to provide some of their own finance by obtaining third party sponsorship, through patents or by licensing the results of their research, many results of research that is publicly funded go straight into commercial enterprises. In the US, for example, an official obligation to make government-funded research accessible to the public exists - in reality, however, this means that "expensive" research in fields like medicine and pharmaceuticals is simply transferred to the private sector.

Basically, raw data can't be protected by copyright - at least, that's the theory. In practical application, data collections are far from being easily accessible for download from the internet, and some of them are even protected by patents.

The most important point of access for scientific open source material is therefore the Creative Commons offshoot Science Commons, which currently focuses on bio and neural sciences. The idea behind Science Commons is to provide free access to research data and databases. Agreements are made to ensure fast exchanges of cell cultures, DNA and other bio materials between research institutions without the need for months of license negotiations.

Cambia The Australian BiOS initiative, which is part of the Cambia project, has set out to apply open source principles and licenses to molecular biology. Cambia is partly funded by the Norwegian Ministry of the Exterior and maintains both large amounts of information materials and the BioForge website – the initiative's download site whose name was deliberately chosen to resemble

Another project providing free bio data is the BioBrick Foundation, founded mainly by Drew Endy and Thomas Knight, who want to apply the advantages of free software development to biotechnology. BioBrick Foundation offers its DNA samples - boldly called "Open Wetware" - for bio engineers under a free license. The underlying idea is to stimulate leaps of innovation in biotechnology in a similar way to the proverbial IT startups in parents' garages by promoting "genetic garage engineers".

BioBrick Foundation: biotechnology in open source style

Free data from public authorities

The social sciences also deal with raw data - unemployment statistics, population change or environmental statistics are just a few examples where scientific access is required to raw data collected by public authorities. This data is available in vast quantities: practically every authority in every country has the official task of collecting data and producing statistics.

In the UK the main organisation is the Office for National Statistics (ONS), with information about productivity, retail sales, population, employment and much more, including links to other sites holding statistics. Terms of use of National Statistics material require a "customer" (sic), to apply for a Click-Use Licence, the details of which are given on the web site. The key point however, is that the source should always be clearly stated when data, which are considered Crown copyright material, are used. Eurostat, the EU office of statistics, takes a slightly easier approach and uses a highly non-specific "license" for its online data: "Reproduction is authorised, provided the source is acknowledged, save where otherwise stated." To be on the safe side, interested amateurs or those looking to mash up unemployment statistics are therefore advised to check with the respective office.

For historical reasons and due to the data protection issues arising in connection with the collection of data, however, we cannot simply fly the "All data on the internet" banner: in many countries, data collected by public authorities is regulated by statistics acts and social legislation – the European social system has collected data since the 19th century and not just since the invention of the internet. French police and Prussian civil servants were already collecting data centuries before anyone ever dreamed up the concept of open source.

Citizens not involved in research can still download many free collections of raw and interpreted data. In general, laws such as the UK's Freedom of Information Act 2000 now also allow better access to data and information to everyone. However, most of the politically and socially relevant data requires thorough research of the respective institutions, since licenses and access conditions are often really hard to find. Central databases? User-friendly APIs for tables? These will probably just be daydreams for another few decades to come.

Print Version | Permalink:
  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit

  • July's Community Calendar

The H Open

The H Security

The H Developer

The H Internet Toolkit