If you are publishing open data, you are usually doing so because you want people to reuse it. This guide discusses ways in which you can help to maximise that reuse by engaging with those who you want to reuse it.

Before Publication

Publishing open data requires time and effort, so as you invest those resources in opening up data, you should have a clear idea about why you are publishing it, and about the different groups who may reuse your data. This will help you to identify which of those groups will reuse data in ways that support your objectives, so you can target your engagement activities.

The open data you publish may be reused within existing applications and services. So, before publishing the data, try to identify existing applications and services that could make use of your data, and engage with their developers.

Try also to identify the researchers, community groups, or interested individuals who might get value from your data. And don't forget the people within your own organisation who might benefit from easier access to your data.

In this preparatory stage, you should be asking:

  • How could the reusers make use of the data?
  • What formats and mechanisms are easiest for the different reusers to work with?
  • What guarantees do the reusers need to start working with the data?
  • Are there any other barriers, such as poor data quality, that could prevent reuse?
  • What support will different reusers need to understand and make use of the data?
  • What capacity do you have to handle enquiries generated by reusers of the data?
  • What opportunities are there for collaborating with reusers of your data?

Having a conversation with existing and potential reusers of your data will help you to identify the best ways of publishing your data. However, you should make sure that the way you publish data does not exclude potential new reusers. For example, even if an existing reuser would like you to publish in a particular proprietary format, you should still aim to publish in an open standard format to meet the needs of future reusers.

Part of this engagement might include events that bring together potential reusers with those who understand the context, content and utility of the data. At this stage, these events should be targeted at addressing the questions above rather than the creation of applications or services.

Getting data curators and data reusers to talk to each other regularly can support ongoing improvements of data, help those in your organisation who are supporting access to the data to get the appreciation they deserve, and maximise the value gained from opening up your data both for your organisation and more widely.

Documentation

The reusers of open data need to understand the data you are publishing in order to work with it. To minimise both the demands on your time and the barriers for reuse, you need good documentation that allows reusers to make sense of your data without direct support. There are a number of important forms of documentation:

Metadata

Metadata is information about the data that you are publishing. This should describe enough of the context of the publication to help reusers to understand what it does and doesn't contain, how to interpret it correctly, and whether it's appropriate for them to trust it.

The documentation that you provide should include:

  • descriptive metadata such as a title, description, keywords or tags and so on, to help reusers understand what the data contains
  • legal information including the copyright statement and indications of the licence(s) under which the data is being made available, including required attribution and any other terms or conditions of use (which should be minimal to encourage reuse)
  • coverage information including its geographical coverage (which areas of the world it relates to), temporal coverage (which time frames the data relates to) and its completeness (which items are included or excluded from the data)
  • timeliness information including information about when the data was published, when it was last modified, how frequently updates will appear and the lag between the collection of the data and its publication
  • reliability information including some information about your intention to continue providing the data over time

If there are existing manuals or other detailed documents that explain how the data was collected, and that offer guidance for using it, then you should consider publishing these.

Equipped with this information, reusers should be able to tell whether the data is interesting and useful for them.

Technical Documentation

Data usually contains a specific set of fields or columns and codes, which reusers need to understand to work effectively with the data. If it's served through an API or service, reusers will also need to understand how the API works.

The technical documentation that you provide should include:

  • format documentation about the data formats that you are providing, possibly including schemas for any vocabularies that you use
  • code lists that provide more details about each of the codes that are used within your data; one way to provide this information is to have a URL that provides documentation about each code and to link to that URL within the data
  • service documentation that describes the way any API that you provide works; this might include links to machine-readable service descriptions if applicable

Equipped with this information, reusers should be able to understand the data that you're publishing and how to create applications that use it.

Support

Engagement with reusers is a two-way street, and you should provide methods for reusers to get back in touch with you. You should also try to enable reusers to help each other.

The support you provide should include:

  • data contact for questions reusers might have about the data or issues that they're having using it, including bugs that arise in your provision
  • forum where you recommend reusers pose questions of other reusers, so that they can help each other; this doesn't have to be a site that you manage: one option is to point reusers to Get the Data as the forum of choice
  • a contact route for error reports to enable reusers to provide corrections for problems they spot within the data; the information you provide should set expectations about how those reports will be dealt with, and you should indicate how existing reusers are informed of these corrections
  • if your data contains or is derived from personal information, you should have a contact point for any confidentiality concerns that reusers have

Services

If your data is important and complex, it will help reusers if you can provide tools, such as plugins, visualisations, software libraries and services, that enable reusers to build on others' work with the data. You might find that these tools get built by third parties.

You should have a page that lists these tools, and indicate how additional tools created by third parties can be added to the list within that page.

Active Outreach

Understanding and tracking reuse of your data can help you to plan improvements and to make the case for further data releases. You should consider:

  • maintaining an active presence on social media
  • using search engine alerts to find out when your data is mentioned
  • blogging to showcase good examples of use of your data
  • arranging hackdays or competitions to encourage the use of your data

Whilst not worthwhile for all datasets, in some cases explicit community building, engagement and outreach work can help to maximise the value that your data brings to your organisation and to others.

Further Reading