This report sets out the findings of research to define responsible data stewardship undertaken by the Open Data Institute (ODI) between June 2022 and March 2023, supported by the Patrick J. McGovern Foundation.
The right kind of access to data is vital in tackling the big challenges we face in society – from the earlier detection and treatment of disease to reducing pollution in urban spaces. However, data and related technologies can also cause harm, including through automating decisions that need a human touch, or embedding existing biases and inequities.
In response to rising awareness of the harms and negative impacts of data, different concepts have emerged that represent new ways of thinking about how it should be used, including things like ‘data ethics’ and ‘data justice’. One of the concepts we’ve used at the ODI is data stewardship, generally describing it simply as ‘the collection, maintenance and sharing of data’. In our work on data institutions in recent years, we’ve expanded on this by suggesting that these organisations are stewarding data ‘on behalf of others towards public, educational or charitable aims’.
However, at the outset of this research, our understanding of what makes for good data stewardship was largely anecdotal and based mainly on our experiences with specific use cases. We were driven to develop and articulate an interpretation of what responsible data stewardship means to add a normative element to our description of data stewardship. This definition provides a more critical lens that we can use in our work to help others design and practice data collection, use and sharing. We understand responsible data stewardship to be:
an iterative, systemic process of ensuring that data is collected, used and shared for public benefit, mitigating the ways that data can produce harm, and addressing how it can redress structural inequalities.
- Iterative: To us, responsible data stewardship is a negotiated and reflective process. Because contexts vary and change over time, mitigations and approaches to collecting, maintaining and sharing data need to constantly evolve.
- Systemic: The impacts of data collection and use are rarely fully within the control of any one organisation, so organisations need to develop a systemic view of their data practices that links how choices made around data have impacts outside of the organisation.
- Public benefit: Stewarding data responsibly involves ensuring it’s used and shared for the benefit of others, rather than only for the benefit of the organisation that holds it.
- Harm: Alongside seeking positive impact from the use of data, responsible data stewardship involves identifying and reducing harmful impacts to individuals and communities, often going above and beyond legal requirements around privacy, security and transparency.
- Redress structural inequalities: Data stewardship always occurs within a wider system of relationships, value exchanges and power imbalances, which have real-world consequences for data. Responsible data stewardship may involve meaningful new communication with data subjects and other stakeholders, or adopting alternative forms of governance.
Key findings from the research
In this report we discuss some of the prominent concepts we came across in our work, including ‘data stewardship’, ‘data ethics’, ‘data justice’, ‘data for good’ and more. We discuss how these narratives have helped people to think about how data should be collected, used and shared.
We explore the concept of ‘responsible data’, and how it’s used by civil society organisations, organisations stewarding data, governments, international NGOs and large technology organisations. We conclude that it’s used inconsistently and with varying depth of thought, and that its meaning differs significantly based on geographical and cultural context.
But what does being responsible with data mean in practice? We investigate different ways that the concept has shaped behaviours, policies and processes in the real world. We found examples of efforts designed to protect privacy, address biases and enable participation in data practices. We also document interventions that have been developed to support and compel organisations to use data responsibly – including frameworks, technologies, training and legislative action.
We recognise that there is an appetite for more practical guidance and support in this area. Our research has highlighted that, in isolation, high-level principles can be difficult to apply and therefore may have limited impact. In our next phase of work on responsible data stewardship, we will explore the potential interventions that we could take to support organisations and operationalise the learnings of this research. We intend to put particular emphasis on concepts such as ‘public benefit’, ‘reducing harm’ and ‘redressing structural inequalities’, exploring what’s needed to practise them in different organisational, geographic and sectoral contexts.
With this research, we want to bring people together to coalesce around how to critically and responsibly think about data and digital technologies to address ecological, democratic, and societal concerns, ensuring that data and technology are used to create a world where data works for everyone.
Finally, we are keen for feedback on this work. We’re interested in any emerging normative concepts and language we may have missed in our research, efforts to practise responsible data stewardship in the real world and views on our interpretation of responsible data stewardship.
If you have any thoughts or feedback on this work, or would like to collaborate on the topic of responsibility please get in touch via email at firstname.lastname@example.org.