As part of the Data Decade, at the Open Data Institute (ODI), we are exploring how data surrounds and shapes our world through 10 stories from different data perspectives. The eighth, State of the Data Nation, asks ODI co-founders Sir Tim Berners-Lee and Sir Nigel Shadbolt about developments in tech and AI in the past 10 years.
Listen to the podcast now
This is Data Decade, a podcast by the ODI.
Emma Thwaites: Hello, I’m Emma Thwaites, and welcome back to Data Decade.
Across the series we’ve been looking at the last 10 years of data, the next decade ahead, and the transformational possibilities for data in future.
We’ve explored how data is shaping our cities, how we can trust the data that’s reported in the media, and the impact of data use in health research and clinical care. But in this very special episode that was recorded at the ODI Summit 2022, you’ll hear from our keynote speakers at the founders of the ODI, Sir Tim Berners-Lee and Sir Nigel Shadbolt. With host Navdip Dhariwal, they’ll discuss the last 10 years since the formation of the ODI, how data use has changed during that time, the role that data plays in the current global political and social climate, and their hopes for how data will shape our lives in the next 10 years.
Recorded at the ODI Summit 2022, this is Data Decade.
Navdip Dhariwal: I’m delighted that the ODI’s founders Sir Nigel Shadbolt and Sir Tim Berners. Lee are here with me. So first, Sir Tim, 10 years on from the founding of the ODI and you’ve just won a peace prize, the Seoul Peace Prize. Congratulations on that for recognition of your work in data sovereignty. You pick that prize up on Friday.
Tim Berners-Lee: Yes, it’s exciting, exciting. It’s a prize for, specifically for data sovereignty, specifically for looking at the empowerment of individuals in this world of data.
Navdip Dhariwal: Great, brilliant news. It’s the Data Decade, we’re obviously marking that this year. What do you think have been the most exciting developments that we should be celebrating over the last 10 years and anything in this last 12 months that we should be pointing out?
Nigel Shadbolt: Well, the last 10 years. I mean, I think the ODI can be very proud of its involvement in trying to understand what building a trusted data ecosystem actually means.
And we started off that journey very much around the idea of trying to think about government data, data that was collected in our names that was often about public services. It was often about data that didn’t identify individuals, it was about when the trains run, it was about the hospital rates of infection, of hospital acquired infections. It was about our postcodes, it was about how transport works.
All that data has been released through time now and is a great basis for innovation and use. But we recognised in the ODI that data, we use this terminology, data is a spectrum from open through shared through to closed data, which were very much more solicitous about, careful about.
AndI think this balance between the ODI’s agenda and understanding how personal data plays into the mix of public data, how public and private data from different organisations plays into one another and how we need to think about the technology and the institutions we need to steward that.
That’s been a very rich process of discovery and innovation, not just at the ODI, but I think throughout society across the world.
Navdip Dhariwal: Tim, your view on that, you know, how would you sum up the last 10 years and any developments in the last year?
Tim Berners-Lee: Well, I suppose the way the Data Spectrum, filling out the spectrum towards the private end, people used to worry about the person, about the medical data or something being stolen or abused or made public and worry about privacy.
But in fact, the power to use your data, the ability to be able to have discussions of sharing your medical data with your family and your doctor in the same space, for example, where you can talk about diagnosis and things. That sort of the power to use data is I think, one of the things people are starting to really realise.
On the technology side, then sort of the web protocols, which originally were just about hypertext pages originally, and then websites where your data is stored in the backend by the company which stores them in this data silo. Where you don’t have that Web 2.0, as we called it. We’re moving on to something now rather more 3.0 with a Solid protocol. You can have a Solid Pod with Solid Pods running it rolling out in Flanders, for example, by the end of this year.
The BBC are using it currently. And on live use the BBC taster so Solid Pots are a thing. And that technology is like the third layer of the web, which we’re missing. We were missing before. And it allows you to have a Pod where you are in control. So different from the history when the data was stored in the backends of different social networks. But you couldn’t really use it. Now, it’s very empowering and it completes the Data Spectrum.
You talked about sort of empowering . Would you say that was the key development over the last 10 years?
Nigel Shadbolt: I think we’ve always seen data as fundamentally about empowering. The question is who and what. And I think the issue that we’ve had is trying to return some self determination, some agency to individuals as consumers or citizens. To think about collectives, groups who might have an interest.
Now the platforms and the corporates, the large organisations who have become receptacles of huge amounts of data, have, of course provided a whole range of services that we have come to love and sometimes be very concerned about – somewhat concerned about. And I think working out that space, how we regulate it, how we think about the benefits that arise, what technical architectures as well as institutional architectures we will need to kind of embrace the potential.
And of course, the other thing that’s changed in the last 10 years. I mean, AI – artificial intelligence – has been around for decades. But the particular power of some of the modern methods in machine learning which consume huge amounts of data have become, I think, a really powerful and critical feature of our data ecosystem. And we’ve got to work and think about what data is being used to train those systems? Where does all of that data sit? How do we spread the benefits of these very large models that are being built now?
And so there are, there are huge opportunities, and I’m sure we’ll talk about them in the context of some of the challenges that are facing us in the 21st century. But there are real issues that we’ve got to be concerned about. And the Open Data Institute is about convening those conversations.
And this summit is taking place, obviously, as COP27 and world leaders meet to tackle probably our greatest challenge around climate change. How do you see the role of data now, in this critical point in our lives when it’s become quite a scary phenomena that we’re facing?
Tim Berners-Lee: Well, for good or bad, the flip side of it is, as things get worse, people do tend to end up becoming, I suspect, more data literate. With the pandemic, for example, it was all about flattening the curve. And people wanted to know, what is the infection rate in this country or in this town?
And so suddenly, people are talking about using numbers a lot more, they’re getting used to graphs, getting used to comparing the graph of infections in Britain with the one in France and so on. So, to a certain extent, the crisis, understanding the crisis and dealing with it, involves looking at understanding data, governments looking at and understanding data.
And of course, climate change, absolutely, it’s about much more, in a way that is much more complicated, or it’s much more simple. Can we get the range down to 1.53? So I hope that obviously, the science involved behind the climate change predictions, about the models, is really complicated. Obviously, the open data around there is, is really, really essential.
Nigel Shadbolt: Yeah, I think climate, of the many challenges we’re facing, it does give us this great use case where there’s going to be pretty general agreement, one would hope, about the importance of getting high quality data at planetary scale.
That it needs to be quite real time. There was a panel just before us, worrying about the real time data, because things are changing so fast. You know the historical data is important, and it’s provided a hugely powerful platform on which to understand just how fast things are changing. But there will become an increasing need to instrument the environment to understand our impact. And that’s crucially a question about building a trusted data ecosystem.
It’s about the free flow of information in the scientific and engineering world and the policy world to understand what is happening and what we might do about that. So I think it’s crucial, people are mobilised, so we have this opportunity to also include people at scale, citizens at scale. The whole citizen science movement, where you’ve got people who can take direct measurements of what’s happening in their environment, and add that to other forms of scientifically acquired data. Crucial.
Navdip Dhariwal: And Nigel, what role does AI then play if we’re looking at this bigger picture around data? And crucially around solutions?
Nigel Shadbolt: Well, clearly, at the very large scale, the planetary models we’ve got here, I mean, a huge amount of serious research about building models that are fine-grain enough, but also inclusive enough to see those unintended consequences. I mean, the climate environment’s a deeply coupled system, and changes in one part can have consequences in all sorts of unexpected areas.
And we might be surprised, in many respects, how much of the planetary ecosystem is still work in progress, you know, the effect of clouds, the effect of important massive parts of our environment and the data we have on that to enter into these models. And colleagues in Oxford, and all around the world, who are working on improving these models. They’re also in need of serious amounts of compute power. What we do know about these systems is that they are demanding of very large amounts of data and serious compute power. And I think we have to imagine how we make common cause on that and I think it’s, as we saw in the pandemic, which is another great use case of where thinking about the data you will need for future emergencies is also crucial.
Navdip Dhariwal: Yeah, I mean, that’s something I did want to talk to you – you both mentioned the Covid pandemic, and what we’ve learned in the way we’ve learned to use data publicly. Do you think we’re prepared or we’ve had enough lessons taken from that for future pandemics, which are now being, you know, predicted?
Tim Berners-Lee: I don’t know, I think what we have learned during the pandemic, that you need to have the internet to be connected, to be functional, to work at home. And so to a certain extent, people being connected and having computers, maybe that’s now something which is either you do or you don’t, and you can work at home or you can’t. So hopefully, will those people, will they have learned enough to be able to apply? I hope so. I hope so.
Navdip Dhariwal: What about governments? Will governments have learned, taken tools away from this whole experience for us to make better decisions in the future?
Tim Berners-Lee: I have no idea.
Nigel Shadbolt: You’d like to think there were. What is the after action review? That’s the kind of terminology you often hear in business. So what have we learned through the experience? And let’s think about it right now, so that we don’t forget, or we don’t just move on to the next thing.
And that is something that’s important. Again, you know, working with colleagues, we’ve been thinking about this whole notion of pandemic preparedness, you know, what might we need to know? And I think we have adjusted some of our assumptions. It was fascinating to see how the best modellers in that area realised that there were whole bunches of factors, many of them to do with our behavioural dispositions, how we acted in crowds. Tons and tons of stuff simply wasn’t known to inform the models about how the virus might spread, how people will get together.
Tons and tons of data that people say, “Hey, it’d be really useful if we had some notion of what the base statistics were in these areas.” So I think that’s been a learning lesson, but more broadly in areas, and in more areas like scientific discovery, drug discovery, which has been helped and empowered by modern AI methods and lots and lots of analytics, looking at the effectiveness of vaccines, and how the deployment or how to redeploy existing drugs in a new context. All of that is needing us to link data at scale in a different way.
And I think one thing that’s changed is our willingness to understand that in an emergency, our presumption that ‘everything is private to me’ may have to be challenged – that there is a public good, as well as a private interest in how your information is shared. And I think that we’ve understood that through the pandemic.
Tim Berners-Lee: So in practice, of course, when those working on the Solid ecosystem, working on who had Solid Pods, when the pandemic came along, the question was immediately how can you just track whether or not – the government would have loved it if they could roll out a Solid app. I wrote a Solid app to track my own vaccine level, temperature and symptoms sort of one afternoon.
And so the good news is that for places where they’ve got rolled out, so for example, in Flanders, the Flanders government will be able to roll out things which allow people to track their own status and publish their own status. I think when governments in Europe, and the UK to certain extent, there’s more trust that in the States, and so if the government says, “Look, I need 1% of the population to give us your daily temperature reading, and your cough symptoms, and a couple other things.” And because we need to very rapidly track this, I think people will then some people who volunteer, and you’ll be able to put together a dataset which will be live and very fast, and roll that so that that in a way makes us very much more flexible and powerful.
Nigel Shadbolt: And we’ve got great use cases from countries who sought to do this integration. We’ll be hearing from the minister in Taiwan, where various sorts of data are integrated in the face of the pandemic.
We see it in other ways. It’s not necessarily about having all the data in a local repository. Sometimes it’s about taking your large scale NHS data and linking it. Now doing that in a trusted and safe environment – one of the friends and supporters in the past, Ben Goldacre has developed a system in the UK OpenSAFELY, which, with a whole bunch of researchers allows us to keep the data in trusted, secure environments, enclaves, and send the models, the analysis into the data and get the results back.
And, again, that was a very powerful I think revelation about what data linked at scale in a way that people could trust and be assured of, what that could lead to. So there’s a whole range, I think, of different architectures that we’re beginning to see emerge and innovate around that, allow us to balance out these interests in data.
Navdip Dhariwal: One of the other stories, obviously, that’s dominating the headlines over the last year or so is the war in Ukraine. There have been lots of innovation, there are lots of threats at the time at the start of the war, which not necessarily have come to fruition. What do you think the key learnings have been from what we’ve seen unfold in the data world?
Tim Berners-Lee: Good question. To a certain extent, unlike the pandemic is something which is something where data is all over the map. With the Ukranian War, there’s very, very little. You have to, you know, you could spend hours trying to- burrowing into the bits of news. And you can find a video clip from somebody taken on the streets of Kyiv, or something. So basically, for normal people, there’s very, very little data. And I’m sure in the military, there’s a lot of data.
Nigel Shadbolt: It’s interesting to see how open source intelligence has played a role in that, though. Where, certainly on the Ukrainian side, you see people collecting and geolocating data about damage to hardware. And that at scale gives you a sort of situational awareness. Now, of course, the militaries will have their own capabilities in these areas. But it’s very, very interesting to see how these open sources of collection, certainly on the Ukrainian side, have been widely shared, and have been a cause for some hope and optimism, in some cases.
I think you can also see that in humanitarian contexts as well, with relief. And of course, the other side of that – the kind of yin and yang of that – is, of course, there’s been an extraordinary battle of disinformation as well. So data being used in this whole range of ways is hugely, hugely problematic.
Navdip Dhariwal: Just going back to that, that idea and about the Ukraine war and Russia and the void of information. We also, there were many predictions at the start of the war of cyber attacks and so on, which haven’t taken place. Do you see that as a positive?
Tim Berners-Lee: I don’t know that if we know sitting here that they haven’t taken place. I certainly would have expected them to have taken place. It’s just the sort of thing that we don’t get told about.
Nigel Shadbolt: I mean, people say it’s the dog that didn’t bark, and da-da-da. But I mean, you can’t know the amount of preparation that’s gone in to harden and make systems more resilient against that kind of process. And I think that in civilian infrastructures, there has been attention really paid to that.
And, as we know, sadly, at times of conflict, huge amounts of innovation happened in these spaces. So if anything, what we can hope to see is some of those lessons being taken over to think about improving the resilience of everything from the software that runs our water supply, to our power systems. And I think that the people who are on the inside of this and are tracking that and understanding it are busy, there’s no question about that.
Tim Berners-Lee: And of course, in the States, there’s the midterm elections. So the question is, to what extent have there been cyber attacks which will influence the course of this action.
Navdip Dhariwal: Another thing to influence the courses of their actions is obviously our freedom of speech. And we’ve seen obviously the last couple of weeks either with Elon Musk taking over Twitter, and many people raising concerns that we have all this data, information about us, which has been used well within Twitter, but also now in the hands of possibly one individual who is determining how that path continues.
Tim Berners-Lee: The people that are on Twitter, they have access to that information, but also they have access to the algorithms, the “What colour is the like button? What is the workflow? How does the app actually work when you’re retweeting?” and so on.
And things like that, when you look at the misinformation out there, a lot of the misinformation is from conspiracy theory, which is spread like wildfire through the brush. If the people who run the brush put water on. Just build systems like Twitter so that in fact people are slightly less inclined to retweet stuff which is nasty.
People have talked about all kinds of different ways, and we’ve seen people- So in fact, a very powerful thing that one could do if one was running Twitter, it would be to try to make it so that it became more constructive, just emphasised more constructive rather than nasty interaction.
Navdip Dhariwal: Well, that’s a major challenge, isn’t it for the social media platforms?
Nigel Shadbolt: It is. And the question then is this kind of balance between automated bots, essentially, and real people, what the accountability ought to be on those individuals, what counts as appropriate behaviour. And this balance between free speech and guidelines for civil discourse, there is some real sense that some of the civics of our discourses,has collapsed.
Now people will lay the blame in different places on that. But certainly the way we engineer our systems, the way we align incentives is – and we thought about this in many different contexts – part of a really important understanding. When we talked about the field of websites, it was to understand what these emergent properties are. When I’m engaged in a piece of persuasive design – and that’s a term of art, you know, to try and have the user stay on the page. Now, what are the limits around that? And we’re very rapidly into general discussions around data ethics and the ethics of appropriate use of AI.
And I think we’re right to be concerned about that. The question there will be, what’s the role for regulation or governance in that context, as opposed to an absolute owners’ privilege to change the rules.
Navdip Dhariwal: And understand them.
Nigel Shadbolt: Indeed.
Navdip Dhariwal: Okay, so just some questions here from our audience. The notion of data, as a public good is so much needed and being explored in academia. And yet, people trust large platforms, going back to the discussion we’ve just had, more than governments to hold and process their personal data. Do we know why?
Tim Berners-Lee: Well, I suppose I feel I know both the US and the UK – I was brought up in one, I brought up two kids in the other. And they’re very different from that point of view.
So in America, in kindergarten, you are taught that the Constitution was created so as to prevent, so as to allow people to be able to control the government. It’s all about keeping the government into place. That’s you have the right to bear arms, in case you might have to create a militia. A spirit of the way the Constitution is put together is that the government is- there are checks and balances, and you should always- As a kid, you’re brought up to distrust the government. And so that in a way, it’s healthy, in a way. But it means that therefore, if you’re going to set up systems to do it in the pandemic, they will trust the large companies.
Whereas come back to Europe, and the Germans looking at, well, should they trust the big American pharmaceutical company – no way! They’ll say “No, we prefer to trust our government because our government we elected, whether they’re good or bad, we don’t want-” So the sort of attitude in Europe is sort of “Government is a chore- being a government person is a chore. Somebody has to do it. I’m glad I’m not doing it. So I better trust them. And I will trust them more than- Why should I trust a large American company?” So to a certain extent, the culture of whether you trust the government is-
Navdip Dhariwal: Well we’re also in a much more global world, especially when it comes to platforms and sharing information and so on. Do you think that that sort of, you know, cultural gap that there is between Europe and the US exists in that virtual world?
Nigel Shadbolt: Yeah, to some extent. I mean, I think, just to be clear, there is evidence as well, when we say that, “Oh, the Brits don’t really trust the government either with their data and much rather leave it on a platform.” I mean the ODI has done research on this in the past. And it depends what categories of data, and it depends in what context. So broadly speaking, there is quite a high level of trust in certain classes of information held within a public sector body.
I think that you might find different reactions in different geographies and different jurisdictions, of course, and for the kinds of very cultural reasons that Tim was saying. And then of course, that’s just getting outside of the US/UK context into different European contexts, into Asia, you will see different models of response, different permissioning of what the government believes it should or shouldn’t do in an individual’s life.
So it’s pretty crucial that we have a constant way of understanding reasonably reliably what our perceptions are and what our trust levels are. Because, you know, trust, once lost is hard to win back very often. And we still live with some of the fallout, I think, of previous experiments where the government said, “we know this is good for you, we will do this to your data.” And I think that it’s much more about building – as we did, to some extent, I think, in the pandemic – a coalition of the willing. That this is actually a contract between citizens and their governments, between individual consumers and the companies they contract their services from.
Navdip Dhariwal: So just going back now, continuing with the role of big tech, someone’s asked: does the dominance of big cloud providers pose a threat to how data is used and guarded? What is the role big tech has to play in governance? And what do we know about data and digital rights when there’s so much power imbalance?
Tim Berners-Lee: That was a bunch of questions rolled into one.
The cloud, I think that it’s very easy when you build a system to just go and grab a bit of cloud from one of the major, the big providers. Where there are quite strong monopolies out there, it’s very easy to do that and not think about where the physical computers are. But from point of view of regulations like GDPR then, depending on what sort of data is, should you be worrying about whether the data – this bit of cloud – is sitting in a computer in some country, where the police, if the police tried to subpoena that data, to what extent will the people who own that building where the computer happens to be, and the discs all happen to be – to what extent will they resist? Will they have the ability and the spirit to resist the police to an extent? Because those things are very different to jurisdictions. So just sort of talking about the cloud as some big thing as though it’s it doesn’t have a place-
Navdip Dhariwal: People think the cloud as being somewhere else don’t they, they don’t really recognise it as being part-
Tim Berners-Lee: It is somewhere else, but it’s somewhere.
Nigel Shadbolt: And people are talking about data sovereignty now, in terms of cloud provision, so that actually, it may be in some sense, diffused out there. But it’ll be diffused out there in a jurisdiction X, and how will you guarantee it won’t leave those borders? Those will become- I mean, in this kind of whole discussion around the connected global world, are we seeing a disaggregation of those assumptions? Are we seeing more kinds of barriers being erected? I think those are the kinds of things that we should be concerned about. Have those conversations.
I think, on the other hand, the big platforms, some of the very large platforms are now in possession of such extraordinary compute power, such large amounts of data, and prodigious machine learning algorithms. So if we think of the developments in companies like Google, that the so-called large language models or foundation models, really do exist in a relatively small number of companies. And the issue about equitable access about being able to interrogate the basis of those models, their strengths and weaknesses, that’s going to be a concern, and a conversation we’ll have to have on quite a broad basis. And the issue of how to think about effective governance. And I mean, these are questions that concern those companies, too. It’s not as if they’re completely blind to it in all cases.
Navdip Dhariwal: Yeah. And then one of the questions is: is data ethics mainstream? What role does the ODI have in helping continue the spread of data ethics?
Tim Berners-Lee: Yes.
Nigel Shabolt: Yes.
Tim Berners-Lee: I think data ethics I mean, you’re talking to the head of Oxford AI and data ethics thing. I think that basically, when we started talking about websites, we said that you should always talk about the society part of it, and the technology part of it. We had the website, and that was 10 years ago or so we had this website cycle. And I think we tried to say “no, it’s not tech, it’s always technologies in society.” Now, I’ve heard much more- there are people whose job is technologies in society in large companies.
And so the fact that you should do the ethics, and you should think about the social implications, you should think about social systems, as well. So it’s, partly it’s the ethics, there’s a philosophy of the ethics if you’d like, but also, it’s the understanding how people work when they’re connected together in large systems.
So the ethics might tell you what you don’t want, what you should try to avoid happening. But then actually, how to implement that in a real system is also sort of other, its network science, for example.
Nigel Shabolt: And I think the work that the ODI’s been doing on promoting data ethics, really crucial. I mean that general sense of awareness. That it’s about values. And it’s not just a bunch of preferences I might have to play nicely. Sometimes ethics is about difficult trade-offs. I mean if we take the analogy from the development of medical ethics, they had to invent whole new categories of object and concept to deal with the challenges that increasing technology, applied to use and our own health, threw up.
Futility of care. Concept of brain death. Concepts like quality of adjusted lives. These things we’re all putting into the mix so we could have conversations around the trade-offs. And I think one of the things that’s going to be happening in data and AI ethics is: what are the key concepts we need to get hold of? People talk about, you know, fairness. Fairness is a very complex concept to be played out in a whole variety of different ways. It’s not just about some simple utility preference to do the most good to the most people. Often these trade-offs are much more complex.
And we’ve heard today, you know, different cultural representations, different group interests. Those trade-offs will sometimes be quite tricky to make. And what we’re trying to do in data ethics is just get that awareness into the decision-making process. That it can’t be agnostic, our technology-application can’t be agnostic. It’s about the values we prefer.
And right back to the issue of persuasive design and what’s presented to us when we’re fed content. That’s ultimately about the values and ethical trade-offs we support.
Navdip Dhariwal: Well the ODI’s become- I mean, you’re in such a positive position having done that now for the past 10 years. How do you see – sorry Tim I just interrupted you there – but I was just going to ask, you know, what are your hopes for the next 10 years? If we’ve done a decade up to now what are your hopes for the next 10 years?
Tim Berners-Lee: I think my hopes for the next 10 years, when it comes to the integrations of the tech and social aspects. I think we’re looking at a huge amount of interdisciplinary shift across the world. It was really hard when we talked about web science back in the day to say: when you look at the web, you don’t need just computer science – you need also economics, and you also need psychology. However now, when kids go to university, people are going and taking the course with them imagining they’re going to take a completely different course after that.
Because the world is changing, it’s so complicated, what’s happening out there, you know, they’re looking at maybe they can help somewhere in the world between a social science and DNA. So they’ll take a course in each, and they’ll find that a course in economics really is the tool they need. So to a certain extent, I hope that people going into higher education are in fact becoming much more interdisciplinary and making connections – not just being interested in physics, but being much more broadly, connecting up lots of different institutions, lots of different groups, lots of different projects, trying to figure out what’s going on and to fix it.
And in a way, climate change is another thing where it’s not just about meteorology. That you realise that actually pushing back, you need lots of economics, clearly. Everytime you turn around, there’s another field which has suddenly crept into it. And you need to know somebody who understands that.
Navdip Dhariwal: Sir Nigel?
Nigel Shabolt: Yeah no, I think, reasons to be optimistic, it is that actually I think, when I look at my own students, there’s a real appetite. They’re computer scientists, but there’s a huge appetite to understand what it is to be human in the 21st century, the challenges they’re facing. That’s about values. It’s not enough just to design a cool piece of software. What’s it in support of? What will it empower?
So, working out how we set those courses up and broadly disseminate them, that’s a challenge. There’s a great phrase, that, you know, curricula evolve one tombstone, gravestone at a time. But I think we have to be more agile than that. And we’re seeing that in the way in which different disciplines are now coming together. Because none of them have the entire set of insights. The fact that network theory now can be applied to social networks or gene regulation, you know, these methods, computational methods, data-driven methods, are everywhere. And that produces a rich dialogue because you’ll find yourself talking to somebody in history, or economics, anthropology, as well as your own field, who has got part of the potential answer to wicked problems.
Navdip Dhariwal: And with the role of the ODI, how do you see that developing? What’s the future now for the ODI?
Nigel Shabolt: Into the second decade!
Navdip Dhariwal: With optimism.
Tim Berners-Lee: I think it is exciting. The world of data is very much richer than it was when we started. It’s partly because – just saying – because of things that the ODI brought to the- brought things like: it’s not just tech, it’s ethics; it’s not just the public data, it’s the whole Data Spectrum. So to a certain extent, and with that Data Spectrum, with the new protocols like Solid empowering people and allowing people to build new systems very much more quickly to solve problems.
So the world is exciting, it’s very powerful. It really, really needs all the ethical oversight and the values behind it, as Nigel says. And so, exciting!
Nigel Shadbolt: And it’s unfinished business, you know. I mean I think it’s also important to say that when we started out 10 years ago, we were still about trying to release as much public data as possible, and the non-personal stuff. There’s still a bunch of stuff to go on that. Yeah, we definitely do.
And there’s one or two sectors that are going to improve by having access to much high-quality data in the UK. We know there’s a lot of concern around water quality, for example. That same set of issues beset us 10 years ago in air pollution. You know, we’ve got to use data and make it our friend to improve things.
So the ODI will continue to be about convening those conversations, about trying to span technology to institutional change, governance to ethics, to real policy impact.
Navdip Dhariwal: Wonderful. Well we look forward to seeing all of that unfold. Sir Tim and Sir Nigel, thank you both very, very much for a really enlightening talk.
Emma Thwaites: So that’s all from this episode of Data Decade, and it’s been a really special one, recorded at our 2022 summit. A huge thanks to Sir Tim Berners-Lee and Sir Nigel Shabolt, our keynote speakers.
And if you want to find out more about anything that you’ve heard in this episode, head over to theodi.org where we continue the conversation around the last 10 years of data and the next decade ahead.
And if you’ve enjoyed the podcast, please do subscribe for updates. In the next episode, we’ll be looking at data at public policy.
I’m Emma Thwaites, and this has been Data Decade from the ODI.
For the State of the Nation address at the ODI Summit 2022, Navdip Dhariwal interviewed ODI co-founders, Sir Tim Berners-Lee, inventor of the World Wide Web, and Sir Nigel Shadbolt, one of the world’s foremost experts in AI. They shared their thoughts about the past 10 years of data, and what lies ahead in an increasingly data-driven world.
Data sovereignty and empowerment
Individuals’ empowerment and agency over data was a key theme, which was recently marked by the presentation of the Seoul Peace Prize 2022 to Sir Tim for his contributions to world peace through science technology, chiefly for his work promoting data sovereignty and leading the movement to decentralise the web. As Sir Tim notes: ‘It is specifically for looking at the empowerment of individuals in this world of data’.
Sir Nigel builds on this topic of sovereignty and empowerment of individuals as a key movement in the data world: ‘I think we’ve always seen data as fundamentally about empowering […]. And I think the issue that we’ve had is trying to return some self determination, some agency, to individuals’.
AI and machine learning
Artificial intelligence (AI) and machine learning also featured heavily in the discussion, with Sir Nigel commenting: ‘The particular power of some of the modern methods in machine learning, which consume huge amounts of data, have become a really powerful and critical feature of our data ecosystem. And we’ve got to think about what data is being used to train those systems.’
He noted the importance of spreading the benefits of the current at-scale AI and machine learning models, adding: ‘There are huge opportunities, and I’m sure we’ll talk about them in the context of some of the challenges that are facing us in the 21st century. But there are real issues that we’ve got to be concerned about. And the Open Data Institute is about convening those conversations.’
The role of data in climate change and emergency response
With COP27 taking place, the panel also discussed the vital topic of the role of data in climate change, with Sir Tim discussing the essential role of the climate and carbon open data: ‘Obviously, the open data out there is, is really, really essential’. He added that when data is about critical issues, such as climate, it drives people to become more data literate.
The consensus around the need for climate action provides an essential use case in data management, argued Sir Nigel: ‘It’s a use case where there’s going to be pretty general agreement, one would hope, about the importance of getting high quality data at planetary scale, and that it needs to be real time.’
Sir Nigel also noted that how the data flows is also critical.
It’s a question about building a trusted data ecosystem. It’s about the free flow of information in the scientific, engineering and policy worlds to understand what is happening and what we might do about that.
– Sir Nigel Shabolt, Executive Chair and Co-founder, ODI
The citizen science movement was also highlighted: ‘It’s crucial – people are mobilised […]. The whole citizen science movement – where you’ve got people who can take direct measurements of what’s happening in their environment, and add that to other forms of scientifically acquired data – is crucial’.
The Covid pandemic also provided a critical use case in the role of data – with data not only about the spread of the disease itself, but also behavioural elements such as how people act in crowds, vaccine uptake and response to national guidelines.
Sir Nigel noted that in areas like drug discovery, modern AI methods and large scale analytics can analyse, for example, the effectiveness of vaccines, or how to redeploy existing drugs in a new context. ‘All of that is needing us to link data, at scale, in a different way.’
I think one thing that’s changed is our willingness to understand that in an emergency, our presumption that everything is private to me, may have to be challenged. That there is a public good, as well as a private interest in how your information is shared. And I think that we’ve understood that through the pandemic.
– Sir Nigel Shabolt, Executive Chair and Co-founder, ODI
Sir Nigel also noted the importance of trust between governments, business and citizens in times of crisis: ‘I think, in the pandemic, a coalition of the willing that this is actually a contract between citizens and their governments, between individual consumers and the companies they contract their services from.’
Big tech and governance
Dhariwal asked about the dominance of big cloud providers, and how data is used and regulated.
Sir Tim noted the importance of considering the physical location of cloud storage. ‘It’s very easy when you build a system to grab a bit of cloud from one of the major providers – there are quite strong monopolies out there – and not think about where the physical computers are.’ He pointed to the associated risks of different governance methods across the globe, and how these can affect the relative safety of the physical data storage.
Sir Nigel notes that the idea of data sovereignty can be applied to cloud provision, and that where data is stored is critical. ‘[Data can be] diffused out there in a jurisdiction X – how will you guarantee it won’t leave those borders? In this discussion around the connected global world – are we seeing a disaggregation of those assumptions? Are we seeing more barriers being erected? Those are the things we should be concerned about’.
Effective governance of the big platforms was also raised, with the question of monopolies and the huge power yielded by the major players – including vast computer power, and the huge volumes of data and machine learning algorithms. ‘The issue about equitable access, about being able to interrogate the basis of those models, their strengths and weaknesses – that’s going to be a concern, and a conversation will have to have on quite a broad basis and the issue of how to think about effective governance,’ said Sir Nigel.
The next 10 years
Sir Tim noted that his aspirations for tech and social aspects over the next 10 years that he hopes for an interdisciplinary shift so that sciences and social sciences work more closely together.
When you look at the web, you don’t just need computer science, you also need economics, and you also need psychology… I hope that people going into higher education are becoming more interdisciplinary
– Sir Tim Berners-Lee, Co-founder, ODI
Sir Nigel echoed this, noting that his own students, although they are computer scientists ‘…there’s a huge appetite to understand what it is to be human in the 21st century, the challenges they’re facing. That’s about values. It’s not enough just to design a cool piece of software. What’s it in support of? What will it empower?’
And for the ODI? Sir Tim explained that the world of data is now ‘very much richer than it was when we started’ and that accessing the whole data spectrum allows new systems to develop at pace. Sir Nigel added that when the ODI started, the aim was to release as much public non-personal data as possible, citing air quality and water data initiatives. ‘We’ve got to use data, and make it our friend to improve things […] The ODI will continue to be about convening those conversations, about trying to span technology to institutional change; governance to ethics; to real policy impact’.