Thomson Reuters’ data services go open

$download_content = get_field('download_content');

Thomson Reuters took an innovative linked data approach that allowed it to join up its specialised data assets without having to migrate them to a single database or data warehouse

Thomson Reuters is a multinational organisation that provides news and information to industry professionals. The company offers a wide range of services – primarily around information and data – in areas including finance and risk, law, tax and accounting, intellectual property and science. Thomson Reuters reported $12.2bn in revenue in 2015.

Thomson Reuters was formed through Thomson Corporation’s merger with Reuters Group in 2008 and has since continued to grow, in part through acquisition. But as with any company that has evolved in this way, there can be challenges in integrating legacy data resources and systems.

In order to tackle this challenge, Thomson Reuters took an innovative linked data approach that allowed it to join up its specialised data assets without having to migrate them to a single database or data warehouse. It began by establishing central data ‘authorities’ to ensure that key entity data (such as organisations and people) was only mastered once. Each of these entities was assigned a Permanent Identifier (PermID). Each PermID provided a unique reference point for a single entity – for example, Thomson Reuters Corp itself has a PermID of 4295861160. The concept of permanence meant the PermID, and its relationship to a specific entity, would not change over time. Existing databases were ‘re-mastered’ by adding PermIDs alongside legacy identifiers, with a view to ultimately replace them altogether. PermIDs were also used to uniquely identify other information objects such as news articles, deals and corporate actions.

Demand for the PermID approach

Thomson Reuters is no stranger to helping its customers with their data management challenges, having managed the RIC system for financial instrument identification for many years. The team therefore recognised that the work they were doing internally with PermID had broader applications. Not only were new internal products and services created, but Thomson Reuters was approached directly by clients who used its products and wanted to integrate them with their own data.

Like Thomson Reuters, these clients wanted to join up their own data holdings for internal use and provide consistent and interoperable data products and services to their clients. As Wilbur Swan, Head of Enterprise Metadata Services, puts it, data siloing issues were not unique to Thomson Reuters, but “a problem shared by all our clients”.

Thomson Reuters “knew [it] had in PermID a solution to wider industry problems”, says Tim Baker, Global Head of Content Strategy and Innovation. And as Dan Bennett, Head of Enterprise Data Services, explains:

We believe that this is the right thing for our customers because our customers are asking for it. And, in a very broad sense, when you do what’s good for your customers it’s generally good for you.

While providing access to the PermID system can help customers with their own internal data silo issues, Tim Baker sees the benefit for Thomson Reuters in “making [its] data much easier to use, which in turn makes it more accessible, more cost effective, and hopefully more widely used by [its] customers as a result”. By providing the tools to integrate its data products and internal client data, Thomson Reuters saw a clear business opportunity.

The case for opening up PermID

Thomson Reuters then had to decide how to expose PermID to its customers. Past experience has taught the industry that releasing proprietary identifiers with restrictive licensing conditions can create significant problems. One key issue is reusers’ inability to expose these proprietary identifiers to their own clients, and even to other departments within their business, in some cases. This has made people approach new proprietary identifiers with caution.

Thomson Reuters realised that the only way customers would embrace and be able to recognise the full potential of PermID was if it took a radically different approach – an open approach. Thomson Reuters and the Open Data Institute (ODI) had already explored the importance of identifiers in such an approach, in an earlier white paper which laid some of the groundwork for PermID. “All we’re trying to do is make it easier for our customers to work with us,” explains Dan Bennett. “Only through making it open data was it actually going to be a reasonable experience for our customers.” Dan Meisner, Head of Capability for Open Data, agrees: “For our customers, it’s that commitment to openness of the identifier and of the information model that‘s important.”

Making a commercial case for open

While it was clear that allowing anyone to access, use or share the data was the only way to fully recognise the potential benefits of PermID, being a commercial organisation meant that Thomson Reuters had to justify this decision internally. This can be challenging, because “customers see an awful lot of value in this but commercially it’s not easy to put a value on,” notes Dan Bennett. “The benefits from this to us are mainly tangential,” he adds. The issue is, as Dan Meisner puts it, that these indirect benefits and network effects “don’t really fit very well into an Excel model for calculating your internal rate of return”.

Ultimately the financial case was made, as Dan Bennett explains:

We don’t exist to make money out of issuing those identifiers. We create those identifiers because it’s important to our internal data model. The reality is that we have this data and are managing it anyway, so the incremental cost for us to expose it externally is not that great in the grand scheme of things.

As such Thomson Reuters decided to publish a subset of its data, including associated PermIDs, under an open Creative Commons licence (CC-BY 4.0). An extended set of fields offering further descriptions of the entities in question has been released under a Creative Commons non-commercial licence (CC-NC 3.0). They launched this service as Open PermIDin 2015, achieving an ODI Open Data Certificate in the process of release.

Unlocking benefits to PermID itself

The benefits of providing this data openly are not limited to customers. In addition to recognising the indirect financial benefits of increased data use, Thomson Reuters expects open licensing will benefit the data itself. Because the data is open, Thomson Reuters can gain valuable external feedback not only from clients but also from others who choose to use the data because they all have a vested interest in its accuracy. “It’s a bit like open source software,” says Dan Bennett, “… which is generally thought to be more secure because you have more eyes on it. In the same way, having more eyes on our data will make it gradually stronger and richer.”

Not only does the company expect opening the data will improve its accuracy but, because linked data is used, when others link their own external open data sources to the PermID data, it enriches and increases the value of the Thomson Reuters data without additional effort.

Unlocking PermID also opens up new solutions for Thomson Reuters to sell. These include Thomson Reuters Intelligent Tagging which uses PermID when tagging unstructured data, helping organisations to enhance the value of their content sets.

Securing a future through an open approach

Regardless of any related commercial opportunities, in making the decision to release PermID under an open licence, Thomson Reuters is looking well beyond immediate costs and benefits. The company fundamentally sees the release as part of a long-term strategy responding to emerging trends and challenges in the industry. One of the main trends concerning Thomson Reuters is that “in many ways, we’re in a post-scarcity world for data,” explains Dan Meisner. Thomson Reuters is keenly aware that the proliferation of data has a significant effect. Tim Baker says “as a company, we’ve realised that customers want to use data from more and more sources. There’s no value to them in a ‘one-size-fits-all’ approach to the data they use.” Dan Meisner agrees:

Customers are grabbing data from different parts of their own businesses [and] they’re looking at open data from government sources and elsewhere. The high-quality professional reference data that is our bread and butter is not becoming less important, but it’s becoming less of an overall component of our clients’ enterprise data inventory.

In the face of these changes, which Dan Meisner describes as “moving towards a networked data economy”, Thomson Reuters sees its role as an actor that can “help the industry take this wealth of data and make it actionable information”. The company believes it is well positioned to do this because of its existing expertise in data integration. As Tim Baker puts it:

In the past, there was a lot of value in the way data companies acquired, organised and presented information for their clients. And to an extent there always will be, but the real value now lies in the underlying data models, the naming mechanisms and the tools used to extract meaning from and link data. This is where our customers are looking for us to help them, and we’ve been reshaping our business to meet this need.

Thomson Reuters believes PermID forms a key part of this value proposition and that the timing of its release is also key. “I think there’s a sense that if we don’t do it, somebody else might,” explains Dan Meisner. Having witnessed the rise of platform firms across the tech sector, the company is determined to experiment and emulate the model. Dan Meisnerexplains:

Increasing amounts of open data are being published, and we’re making the investment to be a foundational part of this future ecosystem. The basic idea is that just as this has helped us connect data from around our own organisation, it should then help our customers, our partners, and maybe even our competitors do the same with their data, and plug that into our organisation, our platform.

By positioning itself in this way, Thomson Reuters believes it can cement its position in the information industry for the foreseeable future. “The environment our customers operate in has fundamentally changed, and we have evolved our business model to suit,” says Tim Baker. “We’re becoming an open platform for our customers, a core piece of their operating infrastructure.”

Open for business

Thomson Reuters has taken this approach not only to help its clients integrate their data but also to help them, and others, to better manage internal data. The company believes making it easier for customers to integrate Thomson Reuters data will increase its value and usage. In addition, by combining open data with stable identifiers, it gives clients freedom to experiment, link their own open or proprietary data and provide feedback on the identifier system, all of which makes PermID a stronger offering – both internally and externally. Thomson Reuters is also well positioned to create a platform based on its data and information model, becoming a central component in a future ecosystem. By embracing linked open data, Thomson Reuters is creating a competitive advantage in paving the way to build new products and generate new business models – both now and in the future.