Yesterday we summarized some of the main developments in the Linked Data world over the past year. Linked Data is a W3C-backed movement that is all about connecting data sets across the Web. It can be viewed as a subset of the wider Semantic Web movement, which is about adding meaning to the Web. However, there is some confusion in the Semantic Web community about the crossover. To add to the confusion, there is a term called 'Open Data' that is being bandied around too. This commonly describes data that has been uploaded to the Web and is accessible to all, but isn't necessarily "linked" to other data sets.
So what's the beef with all of these terms? In this post we seek clarity!
The Difference Between Open Data and Linked Data
In the discussion over yesterday's post, a few people tweeted that the U.K. government's public data website Data.gov.uk is mostly populated with 'Open Data' and not 'Linked Data.' But what does that mean? It means that much of the data on the site is available to the public, but it doesn't link to other data sources on the Web. It could be data that has been uploaded in CSV format (i.e. spreadsheet data), which Sir Tim Berners-Lee said in an interview with me last year is a common occurrence with government departments. Or it could be data in another non-Web format.
Screen from a Tim Berners-Lee presentation on Linked Data, circa 2008
Titti Cimmino put it nicely: Open Data is simply 'data on the web,' whereas Linked Data is a 'web of data.'
However, the idea of Open Data is to turn it into Linked Data. As John S. Erickson pointed out, the first priority of Data.gov.uk (and its U.S. counterpart) is to publish lots of Open Data. The next step is to work towards linking it all up. This is already starting to happen. Answering a question I posed on Twitter, Kingsley Idehen confirmed that Data.gov.uk is currently a combination of Open Data and Linked Data.
Linked Data and The Semantic Web
So may we then suggest that the idea of Linked Data is to turn it into a Semantic Web? Or are they the same thing already?
Lorna Campbell from the University of Strathclyde in Scotland tackled those and other questions in an excellent post earlier this month. She started by warning of the potential for another "holy war" about terminology. I won't delve into that in this post, however this excerpt from Campbell's post gives you a flavor of the terminology angst:
"Some argue that RDF is integral to Linked Data, other suggest that while it may be desirable, use of RDF is optional rather than mandatory. Some reserve the capitalized term Linked Data for data that is based on RDF and SPARQL, preferring lower case "linked data", or "linkable data", for data that uses other technologies."
Even Wikipedia can't define Semantic Web...
Campbell quotes from a number of other articles, in trying to come to a conclusion about how Linked Data and the Semantic Web relate. Perhaps the best definition she found was this one by Paul Walk:
- data can be open, while not being linked
- data can be linked, while not being open
- data which is both open and linked is increasingly viable
- the Semantic Web can only function with data which is both open and linked"
Why This Matters
So there you have it, Linked Data is NOT the same as the Semantic Web. It's also not necessarily open, in other words accessible to developers.
Whatever the definitions, the key points about all of Open Data, Linked Data and the Semantic Web, are:
- data is being uploaded to the Web that wasn't online before (e.g. much of the data on Data.gov.uk).
- structure is being added to the data using Linked Data and/or Semantic Web technologies.
The bottom line is that the more data we have on the Web that is linked and has defined meaning, the smarter our web applications will be. This is why these activities are so exciting, despite the terminology confusion!