Metadata or Hyperdata, Link or Thread, What is a Web of Data? - Blog - Semantic Focus - The Semantic Web, Semantic Web technology and computational semantics

Home Blog Planet Search Contact About

Looking for something?

After the last post about "web of agents", I received a few questions about the "web of data." A few readers mistook my argument to be opposite of a web of data. Don't get me wrong, I have never been opposed to the presentation of a "web of data." I only emphasize that the web-of-data presentation is short of describing the human-web relationship in the Semantic Web. To encourage the engagement of more ordinary people to the grand vision of the Semantic Web, we need a more user-oriented presentation, i.e. a web of agents.

In order to balance my own arguments, however, in this post I express my understanding of a web of data. The Semantic Web will certainly be a web of data after it is realized. But how this web of data is connected, what the structure of this web of data will be; the answers to these questions are varied to different people. I am glad to share my answers to the readers of Semantic Focus.

Metadata, Hyperdata, Web of Data

When we think of a web of data, two terms immediately emerge. One is metadata, the other is hyperdata.

Metadata is data about data. Metadata explains typical semantics of respective concrete data items within local context. Metadata abstracts data items at the conceptual level.

In a recent post, Nova Spivack stated that hyperdata is to data what hypertext is to text. Hypertexts are texts with hyperlinks that link to other documents. Similarly, hyperdata are data with hyperlinks that link to other data.

Both metadata and hyperdata express semantics, but with different emphases. Metadata expresses local facts of data on either the conceptual essence (e.g. Web 2.0-style tags) or the semantics of display (e.g. XHTML tags). These facts are consistent locally, and they do not change with respect to the variation of the external links connected to the data. On the other hand, hyperdata expresses the semantics on how a local data is related to other data. In the other words, hyperdata shows where a local data item is within a broad network of data items.

A web of data is a network of data whose local characters are specified by metadata and global characters are specified by hyperdata. Through metadata and hyperdata, machines can automatically understand and query facts over a web of data. Thus, a web of data is a semantic web. The web-of-data presentation illustrates the back-end of the Semantic Web.

Web Link, Web Thread, Weaving a Web of Data

This metadata plus hyperdata diagram looks perfect, but with one tiny question. What is this web of data weaved by? Wait a minute, is this question dumb? Hyperdata are data with hyperlinks, aren't they?

There is another pair of terms we need to understand about a web of data. One is web link, the other is web thread.

A web link, or hyperlink, is a reference from one Web location to another Web location. Web links have several essential characters. A web link connects two and only two objects, where one is the outbound from the source and the other is the inbound to the target. A web link is unidirectional, and its semantics are about how the remote inbound supplements to the local outbound. The assignment of a web link is usually subjective. The owner of the outbound object decides whether a remote inbound object supplements. If there are more than one remote inbound object supplements, the owner also subjectively decides which one supplements the best because a web link allows only one reference.

Web links are a straightforward, easy-to-understand, and convenient-to-adopt concept. But it has a natural limit on its usage, especially within the diagram of a web of data. There are normally more than two web objects that share the same semantics. For example, there are certainly more than two posts on the Web that explain the concept "web of data" and all these posts semantically supplement each other. But web links only allow binary connections among them. Connecting a large set of items solely by binary links is not only tedious and expensive, but also inefficient to web search. In contrast, we prefer to have one single thread connecting all of them together. This observation leads to the concept of the web thread.

Web Link	Web Thread
Connects two objects	Connects arbitrary number of objects
Has a specified outbound object	Has a set of outbound objects
Has a specified inbound object	Has one fixed inbound object
Anonymous in general	Named
Assigned subjectively in general	Assigned objectively
Syntactically unidirectional	Syntactically omnidirectional
Semantically supplements the inbound end to the outbound end	Semantically supplements each other on the same thread

The concept "web thread" was first introduced in "Weaving the thread-driven Semantic Web." Later on, I explained that Web 2.0 tagging is a typical implementation of weaving the Web with threads. Web evolution from the current Web (a web of documents) to the Semantic Web (a web of data) is equivalent to the evolution from a node-driven web to a thread-driven web.

A web thread is a reference to a named web location. Unlike a web link, a web thread connects arbitrary numbers of objects at the same time. In contrast to unidirectional, a web thread is omnidirectional. Data in a thread is automatically connected to all other data in the same thread. All the data connected by the same web thread mutually supplements each other in semantics.

A web thread has only one fixed inbound object, which is this name of the thread. Any outbound object to this thread is assumed to agree on the semantics of this inbound object. Hence the connection from any outbound object to the inbound object becomes objective. We do not need, however, to explicitly prohibit subjective assignments of the outbound by humans. Machines may automatically decide whether a subjective assignment by humans is also objectively correct by checking the machine-processable criteria associated with the inbound object. The very basic checking criterion is the simple name matching as what we have experienced currently with Web 2.0 tags.

A web of data must be a thread-driven web. In fact, nobody can exhaustively express even a single object. In contrast, humans are used to compromising with each other on the basis of agreeing on the key identifiers of an object. Different people often have varied views on the same object. But as long as they agree on the key identifiers, they reach an agreement about the object and every varied individual view becomes a supplementary to the explanation of this object. This is the philosophy underneath web threads, and this understanding is also a philosophical foundation underneath a web of data.

With web threads, do we still need web links in a web of data? The answer is yes. Web threads cannot completely replace web links. Web links have their irreplaceable semantics.

Most importantly, web links show the originality and creativity of humans. Web threads are objective. Because they are objective, web threads are less original and creative. A web of data would have become a boring world if it would contain only threads. Web links are subjective. Web links may be incorrectly assigned with respect to their semantics. But isn't "incorrectness" a synonym of creativeness? If we want to engage collective intelligence in a web of data, allowing and encouraging subjective (and biased) assignment of web links is fundamental to explore human creativity.

Bridge "Web of Data" and "Web of Agents"

At the end, I want to address why the two presentations of a web of data and a web of agents are indeed exchangeable when they are describing the Semantic Web.

As we have explained, a web of data is primarily weaved by threads. The maintenance of these threads, however, depends on machines rather than humans. This maintenance issue could be solved in this way because web threads are objective existences. Basically, any single thread can be operated by a single machine agent. These machine agents thus automatically convert data objects into hyperdata engaged with web threads. This agent operation is essential to both a web of data and a web of agents.

On the other hand, when individual humans set up a new machine agent, this agent starts to operate itself on the Web with human-specified knowledge. It thus represents both the connections to existing web threads and the establishment of new web threads on the Web. Therefore, the experiences of machine agents on the Web are fundamentally the exploration and further weaving of a web of data.

In summary, a web of agents is what ordinary users can see about the Semantic Web at the front end, while a web of data is what professional developers understand to be the essence of the Semantic Web at the back end. These two presentations tell a common story from two different sides.

Tags:

About the author

I'm currently a Ph.D candidate in Brigham Young University with Prof. David W. Embley in Computer Science Department, Prof. Deryle W. Lonsdale in Linguistic Department, and Prof. Stephen W. Liddle in Marriott School of Management.