Blog entries by James Simmons

Today on the Linking Open Data mailing list, Kingsley Idehen of OpenLink Software announced that he is preparing to load the entire LOD cloud into Virtuoso 6.0 Cluster Edition. The datasets are being added to a table on the ESW wiki, making it convenient for anyone doing Semantic Web research to get a hold of the datasets. Once all the datasets are added we should have a better idea of how much linked data there really is out there. This may also raise the bar for other triple stores and force them to develop methods for storing several billion triples.

Continue reading Calling All RDF Dumps

Tags:

Freebase stores millions of entities and assertions about nearly every topic one can ponder (thanks are owed to their seed dataset – Wikipedia – and their amazing community). The amount of information that Freebase stores is incredible, and is a testament to what can be accomplished with the help of a dedicated community and a little (or a lot) of clever software engineering.

Continue reading Can Graphd Scale to Meet Semantic Web Demands?

Tags:

I just stumbled upon a useful resource from Sindice (the Semantic Web search engine) called the Map of Data. The Map of Data lists sites that export their information via Microformats and embedded RDF (as well which format(s) the sites are using). Each site has been categorized and conveniently placed into lists. The categories include books, people, places, products and listings, social news, events, politics, and more. According to Sindice over 10 billion pieces of reusable information can already be found across 100 million pages.

Tags:

The Seesaw Effect of Algorithms vs. Data Over the years I've noticed that the importance of algorithms and data tends to shift back and forth, depending on which at the time is hardest to duplicate (often from a business perspective). This effect seems to be caused by the availability or demand of one side increasing or decreasing, shifting the balance of importance to the other. At one point the world of software was dominated by the proprietary. The organization with the best software (backend, algorithms, etc) was the dominant entity and data (from say, a Web 2.0 perspective) was generally not the focus. This may have partly been the responsibility of a mindset formed during an era with very little storage space and before mass user activity on the Web.

Continue reading Algorithms vs. Data: The Seesaw Effect

Tags:

Cross-Pollinating DBpedia and Freebase Now that Freebase is available as Linked Data a big question that comes to mind is whether these two major projects will move to assimilate one another. DBpedia and Freebase – two endeavors primarily focused on curating unstructured and semi-structured data about everything and releasing it back into the wild (with structure) – get the bulk of their information from Wikipedia, so the amount of topical overlap is assumed to be extremely high. DBpedia gains new information when it extracts data from the latest Wikipedia dump, whereas Freebase, in addition to Wikipedia extractions, gains new information through its userbase of editors.

Continue reading Cross-Pollinating DBpedia and Freebase

Tags:

At ISWC2008 Freebase released its new RDF service for generating RDF representations of Freebase topics, allowing Freebase to be used as Linked Data! To obtain the RDF data for a topic send a GET request to http://rdf.freebase.com/rdf/some.topic.id where "some.topic.id" is replaced by the desired topic identifier (slashes in the identifier must be replaced by dots). Topic data can be represented as N3, RDF/XML or Turtle depending on the preferences expressed in your client's HTTP Accept header. Try it out with the Freebase topic Semantic Web.

Continue reading Freebase Officially Linked Data with Release of RDF Service

Tags:

ReadWriteWeb just posted an interesting article about investor opportunities and pitfalls in the Semantic Web space. The questions were asked to a panel of industry insiders at the SemTech 2008 conference. Panelists include Amanda Reed (Palomar Ventures), Eghosa Omoigui (Intel), and Stephen Hall (Vulcan Capital). This information can be very useful if you're looking to start a business within the Semantic Web industry.

Tags:

QDOS, measurer of digital presence, has built an interface that lets you search for a FOAF profile. You can search for an individual by their email address or by the URL to their blog or homepage. Their goal is to index and make visible the entire FOAF social graph. If I'm not mistaken they're also helping to extend the social graph by republishing data provided by its users through its primary service (very cool indeed).

Tags:

I got an email from Dolors Reig about his Semantic Web planet-type site, Planeta Web Semántica, an aggregator of Semantic Web news in Spanish. The site indexes feeds in both Spanish and English to make up for the shortage of Spanish-language Semantic Web activity in the blogosphere. I doubt this will be so in the near future as Semantic Web concepts continue to gain traction with people around the world. The site sports a clean layout and I like that you're given the ability to comment on each news item. This is an excellent resource for those whose primary language is Spanish.

Tags:

Deadlines are fast approaching for those submitting papers, Doctoral Consortium applications and tutorial proposals for ISWC 2008! More information can be found here.

Upcoming deadlines:

Research papers: 9/16 May
Semantic Web in Use papers: 16 May
Tutorial proposals: 16 May
Doctoral Consortium applications: 16 May
Posters & Demo proposals: 25 July
Workshops papers (13 workshops): Varies
Semantic Web & Billion Triples challenge: 1 October

Tags:

Calling All RDF Dumps

Can Graphd Scale to Meet Semantic Web Demands?

The Map of Data: Over 10 Billion Pieces of Reusable Information

Algorithms vs. Data: The Seesaw Effect

Cross-Pollinating DBpedia and Freebase

Freebase Officially Linked Data with Release of RDF Service

Investor Opportunities and Pitfalls for the Semantic Web

QDOS Allows Users to Search the FOAF Social Graph

Planeta Web Semántica: Spanish Semantic Web News Aggregator

ISWC 2008 Deadlines Approaching