Semantic and Content exploration

I’ve been building out a set of semantic widgets for a while. Here I describe bringing all these efforts together in a single semantic and content search page.

Here’s the picture of my latest efforts

triple-provenance

This comprises 4 sections. The horizontal bar along the top helps you interactively describe a Sparql query. The remaining three side by side columns give you, in order, a list of matching entities for your query, a list of facts about that selected entity, and a list of content where the graphs for those facts were derived from.

sparqlbar
Interactive SPARQL query builder. You can select the type of entity you wish to find (Person). You then specify their properties, or relations to other entities, to any level. In the example I show a search for People who know other People who are members of an Organisation called Sandford Council, ensuring that those People returned are also themselves members of an organisation. This generates the SPARQL and sends it to the rest APIs /v1/graphs/sparql endpoint as a POST request

sparqlresults
Lists the entities returned by the SPARQL search. Will take the IRI and use this to lookup their RDF type. Based on this, will check application configuration for the ‘common name’ field. In FOAF this is ‘name’. It’ll then use SPARQL again for each result to fetch that in order to show a result like Abraham Troublemaker (Person). You can provide an optional action function to invoke once one of these is clicked.

entityfacts
When a result is clicked, uses SPARQL to load all triples where this entity is the subject. Again uses other sparql queries to load information when the object is an IRI not a property value. Displays property values too. Clicking a related item can load that entity in to the same widget too, to allow graph exploration. Can be given an optional searchresults widget to populate with related content when clicking the ‘find source content’ button. (I’ll enhance this widget with a ‘reverse’ button to show all facts where it is instead the object of the triple, too)

searchresults
My standard content search results from previous work. The entityfacts widget uses a relationship class I’ve defined called, in short, ‘derived_from’ which links a graph (where some facts are stored) with the originating document they were derived from. (They were also checked over manually by a user before being added to the triple store).

The whole demo allows you to upload a word document, this is converted to XHTML and entity enrichment takes place for a list of known places, people and organisations. This document is then loaded in to a widget which has the capability to find where these things occur in the same html paragraph, and present a list of suggested likely facts to the user for confirmation. When the user hits ‘save’ a graph is created per paragraph containing these facts, and an additional fact about the graph itself. (which doc it was derived from)

This widget code is available in my Github repo now, but I’m going to do more work to neaten it up too.
https://github.com/adamfowleruk/mldb

NB I’m not using PROV-O yet. Not sure if that gets everything we may want for fact/content provenance. I’m open to suggestions.

Advertisements

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s