Querying my semantic personal web history knowledge base

· development, discussion
Authors
The beauty of the DATAMI application is that, through the combination of the datami-proxy and the datami text processing component, what is being built is a complete knowledge base of what the user, me, encounters on the Web. The architecture of the application shows an additional component to score and rank entities from this knowledge base. In reality, the interface (as in the demo video) only needs to send the right queries to the knowledge base (as materialized in a SPARQL 1.1 compliant triple store) to obtain scored and ranked entities.

So let’s look at what sort of queries we need to get, say, the 25 most popular places in my web history.

The datami-proxy gives me indication about the websites that have been accessed, through requests. So basically, if I want to know which websites I have accessed, I need to query the ‘?ws’ such that
        ?r <http://weblifelog.com/ontology/toSite> ?ws
?r representing the request.

The annotation produced by the Stanbol Enhancer Service are then connected to the requests through a “related to” relation. So, if I wanted to know about all the entities ?x I encountered in websites, I would need to add:
        ?r <http://datami.co.uk/ontology/relatedTo> ?ea.
        ?ea <http://fise.iks-project.eu/ontology/entity-reference> ?x
using the stucture of entity annotation returned by the enhancer service (i.e. the “entity reference” relation). Adding to it that the type of the entity should be “place”:
        ?ea <http://fise.iks-project.eu/ontology/entity-type> <http://dbpedia.org/ontology/Place>
I get (as ?x) all the places I encountered through my online activities.

Now what we want is the 25 “most popular” ones. Starting simple, we can use as a score for popularity the number of websites mentioning the entity, so the query that would obtain the 25 most mentioned one (in SPARQL 1.1) would be:
select distinct ?x (count(distinct ?ws) as ?nws)
where {
   ?r <http://datami.co.uk/ontology/relatedTo> ?ea. 
   ?ea <http://fise.iks-project.eu/ontology/entity-reference> ?x.
   ?r <http://weblifelog.com/ontology/toSite> ?ws.
   ?ea <http://fise.iks-project.eu/ontology/entity-type> <http://dbpedia.org/ontology/Place>
} group by ?x order by desc(?nws) limit 25
Easy!

Adding a tiny bit of complexity, we can then also retrieve the label of the entity and use the confidence returned by the enhancer service as part of the score. The query then becomes:
select distinct ?x ?l (count(distinct ?nws) * avg(?conf) as ?score)
where {
   ?r <http://datami.co.uk/ontology/relatedTo> ?ea.
   ?ea <http://fise.iks-project.eu/ontology/entity-reference> ?x.
   ?r <http://weblifelog.com/ontology/toSite> ?ws.
   ?ea <http://fise.iks-project.eu/ontology/confidence> ?conf.
   ?ea <http://fise.iks-project.eu/ontology/entity-label> ?l.
   ?ea <http://fise.iks-project.eu/ontology/entity-type> <http://dbpedia.org/ontology/Place>
} group by ?x ?l order by desc(?score) limit 25

And that’s done! All that is left to add to get the DATAMI interface is to add filters for selected entities, websites and times, and the possibility to change the type of the entities considered.

1 Comment

Comments RSS
  1. The DATAMI Interface – DATAMI linked to this post.

Leave a Comment