Practical Semantic Web and Linked Data Applications by Mark Watson - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Argument type

Format

Default value

query

required

string

type

optional

string

/location/citytown

limit

optional

integer

20

start

optional

integer

0

The best way to access Freebase is to use the MQL Query Language. However,

Freebase now has an RDF interface1 and we will use this interface in this chapter.

Please note that the Java, Clojure, JRuby, and Scala edition of this book wraps the

Java Freebase MQL client library for full access to Freebase. You can refer to the

other edition of this book for more detailed information concerning Freebase.

http://rdf.freebase.com/

The RDF interface can fetch all RDF triples for a given Freebase RDF resource iden-

tifier. As an example, here is the identifier for the Freebase topic about me:

http://rdf.freebase.com/ns/en.mark_louis_watson

The returned triples are (most not shown for brevity):2

<http://rdf.freebase.com/ns/en.mark_louis_watson>

<http://rdf.freebase.com/ns/people.person.date_of_birth>

"1951" .

<http://rdf.freebase.com/ns/en.mark_louis_watson>

<http://rdf.freebase.com/ns/common.topic.alias>

"Mark Watson"@en .

<http://rdf.freebase.com/ns/en.mark_louis_watson>

<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>

<http://rdf.freebase.com/ns/computer.software_developer> .

<http://rdf.freebase.com/ns/en.mark_louis_watson>

<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>

<http://rdf.freebase.com/ns/book.author> .

<http://rdf.freebase.com/ns/en.mark_louis_watson>

<http://creativecommons.org/ns#attributionName>

"Source: Freebase - The World’s database" .

1http://blog.freebase.com/2008/10/30/introducing the rdf service/

2Output edited to fit page width.

90

13.2. Accessing Freebase from Common Lisp

<http://rdf.freebase.com/ns/en.mark_louis_watson>

<http://rdf.freebase.com/ns/people. \\

person.education>

<http://rdf.freebase.com/ns/m.0b6_ggq> .

<http://rdf.freebase.com/ns/m.0b6_ggq>

<http://rdf.freebase.com/ns/education. \\

education.institution>

<http://rdf.freebase.com/ns/en.university_of_california_santa_barbara> .

You can use the Freebase RDF browser3 to find Freebase RDF resource identifiers

using free text search.

13.2. Accessing Freebase from Common Lisp

I created a short file freebase client/test.lisp that shows you how to create an MQL

query, encode it as JSON, and make a web service call to Freebase. For this example

I want to create JSON data that looks like:

[{

"name":

"Mark Louis Watson",

"type": []

}]

The file test.lisp creates this JSON query request and makes a web service call:

(require :aserve)

(in-package :net.aserve.client)

(push "../utils/yason/" asdf:*central-registry*)

(asdf:operate ’asdf:load-op ’yason)

(defvar mql-url

"http://api.freebase.com/api/service/mqlread?query=")

(defvar *h* (make-hash-table :test #’equal))

(defvar *h2* (make-hash-table :test #’equal))

(setf (gethash "name" *h2*) "Mark Louis Watson")

(setf (gethash "type" *h2*) (make-array 0))

(setf (gethash "query" *h*) (list *h2*))

3http://rdf.freebase.com/

91

13. Common Lisp Client Library for Freebase

(defvar *hs*

(with-output-to-string

(sstrm)

(json:encode *h* sstrm)))

(defvar *s*

(concatenate ’string

mql-url

(net.aserve.client::uriencode-string *hs*)))

(defvar *str-results* (do-http-request *s*))

(format t "Results:˜%˜%˜A˜%˜%" *str-results*)

The output is a string containing encoded JSON data and looks like:

{

"code": "/api/status/ok",

"result": [

{

"name": "Mark Louis Watson",

"type": [

"/common/topic",

"/people/person",

"/book/author",

"/computer/software_developer"

]

}

],

"status": "200 OK",

"transaction_id": "cache;cache04.p01;2010-10-23T22"

}

This code snippet gets the result as Lisp data:

(defvar *results* (json:parse *str-results*))

(maphash

#’(lambda (key val)

(format t "key: ˜A value: ˜A˜%" key val))

(car (gethash "result" *results*)))

92

13.3. Freebase Wrapup

The output looks like:

key: name value: Mark Louis Watson

key: type value: (/common/topic /people/person

/book/author

/computer/software_developer)

13.3. Freebase Wrapup

Freebase is a useful source of semantic data and this chapter introduced you to ac-

cessing Freebase in general and from Lisp client code. One issue with Freebase is that

it contains sparse data: some topics are well covered and others are not. If you use

Freebase in your Lisp applications start with the interactive query editor4 to explore

the available data and get valid MQL queries for the information you want. Once

you have valid MQL queries then use the Lisp code example from the last section to

convert your MQL queries to JSON data and call the Freebase web services.

4http://www.freebase.com/app/queryeditor

93

14. Common Lisp Client Library

for DBpedia

This Chapter will cover the development of a general purpose SPARQL client library

and also the use of this library to access the DBpedia SPARQL endpoint.

DBpedia is a mostly automatic extraction of RDF data from Wikipedia using the

metadata in Wikipedia articles. You have two alternatives for using DBpedia in your

own applications: using the public DBpedia SPARQL endpoint web service or down-

loading all or part of the DBpedia RDF data and loading it into your own RDF data

store (e.g., AllegroGraph or Sesame).

The public DBpedia SPARQL endpoint URI is http://dbpedia.org/sparql. For the

purpose of the examples in this book we will simply use the public SPARQL endpoint

but for serious applications I suggest that you run your own endpoint using the subset

of DBpedia data that you need..

The public DBpedia SPARQL endpoint is run using the Virtuoso Universal Server

(http://www.openlinksw.com/). If you want to run your own your own DBpedia

SPARQL endpoint you can download the RDF data files from http://wiki.dbpedia.org

and use the open source version of Virtuoso, Sesame, AllegroGraph, or any other

RDF data store that supports SPARQL queries.

14.1. Interactively Querying DBpedia Using the

Snorql Web Interface

When you start using DBpedia, a good starting point is the interactive web application

that accepts SPARQL queries and returns results. The URL of this service is:

http://dbpedia.org/snorql

Figure 14.1 shows the DBpedia Snorql web interface showing the results of one of

the sample SPARQL queries used in this section.

95

index-110_1.png

14. Common Lisp Client Library for DBpedia

Figure 14.1.: DBpedia Snorql Web Interface

A good way to become familiar with the DBpedia ontologies used in these examples

is to click the links for property names and resources returned as SPARQL query

results, as seen in Figure 14.1. Here are three different sample queries that you can

try:

PREFIX dbo: <http://dbpedia.org/ontology/>

SELECT ?s ?p WHERE {

?s ?p <http://dbpedia.org/resource/Berlin> .

}

ORDER BY ?name

PREFIX dbo: <http://dbpedia.org/ontology/>

SELECT ?s ?p WHERE {

?s dbo:state ?p

.

}

limit 25

PREFIX dbpedia2: <http://dbpedia.org/property/>

96

14.2. Interactively Finding Useful DBpedia Resources Using the gFacet Browser

PREFIX dbo: <http://dbpedia.org/ontology/>

SELECT ?location ?name ?state_name WHERE {

?location dbo:state ?state_name

.

?location dbpedia2:name ?name .

FILTER (LANG(?name) = ’en’) .

}

limit 25

The http://dbpedia.org/snorql SPARQL endpoint web application is a great resource

for interactively exploring the DBpedia RDF datastore. We will look at an alternative

browser in the next section.

14.2. Interactively Finding Useful DBpedia

Resources Using the gFacet Browser

The gFacet browser allows you to find RDF resources in DBpedia using a search

engine. After finding matching resources you can then dig down by clicking on indi-

vidual search results.

You can access the gFacet browser using this URL:

http://www.gfacet.org/dbpedia/

Figures 14.2 and 14.3 show a search example where I started by searching for ”Ari-

zona parks,” found five matching resources, clicked the first match ”Parks in Ari-

zona,” and then selected ”Dead Horse State Park.”1

14.3. The lookup.dbpedia.org Web Service

We will use Georgi Kobilarov’s DBpedia lookup web service to perform free text

search queries to find data in DBpedia using free text search. If you have a good idea

of what you are searching for and know the commonly used DBpedia RDF properties

then using the SPARQL endpoint is convenient. However, it is often simpler to just

perform a keyword search and this is what we will use the lookup web service for. We

will later see the implementation of a client library in Section ??. You can find docu-

mentation on the REST API at http://lookup.dbpedia.org/api/search.asmx?op=KeywordSearch.

Here is an example URL for a REST query:

1This is a park near my home where I go kayaking and fishing.

97

index-112_1.png

index-112_2.png

14. Common Lisp Client Library for DBpedia

Figure 14.2.: DBpedia Graph Facet Viewer

Figure 14.3.: DBpedia Graph Facet Viewer after selecting a resource

98

14.4. Using the AllegroGraph SPARQL Client Library to access DBpedia

http://lookup.dbpedia.org/api/search.asmx/KeywordSearch? \\

QueryString=Flagstaff\&QueryClass=XML\&MaxHits=10

As you will see in Section ??, the search client needs to filter results returned from

the lookup web service since the lookup service returns results with partial matches

of search terms. I prefer to get only results that contain all search terms.

The following sections contain implementations of a SPARQL client and a free text

search lookup client.

DBpedia is a mostly automatic extraction of RDF data from Wikipedia using the

metadata in Wikipedia articles. You have two alternatives for using DBpedia in your

own applications: using the public DBpedia SPARQL endpoint or downloading all

or part of the DBpedia RDF data and loading it into your own RDF data store (e.g.,

AllegroGraph or Sesame).

The public DBpedia SPARQL endpoint URI is http://dbpedia.org/sparql. For the

purpose of the examples in this book we will simply use the public SPARQL endpoint

but for serious applications I suggest that you run your own endpoint.

14.4. Using the AllegroGraph SPARQL Client

Library to access DBpedia

The AllegroGraph SPARQL Client library makes it very simple to use the public

DBPedia web service. I have an example in dbpedia/test.lisp that shows how to run a

sample query:2

markws-macbook:lisp_practical_semantic_web markw$ cd dbpedia/

markws-macbook:dbpedia markw$ lisp

CL-USER(1): :ld test

CL-USER(2): (sparql.client::run-sparql-remote

"http://dbpedia.org/sparql" "

PREFIX dbpedia: <http://dbpedia.org/ontology/>

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT ?person {

?person

dbpedia:birthPlace

<http://dbpedia.org/resource/Boston> .

}

LIMIT 6" :results-format :alists)

2I had to edit this output to fit the page width.

99

14. Common Lisp Client Library for DBpedia

(((|?person| .

!<http://dbpedia.org/resource/Benjamin_Franklin>))

((|?person| .

!<http://dbpedia.org/resource/Cotton_Mather>))

((|?person| .

!<http://dbpedia.org/resource/James_Spader>))

((|?person| .

!<http://dbpedia.org/resource/Christa_McAuliffe>))

((|?person| .

!<http://dbpedia.org/resource/Gordon_K_MacLeod>))

((|?person| .

!<http://dbpedia.org/resource/Edgar_Allan_Poe>)))

:SELECT

#(|?person|)

14.5. DBpedia Wrapup

DBPedia is a great information resource and the AllegroGraph SPARQL client library

makes it easy to make queries and use the results in Lisp applications. I will not use

DBPedia in any further examples in this book but I wanted to show you how to use

DBpedia in your Lisp applications.

100

15. Library for GeoNames

GeoNames (http://www.geonames.org/) is a geographic information database. The

raw data is available under a Creative Commons Attribution license. There is a free

web service and a commercial web service. For production environments you will

want to use the commercial service but for development purposes and for the exam-

ples for this book I use the free service1.

15.1. Using the cl-geonames Library

We will use the Common Lisp GeoNames client by Nicolas Lamirault2 in this chapter.

I installed cl-geonames and all dependencies3 in the utils directory for the examples

in in this book.

The file in geonames/test.lisp contains an example of loading and running a few cl-

geonames examples:

mark:lisp_practical_semantic_web markw$ cd geonames/

mark:geonames markw$ alisp

CL-USER(1): :ld test

CL-USER(2): (cl-geonames:geo-country-info :country ’("US" "FR"))

(:|geonames|

(:|country| (:|countryCode| "FR") (:|countryName| "France")

(:|isoNumeric| "250") (:|isoAlpha3| "FRA") (:|fipsCode| "FR")

(:|continent| "EU") (:|capital| "Paris") (:|areaInSqKm| "547030.0")

(:|population| "64768389") ...)

(:|country| (:|countryCode| "US") (:|countryName| "United States")

(:|isoNumeric| "840") (:|isoAlpha3| "USA") (:|fipsCode| "US")

(:|continent| "NA") (:|capital| "Washington")

(:|areaInSqKm| "9629091.0") (:|population| "310232863") ...))

1The geonames.org web service is limited to 2000 queries per hour from any single IP address. Com-

mercial support is available, or, with some effort, you can also run GeoNames on your own server with some effort. There are, for example, a few open source Ruby on Rails projects that use the Geonames data files and provide a web service interface.

2http://code.google.com/p/cl-geonames/

3Drakma, s-xml, cl-json, chunga, cl-base64, flexi-streams, puri, split-sequence, trivial-gray-streams, usocket

101

15. Library for GeoNames

CL-USER(3): (cl-geonames::geo-country-code "42.21" "-71.5")

(:|geonames|

(:|country| (:|countryCode| "US") (:|countryName| "United States")

(:|distance| "0.0")))

CL-USER(4): (cl-geonames::geo-elevation-srtm3 "42.21" "-71.5")

"122"

CL-USER(5): (cl-geonames::geo-country-subdivision "42.21" "-71.5")

(:|geonames|

(:|countrySubdivision| (:|countryCode| "US")

(:|countryName| "United States") (:|adminCode1| "MA")

(:|adminName1| "Massachusetts") ((:|code| :|type| "FIPS10-4") "25")

((:|code| :|type| "ISO3166-2") "MA") (:|distance| "0.0")))

CL-USER(6): (cl-geonames::geo-find-nearby-place-name "42.21" "-71.5" :radius 5) (:|geonames|

(:|geoname| (:|toponymName| "Hayden Row") (:|name| "Hayden Row")

(:|lat| "42.20426") (:|lng| "-71.51062") (:|geonameId| "4939154")

(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")

(:|fcode| "PPL") ...)

(:|geoname| (:|toponymName| "Hopkinton") (:|name| "Hopkinton")

(:|lat| "42.22358") (:|lng| "-71.52282") (:|geonameId| "7257691")

(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")

(:|fcode| "PPL") ...)

(:|geoname| (:|toponymName| "Hopkinton") (:|name| "Hopkinton")

(:|lat| "42.22871") (:|lng| "-71.52256") (:|geonameId| "4939881")

(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")

(:|fcode| "PPL") ...)

(:|geoname| (:|toponymName| "Camp Bob White")

(:|name| "Camp Bob White") (:|lat| "42.22648") (:|lng| "-71.4684")

(:|geonameId| "4932024") (:|countryCode| "US")

(:|countryName| "United States") (:|fcl| "P") (:|fcode| "PPL") ...)

(:|geoname| (:|toponymName| "North Milford") (:|name| "North Milford")

(:|lat| "42.18343") (:|lng| "-71.53784") (:|geonameId| "4945678")

(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")

(:|fcode| "PPL") ...))

CL-USER(7): (cl-geonames::geo-search "Sedona" "Sedona" "Sedona" :country ’("US") :max-rows 2) ((:|geonames| :|style| "MEDIUM") (:|totalResultsCount| "1")

(:|geoname| (:|toponymName| "Sedona") (:|name| "Sedona")

(:|lat| "34.86974") (:|lng| "-111.76099") (:|geonameId| "5313667")

(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")

(:|fcode| "PPL")))

102

15.2. Geonames Wrapup

15.2. Geonames Wrapup

We will not use Geonames in any further examples in this book but I wanted to show

you how to load and use cl-geonames since it is a great resource when you are dealing

with data for countries, cities, etc. The Freebase database also has geographic data.

103

Part IV.

Example Semantic Web

Application

105

16. Semantic Web Portal Back

End Services

The web portal application developed in this chapter and in Chapter 17 is meant

to show you several useful techniques: how to use an RDF data store instead of a

relational database, how to organize complex information as RDF data, and how to

write a high performance web application using Common Lisp and the open source

Portable Allegroserve library1.

In this chapter I will first list all of the ”back end” functionality that the web UI

developed in Chapter 17 will need. Then we will implement this functionality in a

single file web app/backend.lisp. The required functionality is:

1. Read initial RDF data from a file init.nt that contains a few login accounts and

the port number that the web application will use.

2. Check for valid user login.

3. Utilities for entering new ”documents” into the system: perform NLP semantic

analysis, save semantic tags, entities, input text in AllegroServe.

4. Search wrapper: given search terms, return matching document IDs.

5. Given a document ID, return all information about the document.

I am going to use my own KnowledgeBooks NLP library in this chapter but a good

exercise for you would be to make an alternative version that uses the client library

for Open Calais that I wrote for Chapter 11.

After backend.lisp is written then it will be fairly easy to write the web application

UI in Chapter 17.

1The part of this example using AllegroGraph is specific to Franz Lisp and AllegroGraph but what you will learn in Chapter 17 will work well with other Common Lisp implementations like SBCL.

107

16. Semantic Web Portal Back End Services

16.1. Implementing the Back End APIs

This pedantic example web application substitutes the use of AllegroGraph instead

of a relational database for all data storage requirements. A real application would

probably use a relational database to store user information and AllegroGraph to store

semantic data.

The file web app/backend.lisp contains the implementation of the back end APIs and

you should open this file in a text editor while you read through this section because I

will only show you as few code snippets in the book text. I start by loading my NLP

library and the AllegroGraph library and performing some AllegroGraph initializa-

tion as seen in earlier book examples:

(push "../knowledgebooks_nlp/" asdf:*central-registry*)

(asdf:operate ’asdf:load-op :kbnlp)

(eval-when (compile load eval)

(require :aserve)

(require :agraph))

Here I loaded my NLP library, the open source portable AllegroServe library, and

the embedded AllegroGraph library. The following code snippet performs the same

AllegroGraph setup that we have already seen and loads the application parameters

from an N-Triple RDF file into our local RDF data store using the file path /tmp/web-

portal rdf:

(defpackage :user (:use :net.aserve.client :kbnlp))

(in-package :user)

(db.agraph.user::enable-!-reader)

(db.agraph.user::create-triple-store

"/tmp/webportal\_rdf")

(db.agraph.user::register-namespace

"kb" "http://knowledgebooks.com/rdfs#")

(db.agraph.user::register-freetext-predicate

!kb:docTitle)

(db.agraph.user::register-freetext-predicate

!kb:docText)

(db.agraph.user::load-ntriples #p"init.nt")

The most interesting code in backend.lisp is the function for adding a new document:

108

16.1. Implementing the Back End APIs

(defun add-document (doc-uri doc-title doc-text)

(let* ((txt-obj

(kbnlp:make-text-object doc-text

:title doc-title

:url doc-uri))

(resource (db.agraph.user::resource doc-uri)))

(db.agraph.user::add-triple resource

!rdf:type !kb:document)

(db.agraph.user::add-triple resource

!kb:docTitle

(db.agraph.user::literal

doc-title))

(db.agraph.user::add-triple resource

!kb:docText

(db.agraph.user::literal

doc-text))

(dolist (human-name (kbnlp::text-human-names txt-obj))

(pprint human-name)

(db.agraph.user::add-triple resource

!kb:docPersonEntity

(db.agraph.user::literal

human-name)))

(dolist (place-name (kbnlp::text-place-names txt-obj))

(pprint place-name)

(db.agraph.user::add-triple resource

!kb:docPlaceEntity

(db.agraph.user::literal

place-name)))

(dolist (tag (kbnlp::text-category-tags txt-obj))

(pprint tag)

(db.agraph.user::add-triple

resource !kb:docTag

(db.agraph.user::literal

(format nil "˜A/˜A" (car tag) (cadr tag)))))))

Here I used my NLP library but as an exercise, you could rewrite this using the Open

Calais client library I provided in Chapter 11. It can be useful having a local entity

extraction library so applications do not need access to the Internet.2

The function doc-search is a simple wrapper for using the AllegroGraph text search

APIs:

2As I write this chapter in October 2010, I am on a ship in the Pacific Ocean with a poor Internet connection.

109

16. Semantic Web Portal Back End Services

(defun doc-search (search-term-string)

"return a list of matching doc IDs"

(db.agraph.user::freetext-get-ids search-term-string))

Search results are retuned as a list of document IDs and the

You may also like...