Argument type
Format
Default value
query
required
string
type
optional
string
/location/citytown
limit
optional
integer
20
start
optional
integer
0
The best way to access Freebase is to use the MQL Query Language. However,
Freebase now has an RDF interface1 and we will use this interface in this chapter.
Please note that the Java, Clojure, JRuby, and Scala edition of this book wraps the
Java Freebase MQL client library for full access to Freebase. You can refer to the
other edition of this book for more detailed information concerning Freebase.
http://rdf.freebase.com/
The RDF interface can fetch all RDF triples for a given Freebase RDF resource iden-
tifier. As an example, here is the identifier for the Freebase topic about me:
http://rdf.freebase.com/ns/en.mark_louis_watson
The returned triples are (most not shown for brevity):2
<http://rdf.freebase.com/ns/en.mark_louis_watson>
<http://rdf.freebase.com/ns/people.person.date_of_birth>
"1951" .
<http://rdf.freebase.com/ns/en.mark_louis_watson>
<http://rdf.freebase.com/ns/common.topic.alias>
"Mark Watson"@en .
<http://rdf.freebase.com/ns/en.mark_louis_watson>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://rdf.freebase.com/ns/computer.software_developer> .
<http://rdf.freebase.com/ns/en.mark_louis_watson>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://rdf.freebase.com/ns/book.author> .
<http://rdf.freebase.com/ns/en.mark_louis_watson>
<http://creativecommons.org/ns#attributionName>
"Source: Freebase - The World’s database" .
1http://blog.freebase.com/2008/10/30/introducing the rdf service/
2Output edited to fit page width.
90
13.2. Accessing Freebase from Common Lisp
<http://rdf.freebase.com/ns/en.mark_louis_watson>
<http://rdf.freebase.com/ns/people. \\
person.education>
<http://rdf.freebase.com/ns/m.0b6_ggq> .
<http://rdf.freebase.com/ns/m.0b6_ggq>
<http://rdf.freebase.com/ns/education. \\
education.institution>
<http://rdf.freebase.com/ns/en.university_of_california_santa_barbara> .
You can use the Freebase RDF browser3 to find Freebase RDF resource identifiers
using free text search.
13.2. Accessing Freebase from Common Lisp
I created a short file freebase client/test.lisp that shows you how to create an MQL
query, encode it as JSON, and make a web service call to Freebase. For this example
I want to create JSON data that looks like:
[{
"name":
"Mark Louis Watson",
"type": []
}]
The file test.lisp creates this JSON query request and makes a web service call:
(require :aserve)
(in-package :net.aserve.client)
(push "../utils/yason/" asdf:*central-registry*)
(asdf:operate ’asdf:load-op ’yason)
(defvar mql-url
"http://api.freebase.com/api/service/mqlread?query=")
(defvar *h* (make-hash-table :test #’equal))
(defvar *h2* (make-hash-table :test #’equal))
(setf (gethash "name" *h2*) "Mark Louis Watson")
(setf (gethash "type" *h2*) (make-array 0))
(setf (gethash "query" *h*) (list *h2*))
3http://rdf.freebase.com/
91
13. Common Lisp Client Library for Freebase
(defvar *hs*
(with-output-to-string
(sstrm)
(json:encode *h* sstrm)))
(defvar *s*
(concatenate ’string
mql-url
(net.aserve.client::uriencode-string *hs*)))
(defvar *str-results* (do-http-request *s*))
(format t "Results:˜%˜%˜A˜%˜%" *str-results*)
The output is a string containing encoded JSON data and looks like:
{
"code": "/api/status/ok",
"result": [
{
"name": "Mark Louis Watson",
"type": [
"/common/topic",
"/people/person",
"/book/author",
"/computer/software_developer"
]
}
],
"status": "200 OK",
"transaction_id": "cache;cache04.p01;2010-10-23T22"
}
This code snippet gets the result as Lisp data:
(defvar *results* (json:parse *str-results*))
(maphash
#’(lambda (key val)
(format t "key: ˜A value: ˜A˜%" key val))
(car (gethash "result" *results*)))
92
13.3. Freebase Wrapup
The output looks like:
key: name value: Mark Louis Watson
key: type value: (/common/topic /people/person
/book/author
/computer/software_developer)
13.3. Freebase Wrapup
Freebase is a useful source of semantic data and this chapter introduced you to ac-
cessing Freebase in general and from Lisp client code. One issue with Freebase is that
it contains sparse data: some topics are well covered and others are not. If you use
Freebase in your Lisp applications start with the interactive query editor4 to explore
the available data and get valid MQL queries for the information you want. Once
you have valid MQL queries then use the Lisp code example from the last section to
convert your MQL queries to JSON data and call the Freebase web services.
4http://www.freebase.com/app/queryeditor
93
14. Common Lisp Client Library
for DBpedia
This Chapter will cover the development of a general purpose SPARQL client library
and also the use of this library to access the DBpedia SPARQL endpoint.
DBpedia is a mostly automatic extraction of RDF data from Wikipedia using the
metadata in Wikipedia articles. You have two alternatives for using DBpedia in your
own applications: using the public DBpedia SPARQL endpoint web service or down-
loading all or part of the DBpedia RDF data and loading it into your own RDF data
store (e.g., AllegroGraph or Sesame).
The public DBpedia SPARQL endpoint URI is http://dbpedia.org/sparql. For the
purpose of the examples in this book we will simply use the public SPARQL endpoint
but for serious applications I suggest that you run your own endpoint using the subset
of DBpedia data that you need..
The public DBpedia SPARQL endpoint is run using the Virtuoso Universal Server
(http://www.openlinksw.com/). If you want to run your own your own DBpedia
SPARQL endpoint you can download the RDF data files from http://wiki.dbpedia.org
and use the open source version of Virtuoso, Sesame, AllegroGraph, or any other
RDF data store that supports SPARQL queries.
14.1. Interactively Querying DBpedia Using the
Snorql Web Interface
When you start using DBpedia, a good starting point is the interactive web application
that accepts SPARQL queries and returns results. The URL of this service is:
http://dbpedia.org/snorql
Figure 14.1 shows the DBpedia Snorql web interface showing the results of one of
the sample SPARQL queries used in this section.
95
14. Common Lisp Client Library for DBpedia
Figure 14.1.: DBpedia Snorql Web Interface
A good way to become familiar with the DBpedia ontologies used in these examples
is to click the links for property names and resources returned as SPARQL query
results, as seen in Figure 14.1. Here are three different sample queries that you can
try:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?s ?p WHERE {
?s ?p <http://dbpedia.org/resource/Berlin> .
}
ORDER BY ?name
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?s ?p WHERE {
?s dbo:state ?p
.
}
limit 25
PREFIX dbpedia2: <http://dbpedia.org/property/>
96
14.2. Interactively Finding Useful DBpedia Resources Using the gFacet Browser
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?location ?name ?state_name WHERE {
?location dbo:state ?state_name
.
?location dbpedia2:name ?name .
FILTER (LANG(?name) = ’en’) .
}
limit 25
The http://dbpedia.org/snorql SPARQL endpoint web application is a great resource
for interactively exploring the DBpedia RDF datastore. We will look at an alternative
browser in the next section.
14.2. Interactively Finding Useful DBpedia
Resources Using the gFacet Browser
The gFacet browser allows you to find RDF resources in DBpedia using a search
engine. After finding matching resources you can then dig down by clicking on indi-
vidual search results.
You can access the gFacet browser using this URL:
http://www.gfacet.org/dbpedia/
Figures 14.2 and 14.3 show a search example where I started by searching for ”Ari-
zona parks,” found five matching resources, clicked the first match ”Parks in Ari-
zona,” and then selected ”Dead Horse State Park.”1
14.3. The lookup.dbpedia.org Web Service
We will use Georgi Kobilarov’s DBpedia lookup web service to perform free text
search queries to find data in DBpedia using free text search. If you have a good idea
of what you are searching for and know the commonly used DBpedia RDF properties
then using the SPARQL endpoint is convenient. However, it is often simpler to just
perform a keyword search and this is what we will use the lookup web service for. We
will later see the implementation of a client library in Section ??. You can find docu-
mentation on the REST API at http://lookup.dbpedia.org/api/search.asmx?op=KeywordSearch.
Here is an example URL for a REST query:
1This is a park near my home where I go kayaking and fishing.
97
14. Common Lisp Client Library for DBpedia
Figure 14.2.: DBpedia Graph Facet Viewer
Figure 14.3.: DBpedia Graph Facet Viewer after selecting a resource
98
14.4. Using the AllegroGraph SPARQL Client Library to access DBpedia
http://lookup.dbpedia.org/api/search.asmx/KeywordSearch? \\
QueryString=Flagstaff\&QueryClass=XML\&MaxHits=10
As you will see in Section ??, the search client needs to filter results returned from
the lookup web service since the lookup service returns results with partial matches
of search terms. I prefer to get only results that contain all search terms.
The following sections contain implementations of a SPARQL client and a free text
search lookup client.
DBpedia is a mostly automatic extraction of RDF data from Wikipedia using the
metadata in Wikipedia articles. You have two alternatives for using DBpedia in your
own applications: using the public DBpedia SPARQL endpoint or downloading all
or part of the DBpedia RDF data and loading it into your own RDF data store (e.g.,
AllegroGraph or Sesame).
The public DBpedia SPARQL endpoint URI is http://dbpedia.org/sparql. For the
purpose of the examples in this book we will simply use the public SPARQL endpoint
but for serious applications I suggest that you run your own endpoint.
14.4. Using the AllegroGraph SPARQL Client
Library to access DBpedia
The AllegroGraph SPARQL Client library makes it very simple to use the public
DBPedia web service. I have an example in dbpedia/test.lisp that shows how to run a
sample query:2
markws-macbook:lisp_practical_semantic_web markw$ cd dbpedia/
markws-macbook:dbpedia markw$ lisp
CL-USER(1): :ld test
CL-USER(2): (sparql.client::run-sparql-remote
"http://dbpedia.org/sparql" "
PREFIX dbpedia: <http://dbpedia.org/ontology/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?person {
?person
dbpedia:birthPlace
<http://dbpedia.org/resource/Boston> .
}
LIMIT 6" :results-format :alists)
2I had to edit this output to fit the page width.
99
14. Common Lisp Client Library for DBpedia
(((|?person| .
!<http://dbpedia.org/resource/Benjamin_Franklin>))
((|?person| .
!<http://dbpedia.org/resource/Cotton_Mather>))
((|?person| .
!<http://dbpedia.org/resource/James_Spader>))
((|?person| .
!<http://dbpedia.org/resource/Christa_McAuliffe>))
((|?person| .
!<http://dbpedia.org/resource/Gordon_K_MacLeod>))
((|?person| .
!<http://dbpedia.org/resource/Edgar_Allan_Poe>)))
:SELECT
#(|?person|)
14.5. DBpedia Wrapup
DBPedia is a great information resource and the AllegroGraph SPARQL client library
makes it easy to make queries and use the results in Lisp applications. I will not use
DBPedia in any further examples in this book but I wanted to show you how to use
DBpedia in your Lisp applications.
100
15. Library for GeoNames
GeoNames (http://www.geonames.org/) is a geographic information database. The
raw data is available under a Creative Commons Attribution license. There is a free
web service and a commercial web service. For production environments you will
want to use the commercial service but for development purposes and for the exam-
ples for this book I use the free service1.
15.1. Using the cl-geonames Library
We will use the Common Lisp GeoNames client by Nicolas Lamirault2 in this chapter.
I installed cl-geonames and all dependencies3 in the utils directory for the examples
in in this book.
The file in geonames/test.lisp contains an example of loading and running a few cl-
geonames examples:
mark:lisp_practical_semantic_web markw$ cd geonames/
mark:geonames markw$ alisp
CL-USER(1): :ld test
CL-USER(2): (cl-geonames:geo-country-info :country ’("US" "FR"))
(:|geonames|
(:|country| (:|countryCode| "FR") (:|countryName| "France")
(:|isoNumeric| "250") (:|isoAlpha3| "FRA") (:|fipsCode| "FR")
(:|continent| "EU") (:|capital| "Paris") (:|areaInSqKm| "547030.0")
(:|population| "64768389") ...)
(:|country| (:|countryCode| "US") (:|countryName| "United States")
(:|isoNumeric| "840") (:|isoAlpha3| "USA") (:|fipsCode| "US")
(:|continent| "NA") (:|capital| "Washington")
(:|areaInSqKm| "9629091.0") (:|population| "310232863") ...))
1The geonames.org web service is limited to 2000 queries per hour from any single IP address. Com-
mercial support is available, or, with some effort, you can also run GeoNames on your own server with some effort. There are, for example, a few open source Ruby on Rails projects that use the Geonames data files and provide a web service interface.
2http://code.google.com/p/cl-geonames/
3Drakma, s-xml, cl-json, chunga, cl-base64, flexi-streams, puri, split-sequence, trivial-gray-streams, usocket
101
15. Library for GeoNames
CL-USER(3): (cl-geonames::geo-country-code "42.21" "-71.5")
(:|geonames|
(:|country| (:|countryCode| "US") (:|countryName| "United States")
(:|distance| "0.0")))
CL-USER(4): (cl-geonames::geo-elevation-srtm3 "42.21" "-71.5")
"122"
CL-USER(5): (cl-geonames::geo-country-subdivision "42.21" "-71.5")
(:|geonames|
(:|countrySubdivision| (:|countryCode| "US")
(:|countryName| "United States") (:|adminCode1| "MA")
(:|adminName1| "Massachusetts") ((:|code| :|type| "FIPS10-4") "25")
((:|code| :|type| "ISO3166-2") "MA") (:|distance| "0.0")))
CL-USER(6): (cl-geonames::geo-find-nearby-place-name "42.21" "-71.5" :radius 5) (:|geonames|
(:|geoname| (:|toponymName| "Hayden Row") (:|name| "Hayden Row")
(:|lat| "42.20426") (:|lng| "-71.51062") (:|geonameId| "4939154")
(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")
(:|fcode| "PPL") ...)
(:|geoname| (:|toponymName| "Hopkinton") (:|name| "Hopkinton")
(:|lat| "42.22358") (:|lng| "-71.52282") (:|geonameId| "7257691")
(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")
(:|fcode| "PPL") ...)
(:|geoname| (:|toponymName| "Hopkinton") (:|name| "Hopkinton")
(:|lat| "42.22871") (:|lng| "-71.52256") (:|geonameId| "4939881")
(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")
(:|fcode| "PPL") ...)
(:|geoname| (:|toponymName| "Camp Bob White")
(:|name| "Camp Bob White") (:|lat| "42.22648") (:|lng| "-71.4684")
(:|geonameId| "4932024") (:|countryCode| "US")
(:|countryName| "United States") (:|fcl| "P") (:|fcode| "PPL") ...)
(:|geoname| (:|toponymName| "North Milford") (:|name| "North Milford")
(:|lat| "42.18343") (:|lng| "-71.53784") (:|geonameId| "4945678")
(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")
(:|fcode| "PPL") ...))
CL-USER(7): (cl-geonames::geo-search "Sedona" "Sedona" "Sedona" :country ’("US") :max-rows 2) ((:|geonames| :|style| "MEDIUM") (:|totalResultsCount| "1")
(:|geoname| (:|toponymName| "Sedona") (:|name| "Sedona")
(:|lat| "34.86974") (:|lng| "-111.76099") (:|geonameId| "5313667")
(:|countryCode| "US") (:|countryName| "United States") (:|fcl| "P")
(:|fcode| "PPL")))
102
15.2. Geonames Wrapup
15.2. Geonames Wrapup
We will not use Geonames in any further examples in this book but I wanted to show
you how to load and use cl-geonames since it is a great resource when you are dealing
with data for countries, cities, etc. The Freebase database also has geographic data.
103
Part IV.
Example Semantic Web
Application
105
16. Semantic Web Portal Back
End Services
The web portal application developed in this chapter and in Chapter 17 is meant
to show you several useful techniques: how to use an RDF data store instead of a
relational database, how to organize complex information as RDF data, and how to
write a high performance web application using Common Lisp and the open source
Portable Allegroserve library1.
In this chapter I will first list all of the ”back end” functionality that the web UI
developed in Chapter 17 will need. Then we will implement this functionality in a
single file web app/backend.lisp. The required functionality is:
1. Read initial RDF data from a file init.nt that contains a few login accounts and
the port number that the web application will use.
2. Check for valid user login.
3. Utilities for entering new ”documents” into the system: perform NLP semantic
analysis, save semantic tags, entities, input text in AllegroServe.
4. Search wrapper: given search terms, return matching document IDs.
5. Given a document ID, return all information about the document.
I am going to use my own KnowledgeBooks NLP library in this chapter but a good
exercise for you would be to make an alternative version that uses the client library
for Open Calais that I wrote for Chapter 11.
After backend.lisp is written then it will be fairly easy to write the web application
UI in Chapter 17.
1The part of this example using AllegroGraph is specific to Franz Lisp and AllegroGraph but what you will learn in Chapter 17 will work well with other Common Lisp implementations like SBCL.
107
16. Semantic Web Portal Back End Services
16.1. Implementing the Back End APIs
This pedantic example web application substitutes the use of AllegroGraph instead
of a relational database for all data storage requirements. A real application would
probably use a relational database to store user information and AllegroGraph to store
semantic data.
The file web app/backend.lisp contains the implementation of the back end APIs and
you should open this file in a text editor while you read through this section because I
will only show you as few code snippets in the book text. I start by loading my NLP
library and the AllegroGraph library and performing some AllegroGraph initializa-
tion as seen in earlier book examples:
(push "../knowledgebooks_nlp/" asdf:*central-registry*)
(asdf:operate ’asdf:load-op :kbnlp)
(eval-when (compile load eval)
(require :aserve)
(require :agraph))
Here I loaded my NLP library, the open source portable AllegroServe library, and
the embedded AllegroGraph library. The following code snippet performs the same
AllegroGraph setup that we have already seen and loads the application parameters
from an N-Triple RDF file into our local RDF data store using the file path /tmp/web-
portal rdf:
(defpackage :user (:use :net.aserve.client :kbnlp))
(in-package :user)
(db.agraph.user::enable-!-reader)
(db.agraph.user::create-triple-store
"/tmp/webportal\_rdf")
(db.agraph.user::register-namespace
"kb" "http://knowledgebooks.com/rdfs#")
(db.agraph.user::register-freetext-predicate
!kb:docTitle)
(db.agraph.user::register-freetext-predicate
!kb:docText)
(db.agraph.user::load-ntriples #p"init.nt")
The most interesting code in backend.lisp is the function for adding a new document:
108
16.1. Implementing the Back End APIs
(defun add-document (doc-uri doc-title doc-text)
(let* ((txt-obj
(kbnlp:make-text-object doc-text
:title doc-title
:url doc-uri))
(resource (db.agraph.user::resource doc-uri)))
(db.agraph.user::add-triple resource
!rdf:type !kb:document)
(db.agraph.user::add-triple resource
!kb:docTitle
(db.agraph.user::literal
doc-title))
(db.agraph.user::add-triple resource
!kb:docText
(db.agraph.user::literal
doc-text))
(dolist (human-name (kbnlp::text-human-names txt-obj))
(pprint human-name)
(db.agraph.user::add-triple resource
!kb:docPersonEntity
(db.agraph.user::literal
human-name)))
(dolist (place-name (kbnlp::text-place-names txt-obj))
(pprint place-name)
(db.agraph.user::add-triple resource
!kb:docPlaceEntity
(db.agraph.user::literal
place-name)))
(dolist (tag (kbnlp::text-category-tags txt-obj))
(pprint tag)
(db.agraph.user::add-triple
resource !kb:docTag
(db.agraph.user::literal
(format nil "˜A/˜A" (car tag) (cadr tag)))))))
Here I used my NLP library but as an exercise, you could rewrite this using the Open
Calais client library I provided in Chapter 11. It can be useful having a local entity
extraction library so applications do not need access to the Internet.2
The function doc-search is a simple wrapper for using the AllegroGraph text search
APIs:
2As I write this chapter in October 2010, I am on a ship in the Pacific Ocean with a poor Internet connection.
109
16. Semantic Web Portal Back End Services
(defun doc-search (search-term-string)
"return a list of matching doc IDs"
(db.agraph.user::freetext-get-ids search-term-string))
Search results are retuned as a list of document IDs and the