Last update: June 15th 2005
sparql2sql is a query engine for SPARQL over Jena triple stores. It rewrites SPARQL queries into SQL. This approach offloads most of the query execution work on the database. This should improve performance.
This is an experimental implementation. It cannot deal with all SPARQL queries and is not fully tested. See the Limitations and known issues sections for some details.
Please direct feedback and bug reports to the Jena mailing list, jena-dev@groups.yahoo.com.
Author: Richard Cyganiak (richard@cyganiak.de)
Currently sparql2sql is only available as Java source code from CVS.
cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/jena login cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/jena co sparql2sql
When asked for a password, just press Enter.
All required jar files (the Jena 2.2 jars, the MySQL JDBC connector, and a CVS build of ARQ) are in the lib directory.
There's a runnable example, sparql2sql/Test.java, and a unit test suite in the tests-src directory. Both require a live MySQL 4.1 database. The connection is configured in etc/db_connection.properties.
sparql2sql can be used to query database-persisted Jena models (ModelRDB). The example creates a ModelRDB, reads an RDF file into the model, then re-opens the model as an RDBDataSource and executes a SPARQL query on that.
// register the sparql2sql query engine
// (must be done once at startup time)
RDBQueryEngineFactory.registerSelf();
// Open a DB connection and DB model
IDBConnection conn = new DBConnection(url, user, password, engine);
ModelMaker maker = ModelFactory.createModelRDBMaker(conn);
Model persistentModel = maker.createModel("myModelName");
// ... do interesting stuff with the model ...
persistentModel.read("http://xmlns.com/foaf/0.1/index.rdf");
// Open the same model as an ARQ DataSet
DataSet ds = RDBDataSource.open(conn, "myModelName");
// Execute a SPARQL query
String sparql =
"PREFIX rdf: " +
"PREFIX rdfs: " +
"SELECT ?class ?label " +
"WHERE { ?class rdf:type rdfs:Class . " +
" ?class rdfs:label ?label }";
ResultSet results = QueryExecutionFactory.create(
QueryFactory.create(sparql), ds).execSelect();
// Pretty-print results to System.out
new ResultSetFormatter(results).printAll(System.out);
SPARQL's Dataset is a collection consisting of a default graph and any number of named graphs, which are named by URIs.
sparql2sql's implementation of this concept is the RDBDataSource.
The example sets up an RDBDataSource, reads some RDF file into the default graph and some named graphs, and executes a SPARQL query over the Dataset.
// set up datasource
RDBDataSource ds = RDBDataSource.open(
new DBConnection(url, user, password, engine),
"my_dataset");
// clean the model if it still contains stuff from previous run
ds.clear();
// randomly read some RDF into the default and some named graphs
ds.getDefaultModel().read("http://www.w3.org/1999/02/22-rdf-syntax-ns");
// we have to generate the named graphs first -- clunky!
ds.addNamedModel("urn:my:graph1", ModelFactory.createDefaultModel());
ds.addNamedModel("urn:my:graph2", ModelFactory.createDefaultModel());
ds.addNamedModel("urn:my:graph3", ModelFactory.createDefaultModel());
// now read some stuff
ds.getNamedModel("urn:my:graph1").read("http://www.w3.org/2000/01/rdf-schema");
ds.getNamedModel("urn:my:graph2").read("http://purl.org/dc/elements/1.1/");
ds.getNamedModel("urn:my:graph3").read("http://xmlns.com/foaf/0.1/index.rdf");
// register the SPARQL2SQL query engine -- must be done once at
// startup time
RDBQueryEngineFactory.registerSelf();
// Set log level to debug
// This causes the engine to log executed SELECT statements
Logger.getLogger(RDBDataSource.class).setLevel(Level.DEBUG);
// do a SPARQL query
String sparql =
"PREFIX rdf: " +
"PREFIX rdfs: " +
"SELECT ?source ?uri ?superclass " +
"WHERE { GRAPH ?source { " +
"{ ?uri rdf:type rdfs:Class } UNION { ?uri rdf:type rdf:Property } " +
"OPTIONAL { ?uri rdfs:subClassOf ?superclass } } }";
Query q = QueryFactory.create(sparql);
ResultSet results = QueryExecutionFactory.create(q, ds).execSelect();
// print results using an ARQ utility class
ResultSetFormatter.out(System.out, results, q);
// close the dataset
ds.close();
This is experimental software in a very early stage of development. No extensive testing has been performed.
WHERE { ?x :a :b OPTIONAL { ?x :c ?y } OPTIONAL { ?x :c ?y } }
(The results depend on which ?y is bound “first”)WHERE { GRAPH ?g {} }
WHERE { GRAPH ?g { OPTIONAL { ?s ?p ?o } } }
sparql2sql uses the Jena ModelRDB database schema.
This allows SPARQL queries over existing ModelRDB stores, but comes at a performance and complexity cost since the Jena DB schema was not designed with RDF Datasets in mind.
ModelRDB is able to store multiple models in a single statement table. This feature is used by sparql2sql to simulate RDF Datasets.
Generated SQL statements can be logged by lowering the log level:
Logger.getLogger(RDBDataSource.class).setLevel(Level.DEBUG);