PRA3006-SPARQL

Generic queries

The following two queries are generic and can be run on all SPARQL endpoints. They provide some simple initial way to explore the content of an SPARQL endpoint. One of the advantages of RDF is that it can be self-explanatory.

Listing all classess

While there are exceptions, many databases use rdf:type and rdf:subClassOf to organize the content. Exploring this can be informative and at least give some initial idea of the data model used by the database (a is synonymous to rdf:type in the SPARQL language):

SELECT DISTINCT ?type WHERE {
  [] a ?type .
}

When we run this on Wikidata, we get:

type
http://schema.org/Dataset
http://wikiba.se/ontology#GeoAutoPrecision
http://wikiba.se/ontology#Property
http://www.w3.org/ns/lemon/ontolex#LexicalSense
http://wikiba.se/ontology#BestRank
http://schema.org/Article
http://www.w3.org/2002/07/owl#Class
http://www.w3.org/2002/07/owl#DatatypeProperty
This table is truncated. See the full table at sparql/rdfType-1.rq

And when we run this on WikiPathways, we get:

type
http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
http://www.openlinksw.com/schemas/virtrdf#QuadStorage
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapFormat
http://www.openlinksw.com/schemas/virtrdf#QuadMap
http://www.openlinksw.com/schemas/virtrdf#QuadMapValue
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapColumn
http://www.openlinksw.com/schemas/virtrdf#QuadMapColumn
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapATable
This table is truncated. See the full table at sparql/rdfType-2.rq

In both cases the output is not easy to deal with. For Wikidata, this is because it actually uses a different property than rdf:type and for WikiPathways because it first returns classes content that comes from the SPARQL endpoint software, and not from the WikiPathways RDF. You may want to browse the full lists and see what interesting things you can find in those long lists. You can also check the respective chapters elsewhere in this book for more specific queries.

Listing all items of some class

For listing example things in the database of a certain type, we can take the output from the above examples and ask for anything of a specific type. I here replace the rdf:type with the Wikidata wdt:P31 and noting a protein is an example type in Wikidata with Q8054 (only 10):

SPARQL sparql/wikidataProteins.rq (run, edit)

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT * WHERE {
  ?o wdt:P31 wd:Q8054.
  ?o rdfs:label ?l.
  FILTER(LANG(?l)='en')
} LIMIT 10

which gives:

o l
http://www.wikidata.org/entity/Q409065 Uroporphyrinogen decarboxylase
http://www.wikidata.org/entity/Q409106 marker of proliferation Ki-67
http://www.wikidata.org/entity/Q409114 Sex determining region Y
http://www.wikidata.org/entity/Q409166 Coagulation factor II, thrombin
http://www.wikidata.org/entity/Q24190 Neurotrophin 3
http://www.wikidata.org/entity/Q30530 Histidine ammonia-lyase
http://www.wikidata.org/entity/Q58321 protein kinase
http://www.wikidata.org/entity/Q63398 Chromogranin B
http://www.wikidata.org/entity/Q74314 Titin
http://www.wikidata.org/entity/Q74581 Growth differentiation factor 15

Listing all properties of some class

Second, one you have identified a class of interest, then you want to see what properties are used for that class. For proteins, you can do:

SPARQL sparql/wikidataProteinProperties.rq (run, edit)

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?p WHERE {
  ?o wdt:P31 wd:Q8054.
  ?o ?p [].
} LIMIT 100

which gives:

p
http://www.wikidata.org/prop/direct/P352
http://www.wikidata.org/prop/direct-normalized/P352
http://www.wikidata.org/prop/direct/P361
http://www.wikidata.org/prop/direct/P486
http://www.wikidata.org/prop/direct-normalized/P486
http://www.wikidata.org/prop/direct/P527
http://www.wikidata.org/prop/direct/P638
http://www.wikidata.org/prop/direct-normalized/P637
http://www.wikidata.org/prop/direct/P637
http://www.wikidata.org/prop/direct/P646
http://www.wikidata.org/prop/direct/P682
http://www.wikidata.org/prop/direct/P680
http://www.wikidata.org/prop/direct/P681
http://www.wikidata.org/prop/direct/P692
http://www.wikidata.org/prop/direct/P702
http://www.wikidata.org/prop/direct/P703
http://www.wikidata.org/prop/direct/P705
http://www.wikidata.org/prop/direct/P2892
http://www.wikidata.org/prop/direct/P2888
http://www.wikidata.org/prop/direct/P6366
http://www.wikidata.org/prop/direct/P10283
http://www.wikidata.org/prop/P31
http://www.wikidata.org/prop/direct/P31
This table is truncated. See the full table at sparql/wikidataProteinProperties.rq

Actually, this query has a limit to 100 as it times out otherwise. The Wikidata model is complicated.