The following two queries are generic and can be run on all SPARQL endpoints. They provide some simple initial way to explore the content of an SPARQL endpoint. One of the advantages of RDF is that it can be self-explanatory.
While there are exceptions, many databases use rdf:type
and rdf:subClassOf
to organize
the content. Exploring this can be informative and at least give some initial idea of the
data model used by the database (a
is synonymous to rdf:type
in the SPARQL language):
SELECT DISTINCT ?type WHERE {
[] a ?type .
}
When we run this on Wikidata, we get:
type | |
http://schema.org/Dataset | |
http://wikiba.se/ontology#GeoAutoPrecision | |
http://wikiba.se/ontology#Property | |
http://www.w3.org/ns/lemon/ontolex#LexicalSense | |
http://wikiba.se/ontology#BestRank | |
http://schema.org/Article | |
http://www.w3.org/2002/07/owl#Class | |
http://www.w3.org/2002/07/owl#DatatypeProperty | |
This table is truncated. See the full table at sparql/rdfType-1.rq |
And when we run this on WikiPathways, we get:
type | |
http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat | |
http://www.openlinksw.com/schemas/virtrdf#QuadStorage | |
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapFormat | |
http://www.openlinksw.com/schemas/virtrdf#QuadMap | |
http://www.openlinksw.com/schemas/virtrdf#QuadMapValue | |
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapColumn | |
http://www.openlinksw.com/schemas/virtrdf#QuadMapColumn | |
http://www.openlinksw.com/schemas/virtrdf#array-of-QuadMapATable | |
This table is truncated. See the full table at sparql/rdfType-2.rq |
In both cases the output is not easy to deal with. For Wikidata, this is because it actually
uses a different property than rdf:type
and for WikiPathways because it first returns classes
content that comes from the SPARQL endpoint software, and not from the WikiPathways RDF.
You may want to browse the full lists and see what interesting things you can find in those long lists.
You can also check the respective chapters elsewhere in this book for more specific queries.
For listing example things in the database of a certain type, we can take the output from
the above examples and ask for anything of a specific type. I here replace the rdf:type
with the Wikidata wdt:P31
and noting a protein is an example type in Wikidata with
Q8054
(only 10):
SPARQL sparql/wikidataProteins.rq (run, edit)
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT * WHERE {
?o wdt:P31 wd:Q8054.
?o rdfs:label ?l.
FILTER(LANG(?l)='en')
} LIMIT 10
which gives:
o | l |
http://www.wikidata.org/entity/Q409065 | Uroporphyrinogen decarboxylase |
http://www.wikidata.org/entity/Q409106 | marker of proliferation Ki-67 |
http://www.wikidata.org/entity/Q409114 | Sex determining region Y |
http://www.wikidata.org/entity/Q409166 | Coagulation factor II, thrombin |
http://www.wikidata.org/entity/Q24190 | Neurotrophin 3 |
http://www.wikidata.org/entity/Q30530 | Histidine ammonia-lyase |
http://www.wikidata.org/entity/Q58321 | protein kinase |
http://www.wikidata.org/entity/Q63398 | Chromogranin B |
http://www.wikidata.org/entity/Q74314 | Titin |
http://www.wikidata.org/entity/Q74581 | Growth differentiation factor 15 |
Second, one you have identified a class of interest, then you want to see what properties are used for that class. For proteins, you can do:
SPARQL sparql/wikidataProteinProperties.rq (run, edit)
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT DISTINCT ?p WHERE {
?o wdt:P31 wd:Q8054.
?o ?p [].
} LIMIT 100
which gives:
p | |
http://www.wikidata.org/prop/direct/P352 | |
http://www.wikidata.org/prop/direct-normalized/P352 | |
http://www.wikidata.org/prop/direct/P361 | |
http://www.wikidata.org/prop/direct/P486 | |
http://www.wikidata.org/prop/direct-normalized/P486 | |
http://www.wikidata.org/prop/direct/P527 | |
http://www.wikidata.org/prop/direct/P638 | |
http://www.wikidata.org/prop/direct-normalized/P637 | |
http://www.wikidata.org/prop/direct/P637 | |
http://www.wikidata.org/prop/direct/P646 | |
http://www.wikidata.org/prop/direct/P682 | |
http://www.wikidata.org/prop/direct/P680 | |
http://www.wikidata.org/prop/direct/P681 | |
http://www.wikidata.org/prop/direct/P692 | |
http://www.wikidata.org/prop/direct/P702 | |
http://www.wikidata.org/prop/direct/P703 | |
http://www.wikidata.org/prop/direct/P705 | |
http://www.wikidata.org/prop/direct/P2892 | |
http://www.wikidata.org/prop/direct/P2888 | |
http://www.wikidata.org/prop/direct/P6366 | |
http://www.wikidata.org/prop/direct/P10283 | |
http://www.wikidata.org/prop/P31 | |
http://www.wikidata.org/prop/direct/P31 | |
This table is truncated. See the full table at sparql/wikidataProteinProperties.rq |
Actually, this query has a limit to 100 as it times out otherwise. The Wikidata model is complicated.