PRA3006-SPARQL

Wikidata

License CCZero

Wikidata is not a life sciences database, but a general database related to Wikipedia [1]. That said, various research groups have started using Wikidata for the life sciences [2,3]. For example, CAS registry numbers in Wikidata and Wikipedia have been validated against the Common Chemistry database [4], and Wikidata has been used to make chemicals in taxon available in the LOTUS project [5].

Entities

The RDF contains all pathways, their datanodes (genes, proteins, metabolites, etc.), author information, molecular descriptors, and more. The main classes are:

Data model

Example queries

Proteins

We can list proteins with the following query:

SPARQL sparql/wikidataProteins.rq (run, edit)

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT * WHERE {
  ?o wdt:P31 wd:Q8054.
  ?o rdfs:label ?l.
  FILTER(LANG(?l)='en')
} LIMIT 10

which gives:

o l
http://www.wikidata.org/entity/Q409065 Uroporphyrinogen decarboxylase
http://www.wikidata.org/entity/Q409106 marker of proliferation Ki-67
http://www.wikidata.org/entity/Q409114 Sex determining region Y
http://www.wikidata.org/entity/Q409166 Coagulation factor II, thrombin
http://www.wikidata.org/entity/Q24190 Neurotrophin 3
http://www.wikidata.org/entity/Q30530 Histidine ammonia-lyase
http://www.wikidata.org/entity/Q58321 protein kinase
http://www.wikidata.org/entity/Q63398 Chromogranin B
http://www.wikidata.org/entity/Q74314 Titin
http://www.wikidata.org/entity/Q74581 Growth differentiation factor 15

Chemicals

We can also list chemicals, with this query:

SPARQL sparql/wikidataChemicals.rq (run, edit)

PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT * WHERE {
  ?o wdt:P31 wd:Q113145171 .
  ?o rdfs:label ?l.
  FILTER(LANG(?l)='en')
} LIMIT 50

which gives:

o l
http://www.wikidata.org/entity/Q153 ethanol
http://www.wikidata.org/entity/Q50703 cesium iodide
http://www.wikidata.org/entity/Q50980 xanthine
http://www.wikidata.org/entity/Q52353 benzyl alcohol
http://www.wikidata.org/entity/Q150681 octane
http://www.wikidata.org/entity/Q150694 nonane
http://www.wikidata.org/entity/Q150717 decane
http://www.wikidata.org/entity/Q150731 undecane
This table is truncated. See the full table at sparql/wikidataChemicals.rq

References

  1. Vrandečić D, Pintscher L, Krötzsch M. Wikidata: The Making Of. WWW ’23 Companion: Companion Proceedings of the ACM Web Conference 2023 [Internet]. 2023 Apr 30; Available from: https://dl.acm.org/doi/10.1145/3543873.3585579 doi:10.1145/3543873.3585579 (Scholia)
  2. Waagmeester A, Stupp G, Burgstaller-Muehlbacher S, Good BM, Griffith M, Griffith O, et al. Wikidata as a knowledge graph for the life sciences. eLife [Internet]. 2020 Mar 17;9. Available from: https://elifesciences.org/articles/52614 doi:10.7554/ELIFE.52614 (Scholia)
  3. Waagmeester A, Willighagen EL, Su AI, Kutmon M, Gayo JEL, Fernández-Álvarez D, et al. A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses. BMC Biol [Internet]. 2021 Jan 22;19(1):12. Available from: https://bmcbiol.biomedcentral.com/track/pdf/10.1186/s12915-020-00940-y.pdf doi:10.1186/S12915-020-00940-Y (Scholia)
  4. Jacobs A, Williams D, Hickey K, Patrick N, Williams AJ, Chalk S, et al. CAS Common Chemistry in 2021: Expanding Access to Trusted Chemical Information for the Scientific Community. JCIM. 2022 May 13; doi:10.1021/ACS.JCIM.2C00268 (Scholia)
  5. Rutz A, Sorokina M, Galgonek J, Mietchen D, Willighagen E, Gaudry A, et al. The LOTUS initiative for open knowledge management in natural products research. eLife. 2022 May 26;11. doi:10.7554/ELIFE.70780 (Scholia)