Wikidata Recommenders

At this week‘s OpenSym Conference we will present our evaluation of property recommender systems for Wikidata and generally, collaborative knowledge bases. The (admittedly way too long) title of our paper is: “An Empirical Evaluation of Property Recommender Systems for Wikidata and Collaborative Knowledge Bases“.

Wikidata is a popular example for collaboratively filled and mainained knowledge bases. These mostly rely on a community of committed people who edit and add data. These users are often supported by recommender sytems during the process of entering and editing data. Wikidata also provides a so-called „property suggestor“ which basically recommends further properties to be added. In principle, data is entered in a triple-form: subject-property-object, where property-object pairs (so-called „statements“) are used to describe a subject. Wikidata supports its editors by providing recommendations for further suitable properties for a given subject.

In this work, we evaluate different recommendation algorithms serving this purpose. In principle, we compare an approach by Abedjan and Naumann, the current Wikidata recommender and the Snoopy-approach we developed a couple of years ago.

recall
Recommender Evaluation: Recall@k

We identify three important influence factors regarding the quality of recommendations: (i) the use of classifying properties into the rule creation process as incorporated in the Wikidata approach (WD), (ii) ranking according to confidence values as performed by Abedjan and Naumann (AN) and (iii) incorporating contextual information into the ranking process as proposed by the Snoopy approach (SN_context). We find that the current implementation of the Wikidata Entity Suggester works better than the other presented approaches.  In the course of our analyses, we identify two key aspects which are essential for the quality of recommendations: incorporating classifying properties and making use of contextual information for ranking the property recommendation candidates. Combining the current Wikidata Entity Suggester approach with Snoopy’s ranking strategy, which facilitates contextual information, significantly increases the performance of the current Wikidata recommender approach as can be seen in the followRecall@k evaluation (WD_context).

You can find our open source implementation of the underlying evaluation framework and the evaluated algorithms here: https://github.com/dbisibk/PropertyRecommenderEvaluator/

 

For more details on this, please check out or OpenSym paper:

  • Eva Zangerle, Wolfgang Gassler, Stefan Steinhauser, and Günther Specht. An empirical evaluation of property recommender systems for wikidata and collaborative knowledge bases. In Proceedings of the 12th international symposium on open collaboration, OpenSym ’16, Berlin, Germany, 2016. ACM. doi:10.1145/2957792.2957804
    Bibtex BibTeX Abstract Abstract PDF Download PDF

    The Wikidata platform is a crowdsourced, structured knowledgebase aiming to provide integrated, free and language-agnostic facts which are–-amongst others–-used by Wikipedias. Users who actively enter, review and revise data on Wikidata are assisted by a property suggesting system which provides users with properties that might also be applicable to a given item. We argue that evaluating and subsequently improving this recommendation mechanism and hence, assisting users, can directly contribute to an even more integrated, consistent and extensive knowledge base serving a huge variety of applications. However, the quality and usefulness of such recommendations has not been evaluated yet. In this work, we provide the first evaluation of different approaches aiming to provide users with property recommendations in the process of curating information on Wikidata. We compare the approach currently facilitated on Wikidata with two state-of-the-art recommendation approaches stemming from the field of RDF recommender systems and collaborative information systems. Further, we also evaluate hybrid recommender systems combining these approaches. Our evaluations show that the current recommendation algorithm works well in regards to recall and precision, reaching a recall@7 of 79.71% and a precision@7 of 27.97%. We also find that generally, incorporating contextual as well as classifying information into the computation of property recommendations can further improve its performance significantly.

    @inproceedings{opensym16,
    title = {An Empirical Evaluation of Property Recommender Systems for Wikidata and Collaborative Knowledge Bases},
    author = {Eva Zangerle and Wolfgang Gassler and Stefan Steinhauser and G\"{u}nther Specht},
    url = {https://www.evazangerle.at/wp-content/uploads/2017/06/opensym16.pdf},
    doi = {10.1145/2957792.2957804},
    year = {2016},
    date = {2016-01-01},
    booktitle = {Proceedings of the 12th International Symposium on Open Collaboration},
    publisher = {ACM},
    address = {Berlin, Germany},
    series = {OpenSym '16},
    abstract = {The Wikidata platform is a crowdsourced, structured knowledgebase aiming to provide integrated, free and language-agnostic facts which are---amongst others---used by Wikipedias. Users who actively enter, review and revise data on Wikidata are assisted by a property suggesting system which provides users with properties that might also be applicable to a given item. We argue that evaluating and subsequently improving this recommendation mechanism and hence, assisting users, can directly contribute to an even more integrated, consistent and extensive knowledge base serving a huge variety of applications. However, the quality and usefulness of such recommendations has not been evaluated yet. In this work, we provide the first evaluation of different approaches aiming to provide users with property recommendations in the process of curating information on Wikidata. We compare the approach currently facilitated on Wikidata with two state-of-the-art recommendation approaches stemming from the field of RDF recommender systems and collaborative information systems. Further, we also evaluate hybrid recommender systems combining these approaches. Our evaluations show that the current recommendation algorithm works well in regards to recall and precision, reaching a recall@7 of 79.71% and a precision@7 of 27.97%. We also find that generally, incorporating contextual as well as classifying information into the computation of property recommendations can further improve its performance significantly.},
    keywords = {},
    pubstate = {published},
    tppubtype = {inproceedings}
    }

thumbnail of opensym16
Paper