Studio Ousia Inc. and Nara Institute of Science and Technology (NAIST) jointly took the second place in the triple scoring task of the 2017 WSDM Cup, hosted by Cambridge University, with a new system utilizing deep learning. Studio Ousia has been engaged in collaborative research with NAIST since 2016.
The objective of the triple scoring task is to develop a model that utilizes knowledge base such as Wikipedia as a base to validate the adequacy of personal properties from the user’s perspective. For example, Barack Obama has a number of properties such as politician, author, lawyer, and professor according to such knowledge base. However it would be natural to assume that the majority of users would expect the system to identify him as a politician. The task in the competition required highly accurate predictions of the validity of attributes of a given person by analyzing publicly available knowledge base(i.e. Wikipedia) with a small amount of annotations gathered via crowdsourcing.
The method used for the task used deep learning technology to analyze information from Wikipedia. The model consists of two steps. In the first step, it employs several kinds of classification models based on deep learning to solve the task. In the second step, it employs another model called gradient boosting that relies on the results of the first models generate and realizes a highly accurate system.
21 teams from all over the world participated the competition, amongst which our model was chosen for second place. First prize was given to the Chinese Academy of Sciences, a Chinese national research institution, and third prize to University of Illinois at Urbana-Champaign (UIUC), a US university known for its information science research.
Our team and the competition organizer