Studio Ousia wins the NEEL Challenge at WWW2015 by a wide margin, with a language processing engine using machine-learning
Semantic Kernel, a new language processing engine which analyzes large amounts of text data with high speed and accuracy, brings in the big win
TOKYO, May 26, 2015 – Studio Ousia's proposed system won the "entity linking" competition, the Named Entity rEcognition and Linking (NEEL)Challenge, by a landslide with a wide margin separating them from second place downward. The competition was conducted at the world's largest academic conference on web research, the International World Wide Web Conference(WWW2015), held in Florence, Italy, from May 18th to 22nd, 2015.
Entity linking is a natural language processing technology for linking and processing keywords (named entities) in a text with a knowledge base such as Wikipedia. With this, it is possible to analyze the text by directly utilizing high-quality information from the knowledge base. For example, after extracting the words John F. Kennedy, it can identify whether that refers to the president, or the airport. Also, by calculating the proximity of related words and quantifying the strength of their associations, something like linking the title of a film directly to its performers or director becomes a possibility. It will also be able to perform language processing more intuitively, using the proximity of related keywords.
Compared to conventional language processing methods, this eliminates problems with the ambiguity of words, and because it makes high-quality language processing possible with less noise, a variety of language processing, such as document classification and tagging, emotional analysis, and semantic analysis, can be achieved with high precision.
Additionally, in recent years, this technology has been attracting much attention worldwide, with workshops on entity linking having been held by the National Institute of Standards and Technology and Microsoft Research.
The NEEL Challenge competition has been held every year since 2013 by world-renowned researchers in the field of entity linking, and was won by Microsoft Research in 2014. This year, 21 teams from around the world, including companies and universities, participated. In the results, our company's proposed system won an analysis accuracy score of 80.67. In contrast with the second place score (47.57), we won with a great difference of 33.1 (*1). The scores express in numerical values the ability to detect entities from a text.
(*1) For the score table, see the image below.
In addition, at our company, we have developed this technology in order to offer it as a commercial product. In the future, this engine is scheduled to be released in the summer of 2015 under the product name of "Semantic Kernel".
- WWW2015: http://www.www2015.it/
- NEEL: http://www.scc.lancs.ac.uk/microposts2015/challenge/index.html
- Ikuya Yamada, Hideaki Takeda, Yoshiyasu Takefuji: An End-to-End Entity Linking Approach for Tweets, WWW 2015 Workshop on Making Sense of Microposts (Florence, Italy), 2015