NeedleSeek is a project for web semantic mining or ontology mining. It aims at automatically extracting and aggregating semantic knowledge from tera-scale data, with the goal of boosting web search and various NLP applications.
In this online research prototype, a web interface is provided for end users to search and browse the semantic knowledge-base we built.
NeedleSeek V1.0 focuses on semantic class construction and search. It receives a term or a phrase as a query and returns the semantic class(es) the term or phrase belongs to. Our system is able to distinguish different means of the same word/phrase and return multiple semantic classes for one term. For example, for query "apple", our system returns at least two semantic classes: fruits (pear, orange, etc) and companies (Microsoft, Sun, etc).
In the current version (V2.0), we support the following semantic relations:
1. Corpus-based Semantic Class Mining: Distributional
vs. Pattern-Based Approaches
By Shuming Shi, Huibin Zhang, Xiaojie Yuan, and Ji-Rong Wen
To appear in (COLING'10)
2. Employing Topic Models for Pattern-based Semantic
Class Discovery [paper][slides]
By Huibin Zhang, Mingjie Zhu, Shuming Shi, and Ji-Rong Wen
In Proceedings of the Annual Meeting of the Association for Computational
Linguistics (ACL'09),
Singapore, August 2009.
3. Pattern-based Semantic Class Discovery with Multi-Membership Support
By Shuming Shi, Xiaokang Liu, and Ji-Rong Wen
In ACM 17th Conference on Information and Knowledge Management (CIKM'08). Napa Valley,
California, USA, 2008 (Poster)
|
| Microsoft Research Asia | BookMark | Feed back |
|
© 2008 Microsoft CopyRight |
Contact Us |
Privacy Statement |
Trademarks |
Terms of Use |
|