ゲノム情報科学研究教育機構  アブストラクト
Date June 30, 2008
Speaker Dr. Vo Ngoc Anh, The University of Melbourne, Australia
Title Impact-Based Document Ranking
Abstract Given a large collection of text documents and a natural language query q, the principal task of document ranking is to identify the documents that would answer the query by means of ordering them in decreasing order of their (likelihood of) relevance to q. The ranking mechanisms find their applications in a number of practical systems such as searching on the Web, or searching in large repositories of scientific articles.

A number of document ranking models have been developed. Amongst them, the vector space model offers an efficient and effective way to do the task, although in the last few years other mechanisms such as BM25 and language modelling seem to perform better in terms of retrieval effectiveness.

In this talk we describe impact-based retrieval - an approach to document ranking that combines a simple document-centric view of text, and fast evaluation strategies that have been developed in connection with the vector space model. The new method defines the importance of a term within a document qualitatively rather than quantitatively, and in doing so eliminates the need for tuning parameters. In addition, the method supports very fast query processing, with most of the computation carried out on small integers, and dynamic pruning an effective option. Experiments on a wide range of TREC data show that the new method is highly competitive in terms of both retrieval effectiveness and efficiency.
「セミナー」に戻る      
 ホーム