A fast algorithm for the generalized k-keyword proximity problem given keyword offsets

Sung-Ryul Kim; Inbok Lee; Kunsoo Park

doi:10.1016/j.ipl.2004.03.017

A fast algorithm for the generalized k-keyword proximity problem given keyword offsets

Sung-Ryul Kim, Inbok Lee, Kunsoo Park

Source

Information Processing Letters > 2004 > 91 > 3 > 115-120

Abstract

When searching for information on the Web, it is often necessary to use one of the available search engines. Because the number of results are quite large for most queries, we need some measure of relevance with respect to the query. One of the most important relevance factors is the proximity score, i.e., how close the keywords appear together in a given document. A basic proximity score is given by the size of the smallest range containing all the keywords in the query. We generalize the proximity score to include many practically important cases and present an O(nlogk)-time algorithm for the generalized problem, where k is the number of keywords and n is the number of occurrences of the keywords in a document.