The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The Web 2.0 era is characterized by the emergence of a very large amount of live content. A real time and fine grained content filtering approach can precisely keep users up-to-date the information that they are interested. The key of the approach is to offer a scalable match algorithm. One might treat the content match as a special kind of content search, and resort to the classic algorithm [5]....
Processing short texts is becoming a trend in information retrieval. Since the text has rarely external information, it is more challenging than document. In this paper, keyword clustering is studied for automatic categorization. To obtain semantic similarity of the keywords, a broad-coverage lexical resource WordNet
XML keyword search is a popular topic in research field, and the Smallest Lowest Common Ancestor (SLCA) concept is fundamental for XML keyword search algorithms. With the rapid growth of XML data in internet, we are confronted with big data issues, it's becoming a new research direction for managing massive XML data
The content of a text is mainly defined by keywords and named entities occurring in it. In particular for news articles, named entities are usually important to define their semantics. However, named entities have ontological features, namely, their aliases, types, and identifiers, which are hidden from their textual
In document categorization method by using similarity measures based on word vectors, it is important to determine key words to characterize each document. However, conventional methods select the key words based on their frequency or/and particular importance index such as tf-idf. In this paper, we propose a method to characterize each document by using temporal clusters of technical term usages...
retrieval are challenges under the consideration of the aggregation of medical data is so large that it has to be stored in a cloud server. This paper proposes the similarity search tree structure to enhance the hit rate of multi-keyword ranking search. We also propose dynamic interval clustering algorithm DIK-MEDOIDS under
edges will be built among the blogs which belong to the same result set gotten through the Google blog searching by one keyword. Then the problem of recommender is translated into the clustering of a hyper graph. Our multilevel clustering algorithm is then used to do the segmentation. And we set a new optimization index
It is well known that the work condition of pipeline, the leak included, can be identified by a pressure signal analysis. Because of the high frequency data collection and always on-line pipeline leak detection, the pressure signal brings up massive data. A methodology for pipeline leak detection using data mining technology and work condition identification is presented here. Sixteen groups of raw...
Peers search contents with information of contents such as keywords in many peer-to-peer contents sharing systems. In many peer-to-peer contents search architectures, queries are forwarded to peers which belong to clusters related with the keywords. Since clusters are basically constructed regardless of physical
records. At the same time, a nearest neighbor sort algorithm based on cluster index is presented in detection of likeness duplicate records. In this algorithm, we adopted the combination of keywords, the cluster index, the slide window of real-time changes, the field weights and the similarity thresholds, which has increased
retrieval system. Given a document, a keyquery is a set of few keywords for which the document achieves a high relevance score. Keyqueries can hence be viewed as a general and concise description of the returned retrieval results. The keyquery framework addresses important problems of static classification systems: overlarge
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.