The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Automatic multi-document summarization may help news readers retrieve information from digital news media efficiently. The summarizer create a concise summary containing important information from a collection of articles, enabling readers to read only one text to gain information from multiple text sources. Reflecting on previous researches, we propose an automatic summarization system using sentence...
People generate online content everyday at every hour at social networks. Social networks are a medium in which people can give their opinion on different topics and obtain new information. The content people create can be useful for researchers to understand human behavior in cities such as Quito. In this work, we are going to describe how Quito city is described on the travel network TripAdvisor...
The software engineering based on components is an evolving branch of software engineering. The evolution in data mining and information retrieval techniques forms the basis of the approaches to component retrieval. This has paved way to new techniques to be used for efficient storage, retrieval and management of component repository and storage systems. Such information retrieval caters to the needs...
Activity-listing services like TripAdvisor, Foursquare and Facebook events make use of community opinions/reviews to help users identify ‘points of interest’ from a large search space and typically make use of collaborative filtering algorithm. These algorithms analyze reviews from a group of people to find correlations between users and/or items in order to suggest top items to a querying user. The...
Over the years, the volume of information available through the world wide web has been increasing continuously, and never has so much information readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over network have made it difficult for users to shift through and find relevant information. The information retrievals...
Clustering of Web document has become a vital task, due to the tremendous amount of information that is available on web today. The task of finding suitable information with less time has become a big challenge in information retrieval. So, it's very much necessary to adopt a method that can be used organize the information well. This is possible only when good document groups are formed, which in...
The Internet is a major source of online news content. Current efforts to evaluate online news content, including text, story line and sources is limited by the use of small-scale manual techniques that are time consuming and dependent on human judgments. This article explores the use of machine learning algorithms and mathematical techniques for Internet-scale data mining and semantic discovery of...
We consider the problem of accurately and efficiently querying a remote server to retrieve information about images captured by a mobile device. In addition to reduced transmission overhead and computational complexity, the retrieval protocol should be robust to variations in the image acquisition process, such as translation, rotation, scaling, and sensor-related differences. We propose to extract...
Listwise learning to rank (LTR) is aimed at constructing a ranking model from listwise training data to order objects. In most existing studies, each training instance consists of a set of objects described by preference features. In a preference feature space for the objects in training, the structure of the objects is associated with the absolute preference degrees for the objects. The degrees significantly...
Sentence clustering is often used as the first step in various information retrieval tasks like automatic text summarization, topic detection and tracking etc. Researchers face difficulty to cluster sentences because a single sentence is less informative compared to document. We present a sentence Feature Based Sentence Clustering, FBSC, which incorporates some sentence level relationship features...
Document clustering is the application of cluster analysis to textual documents. It is commonly used technique in data mining, information retrieval, knowledge discovery from data, pattern recognition, etc. In traditional document clustering, a document is considered as a bag of words; where semantic meaning of word is not taken into consideration. However, to achieve accurate document clustering,...
In this work, we have created a semantic similarity calculation system between text documents to contribute to their semantic clustering. Indeed, semantic clustering of documents is a promising field of research, since it guarantees a quick and targeted access to information. The aim of document clustering is to put together similar documents. We used the algebraic model VSM (Vector Space Model) [2]...
This paper presents a design, implementation and evaluation of a web-based application called MergeXML (MXML). MXML was developed to integrate XML documents that are similar in terms of structure and content to complete information which can be used for information retrieval. XML documents are clustered into subtrees representing as instances using leaf-node parents as clustering points. The system...
Various dynasties ruled the Indian sub-continent and left behind enormous and rich cultural heritage that also included intellectually enriched research in the shape of various documents scripted in Urdu. In order to provide efficient access to this knowledge, analysis though digitizing the existing work is the need of hour. In addition to digitization, efficient search mechanisms also need to be...
A patent is an intellectual property document that protects new inventions. It covers how things work, what they do, how they do it, what they are made of and how they are made. The owner of the granted patent application has the ability to take a legal action to stop others from making, using, importing or selling the invention without permission. While applying for a patent, the inventor has issues...
Internet is large interconnection of small networks that is commonly known as World Wide Web. The amount of informations available on internet in digital form are very huge and growing at exponential rate following Moore's law. So, it's makes difficult to find exact search result according to user preferences. In this paper, we proposed a method for personalized web search. Personalized web search...
We present VisIRR, an interactive visual information retrieval and recommendation system for large-scale document data. Starting with a query, VisIRR visualizes the retrieved documents in a scatter plot along with their topic summary. Next, based on interactive personalized preference feedback on the documents, VisIRR collects and visualizes potentially relevant documents out of the entire corpus...
Word spotting in graphical documents is a very challenging task. With an increase usage of electronic media, we are in a need of searching objects in graphical documents by some labeled text. To address such scenarios we propose a word spotting system dedicated to graphical documents with Bangla and English scripts. In our proposed system, first text-graphics layers are separated using Gabor filter...
Contemporary search engines and other automated web tools are faced with the task of extracting relevant information from huge web archives. This is supposed to be a difficult task due to the semi-structured and unstructured nature of the web documents. Users need automated ways of organizing and cataloging the web documents so that they can be queried efficiently. Clustering is typically employed...
The purpose of the present work is creating an intelligent system to retrieve desired documents in Marathi language. The system also focuses on providing the personalized documents in Marathi language to the end user based on their interests identified from the browsing history. This paper presents the automatic categorization of Marathi documents and the literature survey of the related work done...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.