The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we present two text-independent writer identification methods in a closed-world context. Both methods use on-line and off-line features jointly with a classifier inspired from information retrieval methods. These methods are local, respectively based on the character and grapheme levels. This writer identification engine may be used to personalize our cursive word recognition engine~\cite{icfhr2010}...
Information Retrieval in large digital document repositories is at the same time a hard and crucial task. While the primary type of information available in documents is usually text, images play a very important role because they pictorially describe concepts that are dealt with in the document. Unfortunately, the semantic gap separating such a visual content from the underlying meaning is very wide...
A host of tools and techniques are now available for data mining on the Internet. The explosion in social media usage and people reporting brings a new range of problems related to trust and credibility. Traditional media monitoring systems have now reached such sophistication that real time situation monitoring is possible. The challenge though is deciding what reports to believe, how to index them...
In the past few years, there has been an explosive growth in scientific and legal information related to the patent system. Patents and related documents are siloed into multiple heterogeneous sources. Retrieving relevant information from diverse sources is a non-trivial task and poses many technical challenges. Among the challenges is the issue of terminological inconsistencies that are used in the...
The paper deals with the problem of anticipating performance parameters for running SPARQL queries. Canonical correlation analysis (CCA) and its kernel variant (KCCA) identify and quantify the associations between two sets of variables. It maximizes the correlation between a linear combination of the variables in one set and a linear combination of the variables in the other set. It measures the strength...
The overall goal of an information retrieval process is to retrieve the information relevant to the given request. The information retrieval techniques commonly used are based on keywords. These techniques use keyword listed to describe the content of information, but one problem with such list is that they do not say anything about the semantic relationships between keywords, nor do they take into...
The users of a digital library often have difficulty in formulating query expression that could represent his or her information requirements exactly. Query suggestion can provide some recommended query expression and help the user build a proper expression. Query suggestion for digital libraries requires higher precision ratio and novelty ratio than that of web search engines. Based on case studies,...
With the rapid expansion of information in the network environment, it brings many challenges for information retrieval staff. It seems to be a pair of contradictions between the searchers' retrieval skill and the rapid expansion of information. On the one hand, the website developers must carefully consider arranging the retrieval key words in order to ensure the users to be able to meet their needs...
Web-Ear is an information retrieval system in which anyone can search "What person X told about the topic Y" on the internet or in the database of the system. If the search is made on the internet, first the system submits a query to Google search engine and retrieves a set of information. In order to isolate speech portions of retrieved data, an extraction process is required to be performed...
In this study we have explained the implementation details of a Automatic Question Answering System for Turkish. Automatic Question Answering (QA) is a sub-field of information retrieval (IR) that deals with finding the answers of submitted free text questions. Automatic QA is important for daily life because, it retrieves the answers or answer sets of submitted questions without wasting time for...
In this paper, we report on the design and implementation of a stemmer for the Farsi language, according to combination of Kazem Taghva's method and improved Krovetz's method. The first method removes the suffixes and prefixes according to the word's structure. And the second method is based on saving the information in a Database. This paper reports a kind of combination of these methods. The results...
This study aims to develop a knowledge navigation framework based on realistic 3D in order to get coaching and advice in real time and recognize multi-relation easily and retrieve on the basis of compound knowledge objects. To do so, it proposed the methods of 3D visualization of compound knowledge, intellectual retrieval, and construction of compound knowledge repository.
The Arabic web content is growing rapidly and the need for its efficient management is gaining importance and the morphological complexity of Arabic raises many challenges in this regard. This paper reports on some of our work aimed at designing text mining and query pre-processing tools that are able to efficiently process and search large quantities of Arabic web data. In our research we try to...
The biomedical field publishes a huge volume of articles every year and most of them are now available online as PDF full-texts. here is still no an effective search mechanisim that could locate the full-text articles very narrowly matching the users preference quickly. Such a mechanisim can be important for many real-time applications such as information at point of care for a physician who only...
Many new storage systems provide some form of data reduction. We examine data reduction methods that might be suitable for \emph{primary} storage systems serving active data (as contrasted with backup and archive systems), by analysis of file sets found in different active data environments. We address questions of: how effective are compression and variations of deduplication, both separately and...
This paper presents an application of Natural Language Tool (NLT) to support the VPRG extraction of text based vulnerability description. The NLT is used to analyze the text-based vulnerability descriptions to retrieve vulnerability properties and evaluate their relationships. Then, a graph based VPRG model that describes the vulnerability can be established. Finally, with fine-tuning from domain...
Autocomplete feature is widely used in search interfaces to assist users in their search. Autocomplete helps users by giving list of options based on characters entered in the search field. Significant amount of work has been done in this field, but the techniques used are not efficient or relevant when it comes to search within specific content like a text file, a document or a web page. This calls...
This paper proposes a novel face recognition algorithm inspired by the selective attention of Human Visual System (HVS). We record four observers' eye movements when they are viewing 100 FRGC [1] frontal view face images and find that the observers are highly consistent in the regions fixated. Inspired by the fact that fovea of HVS has a much higher spatial acuity than the periphery, a face recognition...
The majority of healthcare workers in hospitals continue to record, access and update important patient information using paper charts. Disparate patient data (clinical information, laboratory results and medical imagery) is entered by different caregivers and stored at different locations around the hospital. This is a cumbersome, time consuming process that can result in critical medical errors...
Pharming attacks - a sophisticated version of phishing attacks - aim to steal users' credentials by redirecting them to a fraudulent website using DNS-based techniques. Pharming attacks can be performed at the client-side or into the Internet, using complex and well designed techniques that make the attack often imperceptible to the user. With the deployment of broadband connections for Internet access,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.