The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Biomedical entity extraction from unstructured web documents is an important task that needs to be performed in order to discover knowledge in the veterinary medicine domain. In general, this task can be approached by applying domain specific ontologies, but a review of the literature shows that there is no universal dictionary, or ontology for this domain. To address this issue, we manually construct...
Summary form only given. Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This talk focuses on...
Research on opinion detection has shown that a large number of opinion-labeled data are necessary for capturing subtle opinions. However, opinion-labeled data, especially at the sub-document level, are often limited. This paper describes the application of Semi-Supervised Learning (SSL) to automatically produce more labeled data and explores the potential of SSL to improve transfer of labeled data...
A user's cognitive style has been found to affect how they search for information, how they analyze the information, and how they make decisions in an analytical process. In this paper, we propose an approach that uses Hidden Markov Models (HMM) to dynamically capture a user's cognitive style by automatically exploring the sequence of actions and relevant information with respect to the content of...
In recent years, there has been an explosion of publicly available RDF and OWL web pages. Typically, these pages are small, heterogeneous and prone to change frequently. In order to effectively integrate them, we propose to adapt a query reformulation algorithm and combine it with an information retrieval inspired index in order to select all sources relevant to a query. We treat each RDF document...
The Chem2Bio2RDF portal is a Linked Open Data (LOD) portal for systems chemical biology aiming for facilitating drug discovery. It converts around 25 different datasets on genes, compounds, drugs, pathways, side effects, diseases, and MEDLINE/PubMed documents into RDF triples and links them to other LOD bubbles, such as Bio2RDF, LODD and DBPedia. The portal is based on D2R server and provides a SPARQL...
Our previous research has developed AbraQ, an innovative automatic query expansion algorithm that automatically adds a term to a search query to improve the search results. AbraQ differs from other relevance feedback approaches in that it works independently of the quality of the original search result, which means it works well for hard search tasks when there are not any relevant documents retrieved...
A group of documents is called near-duplicates if they are almost the same with just a slight difference. Since near-duplicates are major concerns of Web search engines, it is necessary to identify and filter them effectively. Among existing near-duplicate identification methods, MinHashing is the most well-known one. It identifies near-duplicates regardless of locations of different parts in two...
This paper deals with large scale information retrieval aiming at contributing to web searching. The collections of documents considered are huge and not obvious to tackle with classical approaches. The greater the number of documents belonging to the collection, the more powerful approach required. A Bees Swarm Optimization algorithm called BSO-IR is designed to explore the prohibitive number of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.