Search results

chapter

Is Wikipedia a Latent Gene Ontology?

Nicoletta Dessi, Maurizio Atzori

2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE) > 164 - 169

2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)

Despite the significant contribution from specialized ontologies and text mining methods, the evaluation of the semantic similarity of genes remains difficult because of the complex functions in which genes are involved. A less exploited resource is Wikipedia that stores more than 10400 articles about human genes: each gene name identifies the corresponding Wikipedia page resuming gene's properties...

chapter

Enhancing Wikipedia search results using Text Mining

K.D.C.G. Kapugama, S.A.S. Lorensuhewa, M.A.L. Kalyani

2016 Sixteenth International Conference on Advances in ICT for Emerging Regions (ICTer) > 168 - 175

2016 Sixteenth International Conference on Advances in ICT for Emerging Regions (ICTer)

Wikipedia is an online encyclopedia which contains millions of articles related to different subject domains. Wikipedia also has a search page itself to display the links corresponding to Wikipedia articles for a given user query input. This search result page displays the search results according to the relevance order, without any content based grouping. This paper presents an experimental deduction...

chapter

An Approach of Vector Space Model to Link Concrete Concepts with Wiki Entities

Lucas Borges Monteiro, Li Weigang, Ahmed Abdelfattah Saleh

2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing > 313 - 320

2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (CIT/IUCC/DASC/PICOM)

Entity Linking (EL) search and labeling are important research topics with various web applications. The challenge is to find and link the important concepts from web text to online encyclopedia databases instead of simple personal and place names. This paper presents a new approach to link concrete concepts from English texts with Wiki entities. Using part-of-speech tagging to detect concrete concepts,...

chapter

Arabic Text Mining a Systematic Review of the Published Literature 2002-2014

Hind Al-Mahmoud, Muna Al-Razgan

2015 International Conference on Cloud Computing (ICCC) > 1 - 7

2015 International Conference on Cloud Computing (ICCC)

Text Mining is a set of techniques that analyzes large masses of data, extract relations that are unknown beforehand, and provide solutions to help decision-making. Text mining had been used extensively to analyze English text. However, text mining has only been used recently in analyzing Arabic text. As a result the objective of this paper is to present the current state of Arabic text mining. A...

chapter

Modeling Both Coarse-Grained and Fine-Grained Topics in Massive Text Data

Weifan Zhang, Hui Zhang, Yuan Zuo, Deqing Wang

2015 IEEE First International Conference on Big Data Computing Service and Applications > 378 - 383

2015 IEEE First International Conference on Big Data Computing Service and Applications (BigDataService)

Topic model has attracted much attention from investigators, as it provides users with insights into the huge volumes of documents. However, most previous related studies that based on Non-negative Matrix Factorization (NMF) neglect to figure out which topics are widespread in the documents and which are not. These widespread topics, which we refer to coarse-grained topics, have great significance...

chapter

Cloud service for assessment of news' Popularity in internet based on Google and Wikipedia indicators

Asad Ullah Rafiq Khan, Mohammad Badruddin Khan, Khalid Mahmood

2015 5th National Symposium on Information Technology: Towards New Smart World (NSITNSW) > 1 - 8

2015 5th National Symposium on Information Technology: Towards New Smart World (NSITNSW)

The time-sensitive nature of the news article implies that the change of extent of internet searches for particular item, as a result of appearance of news, will prevail for few days and then the normal search pattern will again continue to work. This paper presents cloud service to describe how the popularity of the mass media news can be assessed using users online usage behavior. We used data from...

chapter

Cluster Labeling for the Blogosphere

Patrick Hennig, Philipp Berger, Claus Steuer, Christia Wuerz, more

2014 IEEE Fourth International Conference on Big Data and Cloud Computing > 416 - 423

2014 IEEE International Conference on Big Data and Cloud Computing (BdCloud)

Hierarchical Cluster Labeling helps users to quickly understand and analyze hierarchical clusters. This may be used to enhance search engine results or interactive browsing like it is being used in the Blog Intelligence application. The hierarchical organization of data helps to represent different levels of detail. Hierarchical clustering may be quite common, but there are few good solutions for...

chapter

Text mining wikipedia to discover alternative destinations

Kenneth Cosh

The 2013 10th International Joint Conference on Computer Science and Software Engineering (JCSSE) > 43 - 48

2013 10th International Joint Conference on Computer Science and Software Engineering (JCSSE)

This paper discusses an application of some statistical Natural Language Processing algorithms to a set of articles from Wikipedia about top tourist destinations. The objective is to automatically identify the key features of each destination and then discover other destinations which share similar sets of features. Through this a method is demonstrated by which meta data about each article can be...

chapter

In-Depth Analysis of Anaphora Resolution Requirements

Helene Schmolz, David Coquil, Mario Doller

2012 23rd International Workshop on Database and Expert Systems Applications > 174 - 179

2012 23rd International Workshop on Database and Expert Systems Applications (DEXA)

This paper aims to lay the foundations of an anaphora resolution framework able to process all types of hypertexts and treat all types of anaphors for the English language. To this end, we provide a linguistically unambiguous and extensive definition and categorization of the concept of anaphora. We introduce a new corpus, and use our proposed categorization to statistically analyze it. Finally, we...

chapter

Generation of descriptive elements for text

Mutsugu Kuboki, Kazuhide Yamamoto

2011 7th International Conference on Natural Language Processing and Knowledge Engineering > 56 - 59

2011 7th International Conference on Natural Language Processing and Knowledge Engineering (NLPKE)

Finding pages on the Web that are similar to a query page is an important component of modern search engines. Especially recognition method of content about Web pages is important role in search engine. However, if Web page include query words, it does not necessarily mean that Web page describe query. The main challenge here is identification factors that affect the relationship between query and...

chapter

A novel approach to sentence alignment from comparable corpora

Min-Hsiang Li, Vitaly Klyuev, Shih-Hung Wu

Proceedings of the 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems > 2 > 618 - 623

2011 IEEE 6th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)

This paper introduces a new technique to select candidate sentences for alignment from bilingual comparable corpora. Tests were done utilizing Wikipedia as a source for bilingual data. Our test languages are English and Chinese. A high quality of sentence alignment is illustrated by a machine translation application.

chapter

Towards a Universal Text Classifier: Transfer Learning Using Encyclopedic Knowledge

Pu Wang, C. Domeniconi

2009 IEEE International Conference on Data Mining Workshops > 435 - 440

2009 IEEE International Conference on Data Mining Workshops (ICDMW 2009)

Document classification is a key task for many text mining applications. However, traditional text classification requires labeled data to construct reliable and accurate classifiers. Unfortunately, labeled data are seldom available. In this work, we propose a universal text classifier, which does not require any labeled document. Our approach simulates the capability of people to classify documents...

chapter

Using Wikipedia as a Reference for Extracting Semantic Information from a Text

A. Prato, M. Ronchetti

2009 Third International Conference on Advances in Semantic Processing > 56 - 61

2009 Third International Conference on Advances in Semantic Processing (SEMAPRO 2009)

In this paper we present an algorithm that, using Wikipedia as a reference, extracts semantic information from an arbitrary text. Our algorithm refines a procedure proposed by others, which mines all the text contained in the whole Wikipedia. Our refinement, based on a clustering approach, exploits the semantic information contained in certain types of Wikipedia hyperlinks, and also introduces an...

chapter

Leveraging Web 2.0 Sources for Web Content Classification

S. Banerjee, M. Scholz

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 1 > 300 - 306

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

This paper addresses practical aspects of Web page classification not captured by the classical text mining framework. Classifiers are supposed to perform well on a broad variety of pages. We argue that constructing training corpora is a bottleneck for building such classifiers, and that care has to be taken if the goal is to generalize to previously unseen kinds of pages on the Web. We study techniques...

INFONA - science communication portal

Search results

Is Wikipedia a Latent Gene Ontology?

Enhancing Wikipedia search results using Text Mining

An Approach of Vector Space Model to Link Concrete Concepts with Wiki Entities

Arabic Text Mining a Systematic Review of the Published Literature 2002-2014

Modeling Both Coarse-Grained and Fine-Grained Topics in Massive Text Data

Cloud service for assessment of news' Popularity in internet based on Google and Wikipedia indicators

Cluster Labeling for the Blogosphere

Text mining wikipedia to discover alternative destinations

In-Depth Analysis of Anaphora Resolution Requirements

Generation of descriptive elements for text

A novel approach to sentence alignment from comparable corpora

Towards a Universal Text Classifier: Transfer Learning Using Encyclopedic Knowledge

Using Wikipedia as a Reference for Extracting Semantic Information from a Text

Leveraging Web 2.0 Sources for Web Content Classification

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options