The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper discusses the use of Wikipedia for building semantic ontologies to do Query Expansion (QE) in order to improve the search results of search engines. In this technique, selecting related Wikipedia concepts becomes important. We propose the use of network properties (degree, closeness, and pageRank) to build an ontology graph of user query concepts which is derived directly from Wikipedia...
The organization of the information in the knowledge economy hasbecome a priority business process. Better organization leads tofaster retrieval of relevant information. The process of searchingand sequencing didactic materials for course building is anarticulated and time-consuming process that requires considerableeffort by the user. The goal of this research is to implement aplatform for supporting...
Despite the significant contribution from specialized ontologies and text mining methods, the evaluation of the semantic similarity of genes remains difficult because of the complex functions in which genes are involved. A less exploited resource is Wikipedia that stores more than 10400 articles about human genes: each gene name identifies the corresponding Wikipedia page resuming gene's properties...
Social media constitute nowadays one of the most common communication mediums. Millions of users exploit them daily to share information with their community in the network via messages, referred as posts. The massive volume of information shared is extremely diverse and covers a vast spectrum of topics and interests. Automatically identifying the topics of the posts is of particular interest as this...
In the last years, the appearance of new and revolutionary technologies has changed the society in all its aspects (economy, feeding, personal relationships…). Education is not foreign to these changes. Genuine concepts such as collaborative learning, massive courses (MOOC) or flipped classrooms have been incorporated at every educational level, although the maturity of students allows a greater capacity...
This paper introduces an automatic categorical-marking model for text categorization. Traditional classification algorithms are generally applying labeled training set and call for a lot of manual work to tag classifications beforehand. Also due to the ambiguity and fuzziness of texts, the results of traditional text categorization algorithms may not be clear enough and abundant in content. This paper...
As more and more learners are opting for onlinelearning, e-learning industry is working on improving learningexperience of online user by providing relevant content and lotof additional references. Since online learners mostly prefervideo tutorials, identifying major topics and subtopics coveredin video tutorial is a big challenge. Recently, for efficientknowledge sharing and interoperability over...
The domain of traditional web is gradually evolving with the adaptation of newer techniques, which includes semantic web. Integration of web content using ontologies in a language independent manner is a required feature in this process. For better utilization of the resources, it is necessary that the ontology, which is working as a central knowledge repository, to be language independent as well...
Understanding changes in the mood and mentalhealth of large populations is a challenge, with the need for largenumbers of samples to uncover any regular patterns within thedata. The use of data generated by online activities of healthyindividuals offers the opportunity to perform such observationson the large scales and for the long periods that are required. Various studies have previously examined...
Enormous efforts of human volunteers have made Wikipedia become a treasure of textual knowledge. Relation extraction that aims at extracting structured knowledge in the unstructured texts in Wikipedia is an appealing but quite challenging problem because it's hard for machines to understand plain texts. Existing methods are not effective enough because they understand relation types in textual level...
This paper addresses the task of assigning multiple labels of fine-grained named entity (NE) types to Wikipedia articles. To address the sparseness of the input feature space, which is salient particularly in fine-grained type classification, we propose to learn article vectors (i.e. entity embeddings) from hypertext structure of Wikipedia using a Skip-gram model and incorporate them into the input...
As Wikipedia became the largest human knowledge repository, quality measurement of its articles received a lot of attention during the last decade. Most research efforts focused on classification of Wikipedia articles quality by using a different feature set. However, so far, no “golden feature set” was proposed. In this paper, we present a novel approach for classifying Wikipedia articles by analysing...
Wikipedia is the result of a collaborative effort aiming to represent human knowledge and to make it accessible for everyone. As such it contains lots of contemporary as well as history-related information. This research looks into historical data available in Wikipedia to explore its various time-related characteristics. In particular, we study Wikipedia articles on historical persons. Our analysis...
Collaborative systems such as Wikipedia have taken an important step toward creating content and organizing knowledge. Because they allow all people involve in creating content, such systems will face vandals' attacks and challenges. Therefore, in order to use this encyclopedia, it is important to trust in its content and measure its quality. User's reputation is an important factor for trusting electronic...
Semantic web technology can influence the next generation of eLearning systems and applications. Ontology as a major component of semantic web can be used in creating metadata for eLearning resources to improve adaptive eLearning systems. This paper presents an approach to automatically enrich eLearning domain ontology based on the integration of graph clustering techniques and external knowledge...
Wikipedia is an online encyclopedia which contains millions of articles related to different subject domains. Wikipedia also has a search page itself to display the links corresponding to Wikipedia articles for a given user query input. This search result page displays the search results according to the relevance order, without any content based grouping. This paper presents an experimental deduction...
This paper introduces the problem of topical sequence profiling. Given a sequence of text collections such as the annual proceedings of a conference, the topical sequence profile is the most diverse explicit topic embedding for that text collection sequence that is both representative and minimal. Topic embeddings represent a text collection sequence as numerical topic vectors by storing the relevance...
Information Extraction is an important task in Natural Language Processing research. Named Entity Recognition as one of the basic tasks of information extraction, the effect has a great impact on the subsequent tasks such as Relation Extraction. And a major difficulty of NER lies in the unknown word identification. For this issue, method of exploiting Wikipedia external information methods was studied...
Question Answering (QA) system is the task where arbitrary question IS posed in the form of natural language statements and a brief and concise text returned as an answer. Contrary to search engines where a long list of relevant documents returned as a result of a query, QA system aims at providing the direct answer or passage containing the answer. We propose a general purpose question answering...
This paper proposes an approach to finding answers within single text for a given question through extracting a network of categories from Wikipedia as background knowledge to support matching between question and answer. Experiments show that the approach is effective for keyword-based QA.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.