The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Today, user generated content and online shared opinions are gaining relevance as a source of information not only for other consumers but also for retailers. However, the huge number of posted opinions makes difficult any manual analysis. This paper proposes a new approach for gender discourse analysis based on the semantic analysis of the content of shared reviews in electronic word of mouth communities...
Personalized recommendation technology provides the possibility for users to obtain academic resources quickly and accurately. However, the existing recommendation methods based on user's historical behaviors and paper contents are limited in terms of expanding user perspectives. The existing methods that evaluate the authority and quality of academic resources based on academic network ignore paper...
For more than a decade, researchers have been proposing various methods and techniques to mine social tagging data and to learn structured knowledge. It is essential to conduct a comprehensive survey on the related work, which would benefit the research community by providing better understanding of the state-of-the-art and insights into the future research directions. The paper first defines the...
This paper presents a design, implementation and evaluation of a web-based application called MergeXML (MXML). MXML was developed to integrate XML documents that are similar in terms of structure and content to complete information which can be used for information retrieval. XML documents are clustered into subtrees representing as instances using leaf-node parents as clustering points. The system...
Smart Identifier NETwork (SINET), as a novel proposal for future Internet architecture, is expected to make the current Internet flexible. SINET is featured by completely resolving the triple-binding related issues of current Internet, namely resource-location binding, user-network binding and control-data binding. Due to thorough and rigorous redesign of the Internet, SINET is equipped with many...
Cross-language information retrieval research has favored system-centered approaches in the past. The user is not an integral part of the translation and retrieval processes. In this paper, we investigate the problem of personalized cross-language information retrieval by exploiting query expansion techniques. The original query is augmented with terms mined from the user's historical usage information...
The knowledge base is a machine-readable set of knowledge. More and more multi-domain and large-scale knowledge bases have emerged in recent years, and they play an essential role in many information systems and semantic annotation tasks. However we do not have a perfect knowledge base yet and maybe we will never have a perfect one, because all the knowledge bases have limited coverage while new knowledge...
Similar cases recommendation is more and more popular in the internet inquiry. There have been lots of cases which have been solved perfectly, and recommending them to similar inquiries can not only save the patients' waiting time, but also giving more good references. However, the inquiry platform cannot understand the diversity of description, i.e. the same meaning with different description. This...
Existing work in the semantic relatedness literature has already considered various information sources such as WordNet, Wikipedia and Web search engines to identify the semantic relatedness between two words. We will show that existing semantic relatedness measures might not be directly applicable to microblogging content such as tweets due to i) the informality and short length of microblogging...
Multimedia content is increasingly available in multiple modalities. Each modality provides a different representation of the same entity. This paper studies the problem of joint representation of the text and image components of multimedia documents. However, most existing algorithms focus more on inter-modal connection rather than intramodal feature extraction. In this paper, a simple yet effective...
Named Entity Disambiguation (NED) aims at dis-ambiguating named entity mentions in a text to their corre-sponding entries in a knowledge base such as Wikipedia. Itis a fundamental task in Natural Language Processing (NLP)and has many applications such as information extraction, information retrieval, and knowledge acquisition. In the pastdecade, a number of methods have been proposed for theNED task...
By using the market of data, we are able to make decision and solve problem for new business based on real-world data to do. When someone needs to find some suitable data employed to ideas that are going to be realized, we can use keywords derived from the ideas as query to search in the market of data. However, sometimes the keywords are not included in the descriptions of data in the database, even...
Levering data on social media, such as Twitter and Facebook, requires information retrieval algorithms to become able to relate very short text fragments to each other. Traditional text similarity methods such as tf-idf cosine-similarity, based on word overlap, mostly fail to produce good results in this case, since word overlap is little or non-existent. Recently, distributed word representations,...
This article tries to summarize the developments of the Sightspot Ltd — which were carried out at the University of Debrecen — and the possibilities coming from the introduction of the system. Our aim is to introduce and publish the visions, which were formed during the implementation of the system. The systems introduction opened such new doors, which earlier seemed impassable — now we feel that...
Hindi is the fourth largest spoken language in the world. Nowadays in India, working on Internet using Hindi language is becoming popular. But Hindi language has several ambiguous words which affect on sense of the Hindi sentence. The word "Ambiguous" refers to "having more than one meaning or senses". The technique of examining the correct meaning of a word as specified in a given...
We describe a business workflow case study with abnormal behavior management (i.e. recovery) and demonstrate how temporal logics and model checking can provide a methodology to iteratively revise the design and obtain a correct-by construction system. To do so we define a formal semantics by giving a compilation of generic workflow patterns into LTL and we use the bound model checker Zot to prove...
Recently we can get to huge amount of complex information easily and quickly from internet. But it is hard to capture appropriate information inside since we should go through them and see quickly what's going on. So automatic summarization is indispensable. In this work, assuming concept hierarchy, we extract suitable labels for documents by abstracting and ranking characteristic words.
Today's educational systems require students to recall and apply major concepts from study material to perform competently in assessments. Crucial to achieving this is practice and self-assessment through questions. The crafting of such questions can be time consuming for teachers while questions from external sources, e.g. assessment books, might not be tailored to suit students' study materials...
We present an approach to create an Internet slang annotated dictionary to help identifying the level of specific attitudes and moods (specifically aggressiveness, distress, hatefulness and offensiveness) in social networks and social media posts. The annotation refers to the attitudes, with the actual meaning having a low importance. The annotated dictionary is intended for automatic use in the detection...
In view of word sense disambiguation shortcomings of the previous methods, they generally do not consider on word distance for computing semantic correlation of the influence of context, as well as the context is limited for ambiguous word sense disambiguation, and the use of part ambiguous context words make word senses more ambiguous. Therefore, this paper proposes the use of dependency parse tree...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.