The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose a method for image selection using Web image search for automatic video biography authoring. In the proposed method, images are selected from the image search results considering their visual contents for inclusion in the video biography. Through evaluation, we confirmed the effectiveness of the proposed image selection method compared to a baseline method which simply selects...
Venue photos, as a new type of multimedia contents, are exploding on the Internet because users like to take photos and share with their friends in which venue they spent time and what impressed them there. Discovering a venue by a social photo is very useful for supplementing venue retrieval and recommendation. However, little research focused on fine-grained venue discovery by leveraging multimodal...
The interests of individual Internet users fall into a hierarchical structure which is useful in regards to building personalized searches and recommendations. Most studies on this subject construct the interest hierarchy of a single person from the document perspective. In this study, we constructed the user interest hierarchy via user profiles. We organized 433,397 user interests, referred to here...
Improving and assessing knowledge on a topic is need of the current generation. In the traditional system, experts create assessments manually by reading articles and generating questions on a topic. However, it is very effort and time consuming to read and create questions for every article. Therefore, learners find it difficult to assess their knowledge on a topic owing to no- or low-availability...
Following the trend of big data, the business value of data is becoming a hot research field in recent years. The novel concept of Data Jacket introduced by Ohsawa et al. solved the difficult problem of data transactions due to the particular characteristic of data, i.e. the safeguarding privacy. In order to make sure the mechanism of the market of data, there are some researchers proposed a gamified...
Clinical diagnosis is a critical aspect of patient care that is typically driven by expert medical knowledge and intuition. An automated system for clinical diagnosis could reduce the cognitive burden of clinicians during patient care and medical education. In this paper, we describe a Knowledge Graph (KG)-based clinical diagnosis system that leverages publicly available knowledge sources to infer...
Massive Open Online Courses (MOOCs) have attracted millions of people who are geographically dispersed. MOOCs are mainly authored in English. However, a big proportion of the participants speaks English as a foreign language. There are studies reporting that some participants struggle with understanding the language in the video lectures and are reluctant to communicate with other fellow learners...
The Web is witnessing an exponential growth of distributed and heterogeneous educational material. This hampers distinguishing among contents of these materials, as well as their retrieval. While information retrieval and classification mechanisms concentrate on corpus analysis, annotation approaches either target specific formats or require that a document follows interoperable standards. Rather...
Twitter is well known website famous for micro blogging where millions of users exchanging their opinions and thoughts. The tweets users are sharing has a error sum nature. The information available in tweets is insufficient. Because of character limitation tweets are short in nature many applications like Information Retrieval has problems in information retrieval. Here we are proposing a batch processing...
In the software process, unresolved natural language (NL) ambiguities in the early requirements phases may cause problems in later stages of development. Although methods exist to detect domain-independent ambiguities, ambiguities are also influenced by the domain-specific background of the stakeholders involved in the requirements process. In this paper, we aim to estimate the degree of ambiguity...
Acquiring stories and narratives about past periods is a challenge for cultural heritage preservation. In this context, we present a method to obtain from the web a corpus of texts related to the period of 1945-1975 in Luxembourg. Extracted texts are accompanied by meta-data that facilitate their integration by tier applications. As a use-case, this corpus will be used in a software that aims at helping...
We present a comprehension-based framework for measuring semantic similarity between documents of text. In various situations, vector-based similarity measures fail to capture deep semantic relations between terms. Our computational comprehension model processes textual content in a way that resembles human readers, paying attention to context, location, and acquisition time of semantic concepts....
This paper discusses the use of Wikipedia for building semantic ontologies to do Query Expansion (QE) in order to improve the search results of search engines. In this technique, selecting related Wikipedia concepts becomes important. We propose the use of network properties (degree, closeness, and pageRank) to build an ontology graph of user query concepts which is derived directly from Wikipedia...
This paper studies cross-lingual semantic similarity (CLSS) between five European languages (i.e. English, French, German, Spanish and Italian) via unsupervised word embeddings from a cross-lingual lexicon. The vocabulary in each language is projected onto a separate high-dimensional vector space, and these vector spaces are then compared using several different distance measures (i.e., correlation,...
The organization of the information in the knowledge economy hasbecome a priority business process. Better organization leads tofaster retrieval of relevant information. The process of searchingand sequencing didactic materials for course building is anarticulated and time-consuming process that requires considerableeffort by the user. The goal of this research is to implement aplatform for supporting...
Natural Language Processing (NLP) finds many usages in different fields of endeavor. Many tools exists allowing analysis of English language. For Polish language the situation is different as the language itself is more complicated. In this paper we show differences between NLP of Polish and English language. Existing solutions are presented and TEAMS software for facts extraction is described. The...
In recent years, the rapid development of geographic information system technology and the popularity of geo-location-based mobile information services have made people pay more attention to geography-related information. Thus, the information retrieval and related services based on geographic has a broad application prospects. However, the traditional search engine for the processing of geographic...
Knowledge graph technology belongs to the field of artificial intelligence. It is widely used in semantic search and intelligent question answering. Construction of Uyghur's knowledge graph has the great value of Uyghur information processing and Uyghur application software development. Firstly, this paper describes the definition and structure of the knowledge graph, then it reviews the related research...
Nowadays cross-media retrieval is an useful technology that helps people find expected information from the huge amount of multimodal data more efficiently. A common cross-media retrieval framework is first to map features of different modalities into an isomorphic semantic space so that the similarity between heterogeneous data can be measured. For most of semantic space based methods, the mapping...
The web is today's primary publication medium, making web archiving an important activity for historical and analytical purposes. Web pages are increasingly interactive, resulting in pages that are correspondingly difficult to archive. JavaScript enables interactions that can potentially change the client-side state of a representation. We refer to representations that load embedded resources via...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.