The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In recent years, the rapid development of geographic information system technology and the popularity of geo-location-based mobile information services have made people pay more attention to geography-related information. Thus, the information retrieval and related services based on geographic has a broad application prospects. However, the traditional search engine for the processing of geographic...
Knowledge graph technology belongs to the field of artificial intelligence. It is widely used in semantic search and intelligent question answering. Construction of Uyghur's knowledge graph has the great value of Uyghur information processing and Uyghur application software development. Firstly, this paper describes the definition and structure of the knowledge graph, then it reviews the related research...
Web archives preserve an unprecedented abundance of materials regarding major events and transformations in our society. In this paper, we present an approach for building event-centric sub-collections from such large archives, which includes not only the core documents related to the event itself but, even more importantly, documents describing related aspects (e.g., premises and consequences). This...
In this paper, we present the system of automatic MCQs (Multiple Choice Questions) generation for any given input text along with a set of distractors. The system is trained on a Wikipedia-based dataset consisting of URLs of Wikipedia articles. The important words (keywords) which consist of both bigrams and unigrams are extracted and stored in a dictionary along with many other components of the...
While a large number of well-known knowledge bases (KBs) in life science have been published as Linked Open Data, there are few KBs in Chinese. However, KBs of life science in Chinese are necessary when we want to automatically process and analyze electronic medical records (EMRs) in Chinese. Of all, the symptom KB in Chinese is the most seriously in need, since symptoms are the starting point of...
Enormous efforts of human volunteers have made Wikipedia become a treasure of textual knowledge. Relation extraction that aims at extracting structured knowledge in the unstructured texts in Wikipedia is an appealing but quite challenging problem because it's hard for machines to understand plain texts. Existing methods are not effective enough because they understand relation types in textual level...
In recent years, the amount of entities in large knowledge bases has been increasing rapidly. Such entities can help to bridge unstructured text with structured knowledge and thus be beneficial for many entity-centric applications. The key issue is to link entity mentions in text with entities in knowledge bases, where the main challenge lies in mention ambiguity. Many methods have been proposed to...
Semantic relation plays an important role in knowledge acquisition research. This paper proposes a method of semantic relation acquisition and automatic synthesis based on Wikipedia. First of all, we obtain the three kinds of basic semantic relations from Wikipedia and extend the semantic of concept aiming at the problem of semantic fuzziness in the semantic relation. Then, an automatic synthesis...
The characteristic of poor information of short text often makes the effect of traditional keywords extraction not as good as expected. In this paper, we propose a graph-based ranking algorithm by exploiting Wikipedia as an external knowledge base for short text keywords extraction. To overcome the shortcoming of poor information of short text, we introduce the Wikipedia to enrich the short text....
With the rapid evolution of Linked Open Data (LOD), researchers are exploiting it to solve particular problems such as semantic similarity assessment. Existing LOD-based semantic similarity approaches attach compared data (terms or concepts) to LOD resources to exploit their semantic descriptions and relationships with other resources and estimate the degree of overlap between resources. Current approaches...
Considering today's surge of information, the need for well organized knowledge bases is increasing rapidly for providing simplified access to knowledge and its further processing. In biomedical domain, heaps of information is buried in scientific publications and online forums. This calls for representing this information in a more expressive semantic way by determining and storing relational information...
With the growth of Linked Data, updating knowledge bases (KB) is becoming a crucial problem, particularly when representing the knowledge linked to permanently evolving instances. Many approaches have been proposed to extract new knowledge from textual documents in order to update existing KB. These approaches reach maturity but rely on the fact that the adequate corpus is already constructed. In...
Enormous efforts of human volunteers have made Wikipedia become a treasure of textual knowledge. Relation extraction that aims at extracting structured knowledge in the unstructured texts in Wikipedia is an appealing but quite challenging problem because it's hard for machines to understand plain texts. Existing methods are not effective enough because they understand relation types in textual level...
The knowledge base is a machine-readable set of knowledge. More and more multi-domain and large-scale knowledge bases have emerged in recent years, and they play an essential role in many information systems and semantic annotation tasks. However we do not have a perfect knowledge base yet and maybe we will never have a perfect one, because all the knowledge bases have limited coverage while new knowledge...
This paper provide a brief survey of semantic similarity including semantic similarity between concepts and semantic textual similarity. We classify methods of semantic similarity between into four categories based on background information resource used and classify methods of semantic textual similarity into four categories too. As a basic methodology of text related research and applications, semantic...
Building a knowledge base (KB) describing domain-specific entities is an important problem in industry, examples including KBs built over companies (e.g. Dun & Bradstreet), skills (LinkedIn, CareerBuilder) and people (inome). The task involves several engineering challenges, including devising effective procedures for data extraction, aggregation and deduplication. Data extraction involves processing...
This paper presents a method of acquiring IsA assertions (hyponymy relations), AtLocation assertions (informing of location of objects) and Located Near assertions (informing of neigh boring locations) automatically from Japanese Wikipedia XML dump files. To extract IsA assertions, we use the Hyponymy extraction tool v1.0, which analyses definition, category and hierarchy structures of Wikipedia articles...
Underspecified search queries can be performed via result list diversification approaches, which are often computationally complex and require longer response times. In this paper, we explore an alternative, and more efficient way to diversify the result list based on query expansion. To that end, we used a knowledge base pseudo-relevance feedback algorithm. We compared our algorithm to IA-Select,...
Numerous critical Internet applications with high-quality services, such as Web directory, search engine, Web crawler, recommendation system and user profile detector, etc. Almost depend on the efficient and accurate of web page classification system. Traditional supervised or semi-supervised machine learning methods become more and more difficult to adapt to the explosive Internet information. This...
The evolution of named entities affects exploration and retrieval tasks in digital libraries. An information retrieval system that is aware of name changes can actively support users in finding former occurrences of evolved entities. However, current structured knowledge bases, such as DBpedia or Freebase, do not provide enough information about evolutions, even though the data is available on their...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.