The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With the development of natural language processing (NLP) technology, the need for automatic named entity recognition (NER) is highlighted in order to enhance the performance of information extraction systems. In this paper, a hybrid model for Chinese person based on conditional random fields model is proposed, which fuses multiple features. It differentiates from most of the previous approaches,...
A novel approach of the entity relation extraction is proposed by this paper, it is different from the previous approaches, and the syntactic knowledge extraction is specific section, which automatically extracts the characteristic words and patterns based on hierarchy bootstrapping machine learning. It advocates using a small amount of seed information and a large collection of easily-obtained unlabeled...
Entity relation extraction (RE) is one of the important research fields in information extraction, we regard RE as a classification problem in this paper. This paper presents a novel approach, conditional random fields (CRFs)-based machine learning is used to extract entity relation between entities from Chinese texts, ten features have been designed for entity relation extraction, which includes...
This paper proposes a new approach for personal name recognition in Chinese language domain. Combining rule-based and statistical method, we consider wonderful linguistics knowledge; firstly step, we collect personal name as candidate entity, and send it into statistical model to decide whether it is the relevant entity, the conditional random fields (CRFs) is used in this paper. At the same time,...
Entity relation extraction (RE) is an very important research domain in information extraction, we can regard RE as a classification problem in this paper, RE is still original study field in Chinese language now, maximum entropy (ME)-based machine learning is the first time to be used to extract entity relations between named entities from Chinese texts, Thirteen features have been designed for entity...
In the task of Chinese word segmentation, there are two main segmentation ambiguities, overlapping ambiguity and combination ambiguity. The paper analyzes properties of ambiguities and supposes multi-knowledge approach to disambiguate. Multi-knowledge refers to the knowledge from statistic of large corpus and syntactic, semantic or discourse information about ambiguous words. Class based N-gram and...
A specific prototype information service system was proposed by this paper, which can send interesting information to user with database search way from unstructured text. In order to achieve this goal, two fundamental issues were studied by using maximum entropy (ME) algorithm, which is named entity recognition and relation extraction. Our named entity recognition approach is distinguished from most...
In Chinese word segmentation task, combination ambiguity is one of challenges not being well settled. The main obstacle exists in the detection of ambiguous words in given texts and their proper segmentations. This paper puts forward a practical approach to automatically collecting ambiguous words and disambiguating based on maximum entropy principle. The experimental result reveals the approach of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.