The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes a measurement based on Minimum Edit Distance (MED) to the similarity between two sets of MultiWord Expressions (MWEs), which we use to calculate matching degree between two documents. We test the matching algorithm in the position searching system. Experiments show that the new measurement has higher performance than the cosine distance.
This paper presents a novel application of incorporating Alternating Structure Optimization (ASO) to conduct the task of text chunking of Semantic Role Labeling (SRL) in Chinese texts. ASO is a competent linear algorithm based on the theory of multi-task learning. In this paper, by constructing several SRL tasks to constitute a multi-task, we are able to encode the inference obtained by ASO algorithm...
This paper proposes an novel approach to annotate function tags for unparsed text. What distinguishes our work from other attempts in such task is that we assign function tags directly basing on lexical information other than on parsed trees. In order to demonstrate the effectiveness and versatility of our method, we investigate two statistical models for automatic annotation, one is log-linear maximum...
Recent years have seen great process in studying English question classification. In our research, we learn Chinese question classification by exploiting the result of lexical, syntactic and semantic parsing on question sentences. Support vector machines are adopted to train a classifier on 6 coarse categories using single and combination of different parsing results as features. We find that even...
Chinese named entity recognition (NER) is studied in two directions: inner structure and outer surroundings. Inner structural analyses induce constitutions of person, location and organization name from the point of linguistics. However inner structural rules for named entities only provide necessary conditions for a sequence of Chinese characters being an entity name but not sufficient. Whether a...
Since noun phrases are the most popular phrases in texts, noun phrase identification is one of vital subtasks of natural language processing. Generally Chinese noun phrases have hierarchical inner structures. This paper proposes an approach of defining various levels of granularity for noun phrases, catering for different application demands. Three levels of granularity noun phrases are proposed,...
Person, location and organization have been always mentioned as a bottleneck of a named entity recognition (NER) system. Automatic recognition of Chinese organization name is the most difficult problem in NER tasks. This paper presents a new approach of Chinese organization name recognition based on cascaded conditional random fields. In the proposed approach, we first recognize the person name and...
Entity relation extraction (RE) is an very important research domain in information extraction, we can regard RE as a classification problem in this paper, RE is still original study field in Chinese language now, maximum entropy (ME)-based machine learning is the first time to be used to extract entity relations between named entities from Chinese texts, Thirteen features have been designed for entity...
A specific prototype information service system was proposed by this paper, which can send interesting information to user with database search way from unstructured text. In order to achieve this goal, two fundamental issues were studied by using maximum entropy (ME) algorithm, which is named entity recognition and relation extraction. Our named entity recognition approach is distinguished from most...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.