The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
TreeBank is a crucial resource for syntactic parsing. But syntactic labeling and error checking for treebank is time and energy consuming. To improve the efficiency in constructing treebanks, we have designed and realized a treebank editing system based on a graphical interface. And this graphical treebank editing system is proved to be 5 times faster than error checking on bracket marked syntactic...
Chinese word segmentation plays an important role in Chinese text mining. It is the foundation of automatic relation extraction and identification in Chinese information processing. In this paper, we propose a method for Chinese word segmentation based on conditional random fields (CRF) with character clustering. For the character clustering, we firstly use the Skip-Gram model to obtain character...
Generally different websites have different web page structures, which would heavily affect the extraction quality when the web content is automatically collected. On the basis of a statistical analysis on content features and structure characteristics of News domain web pages, this paper proposes a maximum continuous sum of text density (MCSTD) method to efficiently and effectively extract web content...
In this paper the systems submitted by the joint team of Dublin City University and National Taiwan University to the IALP 2016 Shared Task: Dimensional Sentiment Analysis for Chinese Words are presented. The systems learn the vector representation using Word2Vec algorithm for each Chinese word for sentiment analysis. The corpus used for the calculation of vector representation is 5 years (2006 to...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.