The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Email is a kind of semi-structured document, some important attributes are contained in its structure, and especially using spam-specific features could improve the email classification results. In this paper, we apply decision tree data mining technique to dig out the potential association rules among these attributes of email, and then to identify unknown email's category based on these rules. According...
Coreference that is a kind of ubiquitous language phenomenon makes the topic more highlighted and the narration more concise and coherent in discourse. Conversely, it leads to ambiguity in Natural Language Processing as well. Coreference resolution is the process that eliminates the indeterminacy caused by coreferential forms. To improve the current system, a method of coreference resolution combined...
This paper presents a rule-based approach for the phonetic transcription of the Romanian language. We integrate this phonetic analysis in the text processing component of a text-to-speech system for Romanian. Grapheme-to-phoneme rules are constructed based on expert information from DOOMII dictionary. In the cases when rules are useless, we employed decision trees constructed on engineered training...
In this paper, we analyze and compare various approaches for Thai word segmentation. The word segmentation approaches could be classified into two distinct types, dictionary based (DCB) and machine learning based (MLB). The DCB approach relies on a set of terms for parsing and segmenting input texts. Whereas the MLB approach relies on a model trained from a corpus by using machine learning techniques...
Coreference resolution is the process of determining whether two expressions in natural language refer to the same entity in the world. We adopt machine learning approach using decision tree to a coreference resolution of general noun phrases in unrestricted text based on well defined features. We also use approximate matching algorithms for a string match feature and databases of American last names...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.