The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper proposes two types of machine-extracted linguistic features from unlimited text input for Mandarin prosody generation. One is the improved punctuation confidence (iPC) which is a modified version of the previously proposed punctuation confidence that represents likelihood of inserting major punctuation marks (PMs) at word boundaries. Another is the quotation confidence (QC) which measures...
We propose a method for adapting Semantic Role Labeling (SRL) systems from a source domain to a target domain by combining a neural language model and linguistic resources to generate additional training examples. We primarily aim to improve the results of Location, Time, Manner and Direction roles. In our methodology, main words of selected predicates and arguments in the source-domain training data...
This paper presents a study on exploring word-level break-type formation rules for Mandarin read speech. A 4-layer hierarchical structure with seven break types is adopted to represent the prosody of utterance. The work is based on the break-type tags labeled on a large read-speech database by the prosody labeling and modeling algorithm (PLM) proposed previously. Occurrence frequencies of seven break...
Normally regarding control of robotics one studies how to control the robot, for example, how to plan and control its motions; however, in the current paper we pretend to teach the robot how to control people.
For linguistics related research on a language there is always a need for a large collection of database which includes all features of a language such as grammatical information, style of writing, syntax etc. Corpus provides a platform for investigation on a natural language. As compared to other languages very limited research work is done on Urdu language due to its segmentation dilemma and difficult...
A novel statistical linguistic feature, called punctuation confidence, is proposed in this paper for assisting in prosodic break prediction in Mandarin text-to-speech. The punctuation confidence calculated from the input text is a measure of the likelihood of inserting a major PM at a word boundary. Since a punctuation in text tends to be pronounced as a break, the punctuation confidence associated...
This research compares several of the thematic roles of Verb Net (VN) to those of the Linguistic Infrastructure for Interoperable Resources and Systems (LIRICS). The purpose of this comparison is to develop a standard set of thematic roles that would be suited to a variety of natural language processing (NLP) applications. We draw from both resources to construct a unified set of semantic roles that...
In this paper we present different methodologies to extract semantic role labels of Bengali nouns using 5W distilling. The 5W task seeks to extract the semantic information of nouns in a natural language sentence by distilling it into the answers to the 5W questions: Who, What, When, Where and Why. As Bengali is a resource constraint language, the building of annotated gold standard corpus and acquisition...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.