The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With the rapid growth of unstructured data accessible via web, managing these data and finding undiscovered information in huge dataset become a necessary task. Consequently text mining, which can be defined as gleaning important information from natural language text, has emerged. In this study, in order to facilitate information management for aspect based sentiment analysis studies, a Turkish sentiment...
Speech synthesis is the computer generated human voice. It is also known as a text-to-speech system which converts text information into speech. Speech synthesis systems are often called text-to-speech (TTS) systems about their ability to convert text into speech. A TTS synthesis system converts written orthographic text into corresponding artificial speech signals. In multi-lingual cultural settings,...
Each of the dialects of Thai Language has a distinct identity associated with its accents. The conversation between different native speakers of these dialects despite their standard language origination cannot be avoided when visiting each region. Communication with people who understand only the Northern Thai Dialect (NTD) brought us to the idea of inventing the Northern Thai Dialect Text to Speech...
Speech Synthesis System converts written text to speech. To build a natural sounding speech synthesis system, it is essential that the text processing component produce an appropriate sequence of units. Syllable preserves co-articulation effects within the sound unit. In our current work, concatenative method is use to develop a synthesis system using syllable as the basic unit which includes Jodhakshars,...
We constructed a system infrastructure capable of processing unstructured data, with the aim of practical application of the system for document data analysis in the manufacturing industry. Using past ISSM research paper data, papers were classified and verified. Using morphological analysis, the extracted parts of speech were used as feature quantities, and machine learning was executed. Since effective...
Examining data for similar items is one of the fundamental data-mining problems. Application of methods for similarity search could be useful for plagiarism or near-duplicate web page detection. The computerized methods developed during last years are mainly focused on English language. However, Slovak language has several specific attributes and using these methods may not be precise enough. Our...
Syntactic text analysis is a very important step of automatic text processing. The key problem is that all existing approaches are dictionary-dependent. It can be impossible to analyze all sentences because of the lack of one word in the dictionary. However even the presence of the dictionary does not resolve the phrases interpretation ambiguity. At the same time fusional languages contain enough...
By far there are more than 1.2 million Dai compatriots using Dai language in Yunnan province, researching Dai speech synthesis has great significance in advancing the informationization of Dai. This paper focuses on the study of the implementation of Dai speech synthesis by taking the HMM speech synthesis framework and STRAIGHT synthesizer into account. The methods of collection and selection of Dai...
Text analysis is the front-end of a TTS system, which has a great influence on the naturalness of the back-end speech synthesis. Statistical parametric speech synthesis is being commonly applied into speech synthesis now, and gradually becoming an important method of the current speech synthesis, however, the research of front-end text analysis is often overlooked in the process of current Tibetan...
Communication is a very natural characteristic of every creature. Sometimes we use different symbols, or many formed languages to communicate each other. Every Languages we use are able for both oral and text communications. Writing symbols is a way to express our intentions through using any physical material. As we have oral communication capability too which we could use exactly as we want to speak...
This paper investigates the problem of audio event detection and summarization, building on previous work [1,2] on the detection of perceptually important audio events based on saliency models. We take a synergistic approach to audio summarization where saliency computation of audio streams is assisted by using the text modality as well. Auditory saliency is assessed by auditory and perceptual cues...
We describe a basic framework and methodology to convert Bangla Text to Speech. Articulated words are automatically produced from Bangla input text by the methodology from the basic pronunciation of the Bangla words. The single tone syllables are considered as the fundamental units for analysis. The methodology selects phonetic units from uttered vocabulary and then combined the appropriate diphones...
Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts...
In this paper, we investigate how top management team (TMT) attention distribution affects firm's international expansion strategy choice by a case study. With the help of automated text analysis method, we analyze CEO's public speeches and annual reports of Huawei, and measure TMT attention by counting sentences relating to technology seeking, global brand building, and target market positioning,...
This paper presents an approach that considers both the corpus level (global) information as well as localized acoustic patterns to discover prominent words in an audio conversations. The global information is extracted by using text analysis techniques, in particular latent Dirichlet allocation (LDA), that extracts domain specific prominent words and also arranges them in a set of topics. The domain...
Shallow semantic parsing of natural language processing is an important component in all kind of NLP applications and Semantic Role Labeling in particular, is an active research topic. This paper describes a rule-based Semantic Role Labeling system aimed at extracting semantic information from texts. The input text is processed by exploiting part of speech information and syntactic dependencies in...
The paper presents the Text Normalization and Phonetic Analysis Modules that are part of the frontend of the text-to-speech (TTS) system “Speak Macedonian”. First, the architecture of the frontend of the TTS system “Speak Macedonian” is shortly presented, followed by a detailed look into the two modules. For each of the modules a short summary is given of the tasks and developed solutions, found in...
This paper describes the collection of a text and audio corpus for mobile personal communication in Hindi. Hindi is the largest of the Indian languages, and is the first language for more than 200 million people who use it not only for spoken mobile communication but also for sending text messages to each other. The main script for Hindi is Devanagari, but it is not well supported by the current generation...
In Myanmar to English language translation system, in order to provide meaningful sentence from one language to another is non-trivial task. POS tagging is used as an early stage of linguistic text analysis in many applications. POS tagging is a process of assigning correct syntactic categories to each word. Tagsets and word disambiguation rules are fundamental parts of any POS tagger. This paper...
A writing genre can be thought of as the style in which the writer chooses to present textual content to the reader. We distinguish four main types of essay genres namely Narrative, Persuasive, Descriptive and Expository. An essay's writing genre can be identified by searching for salient features present within those genres using various Natural Language Processing tools such as Named Entity Recognition,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.