The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Morphemes are not independent units and attached to each other based on morphotactics. However, they are assumed to be independent from each other to cope with the complexity in most of the models in the literature. We introduce a language independent model for unsupervised morphological segmentation using hierarchical Dirichlet process (HDP). We model the morpheme dependencies in terms of morpheme...
This paper is mainly about the BIT group submitted system to the IALP-2016 Shared Task. This system is to automatically acquire the valence-arousal ratings of Chinese affective words. Two ways are designed to generate a given word's VA: one is based on Synonym Lexicons and the other is based on Word Embeddings. For the first way, we extend the annotated set based on synonym lexicon to improve coverage...
In this paper we described an authorship attribution system for Bengali blog texts. We have presented a new Bengali blog corpus of 3000 passages written by three authors. Our study proposes a text classification system, based on lexical features such as character bigrams and trigrams, word n-grams (n = 1, 2, 3) and stop words, using four classifiers. We achieve best results (more than 99%) on the...
Human motion capture techniques (MOCAP) are widely applied in many areas such as computer vision, computer animation, digital effect and virtual reality. Even with professional MOCAP system, the acquired motion data still always contains noise and outliers, which highlights the need for the essential motion refinement methods. In recent years, many approaches for motion refinement have been developed,...
We present initial experiments with one-class classification, aimed at replacing the “classic” heuristics-based measures used to estimate the smoothness of units concatenated together within unit selection speech synthesizers. A set of spectral feature distances was computed between neighbouring frames in natural speech recordings, i.e. those representing natural joins, from which the per-vowel classifier...
Automatic construction of machine-readable dictionary is a basic and challenging issue for non-common language processing. In this paper, we address the unsupervised ensemble learning (UEL) problem and investigate a UEL-based word extraction algorithm to detect multisyllabic words from large-scale Vietnamese text documents. Firstly, we design a syllable-level n-gram gluer to generate many potential...
In this paper, we introduce the application of generic multi-level Convolutional Neural Networks (CNN) approach into the scene understanding or image parsing task. Given an input image, first, a set of similar images from the training set are retrieved based on global-level CNN feature matching similarities. Then, the input test image and the similar images are overseg-mented into superpixels. Next,...
Genetic Programming (GP) is an evolutionary algorithm that has received a lot of attention lately due to its success in solving hard real-world problems. Lately, there has been considerable interest in GP's community to develop semantic genetic operators, i.e., operators that work on the phenotype. In this contribution, we describe EvoDAG (Evolving Directed Acyclic Graph) which is a Python library...
Personality prediction based on textual data is one topic gaining attention recently for its potential in large-scale personalized applications such as social media-based marketing and political campaigning. However, when applying this technology in real-world applications, users often encounter situations in which the personality traits derived from different sources (e.g., social media posts versus...
This work describes the Dynse framework, which uses dynamic selection of classifiers to deal with concept drift. Basically, classifiers trained on new supervised batches available over time are add to a pool, from which is elected a custom ensemble for each test instance during the classification time. The Dynse framework is highly customizable, and can be adapted to use any method for dynamic selection...
In this paper, we propose a novel learning framework for the problem of domain transfer learning. We map the data of two domains to one single common space, and learn a classifier in this common space. Then we adapt the common classifier to the two domains by adding two adaptive functions to it respectively. In the common space, the target domain data points are weighted and matched to the target...
Spam emails are a major threat that negatively impacts email users. Spam wastes time, financial resources of businesses, consumes network bandwidth and slows down email servers. In addition, provides a medium for distributing malicious code and there is currently not one solution to this problem. The Bag of Words (BoW) word content feature extraction method is well established for classifying spam...
Performance assessment of human teaming in complex, real-world contexts is a fundamental challenge for research and training communities alike. We highlight a unique partnership between the cybersecurity training and research communities with the common goal of capturing human team performance. Whether in the context of a training assessment or a research endeavor; both are two sides of the same coin...
Gender classification is becoming more important with the increasing demand of automated applications especially interactive applications. It can be used to increase the user friendliness of the interactive systems and also to improve the performance of systems like, targeted advertisement, automatic vending machines, security and surveillance systems etc. This work focuses on implementing a gender...
Current healthcare practice is transitioning from a provider-centered model to a patient-centered model of care, where patients are no longer passive recipients of care, but are encouraged to actively engage in and take greater responsibility for medical decision-making. As part of this trend patients are gaining access to larger and more diverse sets of medical texts through Electronic Medical Record...
Despite the growing demand by Brazilian companies for well-qualified professionals in Information Technology (IT), the Brazilian educational system has not been able to meet this demand in a satisfactory way, especially the quality of vocational training question. One way to address this training problem has been the creation of specialized postgraduate courses in IT area focused on students already...
Deep Convolutional Neural Network (CNN) is one of the most popular methods for image processing and recognition. There are many research works to improve the performance of CNNs. However, as an important part of CNNs, convolution kernel has rarely been discussed. As one Original Convolution Kernel (OCK) can only detect one type of visual feature with a fixed deformation, the networks using OCKs may...
Computer workers are in constant tension between meeting their deadlines and learning the tools they use to perform their jobs. Most often, the press to get work down overrides the importance of how to continually use the tool, and thereby improving performance over the long run. The lack of knowledge, however, results in constant interruptions to the workflow as the engineer tries to “bend” the tool...
This Work in Progress paper describes an initiative to promote (lesbian, gay, bisexual, transgender, and queer) LGBTQ equality in engineering via online Safe Zone workshops as a means to create a visible network of LGBTQ-affirming faculty who contribute to creating a positive and inclusive climate in engineering departments. The online Safe Zone workshop series is part of a larger project that aims...
Crowdsourcing has been an inspiration of effectively integrating the power of the crowd. Lots of current contributions on crowdsourcing focus only on the competition between the worker and the requester on resource allocation. However, most of these works lacked attention to the cooperation between the worker and requester. In this paper, we take into consideration the competition-cooperation coexisting...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.