The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Automatic grading systems, such as WebWork, are becoming much more widely used as they relieve the instructor from needing to grade student work, provide students with automatic feedback, and can allow for immediate resubmission. They have also been shown to improve the effectiveness of teaching and learning. In this paper, we apply Item Response Theory (IRT) to a large WebWork Calculus homework dataset...
Mislabeled examples are difficult to avoid while building large scale datasets. In this paper we discuss an efficient approach for finding those mislabeled examples. Our approach involves selecting a small number of potentially mislabeled examples for review by an expert. We demonstrate the utility of our method by finding some mislabeled examples in one large scale dataset (ImageNet). We found 92...
We present work-in-progress reflecting on the initial year of a distinctive summer Research Experiences for Undergraduates (REU) program. Our REU model combines fundamental research in computational sensing with a scholarly context that connects computer science with computational liberal arts. Students are intellectually stimulated to make sense of people's behaviors and cognitive processes with...
We present a procedure for generating Abstract Meaning Representation (AMR) structures from English sentences based on a transition-based system. Our proposed solution makes use of Long Short Term Memory networks to learn the action sequence that needs to be applied on the sentence in order to obtain the AMR graph. The action set is an extension of the arc-standard dependency parser, with several...
Different data mining techniques are employed in stylometry domain for performing authorship attribution tasks. Sometimes to improve the decision system the discretization of input data can be applied. In many cases such approach allows to obtain better classification results. On the other hand, there were situations in which discretization decreased overall performance of the system. Therefore, the...
There is very little practicable significance to prove the equivalency between a pseudo-inverse linear discriminant (PILD) with the desired outputs in reverse proportion to the number of within-class samples and a Fisher linear discriminant (FLD) with the totally projected mean thresholds which are disadvantageous to improve the overall classification accuracy. Even if so, several examples have borne...
Previous RNN architectures have largely been superseded by LSTM, or “Long Short-Term Memory”. Since its introduction, there have been many variations on this simple design. However, it is still widely used and we are not aware of a gated-RNN architecture that outperforms LSTM in a broad sense while still being as simple and efficient. In this paper we propose a modified LSTM-like architecture. Our...
Recently, the multi-label learning has drawn considerable attention as it has many applications in text classification, image annotation and query/keyword suggestions etc. In recent years, a number of remedies have been proposed to address this challenging task. However, they are either tree based methods which has the expensive train costs or embedding based methods which has relatively lower accuracy...
In this paper, we present an intelligent, state-of-the-art, mobile-based transportation system called SAFAR (Safe and Fast around the Road), which provides dynamic information to Karachi bus commuters concerning any type of violence incident which has occurred farther ahead from their current location on the current bus route. Using named entity recognition techniques, we have trained SAFAR to recognize...
Modern data is increasingly complex. High dimensionality, heterogeneity and independent multiple representations are the basic properties of today's data. With increasing sources of data collection, a single object can have multiple representations, which we call views. In this paper we propose a multiview classification technique, which uses fuzzy mapping to obtain maximum similarity between an object...
This paper is mainly about the BIT group submitted system to the IALP-2016 Shared Task. This system is to automatically acquire the valence-arousal ratings of Chinese affective words. Two ways are designed to generate a given word's VA: one is based on Synonym Lexicons and the other is based on Word Embeddings. For the first way, we extend the annotated set based on synonym lexicon to improve coverage...
Many classification problems involve nodes that have a natural connection between them, such as links between people, pages, or social network accounts. Recent work has demonstrated how to learn relational dependencies from these links, then leverage them as predictive features. However, while this can often improve accuracy, the use of linked information can also lead to cascading prediction errors,...
This paper deals with the design of a weighted ensemble of classifiers to classify imbalance data having heterogeneous features. For this purpose, a meta ensemble model is created and instead of class labels, the output of each base classifier used in the ensemble model is transformed into a [class label, weight] pair to deal with the problem. The performances of the proposed method on various datasets...
Deep learning has recently gained popularity in many machine learning applications, but a theoretical grounding for the strengths, weaknesses, and implicit biases of various deep learning methods is still a work in progress. Here, we analyze the role of spatial locality in Deep Belief Networks (DBN) and show that spatially local information is lost through diffusion as the network becomes deeper....
We develop a Partitioned Restricted Boltzmann Machine (PRBM) for classification. We demonstrate that this method provides both speed and accuracy. Specifically, because it is partitioned into smaller RBMs, all available data can be used for training, and individual RBMs can be trained in parallel. Moreover, as the number of dimensions increases, the number of partitions can be increased to significantly...
Because of huge usage of data, necessity of the Data leakage prevention is growing day by day. Data Leakage Prevention system decided that particular data (confidential or non-confidential) is permitted to access or not. In Data leakage Prevention, time stamp is very important for giving permission to access a particular data, because in a particular period of time the data is confidential after the...
Under sampling is a popular technique for unbalanced datasets to reduce the skew in class distributions. However, it is well-known that under sampling one class modifies the priors of the training set and consequently biases the posterior probabilities of a classifier. In this paper, we study analytically and experimentally how under sampling affects the posterior probability of a machine learning...
In this paper, we propose a method to recognize human behavior by combining motion history images (MHI) and non-negative matrix factorization (NMF). The MHI preserves the temporal information of a behavior by holding the temporal motion appearance. Then, NMF is applied to extract the middle-level features of the moving object. The experimental results show that the proposed scheme can achieve robust...
We present LISSA — Live Interactive Social Skill Assistance — a web-based system that helps people practice their conversational skills by having short conversations with a human like virtual agent and receiving real-time feedback on their nonverbal behavior. In this paper, we describe the development of an interface for these features and examine the viability of real time feedback using a Wizard...
The advantages of multi-classification schemes based on decomposition strategies, and especially the One-vs-One framework, have been stressed even for those algorithms that can address multiple classes. However, there is an inherent hitch for the One-vs-One learning scheme related to the decision process: the non-competent classifier problem. This issue refers to the case where a binary classifier...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.