The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Real-time abnormal event detection in practical video surveillance has been a difficult task, because there are a huge amount of continuous arrival video data, where normal events may change and only a small portion of video data contains abnormal events. In this paper, to address this problem, we use the latter arrived data to online update our model in an incremental way. We propose a spatial-temporal...
Program anomaly detection models legitimate behaviors of complex software and detects deviations during execution. Behavior deviations may be caused by malicious exploits, design flaws, or operational errors. Probabilistic detection computes the likelihood of occurrences of observed call sequences. However, maintaining context sensitivity in detection incurs high modeling complexity and runtime overhead...
Bayesian Filtering in nonlinear stochastic dynamical systems has been addressed for a long time. Among other solutions, Particle Filtering (PF) algorithms propagate in time a Monte Carlo (MC) approximation of the a posteriori filtering measure. This paper presents an algorithm for particle prediction that takes into account a variant of the traditional Hidden Markov Model where noises (process noise...
Phone-cluster adaptive training (Phone-CAT) is a subspace based acoustic modeling technique inspired from cluster adaptive training (CAT) and subspace Gaussian mixture model (SGMM). This paper explores three extensions, viz., increasing phonetic subspace dimension, including sub-states and speaker subspace, to the basic Phone-CAT model to improve its recognition performance. The latter two extensions...
Most of the facial expression recognition methods consider that both training and testing data are equally distributed. As facial image sequences may contain information for heterogeneous sources, facial data may be asymmetrically distributed between training and testing, as it may be difficult to maintain the same quality and quantity of information. In this work, we present a novel classification...
Since the last two decades' Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness...
Pervasive Computing also called as ubiquitous computing where computing is made available to anytime and anywhere. It is very fast growing innovative trends in the field of Internet of Thing (IOT). Context Awareness is key part of pervasive computing. Context aware computing it means act of both sensing data from sensor and react according to sensory data. Context is nothing but information about...
Named-Entity-Recognition (NER) is one of the major tasks under Natural Language Processing, which is widely used in the fields of Computer Science and Computational Linguistics. However, the amount of prior research done on NER for Sinhala is very minimal. In this paper, we present data-driven techniques to detect Named Entities in Sinhala text, with the use of Conditional Random Fields (CRF) and...
Statistical topic models represented by Latent Dirichlet Allocation (LDA) and its variants are ubiquitously applied to understanding large corpora. Meanwhile, topic models based on bag-of-words (Bow) rarely adopt contextual information, which encompasses enormous amount of serviceable knowledge in a document, into the probabilistic framework. This shortcoming of LDA leads to its failing to learn contextual...
This paper aimed at introducing a completely automated Arabic phone recognition system based on Enhanced Wavelet Packets Best Tree Encoding (EWPBTE) 15-point speech feature. The process of enhancing of WPBTE is provided by adding energy component to WPBTE, which is implemented in Matlab software and makes an enhancement of 65 % to recognizer accuracy which is the most contribution in this paper. EWPBTE...
Meetings are an important communication and coordination activity of teams: status is discussed, new decisions are made, alternatives are considered, details are explained, information is presented, and new ideas are generated. As such, meetings contain a large amount of rich project information that is often not formally documented. Capturing all of this informal meeting information has been a topic...
Numerous methods have been proposed to address different aspects of human activity recognition. However, most of the previous approaches are static in terms of the data sources used for the recognition task. As sensors can be added or can fail and be replaced by different types of sensors, creating an activity recognition model that is able to leverage dynamically available sensors becomes important...
The aim of machine learning is to solve a given problem using past experience or example data. Many machine learning applications are using now-a-days already. More aspiring problems can be handled as more data become accessible. Here. in this context we learn in detail about text mining as a multi-dimensional field which involves the closely linked areas or sections like 1. Retrieving information,...
This paper focuses on video summarization of abnormal behavior for remote invigilation of online exams. While the last decade has seen a massive increase in e-learning and online courses offered at postsecondary institutions, preserving the integrity of online examinations still heavily relies on web video conference invigilation performed by a remote proctor. Live remote invigilation is limited in...
The linear-chain CRFs is one of the most popular discriminative models for human action recognition, as it can achieve good prediction performance in temporal sequential labeling by capturing the one-or few-timestep interactions of the target states. However, existing CRFs formulations have limited capabilities to capture deeper intermediate representations within the target states and higher order...
The HOWERD model for estimating the most likely alignment between an OWL ontology and an Entity Relation Diagram (ERD) is presented. Automatic alignment between relational schema and ontology represents a big challenge in Semantic Web research due to the different expressiveness of these representations. A relational schema is less expressive than the ontology, this is a non trivial problem when accessing...
The rapid growth of video data demands both effective and efficient video summarization methods so that users are allowed to speedily browse and comprehend a large amount of video content. Hence, it is very challenging to store and access such audiovisual information in real time where an immense amount of recorded video content is rising within one second. In this paper we proposed an equal partition...
Alternative features were derived from extracted temporal envelope bank (TBANK). These simplified temporal representations were investigated in alignment procedures to generate frame-level training labels for deep neural networks (DNNs). TBANK features improved temporal alignments both for supervised training and for context dependent tree building.
This paper adopts the maximum model for English part of speech tagging. It makes pre-tagging for the word that has the only part of speech during the pretreatment of corpus, which adds many context features that can be utilized. We also improve the tagging algorithm, and take into account the whole optimization of POS series without extra computation, and the accuracy of tagging is also improved....
Prosody is a kind of cues that are critical to human speech perception and comprehension, so it is plausible to integrate prosodic information into machine speech recognition. However, as a result of the supra-segmental nature, it is hard to integrate prosodic information with conventional acoustic features. Recently, RNNLMs have shown to be the state-of-the-art language model in many tasks. We thus...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.