The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We propose a statistical framework for high-level feature extraction that uses SIFT Gaussian mixture models (GMMs) and audio models. SIFT features were extracted from all the image frames and modeled by a GMM. In addition, we used mel-frequency cepstral coefficients and ergodic hidden Markov models to detect high-level features in audio streams. The best result obtained by using SIFT GMMs in terms...
Recently there has been considerable interest in topic models based on the bag-of-features representation of images. The strong independence assumption inherent in the bag-of-features representation is not realistic however: patches often overlap and share underlying image structures. Moreover, important information with respect to relative scales of the features is completely ignored, for the sake...
Producing large amounts of digital media data every day requires fast transmission, efficient storage, flexible manipulation, and reuse of visual content. Since humans tend to use high-level semantic concepts when querying and browsing multimedia databases, there is an increasing need for semantic video indexing and analysis. For this purpose, we proposed a unified framework for semantic extraction...
This paper presents a method which able to integrate audio and visual information for human action scene analysis. The approach is top-down for determining and extracting action scenes in video by analyzing both audio and video data. We proposed a framework for recognizing actions by measuring image and action-based information from video with the following characteristics: feature extraction is done...
This work discusses the application of an Artificial Intelligence technique called data extraction and a process-based ontology in constructing experimental qualitative models for video retrieval and detection. We present a framework architecture that uses multimodality features as the knowledge representation scheme to model the behaviors of a number of human actions in the video scenes. The main...
Story boundary detection is the foundation of content based news video retrieval. In this paper, Naive Bayes Model, which has been successfully used in multi-modal feature fusion, is implemented in news video story segmentation. Firstly, we get candidate boundaries through shot detection. Secondly, middle-level features such as visual features, audio type, motion and caption, are extracted from shots...
Content-based video retrieval system is fairly recent and it is currently necessary to examine where it would just replace existing systems, where it can really bring some improvement and where it will open new possibilities. The users want to query the content instead of the raw video data. In this paper, we surveyed the art of video retrieval and proposed a basic framework for video retrieval based...
Besides the reduction of redundancy the selection of representative segments is a core problem when summarizing collections of raw video material. We propose a novel approach for the selection of segments to be included in a video summary based on hidden Markov models (HMM), which are trained on an annotated subset of the content. The observations of the HMM are relevance judgments of content segments...
Pervasive healthcare provides an effective solution for monitoring the wellbeing of elderly, quantifying post-operative patient recovery and monitoring the progression of neurodegenerative diseases such as Parkinson's. However, developing functional pervasive systems is a complex task that entails the creation of appropriate sensing platforms, integration of versatile technologies for data stream...
Commonly, surveillance operators are today monitoring a large number of CCTV screens, trying to solve the complex cognitive tasks of analyzing crowd behavior and detecting threats and other abnormal behavior. Information overload is a rule rather than an exception. Moreover, CCTV footage lacks important indicators revealing certain threats, and can also in other respects be complemented by data from...
Lipreading is a main part of audio-visual speech recognition systems which are mostly faced with redundancy of extracted features. In this paper, a new approach has been proposed to increase the lipreading performance by extraction of discriminant features. In this way, first, faces are detected; then, lip key points are extracted in which four cubic curves characterize lip contours. Next, the visual...
This research proposes a model of the multidimensional metadata generation approach for detecting human action in video. The idea is to develop a multidimensional multimodal framework, which will use a semantic approach on the action recognition and classification level. The main idea of the model is the inputs/outputs in the model will be the results of recognition processes from different modalities...
This paper presents a method which able to integrate audio and visual information for action scene analysis in any movie. The approach is top-down for determining and extract action scenes in video by analyzing both audio and video data. In this paper, we directly modelled the hierarchy and shared structures of human behaviours, and we present a framework of the hidden Markov model based application...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.