The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a multi-attribute sparse coding approach for facial expression recognition by regarding Action-Units (AUs) as attributes. AUs describe the movements of individual facial muscles, which are detected from corresponding attribute masks in this work. They can not only be used to de scribe group property which enforces basis selection from groups with the same AUs as best as possible,...
We present Sparse Coding trees (SC-trees), a sparse coding-based framework for resolving misclassifications arising when multiple classes map to a common set of features. SC-trees are novel supervised classification trees that use node-specific dictionaries and classifiers to direct input based on classification results in the feature space at each node. We have applied SC-trees to emotion classification...
Recently, the sparse coding based image representation has achieved state-of-the-art recognition results on many benchmarks. In this paper, we propose Multi-cue Normalized Non-Negative Sparse Encoder (MN3SE) which enforces both the non-negative constraint and the shift-invariant constraint on top of the traditional sparse coding criteria, and takes multi-cue to further boost the performance. The former...
This paper presents evolutionary multi-objective approaches to tune parameters in Trace transform for invariant feature construction. It is well-known that the Trace transform involves three functionals applied consecutively to the image to produce real numbers called Triple features representing the input image. Traditionally, these functionals are chosen empirically, and the image sampling parameters...
One of the main factors limiting the wider adoption of ultrasound imaging for diagnosis and therapy is requiring highly skilled sonographers. In this paper we consider the challenge of making this technology easier to use for non-experts. Our approach follows some of the recently proposed frameworks that break the process into firstly data acquisition through a simple and task-specific scan protocol...
This paper presents pattern classification to a predefined set of classes as a missing data task. This is achieved by first augmenting the feature vector of each training pattern with the corresponding binary codeword representing its class. A Restricted Boltzmann Machine (RBM) or a Dictionary Learning (DL) algorithm is then trained on the augmented feature space. During the classification stage,...
Rapid growth in Information technology and Communication networking, have increased the inclination of professionals in storage and archival of multimedia-video data. Efficient and accurate retrieval of archived video data is essential need of many professional groups like researchers, analyst, journalist and historians. Textual metadata based video retrieval is intuitive and subject to human perception...
Content Based Video Classification is becoming necessary for various video analysis applications to be able to handle humongous amounts of video data being generated & shared all over the Internet. This paper proposes use of DTTBTC for color based feature extraction from video key frames which are used by machine learning classifiers for training & testing. Experimental results show accuracy...
Research into deep learning has demonstrated performance competitive with humans on some visual tasks, however, these systems have been primarily trained through supervised and unsupervised learning algorithms. Alternatively, research is showing that evolution may have a significant role in the development of visual systems. Thus neuroevolution for deep learning is investigated in this paper. In particular,...
This work utilizes the coding information in HEVC for video copy detection. Both directional modes and residual coefficients of the I-frames are employed as the texture features for matching. These features are robust against different quantization parameters and different frame sizes. The accuracy is comparable with traditional pixel domain approaches.
With the development of artificial intelligence and pattern recognition, facial expression recognition plays a more and more important role in intelligent human-computer interaction. In this paper, we present a model named K-order emotional intensity model (K-EIM) which is based on K-Means clustering. Different from other related works, the proposed approach can quantify emotional intensity in an...
Visual codebook based quantization of robust appearance descriptors extracted from local image patches is an effective means of capturing image statistics for object classification. A codebook is usually constructed by using a cluster method such as k-means at object level or image level. The codebook is global. For fine-grained categorization and recognition problems, however, the global object-level...
Deep brain stimulation (DBS) of Subthalamic Nucleus (STN) is the best method for treating advanced Parkinson's disease (PD), leading to striking improvements in motor function and quality of life of PD patients. During DBS, online analysis of microelectrode recording (MER) signals is a powerful tool to locate the STN. Therapeutic outcomes depend of a precise positioning of a stimulator device in the...
Many mid-level representations have been developed to replace traditional bag-of-words model (VQ+fc-means) such as sparse coding, OMP-fc with fc-SVD, and fisher vector with GMM in image domain. These approaches can be split into a dictionary learning phase and a feature encoding phase which are often closely related. In this paper, we jointly evaluate the effect of these two phases for video-based...
Most literatures have been relying on image processing approaches such as skin detection and depth thresholding for hand detection. These techniques are restricted by strong assumptions and normally possess low robustness in actual applications. In this paper, we focus on an appearance approach and propose a new feature extraction method based on sparse pixel-pair wise intensity comparisons for hand...
This paper presents a method of texture analysis based on regional rank. The method namely regional rank coding is to first determine the rank of the gray level of each pixel in a region whose size and shape depend on the gray level of the pixel being processed, the code of the regional rank is then calculated according to the found rank and also the gray level of processed pixel. This encoding allows...
Text signage as visual indicators in natural scene plays an important role in navigation and notification in our daily life. Most previous methods of scene text extraction are developed from a single scene image. In this paper, we propose a multi-frame based scene text recognition method by tracking text regions in a video captured by a moving camera. The main contributions of this paper are as follows...
Representing images with their descriptive features is the fundamental problem in CBIR. Feature coding as a key-step in feature description has attracted the attentions in recent years. Among the proposed coding strategies, Bag-of-Words (BoW) is the most widely used model. Recently saliency has been mentioned as the fundamental characteristic of BoW. Base on this idea, Salient Coding (SaC) has been...
Many artificial intelligence techniques have been developed to process the constantly increasing volume of data to extract meaningful information from it. The accurate annotation of the unknown protein using the classification of the protein sequence into an existing superfamily is considered a critical and challenging task in bioinformatics and computational biology. This classification would be...
To segment video shot quickly and efficiently, a video shot boundary detection algorithm which combine Macro Block (MB) coding mode with Scale Invariant Feature Transform(SIFT) feature point matching is proposed in this paper. Firstly, the MB coding mode is extracted from H.264/AVC bit stream to calculate the intra coding MB ratio of a frame, according to intra coding MB ratio, some frames are selected...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.