The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a novel local posture orientation-context descriptor, and proposes a FDDL(Fisher discriminant dictionary learning) method based on local orientation-preserving(LOP-FDDL) for sparse coding in action recognition task. To take full use of the information about the position of the local body-part related to the center of the torso, ant the spatial-temporal shape changes of the human...
Scene text extraction is always a challenging task owing to its usual disturbing factors such as complex image backgrounds and various text behaviors (sizes, colors, styles and alignments). This paper proposes a scene text extraction approach based on the novel concept of ‘symmetrical edge-point pairs’ (‘point-pair’), which is adopted to describe the sizes, directions and brightness information of...
Together with the technology advancement, Computer Vision plays an important role in enhancing smart computing systems to help people overcome obstacles in their daily lives. One of the common troublesome problems is human memorization ability, especially memorizing things such as personal items. It is annoying for people to waste their time finding lost items manually by recall or notes. This motivates...
Natural disasters such as earthquakes and tsunamis often have a devastating effect on human life and cause noticeable damage to infrastructure. Active research has been ongoing to mitigate the impact of these catastrophes and preclude the economic losses. The existing methods that utilize pre-event and post-event images not only require the immediate and guaranteed availability of the appropriate...
Realizing the automated and online detection of crowd anomalies from surveillance CCTVs is a research-intensive and application-demanding task. This research proposes a novel technique for detecting crowd abnormalities through analyzing the spatial and temporal features of the input video signals. This integrated solution defines an image descriptor that reflects the global motion information over...
Convolutional neural network extracts features from input data and classify them with end to end learning. In this paper we test the performance of the cooperation between texture features which undertakes effective role for SAR image classification and convolutional neural network. For this purpose we create local images for each pixel in the image taking into consideration neighbor pixels and pixel-based...
In this paper we focus on the characterization of singing styles in world music. We develop a set of contour features capturing pitch structure and melodic embellishments. Using these features we train a binary classifier to distinguish vocal from non-vocal contours and learn a dictionary of singing style elements. Each contour is mapped to the dictionary elements and each recording is summarized...
Estimation of people density in intensely dense crowded scenes is very crucial due to perspective difference, few pixels per target, clutter and complex backgrounds etc. Most of the existing work is unable to handle the crowds of hundreds or thousands. At this level of density, one feature is not enough to estimate the total density of an image. We propose a hybrid model which relies on multiple source...
Recognition of spatial relations between pairs of subexpressions is a key problem of recognition of handwritten mathematical expressions. Most methods for spatial relation classification are based on handcrafted rules and geometric indices extracted from the subexpression bounding boxes. In this work, we propose new spatial relation features that combine subexpression bounding box and intra-subexpression...
This paper presents a method for human action recognition from depth sequence. First, we subdivided the normalized motion energy vector into a set of segments, whose corresponding frame indices are used to partition a video. Then each sub-action is represented by three Depth Motion Maps (DMMs) to capture motion cues in three orthogonal projection views. Multi-scale Histogram of Oriented Gradients...
In the transition from traditional to digital musicology, large scale music data are increasingly becoming available which require research methods that work on the collection level and at scale. In the Digital Music Lab (DML) project, a software system has been developed that provides large-scale analysis of music audio with an interactive interface. The DML system includes distributed processing...
Interaction experience in multimedia systems can be improved by adding personalization. Current applications for building and animating characters to represent real users are typically based on pose and motion detection. For so doing, computer vision algorithms do not exploit the anatomical characteristics of the human body for improving their classification accuracy. This work presents an strategy...
The Shuttle Radar Topography Mission (SRTM) C-band is an important elevation data at regional scale. The 1″ SRTM elevation data has been released in 2014 over China. A detailed evaluation of 1″ SRTM data in China is still lacking. In this paper, the 1″ SRTM data quality is assessed by comparing with the 25m DEM data based on 1∶50000 scale topographic maps taking loess hilly area in China as an example...
This paper introduces a novel multi-layer line grouping method for perceptually building extraction from stereo aerial images. Nowadays, perceptual grouping algorithm for line features obtained from images has been widely investigated, but there are little attentions to be paid to building height information of the line segments applied in existing literature of edge grouping field. In order to enhance...
Local patterns have two problems: 1) the traditional local patterns methods only consider the frequency of each pattern, and does not consider the co-occurrence information between adjacent pixels pairs in the image; 2)the traditional methods limit on the gray texture analysis, ignoring the importance of color information. To address above problems, a novel method is proposed for color image retrieval...
Due to the wide variety of copy videos, the existing video copy detection methods using single feature face great challenges, especially for video content matching, which are difficult to deal with various copy video transformations. To overcome this problem, a video copy detection method based on sparse representation of MPEG-2 spatial and temporal features is proposed in this paper. Firstly, the...
Feature extraction simplifies the amount of information needed to describe the properties of an image accurately. This paper measures the performance of a CBIR system based on texture feature against combination of both color and texture feature. A Gray Level Co-occurrence Matrix is calculated for computing the texture feature of an image. Using these textual parameters similar images are extracted...
The numbers of digital images are increasing day by day and mining from large databases is becoming harder & harder. Indexing image data based on text is tiresome and error prone. If the indexing based on low-level feature of the image then it may reduce the workload and mining become faster. In this research paper we propose an indexing technique which indexes the digital images in the database...
Video summarization refers to the process of recapitulating video stream by producing an abstract of the salient keyframes that could cover its overall content. However, an efficient video summarization requires an efficient video Shot Boundary Detection (SBD) and keyframes extraction. In this backdrop, this paper presents a novel and efficient approach for video SBD and keyframes extraction that...
Given a video or time series of skeleton data, action recognition systems perform classification using cues such as motion, appearance, and pose. For the past decade, actions have been modeled using low-level feature representations such as Bag of Features. More recent work has shown that mid-level representations that model body part movements (e.g., hand moving forward) can be very effective. However,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.