The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Learning to recognize pedestrian attributes (such as gender, hair style, take hat or not) in video surveillance scenarios is critical to a variety of tasks, such as crime prevention and border control. However, it is still challenging due to low resolution and highlight influence in the actual surveillance scenarios, in which traditional methods work not well. This paper aims at proposing a robust...
This paper proposes a shared-hidden-layer deep convolutional neural network (SHL-CNN) for image character recognition. In SHL-CNN, the hidden layers are made common across characters from different languages, performing a universal feature extraction process that aims at learning common character traits existed in different languages such as strokes, while the final softmax layer is made language...
This paper presents a novel scheme for Chinese text recognition in images and videos. It's different from traditional paradigms that binarize text images, fed the binarized text to an OCR engine and get the recognized results. The proposed scheme, named grayscale based Chinese Image Text Recognition (gCITR), implements the recognition directly on grayscale pixels via the following steps: image text...
Broadcast news video has been playing an increasingly important role in our daily life. However, how to effectively segment a broadcast news video into meaningful semantic story units is still a challenge issue. In this paper, we propose a novel unified video structure parsing approach, named multiple style exploration-based news story segmentation (MSE-NSS), to segment broadcast news videos into...
In this paper, a novel binarization technique is introduced for natural scene text, which can be applied after the text location step in order to improve OCR recognition. At the first step, an “optimum” conversion from color image to grayscale image is performed by minimizing L1 - Norm distance between original color image and reconstructed image on corresponding optimum projection vector. Based on...
In this paper a novel approach of video segmentation into topic units is presented. This approach is built upon the design in which topic unit segmentation is transformed into label identification problem by defining four types of shots that reveal semantic structure of it. To implement our algorithm, four middle-level features including shot difference signal, scene transition graph, shot theme and...
Near-duplicate image retrieval (NDIR) is an important topic for many applications such as multimedia content management, copyright infringement identification et al. In this work we propose a novel NDIR framework based on visual phrase. Compared with previous researches, this paper first introduces a spatial visual phrase (SVP) model enabling to capture relative geometry information between visual...
TV Commercials play an important role in our lives, and automatic commercial detection is very useful in TV video analysis. Most of previous works focus on visual and audio features of commercial, while ignoring the information of distributions of commercial blocks in different program types and broadcast times. In this paper, we propose a novel method to fuse visual, audio features and global characteristics...
Automatic scene detection is a fundamental step for efficient video searching and browsing. This paper presents our current work on scene detection that integrates three effective strategies into a single framework. For each video, firstly, a coherence signal is constructed by graph modal obtained from the similarity matrix in a temporal interval. Secondly, the signal is optimized by scene transition...
With the fast development of high-speed network and digital video recording technologies, broadcast video has been playing a more and more important role in our daily life. In this paper, we propose a novel news story segmentation scheme which can segment broadcast video into story units with multi-modal information fusion (MMIF) strategy. Compared with traditional methods, the proposed scheme extracts...
By introducing the concept detection results to the retrieval process, concept-based video retrieval (CBVR) has been successfully used for semantic content-based video retrieval application. However, how to select and fuse the appropriate concepts for a specific query is still an important but difficult issue. In this paper, we propose a novel and effective concept selection method, named graph-based...
Event-related query is playing a more and more important role in video retrieval. However, it is still a challenge to the existing video retrieval engines for lacking the effective motion analysis. In this paper, we propose a novel re-ranking scheme for video retrieval based on motion region trajectory analysis. By focusing on the changes of the primary moving regions, we construct an intuitive motion...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.