The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present Deeply Supervised Object Detector (DSOD), a framework that can learn object detectors from scratch. State-of-the-art object objectors rely heavily on the off the-shelf networks pre-trained on large-scale classification datasets like Image Net, which incurs learning bias due to the difference on both the loss functions and the category distributions between classification and detection tasks...
The deployment of deep convolutional neural networks (CNNs) in many real world applications is largely hindered by their high computational cost. In this paper, we propose a novel learning scheme for CNNs to simultaneously 1) reduce the model size; 2) decrease the run-time memory footprint; and 3) lower the number of computing operations, without compromising accuracy. This is achieved by enforcing...
This paper focuses on a novel and challenging vision task, dense video captioning, which aims to automatically describe a video clip with multiple informative and diverse caption sentences. The proposed method is trained without explicit annotation of fine-grained sentence to video region-sequence correspondence, but is only based on weak video-level sentence annotations. It differs from existing...
Face recognition has been widely used in many application areas such as photo album management and information security. Rapid growth of handheld devices and social networks bring new challenges to face recognition algorithm design and system engineering. To be effective on a handheld device, the face recognition model must be simple and lightweight, and also needs to handle the large variations in...
We present a novel boosting cascade based face detection framework using SURF features. The framework is derived from the well-known Viola-Jones (VJ) framework but distinguished by two key contributions. First, the proposed framework deals with only several hundreds of multidimensional local SURF patches instead of hundreds of thousands of single dimensional haar features in the VJ framework. Second,...
Local binary pattern histogram (LBPH) is one of the popular and excellent image texture descriptor. However, conventional LBPH lacks of the description of spatial structure information. This paper proposes an extension of LBPH called Markov chain local binary patterns (MCLBP) to alleviate this limitation. We apply MCLBP to the task of TRECVID video concept detection. Experimental results demonstrate...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.