The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Development of smart cities has grasped much attention in research community and industry as well. Smart healthcare, communication, infrastructure are required for the development of smart cities. Security is one of the major concern in the development of smart cities. Automatic surveillance helps in boosting security in multiple areas like traffic, hospitals, schools, and industries etc. Video camera...
Pedestrian detection is considered as an active area of research and the advent of autonomous vehicles for a smarter mobility has spearheaded the research in this field. In this paper, design of a real-time pedestrian detection system for autonomous vehicles is proposed and its performance is evaluated using images from standard datasets as well as realtime video input. The proposed system is designed...
In order to solve the problem that the anomalous samples are scarce and the model is susceptible to abnormal data, this paper introduces the idea of kernel trick in the process of constructing the projection classifier and constructs three kinds of projection one-class classifiers: Projection Support Vector Data Description (PSVDD), Projection K-means (PK-means) and Projection K-centers (PK-centers)...
Temporal segmentation of facial expressions in video sequences is an important and relatively unexplored problem in facial image analysis. The difficulties of temporal segmentation include irregular facial behavior, large variability in facial gestures and moderate to large head motion. To solve those problems, we propose a two-step method to segment facial expression temporally, which consists of...
This paper attempts to represent the mapped data in the radial basis function (RBF) feature space under non-negativity constraints and develops a RBF kernel based non-negative matrix factorization (KNMF-RBF) algorithm. Based on an objective function with Frobenius norm, we obtain the multiplicative update rules of our KNMF-RBF approach using kernel theory and gradient descent method. The proposed...
Owing to prominence as a diagnostic tool for probing the neural correlates of cognition, neuroimaging tensor data has been the focus of intense investigation. Although many supervised tensor learning approaches have been proposed, they either cannot capture the nonlinear relationships of tensor data or cannot preserve the complex multi-way structural information. In this paper, we propose a Multi-way...
Temporal action localization is an important yet challenging problem. Given a long, untrimmed video consisting of multiple action instances and complex background contents, we need not only to recognize their action categories, but also to localize the start time and end time of each instance. Many state-of-the-art systems use segment-level classifiers to select and rank proposal segments of pre-determined...
Row-wise exposure delay present in CMOS cameras is responsible for skew and curvature distortions known as the rolling shutter (RS) effect while imaging under camera motion. Existing RS correction methods resort to using multiple images or tailor scene-specific correction schemes. We propose a convolutional neural network (CNN) architecture that automatically learns essential scene features from a...
Scene flow describes the motion of 3D objects in real world and potentially could be the basis of a good feature for 3D action recognition. However, its use for action recognition, especially in the context of convolutional neural networks (ConvNets), has not been previously studied. In this paper, we propose the extraction and use of scene flow for action recognition from RGB-D data. Previous works...
Person Re-identification (ReID) is to identify the same person across different cameras. It is a challenging task due to the large variations in person pose, occlusion, background clutter, etc. How to extract powerful features is a fundamental problem in ReID and is still an open problem today. In this paper, we design a Multi-Scale Context-Aware Network (MSCAN) to learn powerful features over full...
One of recent trends [31, 32, 14] in network architecture design is stacking small filters (e.g., 1x1 or 3x3) in the entire network because the stacked small filters is more efficient than a large kernel, given the same computational complexity. However, in the field of semantic segmentation, where we need to perform dense per-pixel prediction, we find that the large kernel (and effective receptive...
Recently, DNN model compression based on network architecture design, e.g., SqueezeNet, attracted a lot attention. No accuracy drop on image classification is observed on these extremely compact networks, compared to well-known models. An emerging question, however, is whether these model compression techniques hurt DNNs learning ability other than classifying images on a single dataset. Our preliminary...
Due to variations in pose, angle and illumination condition, a person's appearance is significantly different in two different views, which makes person re-identification(re-id) intrinsically difficult. In this paper, we propose a person re-id method which learns Convolutional Neural Networks (CNNs) feature representations from joint-dataset learning. The CNN features extracted from all levels of...
In this paper, we propose a no-reference video quality assessment (VQA) method based on Convolutional Neural Network (CNN) and Multi-Regression (CNN-MR). It is universal for non-specific types of distortion. First, we innovatively introduce the 2D convolutional neural network into VQA model to learn the spatial quality features at frame level. Second, the motion information is extracted as temporal...
The paucity of videos in current action classification datasets (UCF-101 and HMDB-51) has made it difficult to identify good video architectures, as most methods obtain similar performance on existing small-scale benchmarks. This paper re-evaluates state-of-the-art architectures in light of the new Kinetics Human Action Video dataset. Kinetics has two orders of magnitude more data, with 400 human...
Non-Local means (NL-Means) algorithm is an effective denoising algorithm, but very dependent on the measured non-local similarity image blocks. In order to obtain as many similar image blocks as possible, we propose the Discrete Fourier Transform of local Gabor feature (LGF-DFT) which is rotation invariant and noise robust to measure the image blocks similarity, and fully utilize structural information...
To overcome the limitations of manual features and obtain the operating characteristics of the equipment in complex operation processes, different deep learning models have been utilized for industrial data, improving classification accuracy yet causing some other limitations meanwhile. In this paper, a deep hybrid model named Stochastic Convolutional and Deep Belief Network (SCDBN), which assembles...
This paper has proposed gait recognition approach for analyzing and classifying human identification under carrying a bag and wearing a clothing thus improving recognition performances. The proposed method is based on detail wavelet features extracted from the Haar-wavelet decomposition of dynamic areas in the Gait Energy Image (GEI). Spectral Regression Kernel Discriminant Analysis (SRKDA) is then...
Recently, deep learning has been introduced to classify hyperspectral images (HSIs) and achieved effective performance. In general, the previous networks are not enough deep, which might not extract very discriminant features for classification. In addition, they do not consider strong correlations among different hierarchical layers. Due to the two problems, a hybrid deep residual network is presented...
In this work, we develop a new framework to combine ensemble learning and composite kernel learning for hyperspectral image classification. We refer it as the multiple composite kernel learning, which is based on an iterative architecture. More specifically, in each iteration, we use the rotation-based ensemble to create rotation matrix, which is used to generate rotated features for both spectral...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.