The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Most present methods of saliency detection emphasize too much on the local contrast while ignore the global feature of image. The detailed characteristics of the image can be reflected based on the local comparison of image. However, the overall saliency of the image cannot be reflected. In this paper, a saliency detection model combined local and global features was proposed. Firstly, a local feature...
The present status of heart sound recognition is introduced in the paper. In order to improve the performance of heart sound recognition, a new model based on SVM is proposed. Firstly, the wavelet transform is used to reduce the noise of the heart sound, and then MFCC feature is extracted from heart sound. On this basis, the Support Vector Machine is used to build the classification model. In the...
Moving object tracking with discriminative model is very popular in recent years, which focuses on online selecting highly informative features to maximize the separability between object and background. An adapted particle filter tracker with online learning and inheriting discriminative model is proposed in this paper. Top-ranked discriminative features are selected into appearance model by Online...
Object detection in Very High Resolution (VHR) optical remote sensing images is a challenged work for objects are usually dense and tiny. With random orientation, various backgrounds as well as unpredictable noise make traditional image processing methods perform badly. In this paper, we propose using state-of-art Region-based fully convolutional networks to solve object detection tasks in aerial...
For face recognition systems, impostors can obtain legal identity authentication by presenting the printed images, the downloaded images or candid videos to the sensor. In this paper, an enhanced face local binary feature (ELBP) of a face map is extracted as a classification feature to identify whether the face map is a real face or a fake face. Compared with the dynamic or static methods proposed...
Building a human-computer interactive parachute simulator is an efficient way to avoid the high risk and high cost of field parachute training. In this paper, a novel dynamic recognition and simulation approach of parachute training is developed. Firstly we process the skeletal data acquired by Kinect and enforce the indication of the trainees' parachute posture, where principle component analysis...
Recently, very deep convolutional neural networks (CNNs) have been attracting considerable attention in image restoration. However, as the depth grows, the longterm dependency problem is rarely realized for these very deep models, which results in the prior states/layers having little influence on the subsequent ones. Motivated by the fact that human thoughts have persistency, we propose a very deep...
While strong progress has been made in image captioning recently, machine and human captions are still quite distinct. This is primarily due to the deficiencies in the generated word distribution, vocabulary size, and strong bias in the generators towards frequent captions. Furthermore, humans – rightfully so – generate multiple, diverse captions, due to the inherent ambiguity in the captioning task...
The interactive image segmentation model allows users to iteratively add new inputs for refinement until a satisfactory result is finally obtained. Therefore, an ideal interactive segmentation model should learn to capture the user's intention with minimal interaction. However, existing models fail to fully utilize the valuable user input information in the segmentation refinement process and thus...
Convolutional neural networks showed the ability in stereo matching cost learning. Recent approaches learned parameters from public datasets that have ground truth disparity maps. Due to the difficulty of labeling ground truth depth, usable data for system training is rather limited, making it difficult to apply the system to real applications. In this paper, we present a framework for learning stereo...
In this paper, we propose a CNN-based framework for online MOT. This framework utilizes the merits of single object trackers in adapting appearance models and searching for target in the next frame. Simply applying single object tracker for MOT will encounter the problem in computational efficiency and drifted results caused by occlusion. Our framework achieves computational efficiency by sharing...
Considering the problems of low recognition rate and poor robustness in traditional recognition algorithms, we propose a license plate character recognition algorithm based on convolution neural network. In this paper, we adopt a coarse-to-fine strategy for designing the network architecture. Through the convolutional layers and pooling layers, features of input images will be extracted and then sent...
Detecting pedestrians that are partially occluded remains a challenging problem due to variations and uncertainties of partial occlusion patterns. Following a commonly used framework of handling partial occlusions by part detection, we propose a multi-label learning approach to jointly learn part detectors to capture partial occlusion patterns. The part detectors share a set of decision trees via...
In this paper, we address the problem of spatio-temporal person retrieval from videos using a natural language query, in which we output a tube (i.e., a sequence of bounding boxes) which encloses the person described by the query. For this problem, we introduce a novel dataset consisting of videos containing people annotated with bounding boxes for each second and with five natural language descriptions...
We present a novel method for detecting 3D model instances and estimating their 6D poses from RGB data in a single shot. To this end, we extend the popular SSD paradigm to cover the full 6D pose space and train on synthetic model data only. Our approach competes or surpasses current state-of-the-art methods that leverage RGBD data on multiple challenging datasets. Furthermore, our method produces...
In this paper, we propose a scale-invariant framework based on Convolutional Neural Networks (CNNs). The network exhibits robustness to scale and resolution variations in data. Previous efforts in achieving scale invariance were made on either integrating several variant-specific CNNs or data augmentation. However, these methods did not solve the fundamental problem that CNNs develop different feature...
Contrast of image plays an important role in image perception quality and is also susceptive to various factors during image acquisition process. However, only a few image quality evaluation algorithms have been focused on the contrast-changed image quality assessment (IQA), and none of these methods belongs to blind IQA algorithms. Therefore, they cannot be applied to the case when the reference...
Binaural features of interaural level difference and interaural phase difference have proved to be very effective in training deep neural networks (DNNs), to generate time-frequency masks for target speech extraction in speech-speech mixtures. However, effectiveness of binaural features is reduced in more common speech-noise scenarios, since the noise may over-shadow the speech in adverse conditions...
Acoustic event detection (AED) is currently a very active research area with multiple applications in the development of smart acoustic spaces. In this context, the advances brought by Internet of Things (IoT) platforms where multiple distributed microphones are available have also contributed to this interest. In such scenarios, the use of data fusion techniques merging information from several sensors...
Links between issue reports and corresponding fix commits are widely used in software maintenance. The quality of links directly affects maintenance costs. Currently, such links are mainly maintained by error-prone manual efforts, which may result in missing links. To tackle this problem, automatic link recovery approaches have been proposed by building traditional classifiers with positive and negative...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.