The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In light of the powerful learning capability of deep neural networks (DNNs), deep (convolutional) models have been built in recent years to address the task of salient object detection. Although training such deep saliency models can significantly improve the detection performance, it requires large-scale manual supervision in the form of pixel-level human annotation, which is highly labor-intensive...
While most existing approaches for detection in videos focus on objects or human actions separately, we aim at jointly detecting objects performing actions, such as cat eating or dog jumping. We introduce an end-to-end multitask objective that jointly learns object-action relationships. We compare it with different training objectives, validate its effectiveness for detecting objects-actions in videos,...
A major impediment in rapidly deploying object detection models for instance detection is the lack of large annotated datasets. For example, finding a large labeled dataset containing instances in a particular kitchen is unlikely. Each new environment with new instances requires expensive data collection and annotation. In this paper, we propose a simple approach to generate large annotated instance...
Weakly supervised object localization remains challenging, where only image labels instead of bounding boxes are available during training. Object proposal is an effective component in localization, but often computationally expensive and incapable of joint optimization with some of the remaining modules. In this paper, to the best of our knowledge, we for the first time integrate weakly supervised...
We define the object detection from imagery problem as estimating a very large but extremely sparse bounding box dependent probability distribution. Subsequently we identify a sparse distribution estimation scheme, Directed Sparse Sampling, and employ it in a single end-to-end CNN based detection model. This methodology extends and formalizes previous state-of-the-art detection models with an additional...
We describe a method to produce a network where current methods such as DeepFool have great difficulty producing adversarial samples. Our construction suggests some insights into how deep networks work. We provide a reasonable analyses that our construction is difficult to defeat, and show experimentally that our method is hard to defeat with both Type I and Type II attacks using several standard...
Extending state-of-the-art object detectors from image to video is challenging. The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc. Existing work attempts to exploit temporal information on box level, but such methods are not trained end-to-end. We present flow-guided feature aggregation, an accurate and end-to-end learning...
Imagery texts are usually organized as a hierarchy of several visual elements, i.e. characters, words, text lines and text blocks. Among these elements, character is the most basic one for various languages such as Western, Chinese, Japanese, mathematical expression and etc. It is natural and convenient to construct a common text detection engine based on character detectors. However, training character...
The QRS complex detection methods have been extensively studied over the past several decades, and the current common QRS detection algorithms can achieve high detection accuracy on the open-access ECG database. Although massive of researches exist on the performance of QRS detectors, the effect of the ECG signal gain is usually ignored and did not attract researchers' attentions in the past studies...
The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In...
Despite their success for object detection, convolutional neural networks are ill-equipped for incremental learning, i.e., adapting the original model trained on a set of classes to additionally detect objects of new classes, in the absence of the initial training data. They suffer from “catastrophic forgetting”–an abrupt degradation of performance on the original set of classes, when the training...
Learned boundary maps are known to outperform handcrafted ones as a basis for the watershed algorithm. We show, for the first time, how to train watershed computation jointly with boundary map prediction. The estimator for the merging priorities is cast as a neural network that is convolutional (over space) and recurrent (over iterations). The latter allows learning of complex shape priors. The method...
In this paper, we propose an integrated system for scale-variance pedestrian detection. It consists of two cascaded components: a multi-layer detection neural network (MLDNN) for scale-variance pedestrian detection, and a fast decision forest (FDF) for boosting detection performance with only a slight decrease in speed. Experimental results on the Caltech Pedestrian dataset show that our approach...
Robust detection of the smallest circulating cerebral microemboli is an efficient way of preventing cerebrovascular accidents (CVA). Transcranial Doppler ultrasound is widely considered as the most convenient system for the detection of microemboli. Standard detection used in commercial device is achieved through the whole Doppler energy spectrum where constant empirical thresholds are implemented...
Robust detection of the smallest circulating cerebral micro-emboli is an efficient way of preventing Cerebrovascular Accidents. Transcranial Doppler ultrasound is widely considered as the most convenient system for the detection of micro-emboli. Commercially realized standard detection is achieved through the whole Doppler energy spectrum where constant empirical thresholds are implemented. In this...
In the present paper, we propose a deep network architecture in order to improve the accuracy of pedestrian detection. The proposed method contains a proposal network and a classification network that are trained separately. We use a single shot multibox detector (SSD) as a proposal network to generate the set of pedestrian proposals. The proposal network is fine-tuned from a pre-trained network by...
In this paper, we present a novel approach for real-time object identification on a mobile platform. First, our system detects keypoints within a scaled pyramid-based FAST detector and then descriptors of the object of interest are computed using an Analytical Fourier-Mellin transform. The Fourier-Mellin is used in similarity studies due to its invariance property and discrimination power. In this...
Boosted forest (BF) is a commonly used method for object detection. With the help of cascade strategy, it can efficiently reject non-object windows and finally, combined with sliding window paradigm, give the locations of target objects in an image. In the literature, many aspects of cascaded boosted forest (CBF) have been well studied, such as image representation, tree split and cascade structure...
Audio Event Detection (AED) aims to recognize sounds within audio and video recordings. AED employs machine learning algorithms commonly trained and tested on annotated datasets. However, available datasets are limited in number of samples and hence it is difficult to model acoustic diversity. Therefore, we propose combining labeled audio from a dataset and unlabeled audio from the web to improve...
There is a common observation that audio event classification is easier to deal with than detection. So far, this observation has been accepted as a fact and we lack of a careful analysis. In this paper, we reason the rationale behind this fact and, more importantly, leverage them to benefit the audio event detection task. We present an improved detection pipeline in which a verification step is appended...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.