The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
While delineation of aortic aneurysms has been subject of research in several publications, this represents the first contribution to address segmentation of thrombus in case of aortic dissections. The segmentation process ensues in multiplanar reformated slices (MPRs). In 3D CTA data, thrombus hardly differs from surrounding tissue outside the aorta. Segmentation is further complicated by the high...
This work shows how to improve hyperspectral image classification through using both a deep representation and contextual information. To implement this objective, this work proposes a new Conditional Random Field (CRF) model (named DBN-CRF) with potentials defined over deep features produced by the Deep Belief Networks (DBNs). The newly formulated DBN-CRF model takes advantage of strength of the...
Scene text is one of the most important information sources for our daily life because it has particular functions such as disambiguation and navigation. In contrast, ordinary document text has no such function. Consequently, it is natural to have a hypothesis that scene text and document text have different characteristics. This paper tries to prove this hypothesis by semantic analysis of texts by...
Institutes and libraries around the globe are preserving the literary heritage by digitizing historical documents. However, to make this data easily accessible the scanned documents need to be transformed into search-able text. State of the art OCR systems using Long-Short-Term-Memory networks (LSTM) have been applied successfully to recognize text in both printed and handwritten form. Besides the...
Automatic classification of Human Epithelial Type-2 (HEp-2) specimen patterns is an important yet challenging problem in medical image analysis. Most prior works have primarily focused on cells images classification problem which is one of the early essential steps in the system pipeline, while less attention has been paid to the classification of whole-specimen ones. In this work, a specimen pattern...
In this paper, we proposed a novel framework for facial expression recognition, in which face images were taken as vertices in a hypergraph and the task of expression recognition was formulated as the problem of hypergraph based inference. A hybrid strategy was developed to construct hyperedges: we generated probabilities of facial action units by deep convolutional networks and took each action unit...
In this paper, we focus on the text/non-text classification problem: distinguishing images that contain text from a lot of natural images. To this end, we propose a novel neural network architecture, termed Convolutional Multi-Dimensional Recurrent Neural Network (CMDRNN), which distinguishes text/non-text images by classifying local image blocks, taking both region pixels and dependencies among blocks...
We consider the problem of joint modeling of videos and their corresponding textual descriptions (e.g. sentences or phrases). Our approach consists of three components: the video representation, the textual representation, and a joint model that links videos and text. Our video representation uses the state-of-the-art deep 3D ConvNet to capture the semantic information in the video. Our textual representation...
Hierarchical decomposition enables increased number of classes in a classification problem. Class similarities guide the creation of a family of course to fine classifiers which solve categorical problems more effectively than a single flat classifier. High accuracies require precise configurations for each of the family of classifiers. This paper proposes a method to adaptively select the configuration...
Reliable automatic system for Human Epithelial-2 (HEp-2) cell image classification can facilitate the diagnosis of systemic autoimmune diseases. In this paper, an automatic pattern recognition system using fully convolutional network (FCN) was proposed to address the HEp-2 specimen classification problem. The FCN in the proposed framework was adapted from VGG-16, which was trained with ICPR 2016 dataset...
In scene analysis, the availability of an initial background model that describes the scene without foreground objects is at the basis of many computer vision applications. Multi-modal models of the scene background are frequently adopted in the applications, where each mode tries to keep track of the multiple background modes observed along the sequence. In this work we specifically address the problem...
Ocular biometrics in the visible spectrum has emerged as an area of significant research activity. In this paper, we propose two convolution-based models for verifying a pair of periocular images containing the iris, and compare the two approaches amongst each other as well as with a baseline model. In the first approach, we perform deep learning in an unsupervised manner using a stacked convolutional...
The task of the ChaLearn Apparent Personality Analysis: First Impressions Challenge is to rate/quantify personality traits of users in short video sequences. Although the validity of personality judgments from short interactions is questionable, studies show the possibility of predicting attributed traits (First Impressions) using facial [15] and acoustic [13] features. The challenge introduces a...
This paper addresses the problem of continuous gesture recognition from sequences of depth maps using Convolutional Neural networks (ConvNets). The proposed method first segments individual gestures from a depth sequence based on quantity of movement (QOM). For each segmented gesture, an Improved Depth Motion Map (IDMM), which converts the depth sequence into one image, is constructed and fed to a...
Multi-script writer identification consists in identifying a person of a given text written in one script from the samples of the same person written in another script. The rationale behind this is that the writing style of an individual remains constant across different scripts. While this hypothesis may hold, recent results on a multi-script writer identification competition show that classical...
In this article a new strategy for single-image super-resolution is proposed. A selective sparse coding strategy based on patch sharpness is assumed to be invariant for patch resolution. This sharpness criterion is used at training stage to classify image patches into different clusters. It is suggested that the use of coupled dictionary learning, with a mapping function can improve the representation...
Deep neural networks are state of the art methods for many learning tasks due to their ability to extract increasingly better features at each network layer. However, the improved performance of additional layers in a deep network comes at the cost of added latency and energy usage in feedforward inference. As networks continue to get deeper and larger, these costs become more prohibitive for real-time...
In this paper we propose a novel end-to-end framework for mathematical expression (ME) recognition. The method uses a convolutional neural network (CNN) to perform mathematical symbol detection and recognition simultaneously incorporating spatial context, and can handle multi-part and touching symbols effectively. To evaluate the performance, we provide a benchmark that contains MEs both from real-life...
Face detection is a vital step in the process of extracting semantic information about the driver's state, such as distraction and fatigue, from pixel values in images looking at the driver. Therefore, in the context of time and safety critical situation like driving, efficient use of time and reliable detection of faces is essential. While challenges like lighting and occlusion are prevalent in the...
The color constancy problem is addressed by structured-output regression on the values of the fully-connected layers of a convolutional neural network. The AlexNet and the VGG are considered and VGG slightly outperformed AlexNet. Best results were obtained with the first fully-connected “fc6” layer and with multi-output support vector regression. Experiments on the SFU Color Checker and Indoor Dataset...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.