The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a self-learning structure for text localization. The proposed system has an ability to improve itself automatically by analyzing unlabelled images. The system consists of three classification modules called component grader, component linker, and group classifier. Firstly, the image is analyzed to obtain the character candidate components. Then, the grader evaluates the possibility...
In this paper, a region-based text localization that is robust for multiple languages is presented. Maximally Stable Extremal Regions (MSERs) are used for detecting candidates of text areas. The MSER components are grouped based on their connectivity in a feature space by using a new proposed rule for assigning the connectivity. The groups of components are classified into three classes that are text...
This paper presents a new approach for text-based video content retrieval system. The proposed scheme consists of three main processes that are key frame extraction, text localization and keyword matching. For the key-frame extraction, we proposed a Maximally Stable Extremal Region (MSER) based feature which is oriented to segment shots of the video with different text contents. In text localization...
A new framework that uses internet-based images for detecting objects and estimating real world location of the objects via stereo images is proposed. This framework provides a self-learning ability for detecting desired objects in the scene without pre-prepared classifiers by harvesting sample images of the objects from the internet. Histogram and co-occurrence matrices of edge orientation are used...
This paper presents face recognition system that is based on Self-Organizing Map (SOM) clustering. In order to reduce the time consumption in nearest neighbor search, SOM clustering scheme is used to group the training data and determine prototypes of each group. Local feature selection process is employed to reduce dimension of data in each group. To show the performance of the proposed scheme over...
This paper presents a new method for the vehicle license plate and the frontal mask localization. The proposed license plate localization initializes candidate regions based on maximally stable extremal regions (MSERs). Then, the candidate regions are categorized into three classes of license plate character components, plate background components and the other components by using intensity, size,...
The human interaction based framework for manipulable object categorization is proposed in this paper. In the proposed framework, co-occurrence and spatial relationship based features are developed to improve the categorization problem of the objects with high intra-class variation, deformable objects or the objects that are occluded. The descriptor in this framework is based on a co-occurrence of...
This paper presents a new method for vehicle logo detection and recognition from images of front and back views of vehicle. The proposed method is a two-stage scheme which combines Convolutional Neural Network (CNN) and Pyramid of Histogram of Gradient (PHOG) features. CNN is applied as the first stage for candidate region detection and recognition of the vehicle logos. Then, PHOG with Support Vector...
An improvement in framework for unseen place categorization using scene text is proposed. Category score calculation using visual saliency weighting method is proposed to cope with problem of different importance of word locations on scene images. Additionally, a HOG feature extraction using sliding window is proposed to obtain better holistic word recognition on scene images. As the result, the proposed...
In this paper, we introduce a two-stage recognition process for classification of 164 classes of mixing of printed Thai and English characters. Various structural features based on image ratios, image projections, outer boundaries, Pyramid Histogram of Oriented Gradients (PHOG) are extracted from images. In the first stage, Fuzzy C Mean Clustering (FCM) is applied to create prototypes of every character...
This paper proposes methods for measuring size and distance of target objects by using mobile devices. For close-range measurement of a size of an object, users must hold the device close to the object and drag it along a desired direction. We develop a new approach for estimating a dragging distance of the device using acceleration signals retrieved from a three-axis accelerometer embedded on it...
This paper presents a new framework and feature set for vehicle model query system. By giving model names or manufacturer names as keywords, the desired vehicle images can be queried from target videos or vehicle image databases using internet-vision approach. In this framework, sample images are automatically retrieved from internet via search engine or car related website. Logos and frontal masks...
New approaches of eye state detection and eye sequence identification for computer interface of paralyzed patients are proposed. In this work, patients can interact via sequences of four eye states that are close, forward-glance, rightward-glance, and leftward-glance states. To detect the eye states, eye images are firstly segmented by using FCM clustering scheme in a feature space of RGB color components...
This paper presents a novel approach for emotional speech recognition. Instead of using a full length of speech for classification, the proposed method decomposes speech signals into component words, groups the words into segments and generates an acoustic model for each segment by using features such as audio power, MFCC, log attack time, spectrum spread and segment duration. Based on the proposed...
This research proposes a novel Time-Frequency (T-F) analysis method which has made some modifications to the Normalized Sub-Harmonic Summation (NSHS) method. We also have tracked the problems of NSHS and observed 4 main characteristics of music signals. As a result, we proposed the Modified Sub-Harmonic Summation (MSHS) method and introduced the Fundamental-to-Overtone Magnitude Ratio (FOMR) to solve...
A image restoration scheme using a pair of noisy and motion blurred images is proposed. The restoration scheme combines a deconvolution algorithm with a denoising technique. The denoising technique is initially applied to reduce noises. After the blur function has been identified, the residual noise from the denoising process is estimated by using deconvolution scheme. Unlike conventional schemes...
A single image based motion blur identification scheme is proposed for uniform motion blurs that consist of more than one linear motion component. The proposed scheme is composed of three modules that are a motion direction estimator, a motion length estimator and a motion combination selector. In order to identify the motion directions, the proposed scheme is based on a trial restoration by using...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.