The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Text is the easiest means to record information but need not always be the best means for understanding a concept. In psychological theories, it is argued that when information is presented visually, it provides a better means to understand a concept. While techniques exist for generating text from a given image, the inverse problem that is to automatically fetch coherent images to represent a given...
Kotenseki is a collection of classical and ancient Japanese literature. It is comprised of image books that express Japanese stories by using comic drawings of different characters, such as humans, nature, and animals. To effectively store them for posterity, a search system is important. We propose an efficient CBIR system to assist the users in easily accessing the information and have an enjoyable...
The task of visual relationship recognition (VRR) is recognizing multiple objects and their relationships in an image. A fundamental difficulty of this task is class-number scalability, since the number of possible relationships we need to consider causes combinatorial explosion. Another difficulty of this task is modeling how to avoid outputting semantically redundant relationships. To overcome these...
Affordance learning in general, is to identify the purpose, use, and ways to interact with an object, based on information gained from observing the object. Most of the existing affordance learning approaches assume the object target has been cropped individually from images. However, the object could not be easily separated from others due to occlusion or noise. Actually, two or more neighboring...
Blind image quality assessment (BIQA) methods aim to estimate the quality of a given test image without referring to the corresponding reference (original) image. Most BIQA methods use visual sensitivity models, which take into consideration intrinsic image characteristics (e.g. contrast, luminance, and texture) to identify degradations and estimate quality. For example, texture-based BIQA methods...
In this study, we investigated the effects of mastering multiple scripts in handwritten character recognition by means of computational simulations. In particular, we trained a set of deep neural networks on two different datasets of handwritten characters: the HODA dataset, which is a collection of images of handwritten Persian digits, and the MNIST dataset, which contains Latin handwritten digits...
The level of automated unmanned surface vehicle is always dependent on human interactions. An automated collision avoidance approach is proposed which is based on the visual system in order to improve it. Deep convolutional neural network (CNN) is a popular deep neural network for pattern recognition. Three types of encounter scenes are created and recorded which are used as the CNN training samples...
As the human eye on the image of different regions of the contrast sensitivity is different, it is particularly important to segment the image region more accurately in the image quality evaluation. Based on this, this paper presents a non-reference image region division method based on deep learning. Firstly, the Canny operator performs image edge detection at low threshold to obtain the strong edge...
In this paper, a system to aid the visually impaired by providing contextual information of the surroundings using 360° view camera combined with deep learning is proposed. The system uses a 360° view camera with a mobile device to capture surrounding scene information and provide contextual information to the user in the form of audio. The scene information from the spherical camera feed is classified...
The contribution of this paper is to bridge the gap on understanding the mathematical structure and the computational implementation of a convolutional neural network using a minimal model. The proposed minimal convolutional neural network is presented using a layering approach. This approach provides a clear understanding of the main mathematical operations in a convolutional neural network. Hence,...
In the image analysis, a big trend within the field of artificial intelligence is using the Deep Learning method, which is an upgrade of the existing neural network adaptive architecture (ANN). Deep Learning is a major new field in machine learning that encompasses a wide range of neural network architectures designed to perform various tasks. In the thermography energy sector, examples that are processed...
Deep learning has brought a series of breakthroughs in image processing. Specifically, there are significant improvements in the application of food image classification using deep learning techniques. However, very little work has been studied for the classification of food ingredients. Therefore, this paper proposes a new framework, called DeepFood which not only extracts rich and effective features...
In order to realize autonomous landing of the unmanned aerial vehicle (UAV) in power patrolling, a visual method vision based on Faster Regions with Convolutional Neural Network (Faster R-CNN) for UAVs is studied. In this paper, we design the landing sign of the combination of concentric circles and pentagon, and propose the Faster R-CNN recognition algorithm which can be used to identify the target...
The contribution of this paper is to bridge the gap on understanding the mathematical structure and the computational implementation of a convolutional neural network (CNN) using a minimal model (Minimal CNN). The proposed minimal CNN is presented using a layering approach. This approach provides a concise and accessible understanding of the main mathematical operations of a CNN. Hence, it benefits...
In this article, we develop two visual impression models: recognition model and generalization model to simulate the cognition process of human visual systems. We show how the visual impression learned with a deep neural network can be efficiently transferred to other visual recognition tasks. By reusing the hidden layers trained in an unsupervised way, we show that we can largely reduce the number...
We present in this paper a novel approach for training a topological deep neural network with visual impression. We show that by combing denoising auto-encoder model and contractive auto-encoder with Hessian regularization model, we can achieve a deterministic auto-encoder aiming for robustness to small variations of the input. We exploit the tangent propagation algorithm to show how our algorithm...
This paper is concerned of the loop closure detection problem, which is one of the most critical parts for visual Simultaneous Localization and Mapping (SLAM) systems. Most of state-of-the-art methods use hand-crafted features and bag-of-visual-words (BoVW) to tackle this problem. Recent development in deep learning indicates that CNN features significantly outperform hand-crafted features for image...
Aksara jawa is an ancient Javanese character, which has been used since 17th century. The character is mostly written on stones to describe history or naming such as places, wedding, tombstones, etc. This character is however gradually ignored by people. Thus, it is extremely important to preserve this near loss heritage culture. In this paper, as a step toward preserving and converting visual information...
Convolutional neural networks (CNNs) are a staple in the fields of computer vision and image processing. These networks perform visual tasks with state-of-the-art accuracy; yet, the understanding behind the success of these algorithms is still lacking. In particular, the process by which CNNs learn effective task-specific features is still unclear. This work elucidates such phenomena by applying recent...
Fast abnormal event detection algorithm has high application value. But it is difficult to select appropriate feature representation to realize fast abnormal event detection. In view of HVS's dual pulse propagation theory and computational complexity, LBP and OF are used as temporal and spatial feature representation of video in this paper. Since human understanding involves the abstraction of the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.