The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The trend extraction of the burning zone temperature is the basis for the burning state identification and stability control of the cement. In this paper, the trend extraction method of the burning zone temperature based on singular spectrum analysis is studied. Firstly, a visual inspection device is used to detect the temperature of the burning zone. Then, the singular spectrum analysis is carried...
Dexterous object manipulation requires suitable control of grip force, load force and digit positions. To keep an object stable in the air, force magnitude, force direction and digit positions should be coordinated, producing compensatory torque that could balance the external torque and maintain a minimal object roll. Six males and 6 females enrolled in the study. In the experiment, subjects were...
In order to meet the requirement of autonomous navigation in deep space exploration, this paper presents a novel visual navigation method. The visual navigation algorithm based on single feature matching is a common visual method to calculate the attitude and position of a lander. However, the algorithm based on feature line matching or crater matching is great limited due to the feature line extracted...
In this paper, based on quaternion and Euler angles, an attitude control algorithm is proposed for pitching and rolling of quadrotor aircraft. In addition, the target tracking algorithm of quadrotor aircraft is designed by using the collected video information and color feature recognition. The system is based on the homemade quadrotor aircraft, using gyroscope, accelerometer as the original measurement...
Geovisualization, with its capacity to provide tools for visual spatial analysis, has wide-ranging domain applications to support sense and decision making in humanitarian crisis management. The need for such tools is manifest in the Middle East in light of the Syrian civil war and ensuing mass migration of millions of refugees to neighboring countries. The Zaatari refugee camp, home to 80,000 Syrian...
In view of the docking problem of indoor Intelligent Wheelchair/Bed (IWB), a docking control method based on the combination of a simple artificial landmark and ultrasound is proposed in this paper. First, the artificial landmark on the auxiliary bed is identified and the artificial landmark is used to establish the world coordinate system; then, the relative position and orientation from the wheelchair...
In this study, it was implemented a wristband design which provides distance measurements to assist blind people. The obstacles were detected by the help of ultrasonic sensors connected to an Arduino microcontroller board. A design that can be adapted to different usage areas (indoor or outdoor) and various stimulus types (vibration only, sound only, both together), is produced, which concerns user...
We consider retrieving a specific temporal segment, or moment, from a video given a natural language text description. Methods designed to retrieve whole video clips with natural language determine what occurs in a video but not when. To address this issue, we propose the Moment Context Network (MCN) which effectively localizes natural language queries in videos by integrating local and global video...
Image is usually taken for expressing some kinds of emotions or purposes, such as love, celebrating Christmas. There is another better way that combines the image and relevant song to amplify the expression, which has drawn much attention in the social network recently. Hence, the automatic selection of songs should be expected. In this paper, we propose to retrieve semantic relevant songs just by...
Many of the existing methods for learning joint embedding of images and text use only supervised information from paired images and its textual attributes. Taking advantage of the recent success of unsupervised learning in deep neural networks, we propose an end-to-end learning framework that is able to extract more robust multi-modal representations across domains. The proposed method combines representation...
The ability to ask questions is a powerful tool to gather information in order to learn about the world and resolve ambiguities. In this paper, we explore a novel problem of generating discriminative questions to help disambiguate visual instances. Our work can be seen as a complement and new extension to the rich research studies on image captioning and question answering. We introduce the first...
Recognizing fine-grained categories (e.g., bird species) highly relies on discriminative part localization and part-based fine-grained feature learning. Existing approaches predominantly solve these challenges independently, while neglecting the fact that part localization (e.g., head of a bird) and fine-grained feature learning (e.g., head shape) are mutually correlated. In this paper, we propose...
This paper presents a novel hierarchical spatiotemporal orientation representation for spacetime image analysis. It is designed to combine the benefits of the multilayer architecture of ConvNets and a more controlled approach to spacetime analysis. A distinguishing aspect of the approach is that unlike most contemporary convolutional networks no learning is involved; rather, all design decisions are...
This paper introduces a novel approach for modeling visual relations between pairs of objects. We call relation a triplet of the form (subject; predicate; object) where the predicate is typically a preposition (eg. ’under’, ’in front of’) or a verb (’hold’, ’ride’) that links a pair of objects (subject; object). Learning such relations is challenging as the objects have different spatial configurations...
The power of modern image matching approaches is still fundamentally limited by the abrupt scale changes in images. In this paper, we propose a scale-invariant image matching approach to tackling the very large scale variation of views. Drawing inspiration from the scale space theory, we start with encoding the image’s scale space into a compact multi-scale representation. Then, rather than trying...
Since the beginning of early civilizations, social relationships derived from each individual fundamentally form the basis of social structure in our daily life. In the computer vision literature, much progress has been made in scene understanding, such as object detection and scene parsing. Recent research focuses on the relationship between objects based on its functionality and geometrical relations...
Referring expression is a kind of language expression that used for referring to particular objects. To make the expression without ambiguation, people often use attributes to describe the particular object. In this paper, we explore the role of attributes by incorporating them into both referring expression generation and comprehension. We first train an attribute learning model from visual objects...
We propose a novel measure of visual similarity for image retrieval that incorporates both structural and aesthetic (style) constraints. Our algorithm accepts a query as sketched shape, and a set of one or more contextual images specifying the desired visual aesthetic. A triplet network is used to learn a feature embedding capable of measuring style similarity independent of structure, delivering...
Person re-identification is best known as the problem of associating a single person that is observed from one or more disjoint cameras. The existing literature has mainly addressed such an issue, neglecting the fact that people usually move in groups, like in crowded scenarios. We believe that the additional information carried by neighboring individuals provides a relevant visual context that can...
The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method. We start from two assumptions: 1) different video tracklets typically contain different persons, given that the tracklets are taken at distinct places or with long intervals; 2) within each tracklet, the frames are mostly of the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.