The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we propose a pedestrian attribute recognition approach and a CNN-based person re-identification framework enhanced by pedestrian attributes. The knowledge of person attributes can help video surveillance tasks like person re-identification as well as person search, semantic video indexing and retrieval to overcome viewpoint changes with their robustness to the inherent visual appearance...
A discriminative ensemble tracker employs multiple classifiers, each of which casts a vote on all of the obtained samples. The votes are then aggregated in an attempt to localize the target object. Such method relies on collective competence and the diversity of the ensemble to approach the target/non-target classification task from different views. However, by updating all of the ensemble using a...
Convolutional neural networks (CNNs) are a staple in the fields of computer vision and image processing. These networks perform visual tasks with state-of-the-art accuracy; yet, the understanding behind the success of these algorithms is still lacking. In particular, the process by which CNNs learn effective task-specific features is still unclear. This work elucidates such phenomena by applying recent...
Neuroevolution has proven effective at many re-inforcement learning tasks, including tasks with incomplete information and delayed rewards, but does not seem to scale well to high-dimensional controller representations, which are needed for tasks where the input is raw pixel data. We propose a novel method where we train an autoencoder to create a comparatively low-dimensional representation of the...
Reform of teaching content, teaching methods and evaluation of examination system is an urgent issue of current teaching reform in higher education. The article discusses how to reform and optimize the teaching methods of Python language programming course based on visualization; students can change the learning method under the guidance of teachers, promote selfexploration, and form a proactive and...
Normal adult becomes memory decline when increasing age. Memory decline can change from normal aging to mild cognitive impairment (MCI) and then Alzheimer's disease dementia. In order to reduce the risk of dementia, the cognitive training or brain training is needed. Cognitive training can stimulate the ability of normal person's memory for keeping ability of memory prompt when increasing age. Virtual...
Physical interactions between human and machine are essential in facilitating effective physical therapy training programs. Nowadays, physical training largely involves robotic assistive devices or wearable haptics. In this study, we propose a lightweight wearable sensory augmentation device using skin stretch feedback to provide individuals with additional sensory cues during balance training. The...
A visual Brain-Computer Interface (BCI) speller is a system which assists disabled persons with severe neu-romuscular diseases to communicate with the external world. It acquires brain signals in response to visual stimuli shown to the person on a screen, and then analyzes in real-time to predict the desired symbol on a single trial basis. To date most BCI design paradigms have been focused on the...
The following paper presents a new approach for analyzing learning style at the beginning of course. Learner styles, learning style models and existing methods to identify learning style are explained. With proper using of Item Response Theory for determining learner style, greater impact on learning experience can be achieved, such as personalized learning, effective learning as well high satisfaction...
Recently, kernelized correlation Filter-based trackers have aroused the interest of many researchers and achieved good results in the field of tracking. However, the current tracking model based on kernelized correlation filters can not deal with the changes of the target appearance and scale effectively. Therefore, in this paper, we intend to solve these two problems and improve the robustness of...
Understanding the simultaneously very diverse and intricately fine-grained set of possible human actions is a critical open problem in computer vision. Manually labeling training videos is feasible for some action classes but doesnt scale to the full long-tailed distribution of actions. A promising way to address this is to leverage noisy data from web queries to learn new actions, using semi-supervised...
One-shot learning is a challenging problem where the aim is to recognize a class identified by a single training image. Given the practical importance of one-shot learning, it seems surprising that the rich information present in the class tag itself has largely been ignored. Most existing approaches restrict the use of the class tag to finding similar classes and transferring classifiers or metrics...
We propose a high-level concept word detector that can be integrated with any video-to-language models. It takes a video as input and generates a list of concept words as useful semantic priors for language generation models. The proposed word detector has two important properties. First, it does not require any external knowledge sources for training. Second, the proposed word detector is trainable...
Convolutional Neural Networks (CNNs) with Bilinear Pooling, initially in their full form and later using compact representations, have yielded impressive performance gains on a wide range of visual tasks, including fine-grained visual categorization, visual question answering, face recognition, and description of texture and style. The key to their success lies in the spatially invariant modeling...
We consider generation and comprehension of natural language referring expression for objects in an image. Unlike generic image captioning which lacks natural standard evaluation criteria, quality of a referring expression may be measured by the receivers ability to correctly infer which object is being described. Following this intuition, we propose two approaches to utilize models trained for comprehension...
CNNs have made an undeniable impact on computer vision through the ability to learn high-capacity models with large annotated training sets. One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset. This is usually accomplished through fine-tuning a fixed-size network on new target data. Indeed, virtually every contemporary...
Large-scale datasets have driven the rapid development of deep neural networks for visual recognition. However, annotating a massive dataset is expensive and time-consuming. Web images and their labels are, in comparison, much easier to obtain, but direct training on such automatially harvested images can lead to unsatisfactory performance, because the noisy labels of Web images adversely affect the...
End-to-end training from scratch of current deep architectures for new computer vision problems would require Imagenet-scale datasets, and this is not always possible. In this paper we present a method that is able to take advantage of freely available multi-modal content to train computer vision algorithms without human supervision. We put forward the idea of performing self-supervised learning of...
Generating diverse questions for given images is an important task for computational education, entertainment and AI assistants. Different from many conventional prediction techniques is the need for algorithms to generate a diverse set of plausible questions, which we refer to as creativity. In this paper we propose a creative algorithm for visual question generation which combines the advantages...
We address zero-shot learning using a new manifold alignment framework based on a localized multi-scale transform on graphs. Our inference approach includes a smoothness criterion for a function mapping nodes on a graph (visual representation) onto a linear space (semantic representation), which we optimize using multi-scale graph wavelets. The robustness of the ensuing scheme allows us to operate...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.