The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Peripheral vision loss (also called tunnel vision) is one of the main visual field disorders that can be very frustrating, and affect confidence and main activities of the patient. In this paper, two promising solutions for the peripheral vision loss are presented and discussed. The first one uses optical see-through glasses that are augmented by computer-generated images to notify the user about...
Traditional visual speech recognition systems consist of two stages, feature extraction and classification. Recently, several deep learning approaches have been presented which automatically extract features from the mouth images and aim to replace the feature extraction stage. However, research on joint learning of features and classification is very limited. In this work, we present an end-to-end...
It has been reported that visual attention is captured exogenously by faces. This study used pareidolia faces and examined whether subjective perception of a face is sufficient for capturing attention or the registry of an actual face is necessary for attentional capture. Three experiments demonstrated that a completely task-irrelevant face distractor captured attention exogenously, in turn disrupting...
Periodic broadcasting has achieved prominent performance in VoD (Video on Demand) service in wired network. However, the development of wireless VoD service still has gone on hard and slowly in comparison to the rapid growth of mobile video service. In this paper, we propose a scalable video streaming method HQOBA (Heterogeneous Quality-Oriented Bandwidth Allocation) to address the problems in wireless...
Segmentation of TV news broadcast into semantically meaningful stories is an essential pre-requisite for a wide range of video analytics applications. In this work we have introduced a hybrid approach for news story segmentation based on conditional random fields (CRFs). The story boundary detection problem is converted into a shot classification problem by classifying video shots into either of the...
Automatic summarization of streaming news images is critical for efficient news browsing. Although image duplicates are redundant for news reading, the number of duplicates of a news image is a good indicator for its importance. We describe the architecture used in a news aggregation system for online streaming news image summarization. Given a sequence of images for a news topic, we first cluster...
This paper provides an overview of the Joint Contest on Multimedia Challenges Beyond Visual Analysis. We organized an academic competition that focused on four problems that require effective processing of multimodal information in order to be solved. Two tracks were devoted to gesture spotting and recognition from RGB-D video, two fundamental problems for human computer interaction. Another track...
With the huge amount of web video data and its exponential growth in recent years, there are new challenges in Near-Duplicate Video Detection (NDVD) which have attracted much attention owing to its wide applications. One of the problems is how to extract discriminative features to achieve higher precision, and the other problem is how to improve the efficiency of large scale video analysis. Existing...
There is a need for automatic processing and extracting of meaningful metadata from multimedia information, especially in the audiovisual industry. This higher level information is used in a variety of practices, such as enriching multimedia content with external links, clickable objects and useful related information in general. This paper presents a system for efficient multimedia content analysis...
Motivated by increasing popularity of depth visual sensors, such as the Kinect device, we investigate the utility of depth information in audio-visual speech activity detection. A two-subject scenario is assumed, allowing to also consider speech overlap. Two sensory setups are employed, where depth video captures either a frontal or profile view of the subjects, and is subsequently combined with the...
User-generated content on online social media (OSM) has several data mining applications, such as extracting useful information during disaster events. Since popular / important content is often re-posted by multiple people on OSM, identifying duplicate content is an important first step in many data mining applications. In this work, we develop a methodology to identify near-duplicate images posted...
The application such as video surveillance for traffic control in smart cities needs to analyze the large amount (hours/days) of video footage in order to locate the people who are violating the traffic rules. The traditional computer vision techniques are unable to analyze such a huge amount of visual data generated in real-time. So, there is a need for visual big data analytics which involves processing...
Netflix, Hulu, etc are some of the most popular video content streaming services that are increasingly being accessed through many popular consumer devices such as Apple TV, XBox, Wii, etc. It has now become possible to conveniently interact with the video contents by using the input hardwares that these devices provide. We emulate the setups that many of these popular platforms provide in order to...
In this paper we present a scheme for pedestrian tracking from an unmanned aerial vehicle (UAV), which includes the motion control of the UAV, and the visual tracking of a specific pedestrian from the moving platform. In the visual tracking part, we use an online updating feature queue and the Locality-constrained Linear Coding (LLC) method to match the pedestrian target. The ground station receives...
This paper proposes an interactive annotation technique for 360° videos that allows the use of traditional video editing techniques to add content to immersive videos. Using the case study of immersive journalism the main objective is to diminish the entry barrier for annotating 360° video pieces, by providing a different annotation paradigm and a set of tools for annotation. The spread of virtual...
In this manuscript we propose a novel method for jointly page stream segmentation and multi-page document classification.The end goal is to classify a stream of pages as belonging to different classes of documents. We take advantage of the recent state-of-the-art results achieved using deep architectures in related fields such as document image classification, and we adopt similar models to obtain...
Video analysis is an essential process to segment and summarize sports videos automatically. In this paper, we propose fast and simple computer vision algorithms which can be employed to an event segmentation system for basketball broadcasting videos. In our approach, camera panning is estimated by the optical flow estimation and flow segmentation algorithms. For recognizing shot classes and clock...
In recent years, the importance of location-based services and indoor positioning systems increased significantly for both, research and industry. Visual localization systems have the advantage of not depending on dedicated infrastructure and thus they are interesting for navigation within buildings. While there are already approaches which are using pre-recorded databases of reference images to obtain...
In recent years, location-based services and indoor positioning systems gained increasing importance for both, research and industry. Visual localization systems have the advantage of not being dependent on dedicated infrastructure and thus are especially interesting for navigation within buildings. While there are already approaches of using pre-recorded databases of reference images to obtain an...
Robotics is the field currently taking its place as a leading candidate for dramatic changes in everyday life. Advances in the past 10 years in sensing, actuator and power technologies have fuelled an explosion of opportunities in this exciting, and surprisingly affordable domain. Small Unmanned Aircraft Systems (drones) are being rapidly developed for research, public service, and commercial applications,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.