The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper we propose a novel method for TV news retrieval. A first stage concerns a temporal segmentation into stories units. Then, for each story the most relevant concepts are extracted based on a multimodal fusion between visual and textual information. By analyzing the video stream, we perform global frame representation, image retrieval and re-ranking, in order to determine, with high confidence,...
To provide more powerful video enabled applications, e.g. in video surveillance environments, it is increasingly more critical not only to have access to the decoded video but also to, e.g. efficiently search for similar videos. In this context, this paper proposes a feature-based video coding solution adopting a hybrid approach where both pixels and local visual features are exploited for coding...
Video Affective Content Analysis is an active research area in computer vision. Live Streaming video has become one of the modes of communication in the recent decade. Hence video affect content analysis plays a vital role. Existing works on video affective content analysis are more focused on predicting the current state of the users using either of the visual or the acoustic features. In this paper,...
In this paper, our objective is to identify the best possible bitrate for transmitting Ultra High Definition (UHD) content at different complexity levels and frame rates. We compressed several UHD videos at different bitrate levels and evaluated their quality through subjective tests. Results revealed that when the original videos are not available, naïve viewers cannot distinguish the difference...
Automatic prediction of personal preference for lecture video is becoming exceedingly important as the available volume of such content is expanding rapidly. Gaze information is believed to be an convenient and useful indicator of cognitive process due to its non-invasive characteristic. We build up a small-scale dataset of viewer's gaze during watching TED Talk videos, and propose a set of gaze features...
An image retargeting method is presented in this paper, which uses multiple operators to improve the performance. In the method, cropping is adopted first in order to remove non-important content to the desired aspect ratio while keeping significant content intact. If the desired aspect ratio cannot be met by cropping, an aspect ratio adjusting method is then adopted to fit the image to the desired...
In ubiquitous multimedia applications, how to protect the digital copyright for multimedia data has been a difficult task. In this paper, a novel self-adaptive video dual watermarking, which is combined the motion characteristics detection with the geometric invariant of Scale-invariant feature transform (SIFT) is proposed. For each frame, the motion characteristics are calculated as the maximum of...
In this paper, we propose a novel Markov decision-based rate adaption scheme for DASH aiming to maximize the quality of user experience. To this end, our proposed method takes into account those key factors that have critical impact on visual quality, including video playback quality, video rate switching frequency and amplitude, buffer overflow/underflow, and buffer occupancy. And a dynamic reward...
With the popularity of Internet video, video retrieval applications become more and more widespread. The effect of the traditional retrieval optimization algorithms cannot meet the needs of users. To improve rearrangement semantic rationality of video search results, this paper introduces the video annotation to mark based on the video content objectively. At the same time, the use of words semantic...
Community image and video platforms like FlickR and Youtube offer large image collections from different perspectives. However, the majority of publicly available imagery from online communities lack a reasonable exact location and orientation information, which is important for many geo-spatial applications like object geo-referencing, knowledge transfer or augmented reality. In this work we exploit...
The goal of surveillance video abstraction is to generate a video abstract that includes important events and object by eliminating the redundant frames, lacking from activity in original video. Although many research and progresses have been done in video abstraction, the developed approaches either fail to accurately and effectively cover the overall visual content of video or they are computationally...
In this paper an analysis of human engagement behaviour with video is presented based on real life experiments. An engagement model could be employed in classroom education, enhancing programming skills, reading etc. Two groups of people, independent of one another, watched eighteen video clips separately at different times. The first group's participants' eye gaze locations, right and left pupil...
Everyday an enormous amount of video is captured by surveillance system for various purposes around the whole world. However, this is almost impossible for human to analyze the vast majority of video data. In this paper, a video summarization method is introduced combining foreground object, motion, and visual attention cue. Foreground objects typically provide important information about video contents...
Intra-prediction Modes-based (IPM-based) descriptors are among robust and competitive visual descriptors for near-duplicate video similarity detection, in general and content-based copy detection (CCD), in particular. IPM-based descriptors are extracted from the compressed H.264/AVC (MPEG-4/AVC) video domain. Intra-prediction Modes (IPM) are the building blocks of the key frames (I and IDR slices)...
Digital Image Processing (DIP) consists of a set of techniques to acquire, represent and transform digital images. Through these techniques, it is possible to extract and identify information of images and improve the visual quality by facilitating human perception and interpretation by computer systems. However, the Digital Image Processing teaching is hindered by the complexity of implementation...
In past few years, multimedia traffic is growing and Internet have maximum portion of multimedia traffic. This traffic trend is expected to increase due to multimedia applications. Best effort Internet architecture poses design limitations for multimedia traffic. IPTV like applications require higher bandwidth, low packet loss, low delays and jitter effects to transmit high quality video contents...
Various video applications in mobile and wearable devices deal with private or important video data. In order to protect the important video information, several video encryption techniques have been proposed. The secure video processing, the combination of the video compression/decompression and the video encryption/decryption, causes lots of computational overheads, thereby consuming huge energy...
Researchers have done a great number of studies on the visual object tracking and the video encoding transmission respectively. However, there are still no public reports about the influence on the visual object tracking raised by the video coding rates. In this paper, for this issue, a typical tracker with the features of HOG and the most commonly-used video encoding methods-H.264/AVC, are chosen...
Saliency detection in videos has attracted great attention in recent years due to its wide range of applications. In this paper, a novel spatiotemporal saliency detection model based on clustering is proposed. The discrete cosine transform coefficients are used as features to generate the spatial saliency map firstly. We utilize 2D Gaussian function to estimate the absolute feature difference in consideration...
In this talk we seek to provide insight on the general topic of visual soft biometrics (gender, age, ethnicity, etc.). First, we present a new refined definition of soft biometrics, emphasizing on the aspect of human compliance, and then proceed to identify candidate traits that accept this novel definition. Second, we introduce some image processing techniques related to the estimation of some traits...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.