The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
There has been much recent interest, both from industry and research communities, in 3D video technologies and processing techniques. However, with the standardisation of 3D video coding well underway and researchers studying 3D multimedia delivery and users' quality of multimedia experience in 3D video environments, there exist few publicly available databases of 3D video content. Further, there...
Efficient content-based access to large multimedia collections requires annotations that are human-meaningful, and user tagging of media is one means to obtain such semantic metadata. Tags can also act as user feedback essential for quality of multimedia experience assessment; however, tags can lack user context and become ambiguous between different users. Further, user tagging is a deliberate and...
With new social media technologies arising daily, this paper reports on a pilot user survey that studies how tertiary educated users are engaging with social media. The results indicate sporadic use of social media by the tertiary educated users studied; they are generally aware of the key social media sites and facilities, but are not actively utilizing these services. The reasons for, and the implications...
Spatially squeezed surround audio coding (S3AC) has been previously shown to provide efficient coding with perceptually accurate soundfield reconstruction when applied to ITU 5.1 multichannel audio. This paper investigates the application of S3AC to the coding of Ambisonic audio recordings. Traditional ambisonics achieve compression and backward compatibility through the use of the UHJ matrixing approach...
Intelligent multimedia delivery uses semantic information about content to enhance the delivery process. This paper proposes a model for intelligent multimedia delivery that advances the state of the art by incorporating a concept of semantic distortion into the delivery optimization process. Furthermore, the model combines format-independence with rate-distortion optimization to provide a flexible...
The Bitstream Binding Language (BBL) is a new technology developed by the authors and being standardized by MPEG, which describes how multimedia content and metadata can be mapped onto streaming formats. This paper describes how BBL can be used to enhance the interoperability of multimedia content by providing a generic mechanism for the translation of content between formats. As new content formats...
This paper proposes a P2P architecture which uses MPEG-21 as a standard based technique to dynamically adapt resources according to various usage environment attributes such as terminal capabilities and user preferences. In the architecture, a super peer based approach is used to cluster peers, store peer information, perform searches and instruct peers to adapt/send resources. Pull and push-based...
The bitstream binding language (BBL) is a new technology developed by the authors and being standardized by MPEG, which describes how multimedia content and metadata can be mapped onto streaming formats. This paper describes a particular application of BBL-format-independent multimedia streaming. This means that streaming servers no longer require additional software modules in order to support new...
A sequential approach to sparse component analysis (SeqTIF) is proposed in this paper. Although SeqTIF employs the estimation process of the simultaneous TIFROM algorithm, a source cancellation and deflation technique are also incorporated to sequentially estimate speech signals in the mixture. Results indicate that SeqTIF's separation performance shows a clear improvement upon the simultaneous TIFROM...
A blind signal separation algorithm (SCAtemp) that exploits both the sparse time-frequency representation and temporal structure of speech is proposed. SCAtemp compares each speech signal's adherence to the sparsity and temporal criteria, before switching to the most appropriate criteria to estimate each signal. This algorithm is shown to improve the real time separation performance of conventional...
This paper explores user-centered metadata delivery through the example of hierarchically organized meeting audio metadata. Audio annotations that describe meeting scenarios can vary from low-level signal-based descriptors to high-level semantics. Users of meeting metadata also have widely varying requirements and hence want metadata at varying levels and detail. Thus, for efficient metadata access,...
Multiparty meetings generally involve stationary participants. Participant location information can thus be used to segment the recorded meeting speech into each speaker's 'turn' for meeting 'browsing'. To represent speaker location information from speech, previous research showed that the most reliable time delay estimates are extracted from the Hubert envelope of the linear prediction residual...
A dynamic P2P architecture based on MPEG-21 was proposed in our previous work to support resource adaptation/personalization according to the surrounding usage environment and user preferences. In this paper, we improve the proposed system through two separate but related modifications. Firstly, peers are clustered according to registered geographic location information. Secondly, based on that registered...
Bitstream binding language (BBL) provides an abstraction layer between XML multimedia containers and the way their resources and metadata are published in a bitstream. It allows multiple bindings from a single source document to facilitate interoperability and applicability of the multimedia content to a wide range of terminals and users. BBL introduces a number of features not found in other XML...
Summary form only given. A new generation of speech coding algorithms, offering high-quality speech compression at bit rates as low as 2.4 kb/s, has been developed. The algorithms which have been successful at these rates bear little resemblance to those developed for use at higher rates such as 8 kb/s. In particular, the use of CELP and its derivative architectures at rates of 2.4 kb/s has proved...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.