The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Learning to hash has been recognized to accomplish highly efficient storage and retrieval for large-scale visual data. Particularly, ranking-based hashing techniques have recently attracted broad research attention because ranking accuracy among the retrieved data is well explored and their objective is more applicable to realistic search tasks. However, directly optimizing discrete hash codes without...
In this work we study the task of image annotation, of which the goal is to describe an image using a few tags. Instead of predicting the full list of tags, here we target for providing a short list of tags under a limited number (e.g., 3), to cover as much information as possible of the image. The tags in such a short list should be representative and diverse. It means they are required to be not...
Multi-label classification is a vital problem, as it has numerous applications in computer vision, such as automatic image annotation. The label set for each instance is always assumed to be in the original whole form. However, missing labels often occur because manual labelling is a time-consuming and label-intensive work in the case of large amount of data. The incompleteness of labels can certainly...
In this paper, we propose to incorporate convolutional neural networks with a multi-context attention mechanism into an end-to-end framework for human pose estimation. We adopt stacked hourglass networks to generate attention maps from features at multiple resolutions with various semantics. The Conditional Random Field (CRF) is utilized to model the correlations among neighboring regions in the attention...
The CNN-RNN design pattern is increasingly widely applied in a variety of image annotation tasks including multi-label classification and captioning. Existing models use the weakly semantic CNN hidden layer or its transform as the image embedding that provides the interface between the CNN and RNN. This leaves the RNN overstretched with two jobs: predicting the visual concepts and modelling their...
Recently, zero-shot action recognition (ZSAR) has emerged with the explosive growth of action categories. In this paper, we explore ZSAR from a novel perspective by adopting the Error-Correcting Output Codes (dubbed ZSECOC). Our ZSECOC equips the conventional ECOC with the additional capability of ZSAR, by addressing the domain shift problem. In particular, we learn discriminative ZSECOC for seen...
There has been incredible growth of events over the internet in recent years. Google has become the giant source of knowledge for any event which has happened or happening over the internet. Some networking sites such as face book, micro blogging sites such as twitter are evolved with time and became the highly used sites over the internet. Various E-commerce websites such as Amazon, Ebay, Flipkart...
Polarimetric SAR classification is an effective approach in image understanding. This paper proposes a novel semantic method for classification of Polarimetric SAR data. The method combines superpixels and semantic model to benefit from both the object-oriented classification and the high-level semantic information. Firstly, pixels was grouped into superpixels via Simple Linear Iterative Clustering...
Cross-media retrieval, which uses a text query to search for images and vice-versa, has attracted a wide attention in recent years. The mostly existing cross-media retrieval methods aim at finding a common subspace and maximizing different modalities correlations. But these approaches do not directly capture the underlying semantic information of different modalities. This paper proposes a novel cross-media...
DNN-based cross-modal retrieval has become a research hotspot, by which users can search results across various modalities like image and text. However, existing methods mainly focus on the pairwise correlation and reconstruction error of labeled data. They ignore the semantically similar and dissimilar constraints between different modalities, and cannot take advantage of unlabeled data. This paper...
With the demand of power information construction and the application of D5000 platform, the standardization and management of all kinds of data becomes a necessary condition for data sharing and application integration between systems. In order to realize the safe, convenient and efficient information resource sharing of power enterprises and improve the information management level, the fusion method...
Multi-view correlation learning has attracted great attention with the proliferation of heterogeneous data. Typical methods, such as Canonical Correlation Analysis (CCA) and its variants, usually maximize one-to-one corresponding correlation of inter-view data, while most of them neglect discriminative multi-label information and local structure of each view data. In this paper, we propose multi-label...
Nowadays cross-media retrieval is an useful technology that helps people find expected information from the huge amount of multimodal data more efficiently. A common cross-media retrieval framework is first to map features of different modalities into an isomorphic semantic space so that the similarity between heterogeneous data can be measured. For most of semantic space based methods, the mapping...
A novel scheme with deep cross-modal correlation learning is developed in this paper to facilitate more effective Sketch-based Image Retrieval (SBIR) for large-scale annotated images. It integrates the deep multimodal feature generation, deep cross-modal correlation learning and similarity search optimization through mining all the beneficial multimodal information sources in sketches and images,...
Recurrent neural networks (RNNs) are able to capture context in an image by modeling long-range semantic dependencies among image units. However, existing methods only utilize RNNs to model dependencies of a single modality (e.g., RGB) for labeling. In this work we extend this single-modal RNNs to multimodal RNNs (MM-RNNs) and apply it to RGB-D scene labeling. Our MM-RNNs are capable of seamlessly...
Although organizations have widely adopted Business Process Management Systems (BPMS) as an automation and integration middleware, these systems remain limited in their orchestration capabilities. BPMS can only react to event information that enterprise applications emit and only integrate against the service interfaces these applications provide. At the same time, organizations increasingly leverage...
In the past few years, social tags and tagging systems have gained large momentum for service categorizing and indexing content on the Web. Tags are used freely, which leads a random correspondence between the tags and the services, which affects the performance of the tag in the search and applications. We propose a novel scheme for tag predicting based on graph, aiming to automatically sort the...
The service technology and crowdsourcing movement have spawned a host of successful efforts that promote the rapid development of the human service ecosystem. In this ecosystem, a large number of globally-distributed freelancers are organized to tackle a range of tasks over the web. These crowdsourcing services provide convenience for civilians with lower price and shorter response time. However,...
Automatic Image Annotation (AIA) plays an important role in large-scaled intelligent image management and retrieval. Based on the correlation between image low-level features and high-level semantic concepts, images can be efficiently retrieved from large-scaled image dataset. Recently, many researchers leverage machine learning techniques to annotate images automatically. However, these methods still...
Evaluating and scoring essay questions is an exhausting, time consuming process and require a lot of effort. So, applying automated tools is essentially required to tackle these drawbacks. In this study, we propose an automated scoring approach for short answers to Arabic essay questions. The scoring process is based on the similarity between the student's answer and model answer, cosine similarity...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.