Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
We propose an image-text alignment framework to match images with text, and take blog article summarization as the main application. Objects in an image are first detected, from them deep features are extracted and transformed into a space commonly shared with the text. On the other hand, sentences of a blog article are represented as vectors, and are also embedded into the common space. With these...
Convolutional neural network (CNN) has drawn increasing interest in visual tracking, among which fully-convolutional Siamese network based method (SiamFC) is quite popular due to its competitive performance in both precision and efficiency. Generally, SiamFC captures robust semantics from high-level features in the last layer but ignores detailed spatial features in earlier layers, thus tending to...
While recent advances in deep learning pushed the state-of-the-art in object detection and semantic segmentation, it often comes at the cost of a considerable annotation effort. Thus, weakly supervised learning became of increasing interest. In this paper a novel approach to the challenging task of weakly supervised segmentation and object localization will be presented. The problem is tackled from...
Trigger detection plays a key role in the extraction of biomedical events, so it will influence the results of biomedical events extraction directly. The traditional biomedical event trigger recognition method is based on artificial design features and construct feature vectors; Not only does it consume great amounts of manpower, it also lacks system generalization ability. Most of methods of trigger...
In molecular biology, phenotypes are often described using complex semantics and diverse biomedical expressions, thereby facilitating the development of named entity recognition (NER). Here, we propose a novel approach of recognizing plant phenotypes by cascading word embedding to sentence embedding with a class label enhancement. We utilized a word embedding method to find high-frequency phenotypes...
On electronic game platforms, different payment transactions have different levels of risk. Risk is generally higher for digital goods in e-commerce. However, it differs based on product and its popularity, the offer type (packaged game, virtual currency to a game or subscription service), storefront and geography. Existing fraud policies and models make decisions independently for each transaction...
In this article we address the problem of expanding the set of papers that researchers encounter when conducting bibliographic research on their scientific work. Using classical search engines or recommender systems in digital libraries, some interesting and relevant articles could be missed if they do not contain the same search key-phrases that the researcher is aware of. We propose a novel model...
Extending from limited domain to a new domain is crucial for Natural Language Generation in Dialogue, especially when there are sufficient annotated data in the source domain, but there is little labeled data in the target domain. This paper studies the performance and domain adaptation of two different Neural Network Language Generators in Spoken Dialogue Systems: a gating-based Recurrent Neural...
UniProtKB has collected more than 88 million protein sequences by July 2017. Less than 0.2% of these proteins, however, have added experimental GO annotations. To reduce this huge gap, automatic protein function prediction (AFP) becomes increasingly important. Results on CAFA (the Critical Assessment of protein Function Annotation algorithms) benchmark demonstrates that sequence homology based methods...
Biomedical semantic indexing refers to annotating biomedical citations with Medical Subject Headings, which is crucial for texting mining, information retrieval and other researches in the field of bioinformatics. The traditional methods ignore the relations among labels and need complicated feature engineering. In this paper, we present a novel model with a deep serial multi-task learning structure,...
Code bloat is a phenomenon in Genetic Programming (GP) that increases the size of individuals during the evolutionary process. Over the years, there has been a large number of research that attempted to address this problem. In this paper, we propose a new method to control code bloat and reduce the complexity of the solutions in GP. The proposed method is called Substituting a subtree with an Approximate...
In this work we introduce a structured prediction model that endows the Deep Gaussian Conditional Random Field (G-CRF) with a densely connected graph structure. We keep memory and computational complexity under control by expressing the pairwise interactions as inner products of low-dimensional, learnable embeddings. The G-CRF system matrix is therefore low-rank, allowing us to solve the resulting...
Cross-modal hashing is usually regarded as an effective technique for large-scale textual-visual cross retrieval, where data from different modalities are mapped into a shared Hamming space for matching. Most of the traditional textual-visual binary encoding methods only consider holistic image representations and fail to model descriptive sentences. This renders existing methods inappropriate to...
In this paper, we propose a cross-modal deep variational hashing (CMDVH) method for cross-modality multimedia retrieval. Unlike existing cross-modal hashing methods which learn a single pair of projections to map each example as a binary vector, we design a couple of deep neural network to learn non-linear transformations from image-text input pairs, so that unified binary codes can be obtained. We...
Many of the existing methods for learning joint embedding of images and text use only supervised information from paired images and its textual attributes. Taking advantage of the recent success of unsupervised learning in deep neural networks, we propose an end-to-end learning framework that is able to extract more robust multi-modal representations across domains. The proposed method combines representation...
We propose to help weakly supervised object localization for classes where location annotations are not available, by transferring things and stuff knowledge from a source set with available annotations. The source and target classes might share similar appearance (e.g. bear fur is similar to cat fur) or appear against similar background (e.g. horse and sheep appear against grass). To exploit this,...
We propose a novel measure of visual similarity for image retrieval that incorporates both structural and aesthetic (style) constraints. Our algorithm accepts a query as sketched shape, and a set of one or more contextual images specifying the desired visual aesthetic. A triplet network is used to learn a feature embedding capable of measuring style similarity independent of structure, delivering...
Despite the recent success of deep-learning based semantic segmentation, deploying a pre-trained road scene segmenter to a city whose images are not presented in the training set would not achieve satisfactory performance due to dataset biases. Instead of collecting a large number of annotated images of each city of interest to train or refine the segmenter, we propose an unsupervised learning approach...
We propose a novel video object segmentation algorithm based on pixel-level matching using Convolutional Neural Networks (CNN). Our network aims to distinguish the target area from the background on the basis of the pixel-level similarity between two object units. The proposed network represents a target object using features from different depth layers in order to take advantage of both the spatial...
The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.