and lengths of the features including keywords, n grams, skip grams, and bags-of-words. The correlation results are enhanced significantly as the highest correlation scores have increased from 0.22 to 0.70, and the average correlation scores have increased from 0.22 to 0.60.
In this study, we used the Google Trends as a prediction tool to predict the investors' behavior and its impact on stock market. In the behavior and social perspective, more and more Internet users use Google Trends as the search engine to surf on the websites every day. Therefore, these search actions can be seen as personal votes because Internet users often search items they are interested in....
-industrial collaboration with a software company. The collaboration is intended to suggest multiple code fragments to be changed simultaneously when a developer specifies a keyword such as variable names on source code. In the collaboration, we propose to use code clones and logical couplings information to reorder the code
The paper presents a novel benefit based query processing strategy for efficient query routing. Based on DHT as the overlay network, it first applies Nash equilibrium to construct the optimal peer group based on the correlations of keywords and coverage and overlap of the peers to decrease the time cost, and then
Keyword query applies to the database which accommodate structured data provide a search option over text attributes that uses a probability based ranking technique but query facing the issue of poor quality results. The keyword matches with multiple entities because the user does not provide exact data from which we
Language model adaptation using text data downloaded from the WWW is an efficient way to train a topic-specific LM. We are developing an unsupervised LM adaptation method using data in the Web. The one key point of unsupervised Web-based LM adaptation is how to select keywords to compose the search query. In this
A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as keyword is its relevance for the text. The tf.idf score of a term is a widely used relevance measure. While easy to compute and giving quite
We examine whether aggregate daily Twitter keyword volumes over eight months from November 2011 to June 2012 can be used to predict aggregate daily consumer spending as reported by Gallup. We also examine whether Twitter keyword volume improves predictive ability over prediction based solely on current spending
KSORD (keyword search over relational database) techniques allow users to obtain information from databases, which is just like using search engines. However, the advanced techniques only realize exact queries, but not for fuzzy queries. The Rocchio algorithm of learning classification is introduced which is made a
Keyword extraction aims to find representative phrases for a document. Graph-based keyword extraction represent the input document as a graph and rank its nodes according to their score using graph-based ranking method. In this paper, we propose a method to compute importance of co-occurrence word in document and
social media. Discovering keyword-based correlated networks of these large graphs is an important primitive in data analysis, from which users can pay more attention about their concerned information in the large graph. In this paper, we propose and define the problem of keyword-based correlated network computation over a
This Keyword queries provide fluent access to data over big databases, but there is problem of low and poor ranking quality or priority problem of obtaining results after querying .To satisfy the user it is necessary to identify the queries that have low ranking quality. In this paper, we are creating a framework to
Evaluation of the Deep Web data sources must be based on the data in the Web databases, then how to select the most representative keywords as a query word to obtain a large number of uniformly distributed data is a major difficulty, this paper proposed a Deep Web database sampling method based on high correlation
. In this paper, we propose a web page-oriented and keywords-based approach to address this problem. Our approach includes two key components: keyword similarity measurement and keyword similarity based user segmentation. These two components serve as plugins and can be replaced with better algorithms or measurements
Does there exist a compact set of visual topics in form of keyword clusters capable to represent all images visual content within an acceptable error? In this paper, we answer this question by analyzing distribution laws for keywords from image descriptions and comparing with traditional techniques in NLP, thereby
. In view of the traditional feature extraction method based on binary program, this paper presents a method for feature extraction of JAVA source code. The method uses the Keywords Correlation Distance to compute the correlation between key codes such as API calls, Android permissions, the common parameters, and the
The majority of the scientific papers include keywords besides the obligatory title and abstract. The use of keywords is not just for the description of content, but it is a viral part of the scientific paper which is later used for information retrieval function. The aim of this paper is to present comparison of
information is deficient and noisy on YouTube. In this paper, we propose the novel dual updating method for YouTube video topic discovery. We first enhance the document representation for each video with its related videos, then we extract meaningful topics via keyword cores, at last, the video response links and the
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.