The authors investigated censorship practices and the use of microblogs—or weibos, in Chinese—using 111 million microblogs collected between 1 January and 30 June 2012. To better control for alternative explanations for censorship decisions attributable to an individual's characteristics and choices, they used a matched case-control study design to determine a list of Chinese...
Distributed NoSQL systems aim to provide high availability for large volumes of data but lack the inherent support of complex queries often required by overlying applications. Common solutions based on inverted lists for single terms perform poorly in large-scale distributed settings. The authors thus propose a multiterm indexing technique that can store the inverted lists of combinations of terms...
, keywords are extracted from each advertisement, and the most correlated keywords to each category are identified through Term Frequency and Inverted Domain Frequency (TF-IDF) analysis. Thus, the ontology of the advertisement world is built. Normalized Google Distance (NGD) values between keywords are computed to derive the
Customer Edge Switching (CES) is an experimental Internet architecture that provides reliable and resilient multi-domain communications. It provides resilience against security threats because domains negotiate inbound and outbound policies before admitting new traffic. As CES and its signalling protocols are being prototyped, there is a need for independent testing of the CES architecture. Hence,...
In a real world, it is often in a group setting that sensitive information has to be stored in databases of a server. Although personal information does not need to be stored in a server, the secret information shared by group members is likely to be stored there. The shared sensitive information requires more security and privacy protection. To our best knowledge, there is no paper which deals with...
In this work we aim to capitalize on the availability of Internet image search engines to automatically create image training sets from user provided queries. This problem is particularly difficult due to the low precision of image search results. Unlike many existing dataset gathering approaches, we do not assume a category model based on a small subset of the noisy data or an ad-hoc validation set...
party servers. In this paper, we discuss on the authenticated search results of some recent works and then present an improved scheme that ensures the authenticity of the search results corresponding to a search query over Internet. The improved scheme is based on the scheme  that uses the concept of conjunctive keyword
propose a """"Hybrid Search Engine Framework for the Internet of Things based on Spatial-Temporal, Value-based, and Keyword-based Conditions"""" (""""IoT-SVK Search Engine"""" for short). The experimental results
scheme using interesting keyword and manage contents table via overlay network. The content providers with high frequency are chosen in the network. Then an overlay network is constructed using content providers. Via overlay network, it is possible to reduce the number of message transmissions and overhead to manage content
An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to access the pages from certain Web sites. These pages are often referred to as the hidden Web or the deep Web. Since there are no static links
Given a set of keywords, we find a maximum Web query (containing the most keywords possible) that respects user-defined bounds on the number of returned hits. We assume a real-world setting where the user is not given direct access to a Web search engine's index, i.e., querying is possible only through an interface
Search-based advertising has become very popular since it provides advertisers the ability to attract potential customers with measurable returns. In this type of advertising, advertisers bid on keywords to have an impact on their ad’s placement, which in turn affects the response from potential customers. An
Language Model (LM) constitutes one of the key components in Keyword Spotting (KWS). The rapid development of the World Wide Web (WWW) makes it an extremely large and valuable data source for LM training, but it is not optimal to use the raw transcripts from WWW due to the mismatch of content between the web corpus
Search engines award their advertising space through keyword auctions. Some bidders may adopt an aggressive bidding strategy known as Competitor Busting, where they submit higher bids than what is strictly needed to win the auction so as to oust the other bidders. Despite the widespread concern for such practice, we
result shows that their proposal seems unlikely to be implementable with the latest technology, due to a large amount of computational cost involved. Note that it is the first time to analyze and examine the practicality of this public key encryption based keyword search protocol using PIR.
issued to the databases also contain spatial and textual components, for example, "Find shelters with emergency medical facilities in Orange County," or "Find earthquake-prone zones in Southern California." We refer to such queries as spatial-keyword queries or SK queries for short. In recent times, a lot of interest has
A new method, comparative keyword analysis, is used to compare the language of men and women with cancer in 97 research interviews and two popular internet based support groups for people with cancer. The method is suited to the conjoint qualitative and quantitative analysis of differences between large bodies of text
We propose a novel method for mining knowledge from linked Web pages. Unlike most conventional methods for extracting knowledge from linked data, which are based on graph theory, the proposed method is based on our associated keyword space (ASKS), which is a nonlinear version of linear multidimensional scaling (MDS
Financed by the National Centre for Research and Development under grant No. SP/I/1/77065/10 by the strategic scientific research and experimental development program:
SYNAT - “Interdisciplinary System for Interactive Scientific and Scientific-Technical Information”.