The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In order to process very large graphs, existing graph processing systems, such as Pregel and Giraph, usually partition and distribute the graph computation on large number of nodes (i.e., workers). However, due to the heterogeneity of computing clusters (e.g., nodes with various bandwidth or CPU resource), blindly increasing the number of workers for a job may even degrade the overall performance...
The advent of social networks and Internet-of-Things has resulted in unprecedented capability of collecting, sharing and analyzing massive amounts of data. From a security perspective, Big Data may seriously weaken confidentiality, as techniques for improving Big Data analytics performance-including early fusion of heterogeneous data sources — increase the hidden redundancy of data representation,...
A Business Cloud is defined to be a collection of company datasets that are stored on the "Cloud". For simplicity, we have assumed: Each company only has one dataset. There are information flows among these datasets. Within such an environment Chinese Wall Security Policy (CWSP) is revisited. Based on the "physical" view of Brewer and Nash, the Chinese Wall policy that regulates...
Text actionability detection is the problem of classifying user authored natural language text, according to whether it can be acted upon by a responding agent. In this paper, we propose a supervised learning framework for domain-aware, large-scale actionability classification of social media messages. We derive lexicons, perform an in-depth analysis for over 25 text based features, and explore strategies...
This paper presents the results of an ethnographic study focused on how data science projects were conducted within a global media advertising company. Observations, via embedding a researcher within the team, as well as more structured interviews and surveys, are documented. Recommendations to improve the current data science methodology within the company are also discussed. Overall, there had been...
Big Data is an emerging research topic. The term remains fuzzy and is seen as an umbrella term. Origin, composition, possible strategies, and outcomes are uncertain. Thus, the positioning of publications addressing business administrated issues related to Big Data is impeded. From a practitioner's point of view, the ability to communicate a value proposition is impeded due to the difficulty in scoping...
With the explosion of big data, companies both small and large are increasingly motivated to make data-driven decisions. For web-based companies in particular online controlled experiments or A/B tests have become essential scientific tools for decision-making. Large scale organizations like Google, Amazon, eBay, Facebook, LinkedIn, Yahoo, and Microsoft have built mature systems and support for controlled...
Viral marketing, a marketing strategy that leverages the influence power in intimate relationship, has become more prevalent due to the popularity of online social networking services in recent years. Consumers are more likely to make a purchase based on social media referrals. Since marketing through social media and traditional channels may target on different audiences, how to maximize the revenue...
The widespread use of social media and the internet are emerging trends that offer an additional interaction channel for companies to better understand customer sentiments about their brands and products. Sentiment analysis uses text data from social media such as customer comments and reviews, which has the nature of high dimensionality. Without selection, typically there are at least thousands of...
Groupon is a major e-commerce company. It is unique in the sense that it is not only a vendor of goods, but also of local deals (such as restaurants, spas, activities, etc.) that reflect various aspects of a user's interests. In this sense, Groupon has a complete view of its users' lifestyle preferences. This is different from e-commerce goods vendors, who, for instance, may not have direct insight...
Cloud services are widely used across the globe to store and analyze Big Data. These days it seems the news is full of stories about security breaches to these services, resulting in the exposure of huge amounts of private data. This paper studies the current security threats to Cloud Services, Big Data, and Hadoop. The paper analyzes a newly proposed Big Data security system based on the EnCoRe system...
Since the social media hype started in the early 2000s, the Internet has bloomed with user-generated data. The content generated by users in social media varies from blogs, forums, social network platforms, and video sharing communities. This data has a special emphasis on the relationships among users of the community. As a consequence, social media data contains significant information about their...
One of the most promising areas where Big Data Analytics can be integrated into business-oriented projects-allowing research and development teams to work hand in hand with industry representatives — is the digitalization of manufacturing industry. There are two main driving forces for the interest in this area: the promotion of key strategies such as German Government's Industrie 4.0 or General Electric's...
Nowadays, companies are usually strongly interested in discovering the latent social influences among their customers since the information is highly valuable to their marketing strategies. In this paper, we study how to model the influence probabilities among the customers of a telecommunication company by analyzing their call records and mobile web browsing histories. We first construct a directed...
Building a knowledge base (KB) describing domain-specific entities is an important problem in industry, examples including KBs built over companies (e.g. Dun & Bradstreet), skills (LinkedIn, CareerBuilder) and people (inome). The task involves several engineering challenges, including devising effective procedures for data extraction, aggregation and deduplication. Data extraction involves processing...
To reduce container management costs, ocean carrier companies rent containers from container leasing companies. Two carrier companies can exchange their empty containers between each other at various ports to eliminate the transportation cost of empty containers. To minimize costs, a container leasing company has to find the maximum number of pairs of carrier companies that can exchange containers...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.