The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With more and more enterprises and organizations outsourcing their IT services to distributed clouds for cost savings, historical and operational data generated by these services grows exponentially, which usually is stored in the data centers located at different geographic location in the distributed cloud. Such data referred to as big data now becomes an invaluable asset to many businesses or organizations,...
Hadoop as a popular open-source implementation of MapReduce is widely used for large scale data-intensive applications like data mining, web indexing and scientific computing. The current Hadoop implementation assumes that nodes in a cluster are homogeneous in nature, and Hadoop distributed file system(HDFS) distributes data to multiple nodes based on disk space availability. Such data placement strategy...
Info-Kmeans, a K-means clustering method employing KL-divergence as the proximity function, is one of the representative methods in information-theoretic clustering. With the explosive growth of online texts such as online reviews and user-generated content, the text is becoming more sparse and much bigger, which poses significant challenges on both effectiveness and efficiency issues of text clustering...
The correlation analysis of telemetry data plays a significant role in satellite performance analysis. However, the existing methods cannot be well applied, because the telemetry data is large and high-dimensional. In this paper, an efficient algorithm named QARC Apriori is proposed. First, to reduce the redundant attributes and lower the problem complexity, grey relational analysis method is applied...
This work is the first to utilize big data of American depository receipt (ADRs) to conduct economic analysis. Foreign tax liability is the minimums of the taxation imposed on ADR dividend income. Identical foreign tax rates enable ADR investors to engage in transactions during ex-dividend days. This study is the first in utilizing 5,424 ex-dividend events to simulate the impact of tax on the ADR...
In big data research, an important field is the big data graph algorithm. The Bayesian Network (BN) is a very powerful graph model for causal relationship modeling and probabilistic reasoning. One key process of building a BN is discovering its structure -- a directed acyclic graph (DAG). In the literature, numerous Bayesian network structure learning algorithms are proposed to discover BN structure...
Data-intensive services have become one of the most challenging applications in cloud computing. The classical service composition problem will face new challenges as the services and correspondent data grow. A typical environment is the large scale scientific project AMS, which we are processing huge amount of data streams. In this paper, we will resolve service composition problem by considering...
Mining abnormal patterns is important in many areas. With the prevalence of big data, in order to ensure efficiency, an algorithm named PPSpan (JOMP-based parallel Prefix Span) is proposed under the research of traditional serial sequential pattern mining methods. Firstly, redundant parameters are eliminated with grey correlation analysis. Secondly, outlier information is extracted according to the...
Market mechanism constitutes an efficient scheme for the allocation of cloud-based computing resources with the view of virtual machines. However, most of the existing mechanisms commonly use fixed price model and ignore flexible price model for the cloud providers. In this paper, we formulate the problem of virtual machine allocation in clouds as a combinatorial auction problem and propose a mechanism...
There is an increasing interest for cloud services to be provided in a more energy efficient way. The growing deployment of large-scale, complex workflow applications onto cloud computing hosts is being faced with crucial challenges in reducing the power consumption without violating the service level agreement (SLA). In this paper, we consider cloud hosts which can operate in different power states...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.