The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The recent growing interest on highly-available data-intensive applications sparked the need for flexible and portable storage technologies, e.g., NoSQL databases. Unfortunately, the lack of standard interfaces and architectures for NoSQLs makes it difficult and expensive to create portable applications, which results in vendor lock-in. Building on previous work, we aim at providing guaranteed fault-tolerant...
In the article, the conceptual model of near duplicates detection in electronic documents is considered. The model provides separation from the document of different data types (the text, the numerical sequences, images, diagrams and mathematical formulas) and applications to their analysis of the special tools allowing to identify similarities between fragments of the incoming document and documents...
The random forest algorithm is a new classification and prediction model algorithm. So far, there is not much research on the problem of unbalanced data for random forest classification, ditto, no direct and effective method. On the basis of feature selection algorithm based on correlation measure, the integration feature selection method was helpful to increase the selection probability of classification...
In this paper, we establish a joint model GARCHGED-VaR and make the empirical analysis of CSI 300 index tostudy the risk early warning. First, we introduce several GARCH models, the GED distribution, VaR and CVaR risk measurement model, and establish a joint model GARCH-GED-VaR. Second, we select the CSI 300 index closing price from 2005 to 2014 as the sample, make the model fitting and parameter...
This article constructs a three dimensional customer segmentation model based on customer lifetime value, customer satisfaction and customer activity, which more accurately divides customers into different groups. The corresponding variables are obtained by RFM model, Kano model and BG/NBD model. The customer segmentation model provides ten groups of customers with corresponding marketing strategies,...
In order to avoid the problems in traditional forecasting methods which demand too much of various data types and have difficulty in training models, this paper proposes two rapid prediction methods which are called “One by One Comparison” and “Regression as a Whole”. By using the two methods, as long as you get playing index in the first few days of a TV drama, the total playing index accumulation...
The result of Chinese housing market continues to prosper or not is related to the development of China, and further it also has an impact on the world finance. Thus forecasting the house price index is very important and challenging. In this paper we propose an unsupervised learnable neuron model (DNM) by including the nonlinear interactions between excitation and inhibition on dendrites. We use...
Generating robotic grasps for given tasks is a difficult problem. This paper proposes a learning-based approach to generate suitable partial power grasp for a set of tool-using tasks. First a number of valid partial power grasps are sampled in simulation and encoded as a probabilistic model, which encapsulates the relations among the task-specific contact, the graspable object feature and the finger...
This paper presents comparative experiment results of code mixed data with the normal text. We first identify the Languages present in social media text, in the case of code mixed data existing language detector fails to detect language at the word level because of the use of roman script to write their own language. So we bootstrap language identification step and we caluculate the Code Mixe Index...
Bayesian nonparametric (BNP) models have recently become popular due to their flexibility in identifying the unknown number of clusters. However, they have difficulties handling heterogeneous data from multiple sources. Existing BNP methods either treat each of these sources independently - hence do not get benefits from the correlating information between them, or require to explicitly specify data...
Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited research in the clustering of point patterns - sets or multi-sets of unordered elements - that are found in numerous applications and data sources. In this paper, we...
This paper presents a novel multi-task learning framework for the accurate prediction of spatio-temporal data at multiple locations. The framework encodes the data as a third-order tensor and performs supervised tensor decomposition to identify the latent factors that capture the inherent spatiotemporal variabilities of the data and their relationship to the target variable of interest. The framework...
In this paper the adaptive mesh model is studied. First we introduce the 3 subdivision method which is already exist. Second, in order to make the simulation result more realistic and reasonable, we extend the traditional scheme and apply two different subdivision schemes on the triangular mesh. At last, a new mesh coarsening method is proposed, with the help of this method, we build a extended adaptive...
Chronic obstructive pulmonary disease (COPD) accounts for the highest rate of hospital readmissions and is the third leading cause of death in Canada, the United States and worldwide. Predicting COPD failure provides a prognostic warning of death or readmission, and is crucial to early intervention and decision-making. The aim of this study is to perform COPD failure prediction on longitudinal data...
A relational table over a set of attributes can be mapped onto a multi-dimensional array and stored as such. Such a conceptual view of relations lends itself to easy formulations of numerous analytical algorithms. This is the view taken in the representation of relations in data-warehousing to support On-Line Analytical Processing (OLAP). The main drawback of such a storage scheme is that the equivalent...
In order to evaluate the energy production of a solar system, the tilted global radiation is needed. Generally, only the global horizontal radiation data are available. To calculate a tilted global radiation, it is necessary to estimate the diffuse or the direct component of the horizontal solar radiation. In this article, a statistical procedure has been employed to develop correlations between the...
The usage and improvement of information and communication technologies to enhance public sector services (e-Government) was recognized as an important task for the majority of governments in developed countries. Several countries are working hard to improve their e-Government ranking to support their sustainable development. This study employed several data mining techniques to build models that...
In this paper we derive a clustering method based on the Hidden Conditional Random Field (HCRF) model in order to maximizes the performance of a wireless sensor. Our novel approach to clustering in this paper is in the application of an index invariant graph that we defined in a previous work and that precisely links a hyper-tree structure to the data set assumptions. We show that a set of conditional...
Partitioned Global Address Space (PGAS) parallel programming models can provide an efficient mechanism for managing shared data stored across multiple nodes in a distributed memory system. However, these models are traditionally directly addressed and, for applications with loosely-structured or sparse data, determining the location of a given data element within a PGAS can incur significant overheads...
A rising star is an individual who shows the potential to become a star in the near future. We investigate the problem of finding rising stars when heterogeneous data sources are available to define the same person. The proposed solution examines multiple data sources to determine how the importance of an individual improves over time. Scores from different data sources are combined using a multi-objective...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.