The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A large number of text data are regularly published in social networks and the media. Processing and analysis of such information is an highly required direction. This paper focuses on the way to use the entropy measure when dealing with big volumes of text data in classification. The used entropy measure stands for algorithm quality criteria when defining a class in a set of data. The work also features...
The idea behind this tool is to make a generic classifier which can be used to diagnose patients suffering from brain disorders. We have created a tool which uses machine learning algorithms from Weka, Caret and Scikit Learn from Java, R and Python respectively and combines the three packages into one R package which provides the functionality of classifying the patients suffering from brain disorders...
In the evolving technology of big data, high velocity data streams play a vital role since pattern of data is being changed over time. The temporal pattern change in data stream leads to a concept evolution called concept drift where statistical properties of data differs from time to time and the drift is taken into account in order to update old and outdated classifier and make it adaptable to new...
There exists a base classification system for classification of problem tickets in the Enterprise domain. Different deep learning algorithms (Gated Recursive Unit and Long Short Term Memory) were investigated for solving the classification problem. Experiments were conducted for different parameters and layers for these algorithms. Paper brings out the architectures tried, results obtained, our conclusions...
Wafer defects, which are primarily defective chips on a wafer, are of the key challenges facing the semiconductor manufacturing companies, as they could increase the yield losses to hundreds of millions of dollars. Fortunately, these wafer defects leave unique patterns due to their spatial dependence across wafer maps. It is thus possible to identify and predict them in order to find the point of...
Credit scoring profiles the client relationships of empirical attributes (variables) and leverages a scoring model to draw client's credibility. However, empirical attributes often contains a certain degree of uncertainty and requires feature selection. Bayesian network (BN) is an important tool for dealing with uncertain problems and information. Mutual information (MI) measures dependencies between...
To help telecommunications operators accurately predict the terminal replacement behavior, and improve the success rate of marketing and the accuracy of resources devoting, huge user consumption data are used to build Deep Belief Network. The deep features that characterize the terminal replacement behavior are learned, through which a terminal replacement prediction model is conducted. Experiments...
The prediction models based on unsupervised learning are fast and need not have labeled data. However, the analysis for prediction is quite difficult, since no information about the data is given to us for learning. This paper proposes a prediction model based on Big Data analysis using hybrid FCM clustering algorithm to address these problems. The proposed model conducts automatic classification...
In order to construct a high-performance ensemble classifier, it needs that the basic classifiers, which contained by the ensemble one, have higher classification precision and their classification error is independent from each other. In fact, it is too difficult to choose these basic classifiers satisfying the two conditions above. Rough reduction is the core in the fields of Rough Set theory. Each...
The Quality of Experience (QoE) is an irreplaceable metric for evaluating the perceived quality of consumers of multimedia content. Due to the subjectiveness of QoE the most suitable way to measure it is by executing subjective studies. However, executing subjective studies is a complex and expensive process. Careful recreation of the viewing conditions is necessary, and a strict selection of the...
In this paper we present a comparative analysis of the predictive power of two different sets of metrics for defect prediction. We choose one set of product related and one set of process related software metrics and use them for classifying Java files of the Eclipse project as defective respective defect-free. Classification models are built using three common machine learners: logistic regression,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.