The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In Data Mining classification plays prominent role in predicting outcomes. One of the best supervised classification techniques in Data Mining is Naive Bayes Classification. Naive Bayes Classification is good at predicting outcomes and often outperforms other classification techniques. One of the reasons behind the strong performance of Naive Bayes Classification is due to the assumption of conditional...
This paper presents a completely automatic processing chain for orthorectification of optical pushbroom sensors. The procedure is robust and works without manual intervention from raw satellite image to orthoimage. It is modularly divided in four main steps: metadata extraction, automatic ground control point (GCP) extraction, geometric modeling, and orthorectification. The GCP extraction step uses...
With the rapid development of E-commerce, more online reviews for products and services are created, which form an important source of information for both sellers and customers. Research on sentiment and opinion mining for online review analysis has attracted increasingly more attention because such study helps leverage information from online reviews for potential economic impact. The paper discusses...
Internet Relay Chat (IRC) is a commonly used tool by Open Source developers. Developers use IRC channels to discuss programming related problems, but much of the discussion is irrelevant and off-topic. Essentially if we treat IRC discussions like email messages, and apply spam filtering, we can try to filter out the spam (the off-topic discussions) from the ham (the programming discussions). Yet we...
In big data universities, an understanding of how the individual learning style and preferences interacts with the instructional medium presented is needed. In this study we examined the VARK learning style inventory using the variable-centered, person-centered and social approaches. We worked on a big “data set” which encompasses two data sources the first was LMS while the second was social media...
The widespread adoption of ubiquitous devices does not only facilitate the connection of billions of people, but has also fuelled a culture of sharing rich, high resolution locations through check-ins. Despite the profusion of GPS and WiFi driven location prediction techniques, the sparse and random nature of check-in data generation have ushered diverse problems, which have prompted the prediction...
Most existing topic models focus either on extracting static topic-sentiment conjunctions or topic-wise evolution over time leaving out topic-sentiment dynamics and missing the opportunity to provide a more in-depth analysis of textual data. In this paper, we propose an LDA-based topic model for analyzing topic-sentiment evolution over time by modeling time jointly with topics and sentiments. We derive...
Various lower and upper approximations defined by the tolerance classes of objects and maximal tolerance classes are compared in the tolerance rough set models. Based on the study, some suggestions for choosing the suitable lower and upper approximations are given for the purpose of improving approximation accuracy.
The goal of the paper is to predict student retention by using linear discriminant analysis with bootstrapping. The result (93%) provides accuracy superior to the bootstrapping of a comparative method, as well as to the non-bootstrapping variations. In order to perform discriminant analysis, we linearize a fractional programming method by using Charnes-Cooper transformation and apply linear programming,...
Reading news is one of the most important learning activities in the education of the communication schools. Learners learn to improve the awareness and writing essay ability through observing contents inside the news. How to create the learning materials in a short time is still a gap between traditional communication education and information technology. In addition, the most challenge is that there...
Most proteins express their functions by binding with other proteins or molecular compounds called ligands. The local portion involved in binding is called a binding site. The characteristics of the binding site often determine the function of the protein, so clarifying the location of the binding site of the protein helps analyze the function of proteins. Binding sites that bind to similar ligands...
Hepatitis C virus' patients with genotypes 1 & 4 have break-even response rates to Pegylated-Interferon (Peg-IFN) and Ribavirin (RBV) treatment. Furthermore, the incompliance to the treatment because of its high cost and related unfavorable effects makes its prediction of paramount importance. By using machine-learning techniques, a significantly accurate predictive model constructed to predict...
In this paper we propose the optimization of Rough Set method using ant colony for oil-impregnated paper bushings. Ant colony is used to discretize the training data set. The ant colony optimized rough set is compare to a rough set who's data is discretized using equal frequency bin (EFB). Ant colony optimized (ACO) rough set results show an improvement compared to the EFB. The ACO rough set has an...
Data Mining is concerned with extraction of interesting patterns or knowledge from huge amounts of Data. Generally data mining tasks are either predictive or descriptive. Classification falls under predictive induction while clustering and association rule mining fall under descriptive induction. Subgroup discovery is a task at the intersection of supervised learning and descriptive induction. In...
Node splitting is good or bad depends on the measure method of the impurity. We propose a new decision tree feature selection strategy based on maximum similarity, called fsms. First, splitting the dataset into subset according to each attribute value, calculating the sum of average similarity of each subset, then selecting the attribute with the maximum similarity as the best splitting attribute...
This paper presents a model of monitoring method in videos and imagines. In the case of a fixed camera, when moving object appears, all data indicating object movement will change greatly. The real moving target will be detected after processing the data which larger change. The accuracy and instantaneity of will be tested by MATLAB stimulation.
Attracting more students into science and engineering disciplines concerned many researchers for decades. Literature used traditional statistical methods and qualitative techniques to identify factors that affect student retention up most and predict their persistence. In this paper we developed two neural network models using a feed-forward backpropagation network to predict retention for students...
Recommendation systems have been investigated and implemented in many aspects. Particularly, in case of collaborative filtering system, more important issue is how to manipulate the personalized recommendation results for better user understandability and satisfaction. Collaborative filtering system predicts items of interest for users based on predictive relationship discovered between the item and...
This paper analyzes the effect of penetration rate to the estimation error in mobile phone based traffic state estimation systems. More concretely, the error-tolerance is analyzed based upon the penetration rate of participating mobile phones. In addition, a hybrid model by which not only real-time data but also the historical data utilized under a suitable data mining technique is introduced. This...
When information sources are unreliable, information networks have been used in data mining literature to uncover facts from large numbers of complex relations between noisy variables. The approach relies on topology analysis of graphs, where nodes represent pieces of (unreliable) information and links represent abstract relations. Such topology analysis was often empirically shown to be quite powerful...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.