The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
I am very proud and honored to have been entrusted to edit this conference proceedings of 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE). The vision of this conference was to provide an international forum for presentation of research works in the vicinity of Data Mining and Advanced Computing.
In this paper, we discuss Data Mining and its application in Higher Secondary Directorate of Kerala. Data Mining process has a set of functionalities among which classification has wide application in real world data processing. We examine the Naïve Bayes classification techniques. In the third section, we explain Naïve Bayes Theorem using an experiment. This experiment covers attributes like School...
In this model, we propose an innovative recruitment system using social networking websites like Twitter and LinkedIn along with code repository hosting website GitHub and competitive coding platforms like SPOJ. It is aimed to develop advanced search engines to automatically sort the job-seekers based on job offer requirements using various data mining and machine learning techniques. Vritthi allows...
Meetings are an important communication and coordination activity of teams: status is discussed, new decisions are made, alternatives are considered, details are explained, information is presented, and new ideas are generated. As such, meetings contain a large amount of rich project information that is often not formally documented. Capturing all of this informal meeting information has been a topic...
Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. Researchers with the help of machine learning algorithms normally find the best classifier that distinguishes a spam from a benign mail called ham. It is necessary to evaluate the performance of any new spam classifier using standard data sets. The public corpora of email data sets that are available...
Outliers are those data that deviates significantly from the remaining data. Outliers has emerging applications in irregular credit card transactions, used to find credit card fraud, or identifying patients who shows abnormal symptoms due to suffering from a particular type of disease. This paper gives an idea about the various approaches and techniques used in outlier detection and the areas in which...
Document classification can be defined as the task of automatically categorizing collections of electronic documents into their annotated classes, based on their contents. It is an important problem in Data mining. Due to the exponential growth of documents in the Internet and the emergent need to organize them, developing an efficient document classification method to automatically manipulate web...
Recommendation Systems provide suggestions for items that are useful to a user. Initially researches in RS mainly focused to improve only accuracy of the system, however improving only accuracy does not improve user satisfaction. Recently, it has been identified that diversity is an important dimension for evaluating a recommendation system. Users find a diversified set of recommendations more interesting...
Business process mining or process mining is the intersection between data mining and business process modelling that extracts business patterns from event logs. Event logs are freely available in any organization. Business logs are a potential source of useful information. By the various patterns that are present in the logs, a lot can be estimated about the type of procedures that should be incorporated...
The online recommendation system has become a trend. Now a days rather than going out and buying items for themselves, reason being, online recommendation provides an easier and quicker way to buy items and transactions are also quick when it is done online. Recommended systems are powerful new technology and it helps users to find items which they want to buy. A recommendation system is broadly used...
Hierarchical clustering is of enormous importance in data analytics especially because of the exponential growth of the real world data. Frequently these data are unlabelled and there is small prior domain knowledge offered. In this work the plan is to improve the efficiency by introducing a set of methods dealt with synthetic and real data on agglomerative hierarchical clustering followed by k-means...
Single Nucleotide Polymorphisms (SNPs) are the most common form of genetic variation in humans comprising nearly 1/1,000th of the average human genome. The intelligent analysis of databases may be affected by the presence of unimportant features, which motivates the application of feature selection. In this work, we have proposed a genetic based feature selection. Genetic algorithm (GA) is a search...
Sentiment analysis is the computational study of opinions, sentiments, evaluations, attitudes, views and emotions expressed in text. It refers to a classification problem where the main focus is to predict the polarity of words and then classify them into positive or negative sentiment. Sentiment analysis over Twitter offers people a fast and effective way to measure the public's feelings towards...
In this work, the effectiveness of the popular classification techniques k-Nearest Neighbour (kNN) algorithm is integrated with Ant Colony Optimization (ACO) to predict the likelihood of getting heart disease. The analysis has been performed in two phases. In the first phase, the kNN classification is used to classify the test data. In the second phase, the ACO is used to initialize the population...
Agent based Library Recommender System is proposed with the objective to provide effective and intelligent use of library resources such as finding right book/s, relevant research journal papers and articles. The architecture consists of profile agent and library recommender agent. The main task of Library recommender agent is filtering and providing recommendations. Library resources include book...
In this competitive world, business is becoming highly saturated. Especially, the field of telecommunication faces complex challenges due to a number of vibrant competitive service providers. Therefore, it has become very difficult for them to retain existing customers. Since the cost of acquiring new customers is much higher than the cost of retaining the existing customers, it is the time for the...
Data mining plays an important role in the business world and it helps to the educational institution to predict and make decisions related to the students' academic status. With a higher education, now a days dropping out of students' has been increasing, it affects not only the students' career but also on the reputation of the institute. The existing system is a system which maintains the student...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.