The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Inverse Reinforcement Learning (IRL) is an approach for domain-reward discovery from demonstration, where an agent mines the reward function of a Markov decision process by observing an expert acting in the domain. In the standard setting, it is assumed that the expert acts (nearly) optimally, and a large number of trajectories, i.e., training examples are available for reward discovery (and consequently,...
This paper discusses a topic-specific intelligent Web crawler based on Web content and structure mining. The method takes advantage of the characteristics of the neural network and introduces the reinforcement learning to find the relativity between the crawled web pages and the topic. When calculating the correlation, we just select the important tags of HTML makeup of the Web page, to analyze the...
The biomedical society makes wide use of text mining technology. Named Entity (NE) extraction is one of the most primary and significant tasks in biomedical information extraction of text mining technology. Named Entity Recognition (NER) involves processing structured and unstructured documents to recognize the definite kinds of entities and categorization of them into some predefined classes. Several...
Early diagnosis is an important aspect of successful treatment for breast cancer. Mammogram is the most reliable imaging technique available. It is a challenging task for radiologists to detect the abnormalities in the mammograms. Computing helps the radiologists in diagnosing the abnormalities in the mammogram. Computer Aided Diagnosis System involves computerized biomedical image analysis to classify...
The internet is rich in directional text (i.e., text containing opinions and emotions). World Wide Web provides volumes of text-based data about consumer preferences, stored in online review websites, web forums, blogs, etc. Sentiment analysis is a technique to classify people's opinions in product reviews, blogs or social networks has emerged as a method for mining opinions from such text archives...
The multicollinearity exists in the interpretive variable of regression model , it often brings inconvenience to social post-evaluation. The ridge regression has advantages than LS method. The support vector machines (SVM) is a novel machine learning tool in data mining. It is based on the structural risk minimization (SRM) principle, which has been shown to be more superior than the traditional empirical...
Discriminative subgraphs can be used to characterize complex graphs, construct graph classifiers and generate graph indices. The search space for discriminative subgraphs is usually prohibitively large. Most measurements of interestingness of discriminative subgraphs are neither monotonic nor antimonotonic with respect to subgraph frequencies. Therefore, branch-and-bound algorithms are unable to mine...
In recent years, one mode of data dissemination has become extremely popular, which is the deep web. A key characteristics of deep web data sources is that data can only be accessed through the limited query interface they support. This paper develops a methodology for mining the deep web. Because these data sources cannot be accessed directly, thus, data mining must be performed based on sampling...
Effectively utilizing readily available auxiliary data to improve predictive performance on new modeling tasks is a key problem in data mining. In this research the goal is to transfer knowledge between sources of data, particularly when ground truth information for the new modeling task is scarce or is expensive to collect where leveraging any auxiliary sources of data becomes a necessity. Towards...
Anomaly detection in data streams is the problem of extracting subsequences, which do not match an expected behavior. Its importance originates from its applicability in many fields such as system health monitoring, event detection in sensor networks, and detecting eco-system disturbances, etc. In detecting anomalous subsequences from data streams, the main challenge for the existing techniques is...
The need to prolong the ability for older adults to live at home independently has become an important area of smart environment research. In this proposal, we demonstrate a web-based visualization system (CASASviz) that integrates monitoring, analysis, and automated recognition of residents behavior patterns in smart environments. In our data collection module, we collect real sensor data from the...
Data mining concerns theories, methodologies, and in particular, computer systems for knowledge extraction or mining from large amounts of data. Association rule mining is a general purpose rule discovery scheme. It has been widely used for discovering rules in medical applications. The diagnosis of diseases is a significant and tedious task in medicine. The detection of heart disease from various...
Now-a-days privacy has become a major concern; the goals of security like confidentiality, integrity and availability do not ensure privacy. Data mining is a threat to privacy. Researchers today focus on how to ensure privacy while performing data mining task. As Data mining algorithms are typically complex and furthermore the input usually consists of massive data sets, the generic protocols in such...
It has great significance to efficiently distinguish the type of the samples' data in the decision table after the discretization for the course of machine learning and data mining afterwards. This paper puts forward an annotation method of distinguishing the data type based on attributes importance and the samples entropy, and processed the simulation test using part of the UCI database which was...
Automated learning systems used to extract useful information from musical scripts, play a major role in optical music recognition. Optical music recognition or OMR has been widely used to extract the musical notations and knowledge from old scripts and thus enclose lot of importance in retrieving historical data. The field of pattern recognition and knowledge representation has to be symmetrically...
Due to the rise and rapid growth of E-Commerce, use of credit cards for online purchases has dramatically increased and it caused an explosion in the credit card fraud. As credit card becomes the most popular mode of payment for both online as well as regular purchase, cases of fraud associated with it are also rising. In real life, fraudulent transactions are scattered with genuine transactions and...
Real life databases contain many features. Many of these features may be irrelevant or redundant. For example, data recording the age of each teacher in a school is unlikely to help in assessing the success of students' results in the school. Hence, relevant analysis is needed to be performed on the data in order to identify and remove any such irrelevant or redundant attributes from the learning...
In our study we use a kernel based classification technique, Support Vector Machine Regression for predicting the Melting Point of Drug - like compounds in terms of Topological Descriptors, Topological Charge Indices, Connectivity Indices and 2D Auto Correlations. The Machine Learning model was designed, trained and tested using a dataset of 100 compounds and it was found that an SVMReg model with...
The following topics are dealt with: fuzzy system; evolutionary computation; adaptive dynamic programming; reinforcement learning; bioinformatics; data mining; computer games,virtual reality; intelligent system; soft computing; robotics; discrete event system; signal processing; speech processing; image processing; business management distributed architecture; Internet modelling; ad hoc wireless networks...
Condition Based Maintenance (CBM) software, called cbmLAD, under development at École Polytechnique de Montréal is presented in this paper. The backbone of the software is a supervised learning data mining approach called Logical Analysis of Data (LAD). LAD possesses distinctive advantages that are useful in Condition Based Maintenance (CBM), namely its independence from statistical processes and...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.