The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Iris recognition is a biometric authentication system proving vital for ensuring security and has been employed as an important case to test the algorithms developed in pattern recognition. The unique circular shape of the iris and its time invariance makes it a versatile technique that has an accuracy that can be mathematically proven. Here in this work we propose a new segmentation technique and...
Data stored in educational database is increasing day by day. Data mining algorithms can be used to find hidden patterns from the student's database. These patterns can be used to find academic performance of students. The main aim of this study was to determine factors that influence the student's performance. This paper proposes Generalized Sequential Pattern mining algorithm for finding frequent...
Top-K dominating query selects k data objects and influences the highest number of objects in a dataset. This is a decision supportable query since it provides data analysts a best way for finding significant objects. This search is not only for the earlier examination of large upper bounds that leads to earlier identification of results, but also eliminates partial dominance relationship between...
The mainstream of development in knowledge discovery is researching on new high-performance and high- scalability mining algorithm. In fact, the research of process model and inner mechanism is more important, which have got enough attention. This paper proposed a new independent knowledge discovery system framework, which combines those three elements: algorithm, model and mechanism. In essence,...
An efficient incremental approach to the discriminative common vector (DCV) method for dimensionality reduction and classification is presented. Starting from the original batch method, an incremental formulation is given. The main idea is to minimize both matrix operations and space constraints. To this end, an straightforward per sample correction is obtained enabling the possibility of setting...
Finding similar crime case subsets is an important task for intelligence analysts in crime investigation. It can not only provide multiple clues to solve crimes but also improve efficiency to catch the criminals. However, the conventional approach by querying specific attributes in relational databases has two defects: first, it is relatively of poor efficiency when a lot of incidents have to be handled;...
To improve the intelligibility and efficiency of knowledge expression for the land evaluation, a land evaluation method combining simplified fuzzy classification association rules with fuzzy decision is proposed in this paper. To reduce the complexity of the land evaluation models and improve the efficiency and intelligibility of fuzzy classification association rules further, an algorithm to eliminate...
Existing trajectory prediction algorithms mainly employ kinematical models to approximate real world routes and always ignore spatial and temporal distance. In order to overcome the drawbacks of existing trajectory prediction approaches, this paper proposes a novel trajectory prediction algorithm. It works as: (1) mining the interesting regions from trajectory data sets; (2) extracting the trajectory...
Associative classification(AC) is a promising approach used for auto malware detection. However, when data operation occurs (training data added over time), traditional AC algorithms have to re-learn repetitive which is expensive or even become invalidly because of massive data and limited computing resource. To resolve the challenges above, an efficient incremental associative classification algorithm...
Clustering analysis method is one of the main analytical methods in data mining, the method of clustering algorithm will influence the clustering results directly. This paper discusses the standard k-means clustering algorithm and analyzes the shortcomings of standard k-means algorithm, such as the k-means clustering algorithm has to calculate the distance between each data object and all cluster...
Clustering, an supervised learning process is a challenging problem. Clustering result quality improves the overall structure. In this article, we propose an incremental stream of hierarchical clustering and improve the efficiency, reduce time consumption and accuracy of text categorization algorithm by forming an exact sub clustering. In this paper we propose a new method called multilevel clustering...
A good description of a class should be accurate and interpretable. Previous works describe classes either by analyzing the correlation of each attribute with the class, or by producing rules as in building a classifier. These solutions suffer from issues in accuracy and interpretability. A description naturally consists of sentences, where each sentence consists of a set of terms. Normally, a sentence...
In order to improve the efficiency of database management and intelligence of database record classification, a classification method of network database record based on fuzzy theory is proposed in this paper. Firstly, an automatic classification frame of database is constructed, and then standard record model and special data record and new record model on fuzzy set are given. By calculating the...
Many organizations collect large amounts of data to support their business and decision making processes. The data collected from various sources may have data quality problems in it. These kinds of issues become prominent when various databases are integrated. The integrated databases inherit the data quality problems that were present in the source database. The data in the integrated systems need...
An improved K-medoids clustering algorithm (IKMC) to resolve the problem of detecting the near-duplicated records is proposed in this paper. It considers every record in database as one separate data object, uses edit-distance method and the weights of attributes to get similarity value among records, then detect duplicated records by clustering these similarity value. This algorithm can automatically...
K-means cluster algorithm is one of important cluster analysis methods of data mining, but through the analysis and the experiment to the traditional K-means cluster algorithm, it is discovered that its cluster result varies along with the initial selected cluster central point, and the difference is big. In view of this question, this text proposed the method of seeking the initial cluster center...
In this paper, we propose the ldquoaddedrdquo use of proximity search to a Web search query for narrowing down the set of documents returned as answers to a keyword based search query. This approach adds value to Web search query results by allowing users to better express what they are looking for. Most of the current search engines provide limited proximity search behaviour such as allowing only...
A music similarity measure system based on the instrumentation information is proposed in this paper. We combine our previous instrumentation analysis algorithm with a newly designed similarity measure system, which aims at comparing songs solely on their instrumentation content. Several midlevel features are designed and integrated along with the existing low-level feature MFCCs. The weighted distance...
Schema matching, which establishes whether two objects are semantically related, has been a focus of data management research due to its critical role in enterprise information integration. Current schema matching techniques typically tend to exploit only a single method selected from matching schema structure, or syntax, or semantics, or data and probability distributions, which reduces the success...
Associative classification has high classification accuracy and strong flexibility. However, it still suffers from overfitting since the classification rules satisfied both minimum support and minimum confidence are returned as strong association rules back to the classifier. In this paper, we propose a new association classification method based on compactness of rules, it extends Apriori Algorithm...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.