The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Now a day's many of crimes are related to financial domain so forensic analysis of such documents is required. Due to digitization many of documents for investigation is faster. If analyzer analyzes the document manually it will time consuming and tedious task so, we follow the approach which will specify the clustering algorithm to document for forensic analysis of seize system which will help the...
The paper adopts the fuzzy c-means text mining method in lots of text mining methods. But aim at the defect that the initial value of the fuzzy c-means is more sensitivity and poor stability, an improved GAFCM text mining method has been put forward. GAFCM uses global search features of genetic algorithms to improve the fuzzy c-means. Finally, it has proved that the improved text mining method has...
Several features existed in Chinese texts result in technologic bottleneck in Chinese text mining, at present the results of Chinese text clustering obtained by traditional methods are not very satisfactory. In this paper, we propose the text clustering method by the English texts clustering method called as Text Clustering via Particle Swarm Optimizer (TCPSO) to solve the Chinese text clustering...
In a variety of text classification algorithm, KNN is a competitive one with simple implementation and high efficiency. However, with the expansion of the size of the text, the runtime of KNN will grow rapidly that cannot be afford. In this paper, we improve the KNN by introducing the kd-tree storage structure and reducing the sample space through the sample clustering methods. And experiment shows...
With the tremendous amount of information available electronically, there is an increasing requirement for automatic text summarization systems. An extractive summarization method is represented. The weight of a Chinese word/phrase is computed based on its frequency, part of speech, position and length. The weight of a Chinese sentence is computed by its content, position, length and cue words in...
Ant colony clustering was first proposed by Deneubourg in 1990, it is a bionic clustering method and has been widely used in cluster analysis. In this paper, an ant colony clustering algorithm based on appropriate retention of the elites is presented. Based on the general ant colony clustering algorithm, the mechanism to retain the elites is introduced, in each of the iterative algorithm always retain...
At present, graduate students need choose some courses by themselves, which had some blindness. The paper put forward a suit of text mining algorithms based on association rule. The algorithms were used in studying relevance between choosing course and research project, which could provide some reference for graduate students. At first, the scheme of computing words' relevant degree was put forward...
Cross-language information retrieval (CLIR) is the retrieval process where the user presents queries in one language to retrieve documents in another language. In this field the resolution of lexical ambiguity in translating queries is a key challenge. In this paper, we propose a technique for calculating translation probabilities based on creating query terms' concept graphs for selecting the right...
Text clustering is one of the difficult and hot research fields in the Internet search engine research. Using and improving K-means clustering techniques, a new text clustering algorithm is presented. Firstly, texts are preprocessed to satisfy succeed process. Secondly, the paper improves the gravity centers calculation method and algorithm flow of K-means cluster algorithm to improve efficiency and...
Based on the clustering technology in data mining, we aimed to establish a new schoolwork identifying mechanism. In order to let the normal answer can adapt to actual situations better, we first generalized the normal answer, and then calculated the similarity between every sample and normal answer, as well as similar degree between school works. Based on the similarity, we clustered all school work...
The most common task for a forensic investigator is to search a hard disk to find interesting evidences. While, the most search tools in digital forensic field fundamentally utilize text string match and index technology, which produce high recall (100%) and low precision. Investigators frequently waste vast time on huge irrelevant search hits. In this paper, we propose an improved method for ranking...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.