The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Through research on the calculation method of feature words' weight in texts and semantic similarity between words, we proposed a calculation method of feature words' weight based on concept weight for the semantic association phenomenon of text features and the prevalence of high-dimensional problem in a text vector space model. This method reduces the semantic loss of the feature set and the dimension...
Text classification is the key technology for topic tracking, and vector space model (VSM) is one of the most simple and effective model for topics representation. On the basis of VSM and support vector machines (SVM), we have studied how feature space dimension in VSM as well as linearly separable and non-separable SVM affect topic tracking. Then we get the variation law that they affect topic tracking,...
By analyzing the characteristics of Chinese noun phrase, propose an automatic identification method of Chinese noun phrase based on statistics and rules. The method includes efficient binary statistical model, mutual information, and rules for knowledge established from a large number of corpus. Then we analyze the distribution of noun phrase in detail. The results show that the method is fit for...
This paper presents an algorithm based on SIFT features. It calculates key points in the image and extracts the feature of the image by calculating the key points' orientation and modulus of the gradient. The similarity between two images is computed using Euclidean distance. The experiment shows that the feature is invariant to image scaling translation, rotation, and partly invariant to illumination...
Text classification is the key technology for topic tracking, and vector space model (VSM) is one of the most simple and effective model for topics representation. Feature selection algorithm in VSM is an important means of data pre-processing, and it can reduce vector space dimension and improve the generalization ability of the algorithm. Therefore, it is necessary for feature selection algorithms...
In this paper, a novel algorithm for mining maximal frequent patterns is proposed based on projection sum frequent items tree. This algorithm projects the transaction base into a projection sum tree and it can store the frequent itemsets in the tree in a compact manner. The algorithm builds frequent patterns tree directly as FPMax algorithm does. However, all the nodes of PSFIT are sorted and ordered,...
This paper analyses the theory of information retrieval relativity and proposes establishing a content-based industry information retrieval system based on ontology. The system can accomplish the mapping between retrieval results and pragmatics relativity in some degree and bring more convenience to users when querying, which embodies the user- oriented principle. This retrieval system uses a novel...
In the past two decades, database systems have been developed rapidly. Much research has been done about the techniques of index and retrieval. In relational database, B tree or B+ tree is used to index the structured data; in full text database, inverted list or sorted duality inter-relevant Successive Trees Model is used to index the full text. But in an integrated database, the efficiency of using...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.