The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Feature representation plays an important role in text classification. Feature mapping based on labels information is an algorithm suitable for Binary Relevance. Compared with the conventional text representation, it makes the dimension of the text under control by means of word embedding. More importantly, it takes full advantage of the general characteristics of the label on text representation...
To found security events from web logs has become an important aspect of network security. This paper proposes a website anomaly detection model based on security-log-analysis. After creating a anomaly feature sets of the model, C4.5 algorithm was used to improve feature sets, making the abnormal records in feature sets store hierarchically. Compared logs in website with the treated feature stes,...
Security risks brought by web page information has been a matter that can no longer be ignored. Malicious script is a major challenge the web sites security is facing currently. According to the data from the Google Research Centre, more than 10% of web pages is malicious. Especially in China, the proportion of malicious web pages has reached 43.21%. This paper presents a detection system which is...
Data prediction and classification is a critical method in medical nutrition data analysis area. As for the characteristics of being intuitive, efficient and easy to understand, the decision tree algorithm is widely used in this field. However, the classification rules extracted from the decision tree are not the most simple and efficient. The paper analyzes the classical decision tree algorithm CART,...
Recently, the significantly increased IPv6 address length has posed a greater challenge on wire-speed router for packet classification (PC). Most conventional IPv4-based PC algorithms are no longer suitable for IPv6 PC. The performance and capacity of many IPv6 algorithms and classification devices depend upon properties of the IPv6 classifiers. However, there are no publicly available IPv6 real classifiers...
Microarray gene expression data have been used in cancer discovery and prediction characterized by their small samples and large dimensionality. This paper proposes a hybrid method based on improved Ant Colony Optimization (ACO) and Random Forests (RF) for selecting a small set of marker genes from microarray data to produce high accuracy cancer classifier. The method preselects top-ranked features...
Ant colony optimization (ACO) is a kind of bionic swarm intelligence algorithm belongs to artificial intelligence (AI) field and has been successfully applied in resolving complex optimization problems. Support vector machine (SVM) is a new machine learning method with greater generalization performance, and has shown its superiority in classification and regression problems. By combining the advantages...
Text representation is the basis of text processing. Most current text representation models ignore the words' inter-relations, which result in the loss of textpsilas structure information. This paper proposed a novel text representation model, which uses lexical network to represent the text and retains the text's structure. According to the different levels of words' inter-relations, co-occurrence...
Ranking is the key problem for information retrieval and other text applications. Recently, the ranking methods based on machine learning approaches, called learning to rank, become the focus for researchers and practitioners. The main idea of these methods is to apply the various existing and effective algorithms on machine learning to ranking. However, as a learning problem, ranking is different...
Feature selection is an important task in machine learning, pattern recognition and data mining. This paper proposed a new feature selection method for classification, named SD, which is based on scatter matrix used in linear discriminant analysis. The main feature of SD is its simplicity and independency of learning algorithms. High-dimensional data samples are first projected into a lower dimensional...
This paper proposes a new Web QoS control strategy based on user behavior. Web QoS control strategy mainly include request classification, admission control, content adaptation technologies. The approach we take to Web QoS control first analyzes user access factors in request classification, admission control, and content self-adaptive strategies of Web QoS control mechanism, and then proposes user...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.