The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Network traffic classification technique is currently a key part of network security systems. In recent years, some network traffic classification algorithms using machine learning based on packet and flow level features have been proposed, yet the results are frequently disappointing. On the one hand, obtaining a large, representative, training data set that is fully labeled to train a classifier...
In order to improve booking tickets experience of the users of Railway Online Ticketing System and ensure the system normally running, Railway Online Ticketing System's users abnormality booking the tickets detection model based on the traditional K-Means and FP-Growth algorithm is proposed. Firstly, preliminary filter user features by the Random Forest Algorithm based on Spark MLlib to identify the...
There are few Chinese dish recommendation algorithms due to the variety of Chinese dishes. It could be impossible to find one's most liked dishes in a restaurant through the name or the ingredients of a dish. The algorithm in this paper uses the user's ordering history to quantify one's taste by k-means clustering method and determines the number of user's favorite tastes by the BWP index. With the...
Recently, with wide use of computer systems, internet, and rapid growth of computer networks, the problem of intrusion detection in network security has become an important issue of concern. In this regard, various intrusion detection systems have been developed for using misuse detection and anomaly detection methodologies. These systems try to improve detection rates of variation in attack types...
Classification of network traffic is extensively required mainly for many network management tasks such as flow prioritization, traffic shaping/policing, and diagnostic monitoring. Many approaches have been evolved for this purpose. The classical approaches such as port number or payload analyis methods has their own limitations. For example, some applications uses dynamic port number and encryption...
Knowing the geolocation of a router can help to predict the geolocation of an Internet user, which is important for local advertising, fraud detection, and geo-fencing applications. For example, the geolocation of the last router on the path to a user is a reasonable guess for the user's geolocation. Current methods for geolocating a router are based on parsing a router's name to find geographic hints...
Topic detection is an hot research in the area of information retrieval. However, the new environment of Internet, the content of which are usually user-generated, asks for new requirements and brings new challenges. Topic detection has to resolve the problem of its lower quality and large amount of noisy. This paper not only provides a solution for detecting hot topics, but also giving its semantic...
As a new form of malicious software, phishing websites appear frequently in recent years, which cause great harm to online financial services and data security. In this paper, we design and implement an intelligent model for detecting phishing websites. In this model, we extract 10 different types of features such as title, keyword and link text information to represent the website. Heterogeneous...
Many research efforts propose the use of flow-level features (e.g., packet sizes and inter-arrival times) and machine learning algorithms to solve the traffic classification problem. However, these statistical methods have not made the anticipated impact in the real world. We attribute this to two main reasons: (a) training the classifiers and bootstrapping the system is cumbersome, (b) the resulting...
With the high development of Internet, e-commerce websites now routinely have to work with log datasets which are up to a few terabytes in size. How to remove messy data timely with low cost and find out useful information is a problem we have to face. The mining process involves several steps from pre-processing the raw data to establishing the final models. In this paper we describe our method to...
Fraud is increasing with the extensive use of internet and the increase of online transactions. More advanced solutions are desired to protect financial service companies and credit card holders from constantly evolving online fraud attacks. The main objective of this paper is to construct an efficient fraud detection system which is adaptive to the behavior changes by combining classification and...
Trustworthy network is an inevitable trend in the development of high trusted computing and Internet. Behavior evaluation is an important research topic in trustworthy network. Till now, most effect focuses on the validity of host's and user's identity, such as integrity measurement and access control, which could not guarantee the trustworthiness of valid user's behavior. In this paper, we proposed...
It is obvious that internet has become a key media to share resources and exchange information. As a special category of social activities, the behavior from network users normally shows its complexity and diversity, which makes people pay an increased attention to study and manage it. Based upon the formation mechanism of ant colony, this paper proposes an ant colony algorithm to do cluster analysis...
With the widespread of Internet application, more and more enterprises build their Web sites and provide business information through Web pages. Web page classification could be used to assign the enterprise Web pages to one or more predefined business categories. On the purpose of Internet-based enterprises administration in E-government system, algorithms and application related to web page classification...
The particular benefit of cloud computing is the simple scalability of large applications, and many companies have already decided to use the cloud for their infrastructures. An enterprise IT infrastructure often includes a workflow management system. In a cloud, various workflow engines can coexist, each with its specific functional responsibility. A central instance is in charge of distributing...
Most of the traditional classification methods behave undesirable, particularly producing poor predictive accuracy for the minority class of the imbalanced data from real world applications. This paper proposes a novel over-sampling strategy to handle imbalanced data based on cluster ensembles, named CE-SMOTE, which aims to provide a better training platform by introducing clustering consistency index...
Phishing fraudsters attempt to create an environment which looks and feels like a legitimate institution, while at the same time attempting to bypass filters and suspicions of their targets. This is a difficult compromise for the phishers and presents a weakness in the process of conducting this fraud. In this research, a methodology is presented that looks at the differences that occur between phishing...
The health geographical information system (GIS) has been used in many organizations for the management and visualization of public health data. As epidemiology information has become a part of health data repository in the health data management system, many health researchers have dedicated their research areas to geographical epidemiology information analysis and visualization. The Population Health...
The paper reports a study on information categorizing based on high efficient feature selection and comprehensive semi-supervised learning algorithm. Feature selections or conversions are performed using maximum mutual information including linear and non-linear feature conversions. Entropy is made use of and extended to find right features commendably with machine learning method. Fuzzy partition...
GPGPU (general purpose computing on graphic processing unit) attracts a great deal of attention, that is used for general-purpose computations like numerical calculations as well as graphic processing. As the Internet grows, the amount of data transmitted on the Internet increases dramatically, so the data compression has been more important than ever. The VQ (vector quantization) compression is one...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.