The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With the Internet applications become more complex and diverse, simple network traffic matrix estimation or approximation methods such as gravity model are no longer adequate. In this paper, we advocate a novel approach of approximating traffic matrices with multiple low-rank matrices. We build the theory behind the MULTI-LOW-RANK approximation and discuss the conditions under which it is better than...
The cost and effort of developing software systems in a new technical area can be extensive. An organization must perform a domain analysis to discover competing products, analyze their architectures and features, and ultimately discover and specify product requirements. However, delivering high quality products, depends not only on gaining an understanding of functional requirements, but also of...
Synonyms extraction is a fundamental research, which is helpful to text mining and information retrieval. In this paper, we propose method to extract synonymy from text, the method employs spectral clustering and word2vec. First, the word2vec model is trained by a large-scale English Wikipedia corpus. Then, we extract keywords from a text and use the trained model to generate similarities among these...
Rapid pace of global urbanization has posed significant challenges to urban transportation infrastructures. Existing urban transit systems suffer many well-known shortcomings, where public transits have limits on coverage areas, and fixed schedules, and private transits are expensive and fail to timely meet the demand needs. We thus envision a Cloud-Commuting system, that employs a giant pool of centralized...
We conduct a detailed analysis of cellular communication patterns using (voice/text based) call detail records (CDR) dataset from a nationwide cellular network. We analyze a 5-month large dataset containing over hundreds of millions of CDRs with a user population of over 5 million to dissect meaningful communication patterns, with the goal to understand their impact on - and better manage - cellular...
This paper introduces the multiple linear regression, stepwise linear regression, neural network method, and improves the neural network. Comprehensive analysis of the current prediction methods, the application principle of a detailed analysis and comparison of the various prediction methods advantages and disadvantages. Put forward to improve short-term load forecasting accuracy is not only attach...
This paper proposes an intelligent model for detection of phishing emails which depends on a preprocessing phase that extracts a set of features concerning different email parts. The extracted features are classified using the J48 classification algorithm. We experimented with a total of 23 features that have been used in the literature. Ten-fold cross-validation was applied for training, testing...
Based on the area between the curve of the membership function and the horizontal real axes, a new index, called the expansion center for fuzzy numbers is proposed. An intuitive and reasonable ranking method for fuzzy numbers based on their expansion center is also established. This new ranking method is useful in fuzzy decision making and fuzzy data mining.
With the prevalence of geo-position devices GPS and smart phones, textual information, long or short, associated with GPS tags usually denoted by a coordinate with latitude and longitude is widely encountered on the Web. For example, in Twitter, smart phones record the location of every tweet after it is authorized. The spatial objects on Google map are also presented with some textual descriptions...
Based on the area between the curve of the membership function and the horizontal real axis, concepts of left and right wingspans are introduced. By them, a new index, called the w-center for fuzzy numbers is proposed. It is continuous with respect to the convergence of fuzzy number sequence. An intuitive and reasonable ranking method for fuzzy numbers based on their w-center is also established....
The objective function in some real optimization problems may not be differentiable with respect to the unknown parameters at some points such that the gradient does not exist at those points. Replacing the classical gradient search, the method of pseudo gradient search has been proposed and used for solving nonlinear optimization problems, such as nonlinear multiregression based on the Choquet integral...
Feature location is the activity of identifying an initial location in the source code that implements special functionality in a software system. Existing techniques for feature location broadly fall into three categories, based on the type of information they use: text, static, and dynamic. The techniques based on dynamic may generate large amount of data and is difficult to utilize. This paper...
This study used time-series of EOS/MODIS to extract the planting areas of paddy rice in 15 provinces located in Southern part of China. In doing that, the regionalization was firstly carried out and the algorithms for individual zones were built based on analyzing the characteristics of EVI and LSWI using ground field data. These algorithms were then applied to extract the planning areas for different...
When auditing the large enterprise groups with many accounting subjects, in order to find. the doubtful auditing points quickly in the mass electronic data, we designed and developed a accounting report procedure for single subject according to the logic of balance sheet with the parallel simulation; adopted the associative rules to mine the audit features of electronic data, and combined with the...
In electronic commerce, the main usage of data mining is to discover the trend of business development and make the right decisions. This paper elaborates the role of data mining in electronic commerce, summarizes data mining methods in the current electronic commerce and then classifies the data objects in electronic commerce. It provides references for the development and application of practical...
This study was on the computer network layer and application layer data packet interception, analysis, analysis of network data flow filter, protocol decoding, multi-node collaborative detection, anomaly feature extraction, pattern discovery algorithms, intrusion detection model, network-level intrusion detection The algorithm and protocol analysis test method and so on, trying to find fast and efficient...
In this paper we study on the alert analysis technique of Network Situation Awareness (NSA). The overwhelming alerts make it challenging to understand and manage. Although there are already many alert analysis techniques proposed in Intrusion Detection research area, most of them are used to reduce false positives and false negatives. However, the NSA requires the alert analysis techniques to offer...
Today's large campus and enterprise networks are characterized by their complexity, i.e. containing thousands of hosts, and diversity, i.e. with various applications and usage patterns. To effectively manage and secure such networks, network operators and system administrators are faced with the challenge of characterizing, profiling and tracking activity patterns passing through their networks. Because...
More and more netizens prefer to comment on social hot issues today and their views become very useful for government decision-making. Specially, news and related comments often influence decision behavior of officers. However, it becomes a key problem to analyze them automatically in order to provide references for decision-making. One of effective way is to cluster news comments. In this paper,...
Intellectual Properties (IP), such as patents and trademarks, are one of the most critical assets in today's enterprises and research organizations. They represent the core innovation and differentiators of an organization. When leveraged effectively, they not only protect a business from its competition, but also generate significant opportunities in licensing, execution, long term research and innovation...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.