The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The closest pair problem (CPP) is an important problem that has numerous applications in clustering, graph partitioning, image processing, patterns identification, intrusion detection, etc. Numerous algorithms have been presented for solving the CPP. For instance, on n points there exists an O(n log n) time algorithm for CPP (when the dimension is a constant). There also exist randomized algorithms...
In view of the shortcomings of the traditional clustering algorithm in intrusion detection system, this paper proposes a method of selecting the initial clustering center based on density, which can overcome the problem of K value in ordinary K-Means. The improved intrusion detection model can achieve good clustering effect. Compared with the traditional K-Means, it is found that the improved algorithm...
This paper tackles the problem of finding the list of solutions with strictly increasing cost for the Semi-Assignment Problem. Four different algorithms are described and compared. The first two algorithms are based on a mathematical model and on a modification of Murty's algorithm, which was designed to find the list of solutions for the classical assignment problem. The third approach is a heuristic...
With the rapid development of intelligent transportation systems, modern society is at an unimaginable speed to produce massive data. How to make full use of valuable information in big data is particularly important. This paper summarizes the basic concepts of association rule mining and how to use the Apriori algorithm for correlation analysis in massive data. In addition, the Apriori algorithm...
Graph models of social information systems typically contain trillions of edges. Such big graphs cannot beprocessed on a single machine. The graph object must bepartitioned and distributed among machines and processedin parallel on a computer cluster. Programming such systemsis very challenging. In this work, we present DH-Falcon, a graph DSL (domain-specific language) which can be usedto implement...
Graph clustering, popularly known as community detection, is a fundamental kernel for several applications of relevance to the Defense Advanced Research Projects Agency's (DARPA) Hierarchical Identify Verify Exploit (HIVE) Program. Clusters or communities represent natural divisions within a network that are densely connected within a cluster and sparsely connected to the rest of the network. The...
Dynamic networks, especially those representing social networks, undergo constant evolution of their community structure over time. Nodes can migrate between different communities, communities can split into multiple new communities, communities can merge together, etc. In order to represent dynamic networks with evolving communities it is essential to use a dynamic model rather than a static one...
Grouping the vertex of the graph into sets of certain sizes such that minimum number of edges cross between the sets is called graph partitioning. This NP (Non-deterministic Polynomial time)-complete problem has important applications in computing, task scheduling, and parallel processing. We are implementing Kernighan-Lin, a local algorithm on both a Central Processing Unit (CPU) and a Graphics Processing...
In the GIS applications, massive mark point display has become a very important problem. Especially for the application in vehicle networking, high-load vehicle position moving increases the difficulty of the display problem. Many existing efforts have been taken on proposing aggregation algorithms for map mark points. However, all these works suffer from efficiency and scalability problems when we...
We present a heuristic algorithm to address the optimization problem of routing and spectrum allocation, aimed at traffic protection and restoration in an elastic optical network. The algorithm searches for working and backup disjoint paths, using the shared path protection scheme. It divides the spectrum into two partitions and prioritizes slots in one of them for backup path traffic. The way the...
We propose a heuristic for parallel partitioning of graphs into equi-sized components. In particular, we identify a relationship between the graph partitioning problem (GPP) and the traveling saleman problem (TSP), and use that to reduce partitioning to TSP. Given that better performing heuristics are known for TSP than are for GPP, this reduction also leads to improved GPP heuristics. What is more,...
We address a combinatorial optimization problem, namely the 1D array partitioning problem (1D-APP), having several real world applications such as scheduling independent tasks in parallel environments under contiguity constraint. We propose an exact binary search algorithm of pseudo-linearithmic complexity since the latter depends on the input data values. Our approach involves two phases. The first...
We address the 1D array partitioning problem (1D- APP), an easy combinatorial optimization problem, for which an exact dynamic programming algorithm (DPA) is known in the literature. The DPA is structured in a perfect three DO-loop nest (3DLN) with affine loop bounds. Due to its cubic complexity which may be too time consuming for large size real world problems, we propose a parallelization approach...
Scene management technology is one of the key technologies of virtual reality and visualization. In this paper, we propose a new method based on adaptive binary tree (ABT) and scene graph, which is used to improve the real-time rendering of indoor and outdoor objects and enhance the organization efficiency of scenes structure. The generation algorithm of adaptive binary tree, the scoring standard...
Future 5G network provides network slices for applications of different areas on the same physical network. Low-latency is the most key indicators of demand for the slices. How to establish the low-latency network slices is an essential question. This article proposes a mapping algorithm for low-latency network slices based on linear programming. Firstly, this problem can be defined as a linear programming...
K-means algorithm is a classical algorithm and has been widely used in many applications. However, the traditional K-means algorithm is easily influenced by outliers and it usually obtains an unstable clustering result and poor clustering accuracy. In this paper, aiming at K-means algorithm resistant to outliers, we proposed a Capped Robust K-means Algorithm (CRK-means) by adding a capped norm and...
Clustering is an important task in data mining area, especially in the area of continuous stream of data, i.e. ?data stream?. However, some characteristic of this kind of data is neglected during the existing clustering approaches. The similarity in temporal dimension between entities is underestimated. Forgetting mechanism is adopted to remove the old patterns to save computation resources. However,...
Recently multi-agent patrolling became more and more crucial in security, monitoring, etc. applications. It can be used, for example, to monitor points of interest in space, such as measurement points or entrances to a guarded area. A good patrolling strategy would ensure frequent visits to all points of interest defined by a user. A variety of centralized and distributed approaches exist already...
In current large-scale distributed key-value stores for cloud computing, the tail latency of the hundreds of key-value accesses generated by an end-user request determines the response time of this request. Replica selection algorithms, which select the best replica server for each key-value access as much as possible, is crucial to reduce the tail latency. This paper summarizes current replica selection...
Telecommunications fraud, a new type of crime, is showing a rising trend in recent years. However, research from data mining perspectives to detect such frauds is scarce, especially with the behavioral sequences considered. Though the call detail records (CDRs) in telecommunication is generally a snapshot, the history of a caller/callee can be treated as sequences. Indeed, the historical calling sequences...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.