The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
K Nearest Neighbor Join (KNN Join) is a primitive operation widely adopted by many data mining applications. As a combination of the k nearest neighbor query and the join operation, KNN Join is a computationally intensive algorithm; however, with the increase of data volume and data dimension, the results can't be obtained within acceptable time when this algorithm runs on a single machine. Consequently,...
We present a novel approach for phase denoising in Interferometric Synthetic Aperture Radar (InSAR) images, named as Block-Matching InSAR (BMInSAR). It uses k-means clustering to solve the block matching similarity search problem, thus simplifying preprocessing steps and filtering several reference-blocks at once. Also, we propose a novel methodology based on ground-truth GPS measurements to assess...
In complex networks, communities often show the presence of homophily between members, since homophily is the tendency of individuals to associate and bound with similar others to form densely interconnected groups. In this paper, we propose a new bidirectional label propagation community detection algorithm, called Dis-Sim, based on the fundamental idea that a node and its most similar neighbors...
In the literature, there are many clustering algorithms proposed for the vehicle ad hoc networks (VANETs) to improve network stability and scalability. However, there is a lack of comprehensive comparison among them. In this paper, we show that there exists unfair comparison of clustering algorithms, in the aspect of simulators, performance metrics, simulation scenarios, and configuration of algorithms...
In order for autonomous surface vessels (ASVs) to avoid collisions at sea it is necessary to predict the future trajectories of surrounding vessels. This paper investigate the use of historical automatic identification system (AIS) data to predict such trajectories. The availability of AIS data have steadily increased in the last years as a result of more regulations, together with wider coverage...
Partitioning of electric networks into zones or areas is a procedure that has numerous applications in power system planning, operation and control. Spectral clustering based approaches are among the most favoured ones to solve the partitioning problem. Applications of spectral clustering include definition of control zones, analysis of connectivity structure of power networks, intentional controlled...
We present a new algorithm for discovering clusters in noisy data streams using dynamic and cluster-specific temporal decay factors. Our improvement helps identify and adapt to evolving trends by adapting the weighting of stream data based on both content attributes and temporal arrival patterns. Our experimental results show that the proposed algorithm can discover better quality clusters in noisy...
Radio environment maps can be a powerful tool for achieving efficient context-aware resource allocation in 5G heterogeneous networks. In this paper, we consider an heterogeneous network formed by a traditional cellular network and a wireless sensor network. The role of the wireless sensor network is to estimate the radio environment map of the cell using a geostatistical interpolation technique named...
Outlier detection has been shown to be a promising machine learning technique for a diverse array of felds and problem areas. However, traditional, supervised outlier detection is not well suited for problems such as network intrusion detection, where proper labelled data is scarce. This has created a focus on extending these approaches to be unsupervised, removing the need for explicit labels, but...
With continuously growing data, clusters also need to grow periodically to accommodate the increased demand of data processing. This is usually done by addition of newer hardware, whose configuration might differ from the existing nodes. As a result, clusters are becoming heterogeneous in nature. For many real world machine learning and data mining applications, data is represented in the form of...
We study a universal outlying sequence detection problem, in which there are M sequences of samples out of which a small subset of outliers need to be detected. A sequence is considered as an outlier if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences. In the universal setting, the goal is to identify all the...
We describe an analytical process to determine how much UAS traffic is feasible. The process is a simulator and data processing tools. The two are applied to the US San Francisco Bay Area and Norrköping, Sweden. The amount of UAS traffic is measured in flights per day and simulated up to 200,000 flights. A UAS traffic volume is feasible if specified metrics meet operational requirements with high...
With the increasing penetration of renewable energy sources in the modern electric grid, it becomes more technically difficult and costly for system operators to balance generation and demand as traditional providers of flexibility (i.e., flexible generation) become uneconomic. Therefore new sources of flexibility are needed to maintain reliable operation. Flexible demand, including from electric...
The community structure of complex networks reveals hidden relationships in the organization of their constituent nodes. Indeed, many practical problems stemming from different fields of knowledge such as Biology, Sociology, Chemistry and Computer Science can be modeled as a graph. Therefore, graph analysis and community detection have become a key component for understanding the inherent relational...
This work presents a computationally efficient real-time adaptive clustering algorithm that recognizes and adapts to dynamic changes observed in neural recordings. The algorithm consists of an off-line training phase that determines initial cluster positions and an on-line operation phase that continuously tracks drifts in clusters and periodically verifies acute changes in cluster composition. Analysis...
Semi-supervised clustering has been widely explored in the last years. In this paper, we present HCAC-ML (Hierarchical Confidence-based Active Clustering with Metric Learning), an innovative approach for this task which employs distance metric learning through cluster-level constraints. HCAC-ML is based on the HCAC algorithm, an state-of-the-art algorithm for hierarchical semi-supervised clustering...
Trajectory reversing is a method commonly used for estimating the region of attraction of stabilized equilibria. Using a discrete set of points obtained by trajectory reversing, this paper presents an algorithm for estimation and mathematical representation of the region of attraction using convex hulls. Several two-dimensional examples are presented to illustrate the usefulness of the algorithm....
In practice, there are a variety of real-world datasets that have an imbalanced nature where one of two classes dominates the data. These datasets are generally difficult to classify using machine learning algorithms as the skewed nature of the data has a significant impact on the training process. In order to combat this difficulty, many methods of under sampling and over sampling have been proposed...
The Service Oriented Computing (SOC) paradigm promotes building new applications by discovering and then invoking services, i.e., software components accessible through the Internet. Discovering services means inspecting registries where textual descriptions of services functional capabilities are stored. To automate this, existing approaches index descriptions and associate users' queries to relevant...
Credit scoring plays an important role in financial institutions and debt based crowdfunding platforms as well as peer to peer lending platforms. In the last few years, adopting ensemble methods for credit scoring has become much more popular. However, the performance of ensemble methods is easily affected by the parameter settings and the number of base classifiers. Ensemble classification based...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.