The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Apache Spark is an open source distributed data processing platform that uses distributed memory abstraction to process large volume of data efficiently. However, performance of a particular job on Apache Spark platform can vary significantly depending on the input data type and size, design and implementation of the algorithm, and computing capability, making it extremely difficult to predict the...
With an ever-increasing amount of information made available via the Internet, it is getting more and more difficult to find the relevant pieces of information. Recommender systems have thus become an essential part of information technology. Although a lot of research has been devoted to this area, the factors influencing the quality of recommendations are not completely understood. This paper examines...
Cohen's κ coefficient has been widely used for assessing classification results derived from remote sensing data. It however presents several limitations, which are preventing both an efficient use as well as a generalisation of its use. This paper reviews these problems and proposes as an alternative to prefer the Krippendorff's α-coefficient over Cohen's κ. Krippen-dorff's α indeed presents less...
Satellite-borne or aircraft-borne synthetic aperture radar (SAR) technique is useful for high resolution imaging analysis for terrain surface monitoring or surveillance, even in optically harsh environment. For surveillance application, there are various approaches for automatic target recognition (ATR) of SAR images aiming at monitoring unidentified ships or aircrafts. In addition, various types...
An important aspect of research in the remote sensing field is to objectively compare different classifiers. This is the foundation of hundreds of research projects and in this paper we will address some raising concerns when evaluating solutions for classification of data sets with skewed class distributions. The quality of assessment is based on the problem specified by the user and the corresponding...
Recently, machine-learning based vulnerability prediction models are gaining popularity in web security space, as these models provide a simple and efficient way to handle web application security issues. Existing state-of-art Cross-Site Scripting (XSS) vulnerability prediction approaches do not consider the context of the user-input in output-statement, which is very important to identify context-sensitive...
Effort estimation is a project management activity that is mandatory for the execution of software projects. Despite its importance, there have been just a few studies published on such activities within the Agile Global Software Development (AGSD) context. Their aggregated results were recently published as part of a secondary study that reported the state of the art on effort estimation in AGSD...
In this research, we propose using time context to improve predictive accuracy and quality of collaborative filtering for music recommendation. We use time contextual information called micro-profiling. Thus, each user has multiple micro profiles, in particular, six-time slots instead of a single profile. The recommendation is performed depended on these micro-profiling. Our method takes into account...
Handwritten signature recognition is one important component of biometric authentication. This is a central process in a broad range of areas requiring personal identification, such as security, legal contracts and bank transactions. Extensive efforts have been put into the research towards the verification of handwritten signatures, which contain biometric information. Although many successful methods...
Coverage-based fault localization techniques leverage coverage information to identify the suspicious program entities for inspection. However, coincidental correctness (CC) widely occurs during software debugging, and brings negative impact to the effectiveness of CBFL techniques. In this paper, we propose a regression approach to identity CC execution with weighted clustering analysis. Based on...
A Particle Swarm Optimization (PSO) technique, in conjunction with Fuzzy Adaptive Resonance Theory (ART), was implemented to adapt vigilance values to appropriately compensate for a disparity in data sparsity. Gaining the ability to optimize a vigilance threshold over each cluster as it is created is useful because not all conceivable clusters have the same sparsity from the cluster centroid. Instead...
This paper presents a subject centric group feature for person re-identification. Our approach is inspired by the observation that people often tend to walk alongside others or in a group. We argue that co-travelers' information, including geometry and visual cues, can reduce the re-identification ambiguity and lead to better accuracy, compared to approaches that rely only on visual cues. We introduce...
The detection of groups of people is attracting the attention of many researchers in diverse fields, with a growing number of approaches published each year; despite this, the evaluation metrics are not consolidated, with some measures inherited from the people detection fields, other ones designed specifically for a particular approach, generating a set of not comparable figure of merits. Moreover,...
In multi-label classification, labels often have correlations with each other. Exploiting label correlations can improve the performances of classifiers. Current multi-label classification methods mainly consider the global label correlations. However, the label correlations may be different over different data groups. In this paper, we propose a simple and efficient framework for multi-label classification,...
As business processes have become increasingly automated, data quality becomes the limiting and penalizing factor in the business service's overall quality, and thus impacts customer satisfaction, whether it is an end-user, an institutional partner or a regulatory authority. The available research that is related to business services' quality paid very little attention to the impact of poor data quality...
In creating web pages, books, or presentation slides, consistent use of tasteful visual style(s) is quite important. In this paper, we consider the problem of style-based comparison and retrieval of illustrations. In their pioneering work, Garces et al. [2] proposed an algorithm for comparing illustrative style. The algorithm uses supervised learning that relied on stylistic labels present in a training...
Digital Elevation Model (DEM) is crucial for several purposes like town planning, hydrological analysis, land sliding, flash floods, earthquake, road construction, surface analysis, ortho-rectification of satellite imagery, 3D visualization, precise farming and forestry, base mapping, flight simulation and disaster management. Pleiades is a French constellation of very high resolution satellites....
Churn prediction, or the task of identifying customers who are likely to discontinue use of a service, is an important and lucrative concern of firms in many different industries. As these firms collect an increasing amount of large-scale, heterogeneous data on the characteristics and behaviors of customers, new methods become possible for predicting churn. In this paper, we present a unified analytic...
Amongst all the social media platforms available, Twitter is rapidly becoming the main one used for communications about real-time events. As a result, there is a lot of interest in monitoring Twitter and understanding the topics of conversations. However, the fact that tweets are short in content makes topics derivation a challenge, as most existing methods use content features only, sometimes integrated...
In today's IVHM system, diagnostics and prognostic play a crucial part in the system safety while reducing the operating and maintenance costs. Structural health management is a vital part of IVHM as arguably structures are the biggest and most costly part of the system, thus the failure of the structure could lead to catastrophic results. The failure of a structure is usually caused by cracks or...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.