The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The data mining and machine learning community is often faced with two key problems: working with imbalanced data and selecting the best features for machine learning. This paper presents a process involving a feature selection technique for selecting the important attributes and a data sampling technique for addressing class imbalance. The application domain of this study is software engineering,...
A large system often goes through multiple software project development cycles, in part due to changes in operation and development environments. For example, rapid turnover of the development team between releases can influence software quality, making it important to mine software project data over multiple system releases when building defect predictors. Data collection of software attributes are...
Software metrics collected during project development play a critical role in software quality assurance. A software practitioner is very keen on learning which software metrics to focus on for software quality prediction. While a concise set of software metrics is often desired, a typical project collects a very large number of metrics. Minimal attention has been devoted to finding the minimum set...
There is no general consensus on which classifier performance metrics are better to use as compared to others. While some studies investigate a handful of such metrics in a comparative fashion, an evaluation of specific relationships among a large set of commonly-used performance metrics is much needed in the data mining and machine learning community. This study provides a unique insight into the...
There are several performance metrics that have been proposed for evaluating a classification model, e.g., accuracy, error rates, precision, recall, etc. While it is known that evaluating a classifier on only one performance metric is not advisable, the use of multiple performance metrics poses unique comparative challenges for the analyst. Since different performance metrics provide different perspectives...
Software quality modeling for high-assurance systems, such as safety-critical systems, is adversely affected by the skewed distribution of fault-prone program modules. This sparsity of defect occurrence within the software system impedes training and performance of software quality estimation models. Data sampling approaches presented in data mining and machine learning literature can be used to address...
The problem of class imbalance in machine learning is quite real and cumbersome when it comes to building a useful and practical classification model. We present a unique insight into addressing class imbalance for classification problems that involve three or more categories, i.e. non-binary. This study is different than related works in the literature because most works focus on addressing class...
Assuring whether the desired software quality and reliability is met for a project is as important as delivering it within scheduled budget and time. This is especially vital for high-assurance software systems where software failures can have severe consequences. To achieve the desired software quality, practitioners utilize software quality models to identify high-risk program modules: e.g., software...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.