The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Due to the increased importance of machine learning in software and security engineering, effective trainings are needed that allow software engineers to learn the required basic knowledge to understand and successfully apply prediction models fast. In this paper, we present a two-days seminar to teach machine learning-based prediction in software engineering and the evaluation ofits learning effects...
Build systems play a crucial role in modern software engineering. Recent studies have shown that many builds fail, mostly due to neglected maintenance. This blocks teams from continuing the development and costs time and resources to fix. The target of the thesis is to reduce build breakage by investigating changes that lead to failing builds, identifying bad and best practices for build configuration,...
Researchers often focus on the development process and the final product (source code) to investigate and predict software defects. Unfortunately, these models may not be applicable to software projects in which there is no access to the data sources regarding development process. For example, in cases when a company conducts tests on behalf of its business contractors, it is only possible to evaluate...
We would like to present the idea of our Continuous Defect Prediction (CDP) research and a related dataset that we created and share. Our dataset is currently a set of more than 11 million data rows, representing files involved in Continuous Integration (CI) builds, that synthesize the results of CI builds with data we mine from software repositories. Our dataset embraces 1265 software projects, 30,022...
The amount of software in modern vehicles is constantly growing. However, the risk for functional and quality deficiencies increases simultaneously with size. This results in industry for example in inevitable and unexpected refactorings of software models, which is slowing down development processes in turn. In this industrial case study, we evaluate model growth predictors applied to foresee critical...
When there exists not enough historical defect data for building accurate prediction model, semi-supervised defect prediction (SSDP) and cross-project defect prediction (CPDP) are two feasible solutions. Existing CPDP methods assume that the available source data is well labeled. However, due to expensive human efforts for labeling a large amount of defect data, usually, we can only make use of the...
Background Software systems are relying more and more on multi-core hardware requiring a parallel approach to address the problems and improve performances. Unfortunately, parallel development is error prone and many developers are not very experienced with this paradigm also because identifying, reproducing, and fixing bugs is often difficult. Objective The main goal of this paper is the identification...
Years of research in software engineering have given us novel ways to reason about, test, and predict the behavior of complex software systems that contain hundreds of thousands of lines of code. Many of these techniques have been inspired by nature such as genetic algorithms, swarm intelligence, and ant colony optimization. In this paper we reverse the direction and present BioSIMP, a process that...
Build systems are an essential part of modern software engineering projects. As software projects change continuously, it is crucial to understand how the build system changes because neglecting its maintenance can lead to expensive build breakage. Recent studies have investigated the (co-)evolution of build configurations and reasons for build breakage, but they did this only on a coarse grained...
It is common practice to discretize continuous defect counts into defective and non-defective classes and use them as a target variable when building defect classifiers (discretized classifiers). However, this discretization of continuous defect counts leads to information loss that might affect the performance and interpretation of defect classifiers. Another possible approach to build defect classifiers...
There has been a significant interest in the estimation of time and effort in fixing defects among both software practitioners and researchers over the past two decades. However, most of the focus has been on prediction of time and effort in resolving bugs, without much regard to predicting time needed to complete high-level requirements, a critical step in release planning. In this paper, we describe...
Although peer code review is widely adopted in both commercial and open source development, existing studies suggest that such code reviews often contain a significant amount of non-useful review comments. Unfortunately, to date, no tools or techniques exist that can provide automatic support in improving those non-useful comments. In this paper, we first report a comparative study between useful...
Automated builds are integral to the Continuous Integration (CI) software development practice. In CI, developers are encouraged to integrate early and often. However, long build times can be an issue when integrations are frequent. This research focuses on finding a balance between integrating often and keeping developers productive. We propose and analyze models that can predict the build time of...
The objective of this research work is to develop a proficient recommender system for effective bug triaging. To build this we initiated with introducing a novel time based model, Visheshagya, for bug report assignment. Subsequently, we propose a novel AHP based bug assignment approach, W8Prioritizer, based on bug parameter prioritization. We further extend our work for triaging Non-reproducible (NR)...
We propose to study the impact of the representation of the data in defect prediction models. For this study, we focus on the use of developer activity data, from which we structure dependency graphs. Then, instead of manually generating features, such as network metrics, we propose a model inspired in recent advances in Representation Learning which are able to automatically learn representations...
Background. Solving the class-imbalance problem of within-project software defect prediction (SDP) is an important research topic. Although some class-imbalance learning methods have been presented, there exists room for improvement. For cross-project SDP, we found that the class-imbalanced source usually leads to misclassification of defective instances. However, only one work has paid attention...
It is well recognized that effort estimation is an essential part of successful software management. Among many estimation models, the Case-Base Effort Estimation (CBEE) has been intensively used among researchers and practitioners as a promising model for better and accurate effort prediction. The common challenges with this model are: (1) finding the nearest cases to the new case, (2) selecting...
Signature extraction is a critical preprocessing step in forensic log analysis because it enables sophisticated analysis techniques to be applied to logs. Currently, most signature extraction frameworks either use rule-based approaches or handcrafted algorithms. Rule-based systems are error-prone and require high maintenance effort. Hand-crafted algorithms use heuristics and tend to work well only...
We introduce a bi-objective effort estimation algorithm that combines Confidence Interval Analysis and assessment of Mean Absolute Error. We evaluate our proposed algorithm on three different alternative formulations, baseline comparators and current state-of-the-art effort estimators applied to five real-world datasets from the PROMISE repository, involving 724 different software projects in total...
Defect prediction on projects with limited historical data has attracted great interest from both researchers and practitioners. Cross-project defect prediction has been the main area of progress by reusing classifiers from other projects. However, existing approaches require some degree of homogeneity (e.g., a similar distribution of metric values) between the training projects and the target project...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.