The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Background: Merge conflicts are a common occurrence in software development. Researchers have shown the negative impact of conflicts on the resulting code quality and the development workflow. Thus far, no one has investigated the effect of bad design (code smells) on merge conflicts. Aims: We posit that entities that exhibit certain types of code smells are more likely to be involved in a merge conflict...
Currently, open source projects receive various kinds of issues daily, because of the extreme openness of Issue Tracking System (ITS) in GitHub. ITS is a labor-intensive and time-consuming task of issue categorization for project managers. However, a contributor is only required a short textual abstract to report an issue in GitHub. Thus, most traditional classification approaches based on detailed...
Understanding the types of defects is of practical interest, which could help developers adopt proper measures in current and future software releases. As the amount of bug reports increasing, manual classification brings a heavy burden to developers. In this paper, we propose a word2vec based framework of multi-granularity automatic classification for bug reports based on fault triggers. Except classifying...
Software change histories are results of incremental updates made by developers. As a side-effect of the software development process, version history is a surprisingly useful source of information for understanding, maintaining and reusing software. However, traditional commit-based sequential organization of version histories lacks semantic structure and thus are insufficient for many development...
Binary lifting, which is to translate a binary executable to a high-level intermediate representation, is a primary step in binary analysis. Despite its importance, there are only few existing approaches to testing the correctness of binary lifters. Furthermore, the existing approaches suffer from low test coverage, because they largely depend on random test case generation. In this paper, we present...
Code changes are often reviewed before they are deployed. Popular source control systems aid code review by presenting textual differences between old and new versions of the code, leaving developers with the difficult task of determining whether the differences actually produced the desired behavior. Fortunately, we can mine such information from code repositories. We propose aiding code review with...
Numerical software is used in a wide variety of applications including safety-critical systems, which have stringent correctness requirements, and whose failures have catastrophic consequences that endanger human life. Numerical bugs are known to be particularly difficult to diagnose and fix, largely due to the use of approximate representations of numbers such as floating point. Understanding the...
Programmers always fix bugs when maintaining software. Previous studies showed that developers apply repeated bug fixes—similar or identical code changes—to multiple locations. Based on the observation, researchers built tools to identify code locations in need of similar changes, or to suggest similar bug fixes to multiple code fragments. However, some fundamental research questions, such as what...
To improve the accuracy of analysis results is one of the hard challenges for static analysis. Especially, static analyzers generally analyze all paths of a program, including infeasible paths, which undoubtedly decreases the analysis accuracy. To mitigate the issue, we design and implement a static analyzer, called ABAZER-SE, which is based on the meta-compilation and the GCC abstract syntax tree...
Differential testing uses similar programs as cross-referencing oracles to find semantic bugs that do not exhibit explicit erroneous behaviors like crashes or assertion failures. Unfortunately, existing differential testing tools are domain-specific and inefficient, requiring large numbers of test inputs to find a single bug. In this paper, we address these issues by designing and implementing NEZHA,...
The application of information retrieval techniques to search tasks in software engineering is made difficult by the lexical gap between search queries, usually expressed in natural language (e.g. English), and retrieved documents, usually expressed in code (e.g. programming languages). This is often the case in bug and feature location, community question answering, or more generally the communication...
Since debugging is a time-consuming activity, automated program repair tools such as GenProg have garnered interest. A recent study revealed that the majority of GenProg repairs avoid bugs simply by deleting functionality. We found that SPR, a state-of-the-art repair tool proposed in 2015, still deletes functionality in their many "plausible" repairs. Unlike generate-and-validate systems...
Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, where is it located in the source code files? Information retrieval (IR) approaches see a bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the...
To quickly locate the source code that maps to a specific change described in change history, establishing traceability links between release notes and source code is a necessary task. Current works on the traceability link recovery can be used to find out source code changes which are of higher textual similarities with the release note. However, these approaches rely on consistency of the text used...
Automatic Program Repair (APR) has recently been an emerging research area, addressing an important challenge in software engineering. APR techniques, if effective and efficient, can greatly help software debugging and maintenance. Recently proposed APR techniques can be generally classified into two families, namely search-based and semantics-based APR methods. To produce repairs, search based APR...
To improve software reliability, many rule-based techniques have been proposed to infer programming rules and detect violations of these rules as bugs. These rule-based approaches often rely on the highly frequent appearances of certain patterns in a project to infer rules. It is known that if a pattern does not appear frequently enough, rules are not learned, thus missing many bugs. In this paper,...
The traditional duplicate bug reports detection approaches are usually based on vector space model. However, the experimental result is rarely satisfying since this method cannot distinguish semantic correlation among bug reports which written by natural languages. Topic model, as a method to model underlying topics of texts, can solve the problem of document similarity calculation methods used in...
The major part of risk the development of software or programs is existence of duplicate code that can affect the software maintainability. The main aim of Clone identification technique is to search and detect the parts of the software code which is identical. In the passed there are various techniques that are used to identify and reflect the code identity and code fragments. Code cloning reduces...
Automated program repair can potentially reduce debugging costs and improvesoftware quality but recent studies have drawn attention to shortcomings inthe quality of automatically generated repairs. We propose a new kind ofrepair that uses the large body of existing open-source code to findpotential fixes. The key challenges lie in efficiently finding codesemantically similar (but not identical) to...
Context: Software source code is frequently changed for fixing revealed bugs. These bug-fixing changes might introduce unintended system behaviors, which are inconsistent with scenarios of existing regression test cases, and consequently break regression testing. For validating the quality of changes, regression testing is a required process before submitting changes during the development of software...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.