The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Despite a decade of active research, there has been a marked lack in clone detection techniques that scale to large repositories for detecting near-miss clones. In this paper, we present a token-based clone detector, SourcererCC, that can detect both exact and near-miss clones from large inter-project repositories using a standard workstation. It exploits an optimized inverted-index to quickly query...
Given the availability of large source-code repositories, there has been a large number of applications for large-scale clone detection. Unfortunately, despite a decade of active research, there is a marked lack in clone detectors that scale to big software systems or large repositories, specifically for detecting near-miss (Type 3) clones where significant editing activities may take place in the...
Source code similarity measurement is a fundamental technique in software engineering research. Techniques to measure code similarity have been invented and applied to various research areas such as code clone detection, finding bug fixes, and software plagiarism detection. We perform an evaluation of 30 similarity analysers for source code. The results show that specialised tools including clone...
Developers often reuse existing software by copy and paste. Source code reuse improves productivity and software quality. On the other hand, source code reuse requires several professional skills to developers. In source code reuse, developers must locate reusable code fragments, and judge whether such reusable code is adequate to copy and paste into the source file under development. This paper presents...
Duplicated code or code clones are a kind of code smell that have both positive and negative impacts on the development and maintenance of software systems. Software clone research in the past mostly focused on the detection and analysis of code clones, while research in recent years extends to the whole spectrum of clone management. In the last decade, three surveys appeared in the literature, which...
Software project forking, that is copying an existing project and developing a new independent project from the copy, occurs frequently in software development. Analysing the code similarities between such software projects is useful as developers can use similarity information to merge the forked systems or migrate them towards a reuse approach. Several techniques for detecting cross-project similarities...
This paper focuses on the applicability of clone detectors for system evolution understanding. Specifically, it is a case study of Firefox for which the development release cycle changed from a slow release cycle to a fast release cycle two years ago. Since the transition of the release cycle, three times more versions of the software were deployed. To understand whether or not the changes between...
Many tools for clone detection exist. Each has its own model of clones and its own data format. This makes it difficult to share results, compare detectors, and replicate existing studies. Although there have been attempts to provide unified clone models in the past, no widely accepted unified clone model and data format has emerged. This paper discusses challenges that may be the reason why this...
In order to do research on code clones, it is necessary to have information about code clones. For example, if the research is to improve clone detection, this information would be used to validate the detectors or provide a benchmark to compare different detectors. Or if the research is on techniques for managing clones, then the information would be used as input to such techniques. Typically, researchers...
The advent of internet and growth of open source software repositories has made source code readily accessible to software developers. Although, reusing of source code has its own advantages, care must be taken to ensure that proprietary software does not infringe any licenses. In this context, plagiarism detection plays an important role. In this paper, we propose a robust technique to detect plagiarism...
In this paper, we propose a new semantic clone detection technique by comparing programs' abstract memory states, which are computed by a semantic-based static analyzer. Our experimental study using three large-scale open source projects shows that our technique can detect semantic clones that existing syntactic- or semantic-based clone detectors miss. Our technique can help developers identify inconsistent...
Code clone detection is an enabling technology for plenty of applications, each having different requirements to a clone detector. In this paper we present a generic pipeline model of the code clone detection process. Based on this model we developed the JCCD code clone detection API for implementing custom clone detectors. By combining and parameterizing predefined API components as well as by adding...
Clones are generally considered bad programming practice in software engineering folklore. They are identified as a bad smell and a major contributor to project maintenance difficulties. Clones inherently cause code bloat, thus increasing project size and maintenance costs. In this work, we try to validate the conventional wisdom empirically to see whether cloning makes code more defect prone. This...
Cloning in source code is a well known quality defect that negatively affects software maintenance. In contrast, little is known about cloning in requirements specifications. We present a study on cloning in 11 real-world requirements specifications comprising 2,500 pages. For specification clone detection, an existing code clone detection tool is adapted and its precision analyzed. The study shows...
Code reuse through copying and pasting leads to so-called software clones. These clones can be roughly categorized into identical fragments (type-1 clones), fragments with parameter substitution (type-2 clones), and similar fragments that differ through modified,deleted, or added statements (type-3 clones). Although there has been extensive research on detecting clones, detection of type-3 clones...
The area of clone detection has considerably evolved over the last decade, leading to approaches with better results, but at the same time using more elaborate algorithms and tool chains. In our opinion a level has been reached, where the initial investment required to setup a clone detection tool chain and the code infrastructure required for experimenting with new heuristics and algorithms seriously...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.