CBCD: Cloned Buggy Code Detector

TitleCBCD: Cloned Buggy Code Detector
Publication TypeConference Paper
Year of Publication2012
AuthorsLi J, Ernst MD
Conference NameICSE'12, Proceedings of the 34th International Conference on Software Engineering
Date or Month PublishedJune 6–8
Conference LocationZürich, Switzerland
Abstract<p>Developers often copy, or clone, code in order to reuse or modify functionality. When they do so, they also clone any bugs in the original code. Or, different developers may independently make the same mistake. As one example of a bug, multiple products in a product line may use a component in a similar wrong way. This paper makes two contributions. First, it presents an empirical study of cloned buggy code. In a large industrial product line, about 4% of the bugs are duplicated across more than one product or file. In three open source projects (the Linux kernel, the Git version control system, and the PostgreSQL database) we found 282, 33, and 33 duplicated bugs, respectively. Second, this paper presents a tool, CBCD, that searches for code that is semantically identical to given buggy code. CBCD tests graph isomorphism over the Program Dependency Graph (PDG) representation and uses four optimizations. We evaluated CBCD by searching for known clones of buggy code segments in the three projects and compared the results with text-based, token-based, and AST-based code clone detectors, namely Simian, CCFinder, Deckard, and CloneDR\@. The evaluation shows that CBCD is fast when searching for possible clones of the buggy code in a large system, and it is more precise for this purpose than the other code clone detectors.</p>
DownloadsTR UW-CSE-11-05-02
Citation KeyLiE2012
Last changed Mon, 2013-06-03 10:27