CBCD: Cloned Buggy Code Detector
Submitted by mernst on Wed, 2011-11-30 14:35
| Title | CBCD: Cloned Buggy Code Detector |
| Publication Type | Miscellaneous |
| Year of Publication | 2011 |
| Authors | Li J, Ernst MD |
| Abstract | <p>Developers often copy, or clone, code in order to reuse or modify functionality. When they do so, they also clone any bugs in the original code. Or, different developers may independently make the same mistake. As one example of a bug, multiple products in a product line may use a component in a similar wrong way. This paper makes two contributions. First, it presents an empirical study of cloned buggy code. In a large industrial product line, about 4% of the bugs are duplicated across more than one product or file. In three open source projects (the Linux kernel, the Git version control system, and the PostgreSQL database) we found 282, 33, and 33 duplicated bugs, respectively. Second, this paper presents a tool, CBCD, that searches for code that is semantically identical to given buggy code. CBCD tests graph isomorphism over the Program Dependency Graph (PDG) representation and uses four optimizations. We evaluated CBCD by searching for known clones of buggy code segments in the three projects and compared the results with text-based, token-based, and AST-based code clone detectors, namely Simian, CCFinder, and CloneDr. The results of the evaluation show that CBCD is applicable for its principal use: it is fast when searching for possible clones of the buggy code in a large system and it is more precise than the other code clone detectors.</p> |
| Notes | <p>Revised October 2011</p> |
| Citation Key | LiE2011 |
Last changed Mon, 2013-06-03 10:27

cs.