Abstract

There has been a broad assumption that code clones are inherently bad and that eliminating clones by refactoring would solve the problems of code clones. To investigate whether this assumption is valid, we developed a formal definition of clone evolution and built a clone genealogy tool that automatically extracts the history of code clones from a source code repository. Using our clone genealogy extractor, we studied the evolution of code clones in two Java open source projects.

 

Our study of clone evolution contradicts some conventional wisdom about clones; refactoring may not benefit many clones for two reasons.

First, many code clones exist in the system for only a short time, disappearing soon after; extensive refactoring of such short-lived clones may not be worthwhile if they are to diverge from one another very soon.

Second, many clones, especially long-lived clones that have changed consistently with other elements in the same group, are not locally refactorable due to the programming language limitations. Our study discovers that there are types of clones that refactoring would not help, and it opens up opportunities for clone maintenance tools that target unaddressed classes of clones using clone genealogy information

 

Model

 

 

Clone genealogy Data

 

Dnsjava

 

 

Carol