Search code examples
programming-languagesduplicateslanguage-designcode-duplication

Is code clone a common practice in C,Java and Python?


Code clones, also known as Duplicate code is often considered harmful to the system quality.

  1. I'm wondering whether these duplicate code could be seen in standard APIs or other mature tools.
  2. If it is indeed the case, then which language(such like C,Java,Python,common lisp etc.) do you think should introduce code clone practice with a higher probability?

Solution

  • Code cloning is extremely common no matter what programming language is used, yes, even in C, Python and Java.

    People do it because it makes them efficient in the short term; they're doing code reuse. Its arguably bad, because it causes group inefficiencies in the long term: clones reuse code bugs and assumptions, and when those are discovered and need to be fixed, all the code clones need to be fixed, and the programmer doing the fixing doesn't know where the clones are, or even if there are any.

    I don't think clones are bad, because of the code reuse effect. I think what is bad is not managing them.

    To help with the latter problem, I build clone detectors (see our CloneDR) that automatically find exact and near-miss duplicated code, using the structure of the programming language to guide the search. CloneDR works for a wide variety of programming languages (including OP's set).

    In any software system of 100K SLOC or more, at least 10% of the code is cloned. (OK, OK, Sun's JDK is built by an exceptionally good team, they have only about 9.5%). It tends to be worse in older conventional applications; I suspect because the programmers clone more code out of self defense. (I have seen applications in which the clones comprise 50%+ of code, yes, those programs tend be awful for many reasons, not just cloning).

    You can see clone reports at the link for applications in several langauges, look at the statistics, and see what the clones look like.