Detect Copy-Paste Bugs

Pattern Miner can help you identify copy-paste bugs in your code base. When developers want to implement code that is mostly similar to existing code, but need to make minor modifications to fit the code into the new context, they often use copy-paste to save time. It is easy to introduce bugs during this process. For example, when you copy-paste a code segment from one place to another, you may often need to modify one variable name (e.g., variable i), to another (e.g., variable j). If you modify the segment in most instances, but forget to modify consistently within the replicated block, you can introduce a bug.

Linux copy paste
Linux copy-paste bug

Here's an example from the Linux kernel: the loop in lines 111-118 was copied from lines 92-99. In the new copy-pasted segment (lines 111-118), the variable prom_phys_total is replaced with prom_prom_taken in all cases except in line 117 (shown in bold font). As a result, the pointer prom_prom_taken[iter].theres_more incorrectly points to the element of prom_phys_total instead of prom_prom_taken.

It's very difficult to detect copy-paste errors — since they are semantic errors, you cannot detect them visually or by using static or dynamic analysis tools. What makes it yet more challenging is that in many cases, the copy-paste code is slightly modified by the addition or deletion of statements.

Pattern Miner can help you detect these hard-to-find bugs. Pattern Miner efficiently detects similar code patterns that appear multiple times. It then flags code segments with multiple copies that have inconsistent changes between the copies, which indicate potential bugs.

Inconsistencies Listing
Pattern Miner Inconsistencies Listing
Step 1: View the inconsistency listing for your project. When the results are displayed, they are prioritized and ranked based on an algorithm that takes into account several parameters — the number of total modifications between replicated segments, the type of inconsistency found relative to the location it was found, and the number of inconsistencies within the segments.
Inconsistencies Split View
Pattern Miner Inconsistencies Split View
Step 2: You can view the inconsistencies in a split-screen view of both locations where the copy-paste inconsistency has been discovered, with replicated code segments clearly highlighted to show which sections have been replicated.
Inconsistencies Annotation
Pattern Miner Inconsistencies Annotation
Step 3: Once you verify whether an inconsistency is a real bug or a false positive, you can annotate it accordingly. Each copy-paste inconsistency can be annotated as Unverified, Real Bug, False Positive, or Ignore, and stored with notes. Each bug is associated with a specific bug ID, so the bug will retain all of its annotation and comments the next time you use this feature on your code base.