Why is up to 45% of software shipped with previously fixed defects?

Previously fixed defects account for up to 45% of all bugs in production software. Nearly 50% of software releases in the field contain “known” security vulnerabilities. These statistics, from a variety of research papers on defect rates in open source software, are certainly eye-opening, especially since we are led to believe that undiscovered “new” bugs are the most damaging kind.

Importantly, these types of bugs present serious challenges to engineering teams. Customer satisfaction can be compromised. Engineering resources are consumed in manual, time consuming, and error prone approaches to address these issues, but still fail to solve the problem. And in many cases, the situation contributes to prolonged product development and release cycle times.

“We had bugs appearing in releases 7 years after they were first discovered, with Pattern Insight we eliminated all of them in 4 days.”

VP of Engineering,
Large Chipset/Software Vendor

Previously Fixed Bugs

Here are common scenarios for how previously fixed bugs migrate back into production code:

Missed Branch Patch

Bug fixes made in one branch must be integrated into all other branches of a code base. In most cases, this is done manually. Companies usually impose a policy that if a bug is fixed in BRANCH A, the fix needs to be ported to BRANCH X, Y, and Z. However, with the increasing number of branches and bug fixes, the chances of a developer failing to follow this process increase, especially with the constant pressure to deliver releases on time. The patching tends to be secondary in priority and as a result, some branches are not patched.

Operator Errors

A bug fix usually consists of several snippets in different files. It is very common, when being integrated into another branch, for conflicts to occur with other changes that have been made to these snippets. At that moment, the developer only sees the conflict at the snippet level. The purpose of conflict resolution is to make the build clean, but developers commonly break other developers’ bug fixes unintentionally. To make things worse, many companies use dedicated integration or build engineers to perform this task. They typically have less specific knowledge about changes involved in the conflicts.

In many cases, a bug fix consists of several change sets in the SCM system. It’s common for a developer to mistakenly revert one of several change sets, which subsequently breaks the fix.

Code that Contains Bugs Is Reused

Research shows software typically contains 15-50% duplicated code. Copy-paste is a common practice in programming to save time. In many cases, an entire component may be reused and modified, even within the same branch. When a component of code is reused, buggy code in it is propagated. There is no adequate automated means to keep track of these duplicated bugs.

These issues are due to the proliferation of components, software releases and product variants in modern software development. Component and software reuse, while significantly reducing development cost, also creates a large number of variants. And in parallel branching, employed to streamline software development and support release management needs, it is common for software companies to maintain many old releases in the field while working on a new release, adds to the challenge. Developers continuously make code changes, including bug fixes, in these versions. But keeping these versions in sync is an enormous challenge.

Learn why up to 45% of software bugs shipped are previously fixed defects and what you can do about it.

Why Current Tools Fail

In most cases, companies use engineers to manually verify that the final release contains all changes necessary (this includes bug fixes). This manual exercise is expensive, time-consuming, and error-prone. To improve efficiency, most companies build in-house tools. These generally fall into two categories: SCM-based or keyword/regex-based. Here is how they work and why they fail:

SCM Based Tools

SCM systems have built-in mechanisms to integrate one change set from one branch to the other. If an integration action is taken, it leaves an entry in the SCM log. Based on the history in the SCM log, companies build in-house tools to track where a bug fix goes. Although they help to some extent, they fall short in solving the problem and create a false sense of security that worsens the problem. Here’s why:

  • Developers are still required to do the right thing or key in the right information. It is common for developers not to use integration commands to port the bug fixes. They simply fix the bug again in other branches. When this happens, there is no trace in the SCM log. Even when the correct integration process is used, information from developers is often partially missing, incomplete, or misleading. For example, many companies ask developers to write the bug ID, or ticket ID into the comment section of a change set before checking in code. Not doing this correctly stymies cross-reference analyses.
  • Much information does not reside in the SCM system. For example, SCM systems cannot track duplicated content. Further, if a file is removed and added back in, SCM systems cannot track changes between these two files.
  • Legacy SCM systems are still used regularly in development shops. These do not support the notion of change sets, which makes tracking bug fix routes nearly impossible.

Keyword/Regex-Based Tools

Another set of tools typically use keyword search or regular expression. There are many drawbacks to this route that make it inadequate as well:

  • It’s often hard to figure out the right query to get the results you want. A bug tends to involve many snippets in different files. How would a simple keyword search or regex-based search capture such complexity? It quickly becomes “mission impossible”.
  • The results are usually too noisy, since the search lacks context. The search usually returns tons of results that are not relevant. The overhead of processing them and narrowing down the candidate code snippets is prohibitive.
  • You can’t be certain that the results are complete. For instance, it’s very easy to miss cases in which someone has copy-pasted buggy code into another area, and then modified it. This essentially voids the query, because the statement containing the keyword or matching regular expression has been deleted or modified.

Learn why up to 45% of software bugs shipped are previously fixed defects and what you can do about it.

The Solution

There is only one way to know if a previously fixed bug is hiding in a source branch: examine the code itself. But no matter how much you strengthen your manual process, it will be broken because “to err is human.” So, the right solution needs to gain “direct insight” into the code itself.

Research shows that 67% of code is modified after reuse. Code duplicates clearly are not exact. This means that in order to find every instance of a bug across all versions and branches, not only must all identical matches be found, but all similar matches must be identified as well.

Pattern Insight‘s Code Assurance solution is the only direct intelligence solution that detects every instance of a bug so it can never be released again. Based on Pattern Insight’s patent-pending fuzzy matching technology, it  tolerates any variable name change and statement insertion and deletion.

Often a bug fix involves multiple snippet changes in a number of files. Pattern Insight Code Assurance parses and analyzes the code, understanding its semantics. This lets it intelligently determine the right amount of context and the right level of fuzziness to be considered for each snippet and consolidate the results from multiple snippet results to draw an accurate conclusion.

It has the critical characteristics for adoption as well:

  • Fast - Pattern Insight always return results in seconds, even for code bases in the billions of lines.
  • Accurate - Pattern Insight has extremely low false positives.
  • Easy-to-Use - Pattern Insight fully integrates with all SCM systems and is easily used in any development, build and release process.

Learn why up to 45% of software bugs shipped are previously fixed defects and what you can do about it.

Use Cases

Here are the three most common use cases for Pattern Insight:

1. Ensure releases are clean of previously fixed bugs

For Build/Release Owners, Pattern Insight easily integrates into the release process, or continuous integration, to identify previously fixed bugs in nightly builds or releases going out the door.

For example, customers of Pattern Insight have built catalogs containing hundreds of security vulnerabilities and other important defects and run nightly reports indicating if any of these bugs have leaked into their daily builds. If a match is found, the build is blocked and the developer is automatically notified.

2. Eliminate all instances of bugs in development

Developers use Pattern Insight to ensure bugs have been fixed across all locations, branches and components in the development process. Simply running a Pattern Insight report identifies every instance of a bug, allowing each to be fixed immediately. And Pattern Insight can support custom workflows, by enabling alerts at code review or code check-in.

3. Ensure all changes are ported into the final release

Release engineers use Pattern Insight to automatically ensure that changes made in separate branches make it into the final release. The number of changes checked can be hundreds to thousands in one single run. What takes months to manually verify takes only minutes with Pattern Insight.

Learn why up to 45% of software bugs shipped are previously fixed defects and what you can do about it.