Defect Management is an essential process for fostering continuous improvement
‘Bugs, errors, imperfections, flaws, failures, deficiencies, glitches, problems, faults’ — are terms often used in software development and quality assurance and the usage of them may vary depending on context and industry standards. Very often all of these are grouped into one general term — a ‘defect’. ‘Issue’ is a broad term that can include defects, errors, bugs, etc. ‘Error’ specifically refers to human mistakes in development that can lead to defects. ‘Bug’ is a type of defect that specifically refers to issues in software that cause it to function incorrectly or unexpectedly. So on and so forth. These definitions can be continued. However, the most important is how all of those are managed by the team.
Raising defects is a great way to foresee certain metrics like:
- Product quality — Are the right requirements provided? Is the implementation of requirements correct?
- Team productivity — Are defects resolved within the agreed timeline? How often does ‘Fix on fix’ happen?
- Composition of the team’s required skill set — If too many defects are raised with ‘incorrect testing’ as the root cause analysis (RCA), it might indicate the team is not well-trained in product knowledge. If too many coding issues are raised, it might be an indicator of a team skills issue.
This list is not limited to the above examples. I just wanted to explain how defect metrics can be insightful for continuous improvement in different aspects.
Taking action items to ‘prevent’ defects is as important as ‘detecting’ defects. The defect backlog should always be under continuous review; otherwise, it will cause technical debt to increase and quality to decrease. To avoid this situation in your project, proactively set up a defect triage process with clear steps and designated responsible parties. During the process it is very important to document the root cause of the defect. This analysis will help to define action items and ensure the same issue does not occur in the future.
Four integral parts of the effective Defect Management process:
- Define a Defect life cycle that works for your organization and the team that will follow it.
- Perform root cause analysis systematically to make it a routine process.
- Set up expectations regarding the Defect Service Level Agreement to ensure everyone is on the same page about key definitions.
- Continuously monitor Defect metrics to identify patterns and conduct more targeted and effective quality control measures.
Now, let’s take a closer look at these aspects.
Defect life cycle
A pre-defined flow on how to filter non-issues and prioritize defects efficiently will help to set up expectations of roles and responsibilities for all the team members involved in the process. Of course, the below cycle example might be adapted to the project’s needs (e.g. include business representatives to decide on priority and business impact, include DevOps team to align on deployment windows and infrastructure changes when applicable etc.). However, the basic rules will remain the same.
Step #1: Log the defect, analyze it and make a call if it’s valid
- Is it reproducible?
- Does it miss relevant information (screenshots, logs, steps to reproduce etc.) ?
- Is it valid / is it a defect at all?
- Is it a duplicate?
- Is it in scope of the current iteration?
Step #1 will cover statuses: ‘New’, ‘Assigned’, ‘Not a Defect’, ‘Not Reproducible’, ‘Need Information’, ‘Duplicate’, ‘Deferred’.
Step #2: Prioritize based on the current iteration scope, assign priority, severity, and set a deadline by when it should be resolved. Make sure the developer has accepted the defect and understood the priority.
Step #2 will cover status ‘In Progress’.
After completing Steps #1 and #2, the test manager or test lead can start collecting the defect report and first quality metrics.
Here are just a few examples of how early defect metrics can be helpful:
- Compare the number of valid defects versus invalid ones. If the number of invalid defects is too high, it might indicate that the team has a low understanding of the product or the scope that was delivered.
- Monitor the categories of root causes — if too many defects are raised due to incorrect requirements, it means the team needs to take a pause in development and focus on refining user stories to avoid spending time and resources on delivering no-value deliverables.
- If too many defects are raised due to environmental issues, it could be an early sign of immature infrastructure architecture, and the delivery manager needs to be informed to take proactive action and fix the gap.
Step #3: Code the fix, add / update unit tests, add / update functional or non-functional tests, select the regression suite, and deploy the fix to the target environment. After successful unit testing and deployment, the defect will be given the status ‘Fixed.’ At this step, it is very valuable to document the nature of the fix, the build version on which it was fixed, and the applicable release notes (for example, if it’s a production issue fix).
Step #4: This is where the Quality Engineer will go through rounds of testing for the provided fix:
- Perform a smoke test by executing the ‘Steps to reproduce’.
- Execute all related failed functional or non-functional tests (depending on the nature of the defect).
- Execute selective regression to ensure no breaking changes to other modules are introduced. Refer to Regression in one go for more insights about regression strategy.
- Link all test results, provide feedback, and determine whether it should be reopened or resolved.
Step #4 will cover statuses such as ‘QA Testing’, ‘Reopened’, or ‘Closed’.
As a proactive leader you should not underestimate the value of an efficient process right from the beginning. Facilitating a proactive and strategic way of addressing defects is a key in maintaining high standards of software quality, managing resources effectively, meeting project timelines, and ensuring end-user satisfaction.
Experience-based tips:
- Use a defect triage call to review priority, severity, and due dates with the team (Dev, QA, Product owner), and agree on SLAs.
- After the fix is applied, the developer has to describe what was fixed and what changes are introduced (code, configuration, etc.). This information is essential to perform fix impact analysis and ensure regression tests are selected properly.
- Document the build version in which the fix was applied so that rolling back to the latest working version can be done quickly if the fix fails.
- Defect retesting should include the re-execution of all failed tests plus a sanity regression.
- Root cause analysis and capturing the reasons for defects help to prevent similar issues in the future.
- Occasionally conduct a team retrospective on the defect report to ensure lessons learned and action items are defined.
Defect Root Cause Analysis (RCA)
Defect root cause analysis is a process of investigating the origins of defects and understanding why a defect occurred. Ultimate goal is not only resolve the defect but address the root causes. n this way, action items can be identified and implemented to eliminate the same situation in future iterations.
Defect service level agreement (SLA)
Conducting a thorough root cause analysis is integral for defects predicting and enhancing the efficiency of the development process. It is a proactive approach to quality assurance that focuses on continuous improvement. It is also very important to address defects within expected timelines. Keeping a defect open may affect business outcomes and customer satisfaction, or may lead to a loss of team productivity. Therefore, one important step is to agree on Service Level Agreements (SLAs) and ensure that defect resolution times fall within these agreements. Below is an example of how severity and priority combine to determine the time allocated to address the defect.
Below is an example. High priority blockers should be resolved within hours. Blockers of any other priority, and high priority critical issues, should be resolved within a day. Addressing major level defects can be extended to a few days. In the case of minor and trivial defects, the timeline can be discussed based on team capacity. Such defects can be addressed in the next releases if there is no business impact.
“I don’t agree it’s a Blocker.” — Does this sound familiar? I am sure this type of back and forth conversation happens between developers and test engineers. The agreement made at the beginning of the project will make sure no time is spent on arguing what is considered a ‘Blocker’, ‘Critical’, ‘Major’, ‘Minor’, and ‘Trivial’. Transparency and clear communication are key in successful defect management.
Defect metrics
Below are important metrics to keep track in the Defects report:
In summary, defect management is not just about finding and fixing bugs; it is a strategic process that contributes to the overall success of a project by improving quality, satisfying customers, controlling costs, and fostering continuous improvement. The sooner a potential defect is identified and fixed in the process, the less impact it will have on resources, time, money, and company image in terms of cost. It is an integral part of ensuring that software products meet high standards of reliability and performance.