Black Box · Combinatorial · CTFL 4.0

Pairwise Testing

A combinatorial technique that ensures every pair of input values is tested at least once — typically cutting a combinatorial explosion of test cases by 80–95% while still catching the majority of defects.

Junior Senior ISTQB CTFL 4.0 — 4.2

1 The Hook

A Wellington logistics firm rebuilt its parcel-booking screen. It had four dropdowns: carrier (NZ Post, CourierPost, Aramex, NZ Couriers), service level (standard, overnight, rural, Saturday), package size (satchel, small box, large box, pallet), and payment (account, credit card, on-account credit). The team tested the combinations they used most often — CourierPost overnight small box on account, NZ Post standard satchel credit card — and everything looked clean. Shipped on a Friday.

On Monday the support queue lit up. Rural overnight bookings were silently dropping the rural surcharge, but only when paid by credit card. Nobody had tried that exact combination, because each dropdown worked perfectly on its own. The carrier was fine. The service level was fine. The payment method was fine. The bug lived in the pair — rural service crossed with credit-card payment — and the team would have needed all 256 combinations to be sure of catching it by brute force, which no one has time for in a sprint.

This is the trap pairwise testing is built for. Most interaction bugs are caused by two settings clashing, not by some rare four-way alignment. You do not need every combination. You need every pair, and that is a far smaller, achievable set.

2 The Rule

Most defects are triggered by one or two interacting inputs — so instead of testing every combination, design the smallest set of cases that pairs every value of every parameter with every value of every other parameter at least once.

3 The Analogy

Analogy

Seating a wedding so every family group meets every other.

A couple wants every side of the family to have met every other side by the end of a Marlborough wedding. They could throw one giant dinner where all 256 guests share one table — complete coverage, totally impractical. Or they could rotate people across a handful of tables so that every pair of families ends up sitting together at some point during the night. The clashes that matter — Aunty A who cannot stand Uncle B — show up between two people, not among five. Cover every pair and you will surface every feud that is going to happen.

Pairwise testing is that seating plan. You do not need every guest in the room at once; you need every pair to have shared a table. A small, well-designed rotation gets you there.

What it is

Pairwise testing (also called all-pairs testing) is a combinatorial test design technique. Rather than testing every possible combination of input values, you select a minimal set of test cases that covers every pair of values at least once. The insight is that most defects are caused by interactions between one or two variables — not by complex interactions among many variables simultaneously.

This is distinct from equivalence partitioning or boundary value analysis, which test individual inputs in isolation. Pairwise is for when you have multiple inputs that could interact with each other, and you need to balance coverage against the cost of running many tests.

ISTQB definition: Pairwise testing is a black-box test design technique in which test cases are designed to cover all pairs of values of two parameters at least once. It is a specific form of combinatorial testing (CTFL 4.0 section 4.2).

The problem it solves

Imagine testing a checkout form with four configuration parameters, each with four possible values:

  • Payment method: Credit card, POLi, Afterpay, Gift card
  • Delivery type: Standard, Express, Same-day, Click & Collect
  • Customer type: Guest, Registered, VIP, Business
  • Discount code: None, 10% off, Free shipping, Bundle deal

Full combinatorial coverage: 4 × 4 × 4 × 4 = 256 test cases. Running 256 test cases for a single feature is rarely realistic in a sprint.

Pairwise testing covers all two-way interactions (every pair of values tested together at least once) in approximately 20–25 test cases. That is the same defect-catching power for combinations, at less than 10% of the testing cost.

When to use it

  • Testing systems with many configuration options where combinations matter (software settings, checkout options, filter combinations)
  • Form fields with multiple valid values that interact (discount code + payment method + delivery type)
  • Feature flags — when multiple flags can be on or off, pairwise covers the interactions without testing every permutation
  • Compatibility matrices — OS × browser × screen resolution
  • Regression suites where scope needs to be cut without eliminating coverage of interactions

NZ examples where pairwise applies

  • NZ e-commerce checkout: payment method (credit card / POLi / Afterpay) × delivery type (NZ Post standard / CourierPost / Click & Collect) × discount code (none / percentage / free shipping)
  • NZ banking: account type (everyday / savings / term deposit) × transaction type (transfer / bill payment / international) × amount range (small / medium / large)
  • Software configuration screen: browser (Chrome / Firefox / Safari / Edge) × OS (Windows / macOS / iOS / Android) × locale (en-NZ / mi-NZ)

Worked NZ example

A NZ online booking system has three parameters:

  • Payment method: Credit card, POLi, Afterpay
  • Delivery type: Standard (3–5 days), Express (next day), Click & Collect
  • Location: Auckland, Wellington, Christchurch

Full combinations: 3 × 3 × 3 = 27 test cases

Pairwise covers all pairs in 9 test cases:

NZ booking system — pairwise test set (9 cases cover all pairs)
Test # Payment method Delivery type Location Pairs covered
1Credit cardStandardAucklandCC+Std, CC+AKL, Std+AKL
2Credit cardExpressWellingtonCC+Exp, CC+WLG, Exp+WLG
3Credit cardClick & CollectChristchurchCC+C&C, CC+CHC, C&C+CHC
4POLiStandardWellingtonPOLi+Std, POLi+WLG, Std+WLG
5POLiExpressChristchurchPOLi+Exp, POLi+CHC, Exp+CHC
6POLiClick & CollectAucklandPOLi+C&C, POLi+AKL, C&C+AKL
7AfterpayStandardChristchurchAP+Std, AP+CHC, Std+CHC
8AfterpayExpressAucklandAP+Exp, AP+AKL, Exp+AKL
9AfterpayClick & CollectWellingtonAP+C&C, AP+WLG, C&C+WLG

Every pair is covered at least once across these 9 tests. If Afterpay + Click & Collect has a bug (a known issue with some NZ payment providers and in-store pickup), test #9 will catch it. If Express delivery to Christchurch has a routing issue, test #5 will catch it.

Important: do not build pairwise tables by hand for more than 3–4 parameters. The algorithm is complex and humans make pairing errors. Use a tool (PICT, AllPairs, Hexawise) to generate the table from your parameter list.

Why it works

Research by NIST (National Institute of Standards and Technology) analysed hundreds of software defects and found that:

  • ~70% of defects are caused by a single parameter (covered by equivalence partitioning)
  • ~90% of defects are caused by interactions between at most 2 parameters (covered by pairwise testing)
  • ~98% of defects are caused by interactions between at most 3 parameters (covered by 3-way combinatorial testing)

This means pairwise testing — covering all 2-way interactions — catches approximately 90% of the defects that full combinatorial testing would catch, at a fraction of the test count. The remaining ~10% require 3-way or higher-order coverage and are typically reserved for safety-critical systems.

Tools

  • PICT (Pairwise Independent Combinatorial Testing, Microsoft, free CLI tool) — the most widely used pairwise generator. You define your parameters and values in a text file; PICT outputs the minimum test set. Runs on Windows, macOS, and Linux.
  • AllPairs (free, Python-based) — similar to PICT; useful in Python-heavy automation environments
  • Hexawise (commercial, web-based) — more user-friendly interface; produces pairwise and higher-order combinatorial sets; good for larger parameter sets

Using PICT: a quick example

Create a .txt file with your parameters and values:

PICT input file — NZ booking system
File content
Payment: Credit card, POLi, Afterpay
Delivery: Standard, Express, Click & Collect
Location: Auckland, Wellington, Christchurch

Run pict booking.txt and PICT outputs the 9-row test set above. For 10 parameters with 5 values each (full combinations: 5^10 = ~10 million), PICT generates approximately 35 test cases.

When not to use it

  • Safety-critical systems — medical devices, aviation, nuclear control systems may require full combinatorial testing or even higher-order coverage. Pairwise is not sufficient when all combinations must be verified for safety.
  • Known 3-way (or higher) interactions — if you already know that three specific parameters interact in a way that causes defects, use 3-way combinatorial testing (or targeted decision table testing) rather than pairwise.
  • Very few parameters or values — if you have only 2 parameters with 3 values each, full combinatorial is only 9 tests. There’s no need for pairwise reduction.

Bugs pairwise testing typically catches

  • A discount code that fails only when combined with Afterpay as the payment method (not with credit card or POLi)
  • Click & Collect orders that fail for Wellington customers but not Auckland or Christchurch
  • A form that only breaks when a long company name is entered and the city is Christchurch (due to a character limit in the Christchurch address database lookup)
  • Express delivery unavailable for certain NZ regions, but only when a specific discount code is applied (the two interact in the pricing engine)
  • A configuration screen where two specific feature flags enabled simultaneously cause a conflict, even though each flag works correctly in isolation

ISTQB mapping

ISTQB CTFL v4.0 reference
Syllabus refTopicLevel
CTFL 4.0 — 4.2Combinatorial testing techniques — pairwise as a specific techniqueFoundation
FL-4.2 K3Apply pairwise testing to derive test cases for a given component or systemFoundation LO
CTAL-TA 3.2Advanced combinatorial techniques, n-way testing, constraint modellingAdvanced / Senior

The ISTQB Foundation exam expects you to apply pairwise testing (K3). You must be able to identify parameters and values from a specification, understand what a pairwise table covers, and know when the technique is appropriate.

Tips

Don’t build pairwise tables by hand — use PICT or AllPairs. Feed in your parameters and values, and the tool generates the minimum set in seconds. Pairwise is especially valuable for regression suites where you need to cut scope without cutting coverage of interactions. If your sprint regression suite has 200 tests covering a configuration-heavy feature, a well-designed pairwise set might achieve the same interaction coverage in 30.

  • Identify parameters carefully — the quality of your pairwise set depends entirely on correctly identifying the parameters that can interact. Think about which inputs go through the same code path together.
  • Add a few targeted tests beyond pairwise — if you have business knowledge that suggests a particular 3-way interaction is risky, add a specific test for it. Pairwise is a baseline, not a ceiling.
  • Use pairwise for compatibility matrices — if you need to test across browsers, operating systems, and screen sizes, pairwise dramatically reduces the matrix size without abandoning cross-environment coverage.
  • Combine with risk-based prioritisation — not all pairs are equally risky. Use business knowledge to flag which parameter combinations are highest-risk and verify those are included in your generated set.

Practice this technique: Try Junior Practice 06 — Dropdown & select bugs.

4 Now You Try

Three graded exercises — spot, fix, then build. Write your answer, run it for AI feedback, then compare to the model answer.

🔍 Exercise 1 of 3 — Spot: identify the parameters and count the cases

A Foodstuffs click-and-collect checkout has three settings: store brand (New World, PAK'nSAVE, Four Square), collection slot (next hour, same day, next day), and payment (credit card, account, gift card). State the number of cases for full combinatorial coverage, then explain in your own words how many a pairwise set needs and why it is so much smaller.

Show model answer
Parameters (3, each with 3 values):
- Store brand: New World, PAK'nSAVE, Four Square
- Collection slot: next hour, same day, next day
- Payment: credit card, account, gift card

Full combinatorial: 3 x 3 x 3 = 27 test cases.

Pairwise: about 9 cases. With three 3-value parameters, every pair of values can be covered in 9 well-chosen rows (the same shape as a 3x3 Latin-square layout).

Why it is so much smaller: pairwise only guarantees that every PAIR of values appears together once, not every full triple. Each test case covers three pairs at once (brand+slot, brand+payment, slot+payment), so a handful of cases mops up all the pairs. Full coverage forces every three-way triple to appear, which multiplies out fast.
🔧 Exercise 2 of 3 — Fix: repair a set with a missing pair

A tester drafted the pairwise set below for an IRD myIR login flow: device (desktop, mobile), 2FA method (SMS code, authenticator app), browser (Chrome, Safari). It claims to be a complete pairwise set, but at least one pair is never covered. Find the missing pair(s) and add the row(s) needed to complete coverage.

Drafted set:
1. desktop — SMS code — Chrome
2. mobile — authenticator app — Safari
3. desktop — authenticator app — Chrome

List the missing pair(s) and the row(s) to add:

Show model answer
Missing pairs in the drafted set:
- mobile + SMS code (never together)
- mobile + Chrome (never together)
- SMS code + Safari (never together)

The draft only ever puts mobile with the authenticator app and Safari, so several mobile and SMS pairs go untested.

A clean completing addition:
4. mobile + SMS code + Chrome  (covers mobile+SMS, mobile+Chrome, SMS+Chrome already seen but fine)
5. desktop + SMS code + Safari (covers SMS+Safari, desktop+Safari)

After adding rows 4 and 5, every pair across the three parameters appears at least once. The lesson: a set is not "pairwise" just because it has a few rows — you must audit it against the full list of required pairs.
🏗️ Exercise 3 of 3 — Build: a pairwise set from scratch

An Auckland Transport AT HOP top-up kiosk has three parameters: card type (adult, child, tertiary), top-up amount ($5, $20, $50), and payment (EFTPOS, credit card, cash). Design a pairwise test set that covers every pair at least once, and note which tool you would use rather than hand-building it for a larger set.

Show model answer
Three 3-value parameters cover all pairs in 9 cases. One valid set:

1. adult    — $5   — EFTPOS
2. adult    — $20  — credit card
3. adult    — $50  — cash
4. child    — $5   — credit card
5. child    — $20  — cash
6. child    — $50  — EFTPOS
7. tertiary — $5   — cash
8. tertiary — $20  — EFTPOS
9. tertiary — $50  — credit card

Every card-type+amount pair, card-type+payment pair, and amount+payment pair appears at least once. (A solver can sometimes squeeze 3x3x3 below 9, but 9 is the safe, easy-to-build answer.)

Tool for larger sets: PICT (Microsoft's free CLI) or AllPairs. Never hand-build beyond 3-4 parameters — humans miss pairs, exactly the mistake in Exercise 2.

Self-Check

Click each question to reveal the answer.

Q1: What exactly does a pairwise test set guarantee — and what does it not?

It guarantees that every pair of values from any two parameters appears together in at least one test case. It does not guarantee that every full combination (every three-way or higher triple) is tested. It targets the interaction bugs that involve two settings, which research shows are the large majority.

Q2: For three parameters with 4 values each, how many cases does full combinatorial coverage need, and roughly how many does pairwise need?

Full combinatorial is 4 x 4 x 4 = 64 cases. Pairwise covers all pairs in roughly 16–20 cases — well under a third — while still catching the two-way interaction defects.

Q3: Why should you not build pairwise tables by hand for more than three or four parameters?

The pairing algorithm is fiddly and people quietly miss pairs — producing a set that looks complete but leaves gaps (exactly the failure in the fix exercise). Use a generator such as PICT or AllPairs: feed in the parameters and values, and it outputs a minimal, verified set in seconds.

Q4: Roughly what share of defects do single-parameter, pairwise, and three-way coverage each catch, per NIST research?

About 70% of defects come from a single parameter, around 90% are covered once you test all two-way (pairwise) interactions, and about 98% by three-way coverage. Pairwise gets you to roughly 90% at a fraction of the cost of full combinatorial testing.

Q5: When is pairwise not enough, and what should you do instead?

For safety-critical systems (medical, aviation) or where you already know a specific three-way interaction is risky, pairwise's two-way guarantee is insufficient — step up to three-way (or higher) combinatorial coverage, or add targeted tests for the known risky combination. Pairwise is a strong baseline, not a ceiling.

Try It — Spot the missing pair

A NZ ferry booking system has three parameters. Someone has drafted a pairwise test set of 7 cases, but two pairs are not covered. Identify which pairs are missing.

Parameters:
  • Payment: Credit card (CC), POLi (PO), Afterpay (AP)
  • Passenger type: Adult (AD), Child (CH), Senior (SR)
  • Route: Interislander (IS), Bluebridge (BB)
Test #PaymentPassengerRoute
1CCADIS
2CCCHBB
3POADBB
4POCHIS
5APADBB
6APCHIS
7CCSRIS

Which two pairs are not covered by this test set?