Junior Automation · UI Interaction

Locator Strategies

The best test in the world is useless if it breaks every time the CSS changes. Learn how to target elements that survive refactoring.

Junior Automation ISTQB CTAL-TAE v2.0 — Chapter 3 ~10 min read + exercise

1 The Hook — Why This Matters

Research on Trade Me's automation suite (Patel & Ali, 2021) identified a painful pattern: XPath was the number-one cause of flaky tests on their dynamic site. When the marketing team added a single wrapper div to the homepage, forty-seven tests broke overnight.

The root cause? Absolute XPath paths like /html/body/div[2]/div/form/input. Any insertion, deletion, or reordering of DOM nodes shifted the entire coordinate system. The team migrated to ID and data-testid locators. Flakiness dropped by sixty percent in the next quarter.

Your locator strategy determines whether your suite is an asset or a maintenance tax. Choose wrong, and you will spend more time fixing tests than finding bugs.

2 The Rule — The One-Sentence Version

Prefer locators that describe what the element is or does, not where it sits in the DOM, because layout changes far more often than purpose.

A button that submits a payment will always be a "submit payment" button, even if the design system changes, the CSS classes are renamed, or the component is moved to a modal. A locator tied to structure dies on the first refactor.

3 The Analogy — Think Of It Like...

Analogy

Giving a courier delivery directions.

You can say "third house from the corner, past the oak tree, left of the red mailbox." That's an XPath. When a new house is built, the oak tree is pruned, and the mailbox is repainted blue, the courier is lost. Or you can say "the dairy with the blue door." That's a role + accessible name. The dairy is still the dairy no matter how the street changes.

4 Watch Me Do It — Step by Step

Playwright publishes a clear locator priority. Memorise this order and default to the top.

  1. Role + accessible name Survives layout changes and enforces accessibility.
    page.get_by_role("button", name="Submit payment")
  2. Label Tied to <label for="...">. Semantic and stable.
    page.get_by_label("Email address")
  3. Placeholder or visible text Human-readable, but can collide if duplicated.
    page.get_by_placeholder("Search products...")
    page.get_by_text("Order confirmed")
  4. Test ID Stable, decoupled from CSS, but requires front-end cooperation.
    <button data-testid="checkout-submit">Pay Now</button>
    
    # Test code
    page.get_by_test_id("checkout-submit")
  5. CSS selector Fast and native, but brittle when classes change.
    page.locator("[data-testid='login-submit']")
  6. XPath Last resort. Slow, hard to read, and fragile.
    page.locator("//button[contains(text(), 'Submit')]")
Anti-patterns — what breaks and why
Anti-patternWhy it breaks
div > div > div:nth-child(2) > buttonAny wrapper insertion kills the depth
.css-1a2b3c4 (CSS-in-JS hash)Rebuilds change the hash unpredictably
//*[contains(@class, "btn")]Slow; may match multiple elements
Absolute XPath from rootAny page structure change breaks it
Pro tip: Establish a team naming convention for data-testid attributes. Use kebab-case (user-profile-button) and apply them only to interactive or critical assertion targets. Don't pollute the DOM with test IDs on every paragraph.

5 When to Use It / When NOT to Use It

✅ Prioritise robust locators when...

  • Automating any UI with frequent front-end refactors
  • Working on dynamic SPAs with shifting DOM structures
  • Building a suite that must survive design-system updates
  • Accessibility is a project requirement
  • You want tests to act as accessibility regression checks

❌ Structural locators are acceptable when...

  • Testing API-only suites (no UI to locate)
  • Writing unit tests that don't touch the DOM
  • The page is a frozen legacy screen that never changes
  • You are prototyping and accept refactoring later

Before you apply this technique, ask:

  • Is this element accessible to users (visible, functional)?
  • Will this locator break if the page is refactored?
  • Does your team have a documented strategy for locator priority?
 -->

6 Common Mistakes — Don't Do This

🚫 Deep nested CSS selectors

I used to think: A long CSS chain is precise and therefore good.
Actually: Precision in CSS selectors is fragility in disguise. .modal > .content > .form > button breaks when any of those wrappers are renamed, moved, or wrapped in a new component. Use getByRole("button", { name: "Save" }) instead.

🚫 Relying on CSS-in-JS generated hashes

I used to think: The class name in the browser is stable enough.
Actually: Hashes like .css-1a2b3c4 are regenerated on every build. They change when dependencies update, when tree-shaking runs differently, or when a colleague adds a new import. Never assert on generated class names.

🚫 Using absolute XPath

I used to think: XPath is powerful, so I should use it for everything.
Actually: Absolute XPath like /html/body/div[2]/div/form/input is the most fragile locator strategy in existence. Any DOM insertion shifts the indices. Relative XPath with axes is acceptable for complex legacy DOMs, but role and test-id locators are always preferred.

When this technique fails

CSS selectors are fast but break on class refactors. XPath is flexible but slow and hard to read. If you pick the wrong strategy, a design refresh breaks 47 of your tests (like Trade Me learned).



7 Now You Try — Interview Warm-Up

🎯 Interactive Exercise

Three locators target the same "Pay Now" button. Which is the most robust?

  1. page.locator(".css-9x4a2b > button")
  2. page.get_by_role("button", name="Pay Now")
  3. page.locator("/html/body/div[3]/form/button[2]")

Write down your answer and reasoning before revealing.

The best choice is option 2:

page.get_by_role("button", name="Pay Now")

Option 1 relies on a generated CSS hash that changes on rebuild. Option 3 is absolute XPath that breaks on any DOM change. Option 2 uses the button's semantic role and visible label. It survives CSS refactors, wrapper insertions, and design-system upgrades. It also enforces accessibility: if the label changes, the test fails — which is exactly what you want.

8 Self-Check — Can You Actually Do This?

Click each question to reveal the answer. If you got all three, you're ready to practice.

Q1. Why is getByRole preferred over CSS selectors?

Because it targets the element's semantic purpose ("button", "heading", "navigation") rather than its visual presentation. It survives CSS refactors, wrapper insertions, and design-system updates. It also doubles as an accessibility regression check.

Q2. What makes data-testid stable?

It is decoupled from CSS and layout. The front-end team agrees on a naming convention and only changes the attribute when the component's test identity actually changes. Unlike class names, it is not affected by styling refactors or build tooling.

Q3. Why is absolute XPath considered the last resort?

It encodes the entire DOM path from the root. Any insertion, deletion, or reordering of nodes shifts the path and breaks the locator. It is also slow to evaluate and difficult for teammates to read and maintain. Use relative XPath with axes only when no better locator exists.

Interview prep

Interview questions test whether you choose locators strategically and understand the trade-offs.

Q1: A designer changes the Trade Me property card from divs to articles and updates CSS classes. Which locator strategy would survive this refactor: CSS class selector, getByRole, or XPath?
getByRole('article') survives because it doesn't depend on CSS. The CSS class selector breaks immediately. XPath might survive but is fragile and slow. Accessible queries like getByRole are your best bet for stability.
Q2: Your team has 47 tests failing after a CSS refactor. The old tests used `document.querySelector('.btn-checkout.primary')`. What's the fix?
Switch to `getByRole('button', { name: /checkout/i })` or another accessible query. This way, the test cares about what the button does (accessible name), not how it looks (CSS classes). Add this to your team's locator strategy document to prevent it happening again.
Q3: You're testing a BNZ form with 15 input fields. Should you use relative XPath like `//input[@placeholder='Account number']`, or test-id attributes like `data-testid='input-account'`?
Prefer test-id—it's explicit, fast, and survives refactors. But first try getByPlaceholder('Account number'), which uses the visible placeholder (what users see). Only use XPath if you have no other choice—it's slow and hard to maintain.
Q4: You find an element using `getByRole('button', { name: /submit/i })`, but the test flakes sometimes. What might be wrong?
The button text might be wrapped in nested elements, or the text is splitting across lines. Use the 'submit' in a case-insensitive regex like `/submit/i`, or inspect the button's accessible tree with `screen.getByRole('button').textContent`. Flakiness often means the element structure is more complex than it looks.