Locator Strategies

1 The Hook — Why This Matters

Research on Trade Me's automation suite (Patel & Ali, 2021) identified a painful pattern: XPath was the number-one cause of flaky tests on their dynamic site. When the marketing team added a single wrapper div to the homepage, forty-seven tests broke overnight.

The root cause? Absolute XPath paths like /html/body/div[2]/div/form/input. Any insertion, deletion, or reordering of DOM nodes shifted the entire coordinate system. The team migrated to ID and data-testid locators. Flakiness dropped by sixty percent in the next quarter.

Your locator strategy determines whether your suite is an asset or a maintenance tax. Choose wrong, and you will spend more time fixing tests than finding bugs.

2 The Rule — The One-Sentence Version

Prefer locators that describe what the element is or does, not where it sits in the DOM, because layout changes far more often than purpose.

A button that submits a payment will always be a "submit payment" button, even if the design system changes, the CSS classes are renamed, or the component is moved to a modal. A locator tied to structure dies on the first refactor.

3 The Analogy — Think Of It Like...

Analogy

Giving a courier delivery directions.

You can say "third house from the corner, past the oak tree, left of the red mailbox." That's an XPath. When a new house is built, the oak tree is pruned, and the mailbox is repainted blue, the courier is lost. Or you can say "the dairy with the blue door." That's a role + accessible name. The dairy is still the dairy no matter how the street changes.

4 Watch Me Do It — Step by Step

Playwright publishes a clear locator priority. Memorise this order and default to the top.

Role + accessible name Survives layout changes and enforces accessibility.
```
page.get_by_role("button", name="Submit payment")
```
Label Tied to <label for="...">. Semantic and stable.
```
page.get_by_label("Email address")
```

Placeholder or visible text Human-readable, but can collide if duplicated.

page.get_by_placeholder("Search products...")
page.get_by_text("Order confirmed")

Test ID Stable, decoupled from CSS, but requires front-end cooperation.

<button data-testid="checkout-submit">Pay Now</button>

# Test code
page.get_by_test_id("checkout-submit")

CSS selector Fast and native, but brittle when classes change.
```
page.locator("[data-testid='login-submit']")
```

XPath Last resort. Slow, hard to read, and fragile.

page.locator("//button[contains(text(), 'Submit')]")

Anti-patterns — what breaks and why

Anti-pattern	Why it breaks
`div > div > div:nth-child(2) > button`	Any wrapper insertion kills the depth
`.css-1a2b3c4` (CSS-in-JS hash)	Rebuilds change the hash unpredictably
`//*[contains(@class, "btn")]`	Slow; may match multiple elements
Absolute XPath from root	Any page structure change breaks it

Pro tip: Establish a team naming convention for data-testid attributes. Use kebab-case (user-profile-button) and apply them only to interactive or critical assertion targets. Don't pollute the DOM with test IDs on every paragraph.

5 When to Use It / When NOT to Use It

✅ Prioritise robust locators when...

Automating any UI with frequent front-end refactors
Working on dynamic SPAs with shifting DOM structures
Building a suite that must survive design-system updates
Accessibility is a project requirement
You want tests to act as accessibility regression checks

❌ Structural locators are acceptable when...

Testing API-only suites (no UI to locate)
Writing unit tests that don't touch the DOM
The page is a frozen legacy screen that never changes
You are prototyping and accept refactoring later

Before you apply this technique, ask:

Is this element accessible to users (visible, functional)?
Will this locator break if the page is refactored?
Does your team have a documented strategy for locator priority?

-->

6 Common Mistakes — Don't Do This

🚫 Deep nested CSS selectors

I used to think: A long CSS chain is precise and therefore good.
Actually: Precision in CSS selectors is fragility in disguise. .modal > .content > .form > button breaks when any of those wrappers are renamed, moved, or wrapped in a new component. Use getByRole("button", { name: "Save" }) instead.

🚫 Relying on CSS-in-JS generated hashes

I used to think: The class name in the browser is stable enough.
Actually: Hashes like .css-1a2b3c4 are regenerated on every build. They change when dependencies update, when tree-shaking runs differently, or when a colleague adds a new import. Never assert on generated class names.

🚫 Using absolute XPath

I used to think: XPath is powerful, so I should use it for everything.
Actually: Absolute XPath like /html/body/div[2]/div/form/input is the most fragile locator strategy in existence. Any DOM insertion shifts the indices. Relative XPath with axes is acceptable for complex legacy DOMs, but role and test-id locators are always preferred.

When this technique fails

CSS selectors are fast but break on class refactors. XPath is flexible but slow and hard to read. If you pick the wrong strategy, a design refresh breaks 47 of your tests (like Trade Me learned).

7 Now You Try — Interview Warm-Up

🎯 Interactive Exercise

Three locators target the same "Pay Now" button. Which is the most robust?

page.locator(".css-9x4a2b > button")
page.get_by_role("button", name="Pay Now")
page.locator("/html/body/div[3]/form/button[2]")

Write down your answer and reasoning before revealing.

The best choice is option 2:

page.get_by_role("button", name="Pay Now")

Option 1 relies on a generated CSS hash that changes on rebuild. Option 3 is absolute XPath that breaks on any DOM change. Option 2 uses the button's semantic role and visible label. It survives CSS refactors, wrapper insertions, and design-system upgrades. It also enforces accessibility: if the label changes, the test fails — which is exactly what you want.

8 Self-Check — Can You Actually Do This?

Click each question to reveal the answer. If you got all three, you're ready to practice.

Q1. Why is getByRole preferred over CSS selectors?

Because it targets the element's semantic purpose ("button", "heading", "navigation") rather than its visual presentation. It survives CSS refactors, wrapper insertions, and design-system updates. It also doubles as an accessibility regression check.

Q2. What makes data-testid stable?

It is decoupled from CSS and layout. The front-end team agrees on a naming convention and only changes the attribute when the component's test identity actually changes. Unlike class names, it is not affected by styling refactors or build tooling.

Q3. Why is absolute XPath considered the last resort?

It encodes the entire DOM path from the root. Any insertion, deletion, or reordering of nodes shifts the path and breaks the locator. It is also slow to evaluate and difficult for teammates to read and maintain. Use relative XPath with axes only when no better locator exists.

← All Junior Automation learning Practice pages →

1 The Hook — Why This Matters

2 The Rule — The One-Sentence Version

3 The Analogy — Think Of It Like...

4 Watch Me Do It — Step by Step

5 When to Use It / When NOT to Use It

6 Common Mistakes — Don't Do This

7 Now You Try — Interview Warm-Up

8 Self-Check — Can You Actually Do This?

Interview prep