Photo by Nick Gosset on Unsplash
7 Ways to Fix Flaky Tests
My top seven ways to minimize test flakiness
1: Prefer explicit waits
Waits are one of the most common reasons for flaky tests. An automated test typically consists of a series of steps that must be executed in order one after the other. The app the test is attempting to control, however, often depends upon APIs and systems that may perform faster or slower based on a variety of conditions.
For the test to not run faster than the app it is testing, the test must wait for the app at certain times for the app to be ready. How you execute waits, though, is very important for improving test reliability and avoiding flakiness.
Use hard-coded waits as a last resort
What is a hard-coded wait?
Hard-coded waits are functions that pause the execution of a test script for a constant amount of time. While it may be tempting to say "waiting for this amount of time works for most conditions," odds are that:
when the app is working normally, your tests will waste time waiting for the maximum amount of time to pass
when the app or test environment is under strain, your tests will break since the app will take longer to change state than the hard-coded wait expected
Here are some examples in Selenium, Cypress, and Playwright
Now that you know what a hard-coded wait is, try not to use them! Your automated tests should only use hard-coded waits when other alternatives are impractical or unavailable, and encourage your colleagues to avoid hard-coded waits as well through code reviews and coding standards.
Understand the effects of implicit waits
Most test automation libraries support implicit waits, such as Selenium, Cypress, and Playwright
An implicit wait works differently from a hard-coded wait in that:
the maximum wait time is set globally (hence "implicit" wait)
the implicit wait time automatically applies to every function in a test framework that locates an element or waits for an event to occur
if the expected element or event is detected, the wait will stop and text execution will continue even if the maximum wait time has not yet elapsed
From my experience, implicit waits generally improve the reliability of test execution and thus reduce flakiness. Implicit waits can also free less experienced testers from having to worry about wait times for common actions, which can help newer testers gain confidence.
Use explicit waits to make tests more resilient
Wait until specific UI elements or events are detected to continue a test
Selenium: ExpectedConditions
Cypress: wait strategies from Filip Hric
Playwright:
auto-waiting means that Playwright performs a range of actionability checks on the elements, such as ensuring the element is visible and enabled before it performs the click
expect timeouts will wait for an assertion to occur; the timeout for the assertion is unrelated to the test timeout
If you want to have more confidence that a test will run fluidly, you can wait for entire lists of elements to load. For example, when testing a particular page, you can wait until all of the elements you will interact with on that page have loaded, or wait until the elements that usually take the longest to load have text.
Do not mix implicit and explicit waits, however:
As the Selenium docs warn:
- " Do not mix implicit and explicit waits. Doing so can cause unpredictable wait times. For example, setting an implicit wait of 10 seconds and an explicit wait of 15 seconds could cause a timeout to occur after 20 seconds."
For more experienced testers, to achieve consistent wait times while using both implicit and explicit waits you may want to try turning implicit waits off (or setting the implicit wait time to zero) before using an explicit wait, catching any exceptions, and then finally turning implicit waits back on. That way, you can enjoy the best of both wait styles!
2: write independent tests
- Independent tests set up their own test data and test state. Dependent tests, in contrast, rely on other tests to create test data or set the app to a certain state. For example, a test may rely on a previous test to create a test user and initialize that test user with certain settings. App state that is shared globally between tests is particularly vulnerable to flakiness, since other tests may change the state in unexpected ways.
3: use XPath axes and CSS combinators
As shown in the diagram above, a logical tree model known as the Document Object Model (DOM) is used by most test automation tools to describe webpages. A similar XML tree is often used to describe mobile apps.
To navigate these tree models, XPath axes and CSS combinators describe the relationship between UI elements. Of these combinators, "descendant" is the axis/combinator I use most often. Typically one wants to find an element that is contained within a larger component, so finding the outer component first and then finding the inner (descendant) element makes creating XPath and CSS selectors much easier.
4: Use HTML data attributes
If you are testing webpages, your team should be using HTML data attributes. They are simple, resilient selectors that are easy to create, maintain, and understand.
Most test automation tools use the DOM to find elements on a webpage. There's only one problem; webpage visitors don't read the DOM! Most of your users are looking at your webpage and navigating it visually.
Your tests, meanwhile, are probably looking at the DOM to find HTML elements. HTML data attributes are designed to let any HTML element store "extra information that doesn't have any visual representation." Selectors are extra information about an element that I don't want to display visually... Sounds like a good fit!
So what does an HTML data attribute look like?
<a href='https://www.telemundo.com/noticias'
data-test-element-name='telemundo-news-link'
id='noticias'>Noticias Telemundo</a>
Above is a basic example of an HTML link. It contains a URL to a Spanish-language news website, and the text displayed is in Spanish as well ("Noticias Telemundo" translates as "Telemundo News" in English).
As seen above, HTML data attributes allow testers to:
create selectors that are easy to detect and read
provide information to our tests about an element that is independent of the other information the element contains
create selectors that are language-independent
- the selectors for your test can be in the spoken language used by your testing team, while the text displayed by your webpage can be in another language
create selectors that are used by QA only
- IDs, CSS, and other selectors may be used by the front-end framework used to create and render the webpage
establish naming conventions for data attributes that allow developers to add them easily
While HTML data attributes are recognized as a best practice for test automation (Cypress, Playwright), some developers might complain that they don't want these attributes to be released to production. Fortunately, most front-end Web frameworks provide tools to strip HTML data attributes from code before it is packaged for production.
5. Don't use CSS style names as selectors
This is my personal opinion, and I can understand some people may find it controversial. While I am sure there are testers and development teams that use CSS styles successfully, what I have tended to encounter is that:
Developers will change styles without telling you
CSS is for styles. Cascading Style Sheets - it's in the name. Some might call that a clue! CSS is used to describe the way elements on a webpage look, not what they do.
Developers will probably be unaware of automated tests that may be using a CSS style as a selector. If they change the name of a style or remove that style from a single element - poof! Suddenly, your test may break for a reason unrelated to the functionality of the application.
Additionally, developers should be able to make updates to CSS styles without having to slow down their work.
Now, some people might say that by creating CSS styles that are QA-specific, developers will know not to remove them. Extra styles increase the file size of the CSS files that every webpage visitor has to download, however. If your developers can add a custom CSS style, why not use an HTML data attribute instead? Nikolay Advolodkin provides a battle-tested list of reasons in his article "Bulletproof Your Automated Testing: Why Data-* Attributes Trump CSS Selectors."
6: Establish naming conventions for selectors
As applications change, the projects and teams that create and update those applications will change as well. To make development faster and easier, QAs and the developer team should agree on naming conventions so that developers can code without having to ask the QA team what to name a selector.
Agreeing on naming conventions ahead of time will:
make development easier by allowing devs to write code without having to refer to a list of test automation selectors
allow testers to confidently create automated tests ahead of development because the selectors they expect are likely to be included in the app
7: Extend your framework so that you can use more than one locator for the same element
In software development, change is a fact of life, and as testers, those changes can cause our tests to break. If the change is small, such as a test needing a new selector to locate an element, the test can be quickly updated and the test maintenance is minor. Wouldn't it be nice, though, to keep test maintenance due to changing selectors to a minimum?
Many tools offered by test automation vendors provide some way to use more than one selector to identify an element, whether by providing more than one selector yourself or the tool generating selectors for you. Open-source tools like Healenium can also help to generate and maintain a list of selectors in an automated fashion.
Since the WebDriver functions used to find elements receive a "By" locator that accepts different selectors, an approach I have used on previous test automation projects is to create an object that contains a list of selectors and then loop through the selectors to try to find an element. If you are using the Selenium PageFactory feature, the "@FindAll" annotation will also allow your page objects to look for the first matching selector. This approach works particularly well when dealing with feature flags or testing code branches that may not be released to production in the current sprint.
Reducing test flakiness is a team effort
Reducing test flakiness is an important goal for test automation and CI/CD. Reliable test results increase the confidence of your team and allow more members of the team to run the tests and evaluate their results. Hopefully, the techniques in this article will help to make your tests more reliable and resilient. Happy testing!