Upon completing the design phase of the software development life cycle, designers often have detailed wireframes and workflows that they can share with product managers, developers, and even clients to provide a preview of the look and feel of upcoming software features.
Testers can use these design documents to draft automated tests ahead of development, based on the expected user interface and workflows. Testers can also use the design documents to recommend to developers where to add IDs and other selectors for UI elements, or even work with developers to establish naming conventions for these selectors.
Having implemented the above process on previous test automation projects, I can attest that by agreeing on UI element selectors in advance, developers can work faster with less unplanned work to add automation support, and test engineers can rely on these selectors to consistently locate UI elements in their automated tests.
Now what if I told you that we could use those design documents to start running automated tests before a single line of code is even written?
Assuming that developers abide by the design, the code they write should eventually:
follow the same workflows as the hi-fi design documents
look very similar, if not the same as the images in the hi-fi design docs
Based on these assumptions, we can use screen captures from the hi-fi design images to enable template matching. By searching for closely matching images in the browser or device screen containing our app and interacting with the UI element at that location, we can use images to select UI elements and drive automated tests.
How does this work? Simple - by using computer vision to navigate the UI. Users don't look at the code of a webpage or app to locate something, so why limit your automated tests to using the DOM or an XML tree to navigate your website or app?
For example, take a look at the above image from the Where's Waldo series of books.
Looking for Waldo? Spoiler alert:
Now imagine that the Where's Waldo image works like an HTML image map, so that when you click on a region of the image (such as Waldo) you can interact with the UI. Template matching for test automation works in much the same way. Once the template image is found, the test automation driver can click on the element, send text to a cursor at that location, or otherwise interact with the element.
This all sounds great! But Selenium doesn't do this. If I'm not an OpenCV wizard, how can I do this in real life? Thankfully the people who are AI wizards have made tools to do this so the rest of us don't have to!
- As one of the first tools to enable test automation using image templates, SikuliX can be used to automate almost anything, so long as you have access to the output of the monitor or device screen to compare screenshots. To use SikuliX to test Web apps, a local computer or an AWS EC2 instance works well. For best results, keep the resolution, screen size, and browser zoom the same for your image templates and the web browser instance during a test. You can also combine SikuliX with Selenium in a hybrid test automation suite to get the advantages of both, as I did on a previous project.
- For mobile apps, Appium supports image comparison. For those using Appium 2, adding the Appium Image Plugin to your Appium server is simple, and after finding elements using image templates, one can perform most of the functions on them that one would usually expect using Appium. At the time of this writing, though, the Appium Image Plugin is not widely supported by cloud-based mobile device labs (perhaps BrowserStack), so you may want to run image-driven tests on a local device.
- For both Web and mobile apps, Applitools Eyes provides the capacity to create visual locators, as well as many other visual testing tools. Though Applitools is costlier than open-source tools such as SikuliX and Appium Image Plugin, tests can benefit from Applitools' image matching technology which uses machine learning to reduce false positives from UI elements that may change often, such as the price of an item.