How to use computer vision to improve your test automation
Computer vision is used in a variety of modern devices and security systems. However, its application in test automation isn’t widespread. Most often verifications are limited to comparing two images or searching for a particular element in an overall picture. Is it possible to go beyond this? Is there a way to make automated tests ‘see’ all objects on the screen of your tablet or smartphone, for instance?
Read our article to learn how to streamline mobile test automation through computer vision, what tools to choose, and why its application sometimes can be restricted.
In most cases, to automate testing of iOS and Android mobile apps Appium is used, which is a solid tool. However, the scope of its possibilities is at times limited.
For instance, automatic search of certain elements within the same image can become a challenge for Appium.
The issue is that it can’t track location of individual elements of an image on the screen unless developers have previously added the corresponding unique addresses for each of them. Appium will overlook them.
What is the way out? Make use of computer vision.
Computer vision potential
Computer vision is a scientific discipline that allows machines to ‘visually’ analyze their environment. Images/videos usually form the core of this process.
Nowadays, the technology allows recognizing people by their faces and postures, helps unmanned vehicles to distinguish road signs and identify pedestrians accurately. It is also used to process images in medicine and support manufacturing.
This technology serves to solve the following tasks:
- Object identification
Computer vision assists to detect whether the video contains a particular object. Images search by their content, the assessment of the object’s location, and the defining of characters in the text also belong to the list of this function.
- Estimating speed of moving objects
This point presupposes processing a sequence of images to determine the speed of each point of the image.
- Image recovery
Its core goal is to remove noise (fuzzy image of an object in motion).
- Scene restoration
If there are more than two scene models, it’s possible to recreate its three-dimensional model.
Mobile testing automation using computer vision
Furthermore, the technology under discussion serves to automate the mobile software testing process.
In automation, it is used specifically for searching for individual elements in the overall image. It is necessary to pre-cut these elements and store them. This process isn’t appropriate for mobile testing automation on devices with different screen resolutions.
Therefore, the automation of this process will help to eliminate the need to manually cut elements out, thus optimizing the workflow significantly.
How to find the necessary elements in the whole image?
Computer vision tools serve this purpose. Combining them with Appium, test automation engineers solve the existing issue.
Let’s discuss each tool in detail.
Overview of computer vision tools
OpenCV
This С/С++ based tool is an open source library that contains computer vision and images processing algorithms. It can also be applied for Java, Python, Ruby, and some other languages.
OpenCV performs image binarization by converting a full-color display to a monochrome with only two types of pixels (dark and light).
Binarization is performed by either an adaptive or a threshold method. The first one is applied only to heterogeneously lighted areas of the image. The second one detects a certain threshold, which allows dividing the image into black and white.
Built-in algorithms separate the background from the object itself, as well as search for individual elements by contours. Open source code allows combining these methods for a specific solution.
SikuliX is an alternative tool when searching for an individual element within the overall image. It represents an open cross-platform script development environment focused on programming GUIs using screenshots.
For scripts creation one may use such languages as Ruby and its implementation jRubi, Jython, constructions from Python. The program is applied for Windows, Mac OS X, Linux. SikuliX is based on OpenCV algorithms.
Tesseract OCR
This is one of the most popular and high-quality programs for text recognition from images. It is written in C/C++ and is suitable for Mac OS, Windows, and Linux OS.
This tool supports new programming languages and fonts, while neural network forms the basis of an updated product version 4.0.
Leptonica (image processing library), the latest version of Tesseract OCR, and data for learning the desired language (for example, Spanish) are necessary to start working with this solution.
When combining these tools with Appium, the automated testing of a mobile application looks like this.
A screenshot of an image is created automatically, then it is divided into separate elements. Each of them is added to a configuration file. Engineers already have the coordinates of each element.
After that Appium accesses the configuration file, obtains the necessary information, and interacts with these elements using their coordinates.
Why is computer vision not a versatile solution?
Despite its vigorous development, QA and Dev teams can’t implement it on a daily basis.
It happens due to the following factors:
- Dependency on the image quality
Sometimes the quality is quite low. That’s the reason why computer vision tools fail to provide an accurate result.
- Inability to automate all test cases
It happens due to the existing limitations in the process of recognizing animated elements.
To sum up
Appium is a proven and robust mobile test automation tool. However, sometimes test automation engineers opt for combining it with computer vision instruments in order to elaborate a sustainable solution.
The one that will provide you with exact and quick results and can be extended to multiple applications with minimal effort.
Contact us to get more information on test automation service.