Specifics of data warehouse and business intelligence testing
Business Intelligence software is a set of technologies aimed at enabling executives, managers, and analysts to make better and faster decisions.
Let’s consider an example of the online store.
Customers visit the website, navigate its pages, make decisions, and add the selected goods to the cart. Meanwhile, they don’t realize that every step is registered, analyzed, and alongside with other steps used to make some business decision.
Ecommerce owners try to get maximum information about the consumer: gender, age, country of origin, purchases, money spent on this or that item, payment methods preferred, etc. The data is further organized, restructured, and provided to business users (managers, marketing specialists, etc.) to help them make well-grounded and informed decisions.
What is the architecture of business intelligence software?
The company may prefer to make use of either off-the-shelf BI solution or invest in developing its own. The market today offers a wide range of BI software of different complexity and functionality.
However, any of them shall be made of three obligatory blocks:
- Data loading and transformation system
- Data warehouse system
- Reporting and data visualization system
Data comes to the BI system from a number of different sources. First of them is JSON files. These are text files with lines in JSON format that record the subset of the website activity that is of interest to us. Also, information may be entered by the company’s employees (goods description, price, etc.). Finally, the information may be extracted from the marketing campaigns.
DWH and business intelligence testing is divided into several phases:
Certainly, the well-grounded business decision can be made only if the source data is reliable and error-free. Thorough testing will help to guarantee it.
ETL testing
In data warehousing, ETL refers to data pulling out (Extract), placing it in a DWH system in the organized format (Transform and Load). Testing engineers will verify the data moves from the source to the target repository and the transformation rules have been applied as required.
ETL testing will help to
1) Make sure no data is lost.
The data may be lost on any of the stages:
- On the website itself. For example, the user made a purchase but no data entered the DWH.
- On the way from the website to the DWH. Before the information gets to the DWH system it may be stored in a cloud using Amazon services, for example.
- Inside the DWH while moving from one level to another.
Software engineers will check the entire process and detect the bottlenecks.
2) Verify key-value
The records are stored and retrieved using a key that uniquely identifies the record, and is used to quickly find the data within the database. Testers should verify that the key corresponds to the record and the data may be processed correctly. Otherwise, the data loss may be significant.
3) Check the user session parameters are recorded correctly: start and end of the session, its duration, user profile data, his or her activity during the session.
4) Make sure the SQL functional works correctly.
For example, on one of the project the a1qa engineers came across the following defect: the DATEDIFF function that returns the difference between two date values, based on the interval specified, calculated only hours. As a result, the session lasted 11 minutes but the record was an hour.
5) Verify calculations accuracy.
Data warehousing is all about information and calculations. The calculations should be performed correctly.
In the table above, user activity is measured and summarized. The ‘Total’ value should be equal to the sum of values in all columns. And it’s the tester’s job to check it.
6) Exclude data duplication.
On any stage of the processing, data entries can be duplicated. If the customer purchased one item and the system creates two files – it’s a false scenario that should be omitted.
Report testing
The QA engineer will verify the data sorting, exporting and getting to the report. This phase also includes usability testing of the reports with the focus on various data sets (time, currency, etc.).
Load / Performance testing
As the company grows, the data volume will grow as well. So it’s vitally important to validate the system performance and scalability and define its load limits.
What are the main features of the DWH/BI applications?
Data testing is a complicated topic. Up to now, there is no single testing methodology. Every data testing project will be a unique one.
The main features of the DWH/BI solutions that should be taken into account are the following:
- Complex architecture and business logic
- Large volume of heterogeneous data
- Multiple data sources
- Constantly growing data scope
- Changing business requirements
How to choose DWH and BI testing team?
In order to perform efficient testing of the data processing and storage a tester is expected to:
- have a clear idea of the ETL process
- possess good understanding of database principles
- know BI/DWH concepts and technologies
- adapt to dynamic software requirements
- be able to ‘talk’ to business users.
What is more, it will be a big plus if the QA team is able to participate in the design and requirements testing phase. This will reduce the number of late improvements and ensure the project readiness within the set time and budget limits.
Timely and unbiased BI and DWH testing will guarantee information accuracy and reporting efficiency. Valid and correct data will help you make better decisions to fulfill your needs.
Need help? Book a free consultation with an a1qa expert.