Introduction
The ecosystem-test-scripts repository is maintained by Mozilla’s Ecosystem Test Engineering (ETE) Team to support their testing operations and the service teams they work with. It provides tools and documentation for managing tests in Mozilla services.
Contents
- Command-Line Tools: Scripts for tasks such as metric collection and report updates.
- Developer Guides: Instructions for configuring environments and contributing to the repository.
- Reference Guides: Documentation for interpreting metrics and SOPs.
Command Line Tool
COMMANDS
check
-- Run linting, formatting, security, and type checks.clean
-- Clean up installation and cache files.format
-- Apply formatting.install
-- Install dependencies.run_metric_reporter
-- Run the Test Metric Reporter.test
-- Run tests.test_coverage
-- Run tests with coverage reporting.test_coverage_html
-- Run tests and generate HTML coverage report.
install
Install dependencies.
USAGE
make install
SEE ALSO
clean
-- Clean up installation and cache files.
clean
Clean up installation and cache files.
USAGE
make clean
SEE ALSO
install
-- Install dependencies.
check
Run linting, formatting, security, and type checks.
This script uses the following tools:
ruff
-- check for linting issues and code formatting.bandit
-- check for security issues.mypy
-- check for type issues.
USAGE
make check
SEE ALSO
format
-- Apply formatting.
format
Apply formatting.
This script will use ruff
to automatically fix linting issues and format the code.
USAGE
make format
SEE ALSO
check
-- Run linting, formatting, security, and type checks.
test
Run tests.
USAGE
make test
SEE ALSO
test_coverage
-- Run tests with coverage reporting.test_coverage_html
-- Run tests and generate HTML coverage report.
test_coverage
Run tests with coverage reporting.
USAGE
make test_coverage
SEE ALSO
test
-- Run tests.test_coverage_html
-- Run tests and generate HTML coverage report.
test_coverage_html
Run tests and generate HTML coverage report.
USAGE
make test_coverage_html
SEE ALSO
test
-- Run tests.test_coverage
-- Run tests with coverage reporting.
run_metric_reporter
Run the Test Metric Reporter.
USAGE
make run_metric_reporter
Developer Setup
Below are step-by-step instructions on how to set up a development environment in order to be able to successfully contribute to and execute the ecosystem test scripts.
1. Clone the ecosystem-test-scripts repository
The ecosystem test scripts are hosted on the Mozilla Github and can be cloned using the method of your choice (see Github Cloning A Repository). Contributors should follow the Contributing Guidelines and Community Participation Guidelines for the repository.
2. Copy the Metric Reporter Service Account JSON Key
The metric_reporter
script is set up using the ecosystem-test-eng GCP project
with the metric-reporter service accounts. In order to execute the
script, a key for the service account, in the form of a JSON file, needs to be copied from the
1Password Ecosystem Test Engineering Team Vault into the root directory of the
ecosystem-test-scripts
project.
3. Set up the config.ini
All settings for the ecosystem-test-scripts
are defined in the config.ini
file. To set up a
local config.ini
file, make a copy of the config.ini.sample
file found in the root directory of
the ecosystem-test-scriptsproject and rename it to
config.ini`
4. Set up the python virtual environment
This project uses Poetry for dependency management in conjunction with a pyproject.toml file. While you can use virtualenv to set up the dev environment, it is recommended to use pyenv and pyenv-virtualenv, as they work nicely with Poetry. Once poetry is installed, dependencies can be installed using the following Make command from the root directory:
make install
For more information on Make commands, run:
make help
5. Start Developing!
Metric Interpretation Guide
The purpose of this document is to help individuals understand and interpret the metrics that represent the health of their test suites and ensure that tests contribute to rapid development and high product quality. Test metrics provide insights into test performance, helping teams address potential issues early and monitor improvement efforts. While these metrics offer valuable data about the health of test suites, they do not necessarily measure the effectiveness of the test cases themselves.
Test Suite Size & Success Rates
Supported Test Frameworks: jest, mocha, nextest, playwright, pytest, tap
Test Suite Size
Test suite size refers to the number of tests in a suite and serves as a control measure. Unexplained changes, such as sudden growth or shrinkage, may indicate test runner issues or attempts to manipulate the scope or quality of the suite. Test suite size should correspond to the state of the product. For example, a product under active development should show a gradual increase in the size of its test suite, while a product in maintenance should exhibit more stable trends.
Success Rates
Success rates provide a quick indication of the test suite's health. A low success rate or high failure rate signals potential quality issues, either in the product or within the test suite itself. This metric can be tracked on both a test-by-test basis and for the entire test suite.
Average Success Rates
To avoid noise from isolated failures and spot trends more easily, it's helpful to calculate averaged success rates over time. This allows teams to act early if trends toward failure begin to emerge, preventing the test suite from becoming unreliable or mistrusted. Success rate averages are calculated as:
100 x (Successful Runs / (All Runs - Cancelled Runs))
These averages can be calculated over 30-day, 60-day, and 90-day periods, with the 90-day trend being preferred. Average success rates are typically interpreted as follows, though teams may adjust thresholds based on their specific needs:
Threshold | Interpretation |
---|---|
>= 95% | Healthy - Tests pass the majority of the time |
90% - 95% | Caution - Tests show signs of instability, requiring investigation |
< 90% | Critical - Tests are faulty and need intervention |
Time Measurements
Supported Test Frameworks: jest, mocha, nextest, playwright, pytest
Time measurements track how long it takes for tests to run. Ideally, these times should be proportional to the size of the test suite and remain stable over time. Significant increases or variations in execution time may indicate performance issues or inefficiencies within the test suite. Monitoring execution times allows teams to identify and address bottlenecks to keep test suites efficient.
Run Time
The cumulative time of all test runs in a suite.
Execution Time
The total time taken for the test suite to execute. If tests are not run in parallel, the execution time should match the run time. Execution time thresholds are typically interpreted as follows:
Threshold | Interpretation |
---|---|
> 5m | Slow - The test suite may require optimization |
<= 5m | Fast - The test suite runs within an acceptable time frame |
Coverage Metrics
Supported Coverage Frameworks: pytest-cov, llvm-cov
Coverage metrics measure the percentage of the codebase covered by tests. They help identify untested areas of the code, allowing teams to determine whether critical paths are adequately covered.
While high coverage percentages are generally good, they don’t always guarantee that the tests are meaningful. The quality and relevance of tests should be balanced with coverage goals. The following thresholds for line coverage provide general guidance but can be adjusted according to project needs:
Threshold | Interpretation |
---|---|
>= 95% | High - Potential for metric gaming or diminishing returns. * For teams using pytest-cov, the line excluded measure may offer further insight into coverage gaming. |
80% - 95% | Good - Suitable for high-risk or high-incident projects |
60% - 79% | Acceptable - Suitable for low-risk or low-incident projects |
< 60% | Low - Coverage should be improved |
Skip Rates
Supported Test Frameworks: jest, mocha, nextest, playwright, pytest
Skip rates indicate how often tests are temporarily excluded from execution. While skipping tests can be a necessary short-term solution to prevent flaky tests from disrupting workflows, high or sustained skip rates can signal deeper issues with the test suite's sustainability.
Long-term skips may indicate that tests have fallen into disrepair, and an increasing skip rate can point to team capacity or prioritization problems. Monitoring skip rates ensures that skipped tests are revisited and resolved promptly. Thresholds for skip rates are typically interpreted as follows:
Threshold | Interpretation |
---|---|
> 2% | Critical - Test coverage is compromised, requiring immediate intervention |
1% - 2% | Caution - Test coverage is at risk, and the suite may become prone to silent failures |
<= 1% | Healthy - Most of the test suite is running, ensuring comprehensive coverage |
Note: Playwright offers both skipme
and fixme
annotations, allowing for further refinement of
this metric.
Retry Rates
Supported Test Frameworks: playwright
Retry rates track how often tests are re-executed following a failure. While retries can help address transient issues, such as network errors, elevated retry rates may indicate flakiness in the test suite or performance regressions in the product. High retry rates can increase execution times and negatively impact developer workflows. Monitoring retry rates helps teams identify and fix unstable tests, ensuring predictable test execution.
TestRail Metric Interpretation Guide
This document helps explain the TestRail test case metrics dashboard. This dashboard is to help track the overall trend of test case automation within a project and to help identify potential test cases that are suitable for automation.
Test Case Automation Statuses
There are 5 Automation Statuses that a test case can have: Suitable, Completed, Disabled, Unsuitable and Untriaged.
Suitable
This test case is suitable and can be prioritized for automation. These cases have been evaluated by Test Engineering, SoftVision or project engineers as automatable.
Completed
This test case has been automated successfully and has been reviewed by the interested parties as well as Test Engineering.
Disabled
This test case is disabled from both manual and automated test runs. These should be reviewed periodically by the engineering team to evaluate if it can be moved out of the Disabled status.
Unsuitable
This test case cannot be automated due to its complex nature. These should be reviewed periodically by the engineering team to evaluate if it can be moved out of the Unsuitable status.
Untriaged
This test case has not been evaluated for automation. This is the default status for test cases added to TestRail and should be reviewed as soon as possible.
Automation Status Trend
This view tracks the trend that test cases get updated on TestRail. As test cases are marked with an automation status that matches one of the above, this will show up on this view. This gives an overview of how the automation work for the tracked test suites is progressing over time.
Test Case Coverage
This view tracks the overall percentages of the 5 statuses listed above.
Metric Update Procedure
Please follow the steps below to update the ETE Looker Dashboards. The dashboards are typically updated on Monday mornings (North America ET/PT) to ensure the values are available for team check-in meetings. The process of updating the dashboards can take up to an hour, this is due to network latency from parsing files stored on GCS.
Prerequisites
Before updating the test metrics, ensure that:
- Your development environment is setup with proper permissions (see the Developer Setup).
- You are on the latest version of the
main
branch - Your
config.ini
file in the root directory is up-to-date
1. Push updates to BigQuery
To push data to BigQuery with the latest test results and test coverages, execute the following command from the ecosystem-test-scripts root directory:
make run_metric_reporter
Notes:
- Coverage results are produced only for Autopush-rs unit tests and Merino-py unit and integration tests.
Project Onboarding Procedure
Below are step-by-step instructions on how to onboard a project to report test metrics.
Prerequisites
To report test metrics for a repository, contributors must ensure the following requirements are met:
- Test results must be in JUnit XML format.
- Coverage results must be in JSON format.
- Supported test frameworks are listed in the Metric Interpretation Guide. If a test framework is not listed, contact the ETE team to look into adding support.
1. Setup CICD to push test result and coverage data to GCS
- GCP Admin Requirement: To create a cloud bucket and a service user with access to the bucket in the ecosystem-test-eng GCP project, administrative permissions are required. Contact the ETE team.
- Create A Cloud Bucket (ETE Team): Create a directory for the repository in the ecosystem-test-eng-metrics GCS.
- Set up a Service User (ETE Team):
- If test artifacts are being pushed by GitHub Actions:
- Set up a service account for the project with
Storage Object Creator
andStorage Object Viewer
permissions - The service account name should be
{repository}-github
- Allow the GitHub repo to use this service account:
- On the Service accounts page, click your service account
- Go to the PERMISSIONS tab
- Under VIEW BY PRINCIPALS, click GRANT ACCESS
- Add this principal:
principalSet://iam.googleapis.com/projects/324168772199/locations/global/workloadIdentityPools/github-actions/attribute.repository/OWNER/REPOSITORY
- Assign this role: Workload Identity User
- Configs are expected to use the
Google Cloud Authentication GitHub Action
with the
service_account
set to the above and theworkload_identity_provider
set to${{ vars.GCPV2_GITHUB_WORKLOAD_IDENTITY_PROVIDER }}
- Set up a service account for the project with
- If test artifacts are being pushed by CircleCI:
- Set up a service account for the project with
Storage Object Creator
andStorage Object Viewer
permissions - The service account name should be
{repository}-circleci
- Create a JSON key
- Store the credentials in the ETE 1Password vault
- Configs are expected to use the CircleCI gcp-cli orb
- In CircleCI under 'Project Settings > Environment Variables':
- Option 1: Set the default environment variables for the orb
GCLOUD_SERVICE_KEY
with the JSON key contents andGOOGLE_PROJECT_ID
toecosystem-test-eng
- Option 2: If the default environment variables are already in use, override the
gcp-cli/setup
setup variables with the following environment variables.ETE_GCLOUD_SERVICE_KEY
which should be set with the JSON key contents andETE_GOOGLE_PROJECT_ID
which should be set toecosystem-test-eng
- Option 1: Set the default environment variables for the orb
- Set up a service account for the project with
- If test artifacts are being pushed by GitHub Actions:
- Modify CICD:
- Update project CICD jobs to push Coverage JSON files and JUnit XML files to the GCS repository
directory, under
coverage
andjunit
subdirectories respectively. - Coverage JSON files must follow a strict naming convention:
{job_number}__{utc_epoch_datetime}__{repository}__{workflow}__{test_suite}__coverage.json
- Example:
15592__1724283071__autopush-rs__build-test-deploy__integration__coverage.json
- Example:
- JUnit XML files must follow a strict naming convention:
{job_number}__{utc_epoch_datetime}__{repository}__{workflow}__{test_suite}__results{-index}.xml
- Example:
15592__1724283071__autopush-rs__build-test-deploy__integration__results.xml
- The index is optional and can be used in cases of parallel test execution
- Example:
- Update project CICD jobs to push Coverage JSON files and JUnit XML files to the GCS repository
directory, under
2. Create and Populate Tables in the ETE BigQuery Dataset
- GCP Admin Requirement: To create and populate tables in the ecosystem-test-eng GCP project, administrative permissions are required. Contact the ETE team.
- Create Tables (ETE Team):
- For test results, create one empty table named
{project_name}_results
. - For coverage results, create one empty table named
{project_name}_coverage
. - These tables should be created in the
test_metrics
dataset of the ETE BigQuery instance. Reference the official documentation to create empty tables with schema definitions. Schemas can be copied from existing project tables.
- For test results, create one empty table named
- Populate Tables (ETE Team):
- Execute the following command to populate the tables with data:
make run_metric_reporter
- Execute the following command to populate the tables with data:
3. Create a Looker Dashboard
-
License Requirement: A developer license is required to create and edit dashboards in Looker. Instructions for obtaining a license and resources for learning Looker are available on the Mozilla Confluence. Additional help can be found in the
#data-help
and#looker-platform-discussion
Slack channels. -
Update Looker Project:
- Update the ecosystem-test-eng Looker project Looker project model and add the required views for the new project test data. Related repository: looker-ecosystem-test-eng.
-
Create Dashboard: Create a new dashboard, populate it with looks for the new project data, and add it to the ETE Test Metrics Board.
Contributors
The ecosystem-test-scripts repository is owned by the Ecosystem Test Engineering Team.
See https://github.com/mozilla/ecosystem-test-scripts/blob/main/CONTRIBUTING.md for contribution guidelines.