Enabling data to be ingested by the data platform
This page provides a step-by-step guide on how to enable data from your product to be ingested by the data platform.
This is just one of the required steps for integrating Glean successfully into a product. Check the full Glean integration checklist for a comprehensive list of all the steps involved in doing so.
Requirements
- GitHub Workflows
Add your product to probe scraper
At least one week before releasing your product, file a data engineering bug to enable your product's application id.
This will result in your product being added to probe scraper's
repositories.yaml
.
Validate and publish metrics
After your product has been enabled, you must submit commits to probe scraper to validate and publish metrics.
Metrics will only be published from branches defined in probe scraper's repositories.yaml
, or the Git default branch if not explicitly configured.
This should happen on every CI run to the specified branches.
Nightly jobs will then automatically add published metrics to the Glean Dictionary and other data platform tools.
Enable the GitHub Workflow by creating a new file .github/workflows/glean-probe-scraper.yml
with the following content:
---
name: Glean probe-scraper
on: [push, pull_request]
jobs:
glean-probe-scraper:
uses: mozilla/probe-scraper/.github/workflows/glean.yaml@main
Add your library to probe scraper
At least one week before releasing your product, file a data engineering bug to add your library to probe scraper
and be scraped for metrics as a dependency of another product.
This will result in your library being added to probe scraper's
repositories.yaml
.