Schema includes
The YAML tags listed below can be used in schema.yaml files to include content from:
- stable table schemas (i.e. telemetry tables in
*_stabledatasets) schema.yamlfiles for tables or views- arbitrary YAML files (e.g.
bigquery_etl/schema/global.yaml)
Recommendations
- Be mindful to arrange the includes so the resulting rendered
schema.yamlfile matches the actual schema of the query when it runs. - Have the includes mimic the structure of the associated query:
- If the query passes through all fields from an upstream table, then use
!include-fieldswith no field parameter. - If the query selects specific fields from an upstream table, then use
!include-fieldswith thefield_namesparameter specifying fields in the same order the query selects them. - If the query selects most fields from an upstream table with
SELECT * EXCEPT (...), then use!include-fieldswith theexclude_field_namesparameter. - If the query replaces some fields from an upstream table with
SELECT * REPLACE (...), then use!include-fieldswith thefield_replacementsparameter.
- If the query passes through all fields from an upstream table, then use
- Avoid using the
bqetl query schema updatecommand onschema.yamlfiles with includes, as that will overwrite the includes.
YAML tags
!include-field
Includes a field from the specified table/view or schema YAML file.
Parameters
- Either
tableorfileis required:table: Fully qualified ID of the table/view to include from (must have aschema.yamlfile or be a stable table).file: File path of the schema YAML file to include from (relative to the root of the repository).
field: Field path of the field to include.new_type: Optional type value to use in place of the included field's original type.new_mode: Optional mode value to use in place of the included field's original mode.new_description: Optional description to use in place of the included field's original description.append_description: Optional text to append to the included field's description.prepend_description: Optional text to prepend to the included field's description.new_fields: Optional list of field definitions to use in place of the included field's original list of subfields.- Tip: include tags can be also used in this list.
Examples
# Include a top-level column from an ETL table.
fields:
- !include-field
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
field: default_search_engine
# Include a top-level column from an ETL table, but override its type.
fields:
- !include-field
table: moz-fx-data-shared-prod.stripe_external.product_v1
field: metadata
new_type: JSON
# Include a top-level column from an ETL table, but override its description.
fields:
- !include-field
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
field: country_code
new_description: What this actually means is...
# Include a top-level column from an ETL table, and append extra text to the description.
fields:
- !include-field
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
field: default_search_engine
append_description: And one more thing...
# Include a top-level column from a stable table, but override its fields.
fields:
- !include-field
table: moz-fx-data-shared-prod.firefox_desktop_stable.metrics_v1
field: client_info
new_fields:
- name: client_id_hash
type: STRING
# Include a nested field from a stable table.
fields:
- !include-field
table: moz-fx-data-shared-prod.firefox_desktop_stable.metrics_v1
field: client_info.client_id
# Include a field from a YAML file.
fields:
- !include-field
file: /bigquery_etl/schema/global.yaml
field: app_build_id
!include-fields
Includes fields from the specified table/view, schema YAML file, or struct field in the table/view/file.
If the included fields are being inserted into part of a larger list, then the !flatten-lists tag will also need to be used (see examples below).
Parameters
- Either
tableorfileis required:table: Fully qualified ID of the table/view to include from (must have aschema.yamlfile or be a stable table).file: File path of the schema YAML file to include from (relative to the root of the repository).
parent_field: Optional field path of the struct parent field to include fields from.field_names: Optional list of fields to include (either top-level columns, or nested fields ifparent_fieldis specified).exclude_field_names: Optional list of fields to exclude (either top-level columns, or nested fields ifparent_fieldis specified).field_replacements: Optional list of field definitions that will be used in place of the associated field definitions found in the include (matched by field name).- Tip: include tags can be also used in this list.
Examples
# Include all top-level columns from an ETL table.
fields: !include-fields
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
# Include all top-level columns from an ETL table alongside additional fields.
# Note the !flatten-lists tag to flatten the nested list created by using !include-fields in a list item.
fields: !flatten-lists
- name: first_seen_date
type: DATE
mode: NULLABLE
description: The date when the telemetry ping was first received on the server side.
- !include-fields
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
# Include most top-level columns from an ETL table.
fields: !include-fields
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
exclude_field_names:
- client_id
- sample_id
# Include specific top-level columns from an ETL table.
fields: !include-fields
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
field_names:
- submission_date
- normalized_channel
# Include all top-level columns from an ETL table, but replace a string field with a JSON version.
fields: !include-fields
table: moz-fx-data-shared-prod.stripe_external.product_v1
field_replacements:
- name: metadata
type: JSON
mode: NULLABLE
description: Set of key-value pairs attached to the product, stored as a JSON object.
# Include all nested fields in a struct from a stable table.
fields: !include-fields
table: moz-fx-data-shared-prod.firefox_desktop_stable.metrics_v1
parent_field: client_info
# Include specific nested fields in a struct from a stable table.
fields: !include-fields
table: moz-fx-data-shared-prod.firefox_desktop_stable.metrics_v1
parent_field: client_info
field_names:
- app_build
- app_channel
# Include specific fields from a YAML file.
fields: !include-fields
file: /bigquery_etl/schema/global.yaml
field_names:
- app_build_id
- app_name
!include-field-description
Includes a field description from the specified table/view or schema YAML file.
Parameters
- Either
tableorfileis required:table: Fully qualified ID of the table/view to include from (must have aschema.yamlfile or be a stable table).file: File path of the schema YAML file to include from (relative to the root of the repository).
field: Field path of the field to include the description from.append: Optional text to append to the description.prepend: Optional text to prepend to the description.
Examples
# Include a top-level column description from an ETL table.
fields:
- name: channel
type: STRING
mode: NULLABLE
description: !include-field-description
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
field: normalized_channel
# Include a top-level column description from an ETL table, and append extra text to the description.
fields:
- name: channel
type: STRING
mode: NULLABLE
description: !include-field-description
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
field: normalized_channel
append: And one more thing...
# Include a nested field description from a stable table.
fields:
- name: client_locale
type: STRING
mode: NULLABLE
description: !include-field-description
table: moz-fx-data-shared-prod.firefox_desktop_stable.metrics_v1
field: client_info.locale
# Include a field description from a YAML file.
fields:
- name: version
type: STRING
mode: NULLABLE
description: !include-field-description
file: /bigquery_etl/schema/global.yaml
field: app_version
!include
Includes data from a YAML file.
Parameters
file: File path of the YAML file to include from (relative to the root of the repository).jmespath: Optional JMESPath expression to select the data.
Examples
# Include an entire YAML file.
!include
file: /sql/moz-fx-data-shared-prod/firefox_desktop_derived/metrics_clients_daily_v1/schema.yaml
# Include a specific field from a `schema.yaml` file.
fields:
- !include
file: /sql/moz-fx-data-shared-prod/firefox_desktop_derived/metrics_clients_daily_v1/schema.yaml
jmespath: fields[?name == 'default_search_engine'] | [0]
# Include a specific subset of fields from a `schema.yaml` file.
fields: !include
file: /sql/moz-fx-data-shared-prod/firefox_desktop_derived/metrics_clients_daily_v1/schema.yaml
jmespath: fields[?contains(['submission_date', 'normalized_channel'], name)]
!flatten-lists
When applied to a list it will flatten any directly nested lists, like those created by using !include-fields in a list item.
Examples
# Include all top-level columns from an ETL table alongside additional fields.
fields: !flatten-lists
- name: first_seen_date
type: DATE
mode: NULLABLE
description: The date when the telemetry ping was first received on the server side.
- !include-fields
table: moz-fx-data-shared-prod.firefox_desktop_derived.metrics_clients_daily_v1
Rendering schema.yaml files with includes
The bqetl query schema render command can render schema.yaml files for queries and views, resolving all includes, and printing to the console by default.
The GitHub Actions jobs use the bqetl query schema render command to resolve all schema.yaml includes before doing diffs or publishing files to the generated-sql branch.