Skip to content

User-Defined Functions

Persistent user-defined functions

Source Directory

active_n_weeks_ago

Active N Weeks Ago

udf.active_n_weeks_ago

Source Directory | Metadata File


active_values_from_days_seen_map

Active Values From Days Seen Map

udf.active_values_from_days_seen_map

Given a map of representing activity for STRING keys, this function returns an array of which keys were active for the time period in question. start_offset should be at most 0. n_bits should be at most the remaining bits.

Source Directory | Metadata File


add_monthly_engine_searches

Add Monthly Engine Searches

udf.add_monthly_engine_searches

This function specifically windows searches into calendar-month windows. This means groups are not necessarily directly comparable, since different months have different numbers of days. On the first of each month, a new month is appended, and the first month is dropped. If the date is not the first of the month, the new entry is added to the last element in the array. For example, if we were adding 12 to [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]: On the first of the month, the result would be [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 12] On any other day of the month, the result would be [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 24] This happens for every aggregate (searches, ad clicks, etc.)

Source Directory | Metadata File


add_monthly_searches

Add Monthly Searches

udf.add_monthly_searches

Adds together two engine searches structs. Each engine searches struct has a MAP[engine -> search_counts_struct]. We want to add add together the prev and curr's values for a certain engine. This allows us to be flexible with the number of engines we're using.

Source Directory | Metadata File


add_searches_by_index

Add Searches By Index

udf.add_searches_by_index

Return sums of each search type grouped by the index. Results are ordered by index.

Source Directory | Metadata File


aggregate_active_addons

Aggregate Active Addons

udf.aggregate_active_addons

This function selects most frequently occuring value for each addon_id, using the latest value in the input among ties. The type for active_addons is ARRAY>, i.e. the output of SELECT ARRAY_CONCAT_AGG(active_addons) FROM telemetry.main_summary_v4, and is left unspecified to allow changes to the fields of the STRUCT.

Source Directory | Metadata File


aggregate_map_first

Aggregate Map First

udf.aggregate_map_first

Returns an aggregated map with all the keys and the first corresponding value from the given maps

Source Directory | Metadata File


aggregate_search_counts

Aggregate Search Counts

udf.aggregate_search_counts

Source Directory | Metadata File


aggregate_search_map

Aggregate Search Map

udf.aggregate_search_map

Aggregates the total counts of the given search counters

Source Directory | Metadata File


array_11_zeroes_then

Array 11 Zeroes Then

udf.array_11_zeroes_then

An array of 11 zeroes, followed by a supplied value

Source Directory | Metadata File


array_drop_first_and_append

Array Drop First And Append

udf.array_drop_first_and_append

Drop the first element of an array, and append the given element. Result is an array with the same length as the input.

Source Directory | Metadata File


array_of_12_zeroes

Array Of 12 Zeroes

udf.array_of_12_zeroes

An array of 12 zeroes

Source Directory | Metadata File


array_slice

Array Slice

udf.array_slice

Source Directory | Metadata File


bitcount_lowest_7

Bitcount Lowest 7

udf.bitcount_lowest_7

This function counts the 1s in lowest 7 bits of an INT64

Source Directory | Metadata File


bitmask_365

Bitmask 365

udf.bitmask_365

A bitmask for 365 bits

Source Directory | Metadata File


bitmask_lowest_28

Bitmask Lowest 28

udf.bitmask_lowest_28

Source Directory | Metadata File


bitmask_lowest_7

Bitmask Lowest 7

udf.bitmask_lowest_7

Source Directory | Metadata File


bitmask_range

Bitmask Range

udf.bitmask_range

Returns a bitmask that can be used to return a subset of an integer representing a bit array. The start_ordinal argument is an integer specifying the starting position of the slice, with start_ordinal = 1 indicating the first bit. The length argument is the number of bits to include in the mask. The arguments were chosen to match the semantics of the SUBSTR function; see https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#substr

Source Directory | Metadata File


bits28_active_in_range

Bits28 Active In Range

udf.bits28_active_in_range

Source Directory | Metadata File


bits28_days_since_seen

Bits28 Days Since Seen

udf.bits28_days_since_seen

Source Directory | Metadata File


bits28_from_string

Bits28 From String

udf.bits28_from_string

Source Directory | Metadata File


bits28_range

Bits28 Range

udf.bits28_range

Source Directory | Metadata File


bits28_retention

Bits28 Retention

udf.bits28_retention

Source Directory | Metadata File


bits28_to_dates

Bits28 To Dates

udf.bits28_to_dates

Source Directory | Metadata File


bits28_to_string

Bits28 To String

udf.bits28_to_string

Source Directory | Metadata File


bits_from_offsets

Bits From Offsets

udf.bits_from_offsets

Returns a bit pattern of type BYTES compactly encoding the given array of positive integer offsets. This is primarily useful to generate a compact encoding of dates on which a feature was used, with arbitrarily long history. Example aggregation: sql bits_from_offsets( ARRAY_AGG(IF(foo, DATE_DIFF(anchor_date, submission_date, DAY), NULL) IGNORE NULLS) ) The resulting value can be cast to an INT64 representing the most recent 64 days via: sql CAST(CONCAT('0x', TO_HEX(RIGHT(bits >> i, 4))) AS INT64) Or representing the most recent 28 days (compatible with bits28 functions) via: sql CAST(CONCAT('0x', TO_HEX(RIGHT(bits >> i, 4))) AS INT64) << 36 >> 36

Source Directory | Metadata File


bits_to_active_n_weeks_ago

Bits To Active N Weeks Ago

udf.bits_to_active_n_weeks_ago

Given a BYTE and an INT64, return whether the user was active that many weeks ago. NULL input returns NULL output.

Source Directory | Metadata File


bits_to_days_seen

Bits To Days Seen

udf.bits_to_days_seen

Given a BYTE, get the number of days the user was seen. NULL input returns NULL output.

Source Directory | Metadata File


bits_to_days_since_first_seen

Bits To Days Since First Seen

udf.bits_to_days_since_first_seen

Given a BYTES, return the number of days since the client was first seen. If no bits are set, returns NULL, indicating we don't know. Otherwise the result is 0-indexed, meaning that for \x01, it will return 0. Results showed this being between 5-10x faster than the simpler alternative: CREATE OR REPLACE FUNCTION udf.bits_to_days_since_first_seen(b BYTES) AS (( SELECT MAX(n) FROM UNNEST(GENERATE_ARRAY( 0, 8 * BYTE_LENGTH(b))) AS n WHERE BIT_COUNT(SUBSTR(b >> n, -1) & b'\x01') > 0)); See also: bits_to_days_since_seen.sql

Source Directory | Metadata File


bits_to_days_since_seen

Bits To Days Since Seen

udf.bits_to_days_since_seen

Given a BYTES, return the number of days since the client was last seen. If no bits are set, returns NULL, indicating we don't know. Otherwise the results are 0-indexed, meaning \x01 will return 0. Tests showed this being 5-10x faster than the simpler alternative: CREATE OR REPLACE FUNCTION udf.bits_to_days_since_seen(b BYTES) AS (( SELECT MIN(n) FROM UNNEST(GENERATE_ARRAY(0, 364)) AS n WHERE BIT_COUNT(SUBSTR(b >> n, -1) & b'\x01') > 0)); See also: bits_to_days_since_first_seen.sql

Source Directory | Metadata File


bool_to_365_bits

Bool To 365 Bits

udf.bool_to_365_bits

Convert a boolean to 365 bit byte array

Source Directory | Metadata File


boolean_histogram_to_boolean

Boolean Histogram To Boolean

udf.boolean_histogram_to_boolean

Given histogram h, return TRUE if it has a value in the "true" bucket, or FALSE if it has a value in the "false" bucket, or NULL otherwise. https://github.com/mozilla/telemetry-batch-view/blob/ea0733c/src/main/scala/com/mozilla/telemetry/utils/MainPing.scala#L309-L317

Source Directory | Metadata File


coalesce_adjacent_days_28_bits

Coalesce Adjacent Days 28 Bits

udf.coalesce_adjacent_days_28_bits

We generally want to believe only the first reasonable profile creation date that we receive from a client. Given bits representing usage from the previous day and the current day, this function shifts the first argument by one day and returns either that value if non-zero and non-null, the current day value if non-zero and non-null, or else 0.

Source Directory | Metadata File


coalesce_adjacent_days_365_bits

Coalesce Adjacent Days 365 Bits

udf.coalesce_adjacent_days_365_bits

Coalesce previous data's PCD with the new data's PCD. We generally want to believe only the first reasonable profile creation date that we receive from a client. Given bytes representing usage from the previous day and the current day, this function shifts the first argument by one day and returns either that value if non-zero and non-null, the current day value if non-zero and non-null, or else 0.

Source Directory | Metadata File


combine_adjacent_days_28_bits

Combine Adjacent Days 28 Bits

udf.combine_adjacent_days_28_bits

Combines two bit patterns. The first pattern represents activity over a 28-day period ending "yesterday". The second pattern represents activity as observed today (usually just 0 or 1). We shift the bits in the first pattern by one to set the new baseline as "today", then perform a bitwise OR of the two patterns.

Source Directory | Metadata File


combine_adjacent_days_365_bits

Combine Adjacent Days 365 Bits

udf.combine_adjacent_days_365_bits

Source Directory | Metadata File


combine_days_seen_maps

Combine Days Seen Maps

udf.combine_days_seen_maps

The "clients_last_seen" class of tables represent various types of client activity within a 28-day window as bit patterns. This function takes in two arrays of structs (aka maps) where each entry gives the bit pattern for days in which we saw a ping for a given user in a given key. We combine the bit patterns for the previous day and the current day, returning a single map. See udf.combine_experiment_days for a more specific example of this approach.

Source Directory | Metadata File


combine_experiment_days

Combine Experiment Days

udf.combine_experiment_days

The "clients_last_seen" class of tables represent various types of client activity within a 28-day window as bit patterns. This function takes in two arrays of structs where each entry gives the bit pattern for days in which we saw a ping for a given user in a given experiment. We combine the bit patterns for the previous day and the current day, returning a single array of experiment structs.

Source Directory | Metadata File


country_code_to_flag

Country Code To Flag

udf.country_code_to_flag

For a given two-letter ISO 3166-1 alpha-2 country code, returns a string consisting of two Unicode regional indicator symbols, which is rendered in supporting fonts (such as in the BigQuery console or STMO) as flag emoji. This is just for fun. See: - https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 - https://en.wikipedia.org/wiki/Regional_Indicator_Symbol

Source Directory | Metadata File


days_seen_bytes_to_rfm

Days Seen Bytes To Rfm

udf.days_seen_bytes_to_rfm

Return the frequency, recency, and T from a BYTE array, as defined in https://lifetimes.readthedocs.io/en/latest/Quickstart.html#the-shape-of-your-data RFM refers to Recency, Frequency, and Monetary value.

Source Directory | Metadata File


days_since_created_profile_as_28_bits

Days Since Created Profile As 28 Bits

udf.days_since_created_profile_as_28_bits

Takes in a difference between submission date and profile creation date and returns a bit pattern representing the profile creation date IFF the profile date is the same as the submission date or no more than 6 days earlier. Analysis has shown that client-reported profile creation dates are much less reliable outside of this range and cannot be used as reliable indicators of new profile creation.

Source Directory | Metadata File


deanonymize_event

Deanonymize Event

udf.deanonymize_event

Rename struct fields in anonymous event tuples to meaningful names.

Source Directory | Metadata File


decode_int64

Decode Int64

udf.decode_int64

Source Directory | Metadata File


dedupe_array

Dedupe Array

udf.dedupe_array

Return an array containing only distinct values of the given array

Source Directory | Metadata File


event_code_points_to_string

udf.event_code_points_to_string

Source Directory


experiment_search_metric_to_array

Experiment search metric to array

udf.experiment_search_metric_to_array

Used for testing only. Reproduces the string transformations done in experiment_search_events_live_v1 materialized views.

Source Directory | Metadata File


extract_count_histogram_value

Extract Count Histogram Value

udf.extract_count_histogram_value

Source Directory | Metadata File


extract_document_type

Extract Document Type

udf.extract_document_type

Extract the document type from a table name e.g. _TABLE_SUFFIX.

Source Directory | Metadata File


extract_document_version

Extract Document Version

udf.extract_document_version

Extract the document version from a table name e.g. _TABLE_SUFFIX.

Source Directory | Metadata File


extract_histogram_sum

Extract Histogram Sum

udf.extract_histogram_sum

This is a performance optimization compared to the more general mozfun.hist.extract for cases where only the histogram sum is needed. It must support all the same format variants as mozfun.hist.extract but this simplification is necessary to keep the main_summary query complexity in check.

Source Directory | Metadata File


extract_schema_validation_path

Extract Schema Validation Path

udf.extract_schema_validation_path

Return a path derived from an error message in payload_bytes_error

Source Directory | Metadata File


fenix_build_to_datetime

Fenix Build To Datetime

udf.fenix_build_to_datetime

Convert the Fenix client_info.app_build-format string to a DATETIME. May return NULL on failure.

Fenix originally used an 8-digit app_build format

In short it is yDDDHHmm:

  • y is years since 2018
  • DDD is day of year, 0-padded, 001-366
  • HH is hour of day, 00-23
  • mm is minute of hour, 00-59

The last date seen with an 8-digit build ID is 2020-08-10.

Newer builds use a 10-digit format where the integer represents a pattern consisting of 32 bits. The 17 bits starting 13 bits from the left represent a number of hours since UTC midnight beginning 2014-12-28.

This function tolerates both formats.

After using this you may wish to DATETIME_TRUNC(result, DAY) for grouping by build date.

Source Directory | Metadata File


ga_is_mozilla_browser

Is Mozilla Browser

udf.ga_is_mozilla_browser

Determine if a browser in a Google Analytics data is produced by Mozilla

Source Directory | Metadata File


geo_struct

Geo Struct

udf.geo_struct

Convert geoip lookup fields to a struct, replacing '??' with NULL. Returns NULL if if required field country would be NULL. Replaces '??' with NULL because '??' is a placeholder that may be used if there was an issue during geoip lookup in hindsight.

Source Directory | Metadata File


get_key

Get Key

udf.get_key

Source Directory | Metadata File


get_key_with_null

Get Key With Null

udf.get_key_with_null

Source Directory | Metadata File


glean_timespan_nanos

Glean Timespan Nanos

udf.glean_timespan_nanos

Source Directory | Metadata File


glean_timespan_seconds

Glean Timespan Seconds

udf.glean_timespan_seconds

Source Directory | Metadata File


Gzip Length Footer

udf.gzip_length_footer

Given a gzip compressed byte string, extract the uncompressed size from the footer. WARNING: THIS FUNCTION IS NOT RELIABLE FOR ARBITRARY GZIP STREAMS. It should, however, be safe to use for checking the decompressed size of payload in payload_bytes_decoded (and NOT payload_bytes_raw) because that payload is produced by the decoder and limited to conditions where the footer is accurate. From https://stackoverflow.com/a/9213826 First, the only information about the uncompressed length is four bytes at the end of the gzip file (stored in little-endian order). By necessity, that is the length modulo 232. So if the uncompressed length is 4 GB or more, you won't know what the length is. You can only be certain that the uncompressed length is less than 4 GB if the compressed length is less than something like 232 / 1032 + 18, or around 4 MB. (1032 is the maximum compression factor of deflate.) Second, and this is worse, a gzip file may actually be a concatenation of multiple gzip streams. Other than decoding, there is no way to find where each gzip stream ends in order to look at the four-byte uncompressed length of that piece. (Which may be wrong anyway due to the first reason.) Third, gzip files will sometimes have junk after the end of the gzip stream (usually zeros). Then the last four bytes are not the length.

Source Directory | Metadata File


histogram_max_key_with_nonzero_value

Histogram Max Key With Nonzero Value

udf.histogram_max_key_with_nonzero_value

Find the largest numeric bucket that contains a value greater than zero. https://github.com/mozilla/telemetry-batch-view/blob/ea0733c/src/main/scala/com/mozilla/telemetry/utils/MainPing.scala#L253-L266

Source Directory | Metadata File


histogram_merge

Histogram Merge

udf.histogram_merge

Source Directory | Metadata File


histogram_normalize

Histogram Normalize

udf.histogram_normalize

Source Directory | Metadata File


histogram_percentiles

Histogram Percentiles

udf.histogram_percentiles

Source Directory | Metadata File


histogram_to_mean

Histogram To Mean

udf.histogram_to_mean

Source Directory | Metadata File


histogram_to_threshold_count

Histogram To Threshold Count

udf.histogram_to_threshold_count

Source Directory | Metadata File


hmac_sha256

Hmac Sha256

udf.hmac_sha256

Given a key and message, return the HMAC-SHA256 hash. This algorithm can be found in Wikipedia: https://en.wikipedia.org/wiki/HMAC#Implementation This implentation is validated against the NIST test vectors. See test/validation/hmac_sha256.py for more information.

Source Directory | Metadata File


int_to_365_bits

Int To 365 Bits

udf.int_to_365_bits

Source Directory | Metadata File


int_to_hex_string

Int To Hex String

udf.int_to_hex_string

Source Directory | Metadata File


json_extract_histogram

Json Extract Histogram

udf.json_extract_histogram

Source Directory | Metadata File


json_extract_int_map

Json Extract Int Map

udf.json_extract_int_map

Source Directory | Metadata File


json_mode_last

Json Mode Last

udf.json_mode_last

Source Directory | Metadata File


keyed_histogram_get_sum

Keyed Histogram Get Sum

udf.keyed_histogram_get_sum

Take a keyed histogram of type STRUCT, extract the histogram of the given key, and return the sum value

Source Directory | Metadata File


kv_array_append_to_json_string

Kv Array Append To Json String

udf.kv_array_append_to_json_string

Returns a JSON string which has the pair appended to the provided input JSON string. NULL is also valid for input. Examples: udf.kv_array_append_to_json_string('{"foo":"bar"}', [STRUCT("baz" AS key, "boo" AS value)]) '{"foo":"bar","baz":"boo"}' udf.kv_array_append_to_json_string('{}', [STRUCT("baz" AS key, "boo" AS value)]) '{"baz": "boo"}'

Source Directory | Metadata File


kv_array_to_json_string

Kv Array To Json String

udf.kv_array_to_json_string

Returns a JSON string representing the input key-value array. Value type must be able to be represented as a string - this function will cast to a string. At Mozilla, the schema for a map is STRUCT>>. To use this with that representation, it should be as udf.kv_array_to_json_string(struct.key_value).

Source Directory | Metadata File


main_summary_scalars

Main Summary Scalars

udf.main_summary_scalars

Source Directory | Metadata File


map_bing_revenue_country_to_country_code

Map Bing Revenue Country To Country Code

udf.map_bing_revenue_country_to_country_code

For use by LTV revenue join only. Maps the Bing country to a country code. Only keeps the country codes we want to aggregate on.

Source Directory | Metadata File


map_mode_last

Map Mode Last

udf.map_mode_last

Source Directory | Metadata File


map_revenue_country

Map Revenue Country

udf.map_revenue_country

Only for use by the LTV Revenue join. Maps country codes to the codes we have in the revenue dataset. Buckets small Bing countries into "other".

Source Directory | Metadata File


map_sum

Map Sum

udf.map_sum

Source Directory | Metadata File


merge_scalar_user_data

Merge Scalar User Data

udf.merge_scalar_user_data

Given an array of scalar metric data that might have duplicate values for a metric, merge them into one value.

Source Directory | Metadata File


mod_uint128

Mod Uint128

udf.mod_uint128

This function returns "dividend mod divisor" where the dividend and the result is encoded in bytes, and divisor is an integer.

Source Directory | Metadata File


mode_last

Mode Last

udf.mode_last

Source Directory | Metadata File


mode_last_retain_nulls

Mode Last Retain Nulls

udf.mode_last_retain_nulls

Source Directory | Metadata File


Monetized Search Stub

udf.monetized_search

Stub monetized_search UDF for tests

Source Directory | Metadata File


new_monthly_engine_searches_struct

New Monthly Engine Searches Struct

udf.new_monthly_engine_searches_struct

This struct represents the past year's worth of searches. Each month has its own entry, hence 12.

Source Directory | Metadata File


normalize_fenix_metrics

Normalize Fenix Metrics

udf.normalize_fenix_metrics

Accepts a glean metrics struct as input and returns a modified struct that nulls out histograms for older versions of the Glean SDK that reported pathological binning; see https://bugzilla.mozilla.org/show_bug.cgi?id=1592930

Source Directory | Metadata File


normalize_glean_baseline_client_info

Normalize Glean Baseline Client Info

udf.normalize_glean_baseline_client_info

Source Directory | Metadata File


normalize_glean_ping_info

Normalize Glean Ping Info

udf.normalize_glean_ping_info

Source Directory | Metadata File


normalize_main_payload

Normalize Main Payload

udf.normalize_main_payload

Accepts a pipeline metadata struct as input and returns a modified struct that includes a few parsed or normalized variants of the input metadata fields.

Source Directory | Metadata File


normalize_metadata

Normalize Metadata

udf.normalize_metadata

Source Directory | Metadata File


normalize_monthly_searches

Normalize Monthly Searches

udf.normalize_monthly_searches

Sum up the monthy search count arrays by normalized engine

Source Directory | Metadata File


normalize_os

Normalize Os

udf.normalize_os

Source Directory | Metadata File


normalize_search_engine

Normalize Search Engine

udf.normalize_search_engine

Return normalized engine name for recognized engines This is a stub implementation for use with tests; real implementation is in private-bigquery-etl

Source Directory | Metadata File


null_if_empty_list

Null If Empty List

udf.null_if_empty_list

Return NULL if list is empty, otherwise return list. This cannot be done with NULLIF because NULLIF does not support arrays.

Source Directory | Metadata File


one_as_365_bits

One As 365 Bits

udf.one_as_365_bits

One represented as a byte array of 365 bits

Source Directory | Metadata File


pack_event_properties

udf.pack_event_properties

Source Directory


parquet_array_sum

Parquet Array Sum

udf.parquet_array_sum

Sum an array from a parquet-derived field. These are lists of an element that contain the field value.

Source Directory | Metadata File


parse_desktop_telemetry_uri

Parse Desktop Telemetry Uri

udf.parse_desktop_telemetry_uri

Parses and labels the components of a telemetry desktop ping submission uri Per https://docs.telemetry.mozilla.org/concepts/pipeline/http_edge_spec.html#special-handling-for-firefox-desktop-telemetry the format is /submit/telemetry/docId/docType/appName/appVersion/appUpdateChannel/appBuildID e.g. /submit/telemetry/ce39b608-f595-4c69-b6a6-f7a436604648/main/Firefox/61.0a1/nightly/20180328030202

Source Directory | Metadata File


parse_iso8601_date

Parse Iso8601 Date

udf.parse_iso8601_date

Take a ISO 8601 date or date and time string and return a DATE. Return null if parse fails. Possible formats: 2019-11-04, 2019-11-04T21:15:00+00:00, 2019-11-04T21:15:00Z, 20191104T211500Z

Source Directory | Metadata File


pos_of_leading_set_bit

Pos Of Leading Set Bit

udf.pos_of_leading_set_bit

Returns the 0-based index of the first set bit. No set bits returns NULL.

Source Directory | Metadata File


pos_of_trailing_set_bit

Pos Of Trailing Set Bit

udf.pos_of_trailing_set_bit

Identical to bits28_days_since_seen. Returns a 0-based index of the rightmost set bit in the passed bit pattern or null if no bits are set (bits = 0). To determine this position, we take a bitwise AND of the bit pattern and its complement, then we determine the position of the bit via base-2 logarithm; see https://stackoverflow.com/a/42747608/1260237

Source Directory | Metadata File


product_info_with_baseline

Product Info with Baseline

udf.product_info_with_baseline

Similar to mozfun.norm.product_info(), but this UDF also handles "baseline" apps that were introduced differentiate for certain apps whether data is sent through Glean or core pings. This UDF has been temporarily introduced as part of https://bugzilla.mozilla.org/show_bug.cgi?id=1775216

Source Directory | Metadata File


pseudonymize_ad_id

Pseudonymize Ad Ids

udf.pseudonymize_ad_id

Pseudonymize Ad IDs, handling opt-outs.

Source Directory | Metadata File


quantile_search_metric_contribution

Quantile Search Metric Contribution

udf.quantile_search_metric_contribution

This function returns how much of one metric is contributed by the quantile of another metric. Quantile variable should add an offset to get the requried percentile value. Example: udf.quantile_search_metric_contribution(sap, search_with_ads, sap_percentiles[OFFSET(9)]) It returns search_with_ads if sap value in top 10% volumn else null.

Source Directory | Metadata File


round_timestamp_to_minute

Round Timestamp To Minute

udf.round_timestamp_to_minute

Floor a timestamp object to the given minute interval.

Source Directory | Metadata File


safe_crc32_uuid

Safe CRC-32 Uuid

udf.safe_crc32_uuid

Calculate the CRC-32 hash of a 36-byte UUID, or NULL if the value isn't 36 bytes. This implementation is limited to an exact length because recursion does not work. Based on https://stackoverflow.com/a/18639999/1260237 See https://en.wikipedia.org/wiki/Cyclic_redundancy_check

Source Directory | Metadata File


safe_sample_id

Safe Sample Id

udf.safe_sample_id

Stably hash a client_id to an integer between 0 and 99, or NULL if client_id isn't 36 bytes

Source Directory | Metadata File


search_counts_map_sum

Search Counts Map Sum

udf.search_counts_map_sum

Calculate the sums of search counts per source and engine

Source Directory | Metadata File


shift_28_bits_one_day

Shift 28 Bits One Day

udf.shift_28_bits_one_day

Shift input bits one day left and drop any bits beyond 28 days.

Source Directory | Metadata File


shift_365_bits_one_day

Shift 365 Bits One Day

udf.shift_365_bits_one_day

Shift input bits one day left and drop any bits beyond 365 days.

Source Directory | Metadata File


shift_one_day

Shift One Day

udf.shift_one_day

Returns the bitfield shifted by one day, 0 for NULL

Source Directory | Metadata File


smoot_usage_from_28_bits

Smoot Usage From 28 Bits

udf.smoot_usage_from_28_bits

Calculates a variety of metrics based on bit patterns of daily usage for the smoot_usage_* tables.

Source Directory | Metadata File


vector_add

Vector Add

udf.vector_add

This function adds two vectors. The two vectors can have different length. If one vector is null, the other vector will be returned directly.

Source Directory | Metadata File


zero_as_365_bits

Zero As 365 Bits

udf.zero_as_365_bits

Zero represented as a 365-bit byte array

Source Directory | Metadata File


zeroed_array

Zeroed Array

udf.zeroed_array

Generates an array if all zeroes, of arbitrary length

Source Directory | Metadata File