User-Defined Functions
Persistent user-defined functions
active_n_weeks_ago
Active N Weeks Ago
udf.active_n_weeks_ago
Source Directory | Metadata File
active_values_from_days_seen_map
Active Values From Days Seen Map
udf.active_values_from_days_seen_map
Given a map of representing activity for STRING key
s, this function returns an array of which key
s were active for the time period in question. start_offset should be at most 0. n_bits should be at most the remaining bits.
Source Directory | Metadata File
add_monthly_engine_searches
Add Monthly Engine Searches
udf.add_monthly_engine_searches
This function specifically windows searches into calendar-month windows. This means groups are not necessarily directly comparable, since different months have different numbers of days. On the first of each month, a new month is appended, and the first month is dropped. If the date is not the first of the month, the new entry is added to the last element in the array. For example, if we were adding 12 to [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]: On the first of the month, the result would be [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 12] On any other day of the month, the result would be [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 24] This happens for every aggregate (searches, ad clicks, etc.)
Source Directory | Metadata File
add_monthly_searches
Add Monthly Searches
udf.add_monthly_searches
Adds together two engine searches structs. Each engine searches struct has a MAP[engine -> search_counts_struct]. We want to add add together the prev and curr's values for a certain engine. This allows us to be flexible with the number of engines we're using.
Source Directory | Metadata File
add_searches_by_index
Add Searches By Index
udf.add_searches_by_index
Return sums of each search type grouped by the index. Results are ordered by index.
Source Directory | Metadata File
aggregate_active_addons
Aggregate Active Addons
udf.aggregate_active_addons
This function selects most frequently occuring value for each addon_id, using the latest value in the input among ties. The type for active_addons is ARRAYSELECT ARRAY_CONCAT_AGG(active_addons) FROM telemetry.main_summary_v4
, and is left unspecified to allow changes to the fields of the STRUCT.
Source Directory | Metadata File
aggregate_map_first
Aggregate Map First
udf.aggregate_map_first
Returns an aggregated map with all the keys and the first corresponding value from the given maps
Source Directory | Metadata File
aggregate_search_counts
Aggregate Search Counts
udf.aggregate_search_counts
Source Directory | Metadata File
aggregate_search_map
Aggregate Search Map
udf.aggregate_search_map
Aggregates the total counts of the given search counters
Source Directory | Metadata File
array_11_zeroes_then
Array 11 Zeroes Then
udf.array_11_zeroes_then
An array of 11 zeroes, followed by a supplied value
Source Directory | Metadata File
array_drop_first_and_append
Array Drop First And Append
udf.array_drop_first_and_append
Drop the first element of an array, and append the given element. Result is an array with the same length as the input.
Source Directory | Metadata File
array_of_12_zeroes
Array Of 12 Zeroes
udf.array_of_12_zeroes
An array of 12 zeroes
Source Directory | Metadata File
array_slice
Array Slice
udf.array_slice
Source Directory | Metadata File
bitcount_lowest_7
Bitcount Lowest 7
udf.bitcount_lowest_7
This function counts the 1s in lowest 7 bits of an INT64
Source Directory | Metadata File
bitmask_365
Bitmask 365
udf.bitmask_365
A bitmask for 365 bits
Source Directory | Metadata File
bitmask_lowest_28
Bitmask Lowest 28
udf.bitmask_lowest_28
Source Directory | Metadata File
bitmask_lowest_7
Bitmask Lowest 7
udf.bitmask_lowest_7
Source Directory | Metadata File
bitmask_range
Bitmask Range
udf.bitmask_range
Returns a bitmask that can be used to return a subset of an integer representing a bit array. The start_ordinal argument is an integer specifying the starting position of the slice, with start_ordinal = 1 indicating the first bit. The length argument is the number of bits to include in the mask. The arguments were chosen to match the semantics of the SUBSTR function; see https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#substr
Source Directory | Metadata File
bits28_active_in_range
Bits28 Active In Range
udf.bits28_active_in_range
Source Directory | Metadata File
bits28_days_since_seen
Bits28 Days Since Seen
udf.bits28_days_since_seen
Source Directory | Metadata File
bits28_from_string
Bits28 From String
udf.bits28_from_string
Source Directory | Metadata File
bits28_range
Bits28 Range
udf.bits28_range
Source Directory | Metadata File
bits28_retention
Bits28 Retention
udf.bits28_retention
Source Directory | Metadata File
bits28_to_dates
Bits28 To Dates
udf.bits28_to_dates
Source Directory | Metadata File
bits28_to_string
Bits28 To String
udf.bits28_to_string
Source Directory | Metadata File
bits_from_offsets
Bits From Offsets
udf.bits_from_offsets
Returns a bit pattern of type BYTES compactly encoding the given array of positive integer offsets.
This is primarily useful to generate a compact encoding of dates on which a feature was used, with arbitrarily long history. Example aggregation:
sql bits_from_offsets(
ARRAY_AGG(IF(foo, DATE_DIFF(anchor_date, submission_date, DAY), NULL)
IGNORE NULLS)
)
The resulting value can be cast to an INT64 representing the most recent 64 days via:
sql CAST(CONCAT('0x', TO_HEX(RIGHT(bits >> i, 4))) AS INT64)
Or representing the most recent 28 days (compatible with bits28 functions) via:
sql CAST(CONCAT('0x', TO_HEX(RIGHT(bits >> i, 4))) AS INT64) << 36 >> 36
Source Directory | Metadata File
bits_to_active_n_weeks_ago
Bits To Active N Weeks Ago
udf.bits_to_active_n_weeks_ago
Given a BYTE and an INT64, return whether the user was active that many weeks ago. NULL input returns NULL output.
Source Directory | Metadata File
bits_to_days_seen
Bits To Days Seen
udf.bits_to_days_seen
Given a BYTE, get the number of days the user was seen. NULL input returns NULL output.
Source Directory | Metadata File
bits_to_days_since_first_seen
Bits To Days Since First Seen
udf.bits_to_days_since_first_seen
Given a BYTES, return the number of days since the client was first seen. If no bits are set, returns NULL, indicating we don't know. Otherwise the result is 0-indexed, meaning that for \x01, it will return 0. Results showed this being between 5-10x faster than the simpler alternative: CREATE OR REPLACE FUNCTION udf.bits_to_days_since_first_seen(b BYTES) AS (( SELECT MAX(n) FROM UNNEST(GENERATE_ARRAY( 0, 8 * BYTE_LENGTH(b))) AS n WHERE BIT_COUNT(SUBSTR(b >> n, -1) & b'\x01') > 0)); See also: bits_to_days_since_seen.sql
Source Directory | Metadata File
bits_to_days_since_seen
Bits To Days Since Seen
udf.bits_to_days_since_seen
Given a BYTES, return the number of days since the client was last seen. If no bits are set, returns NULL, indicating we don't know. Otherwise the results are 0-indexed, meaning \x01 will return 0. Tests showed this being 5-10x faster than the simpler alternative: CREATE OR REPLACE FUNCTION udf.bits_to_days_since_seen(b BYTES) AS (( SELECT MIN(n) FROM UNNEST(GENERATE_ARRAY(0, 364)) AS n WHERE BIT_COUNT(SUBSTR(b >> n, -1) & b'\x01') > 0)); See also: bits_to_days_since_first_seen.sql
Source Directory | Metadata File
bool_to_365_bits
Bool To 365 Bits
udf.bool_to_365_bits
Convert a boolean to 365 bit byte array
Source Directory | Metadata File
boolean_histogram_to_boolean
Boolean Histogram To Boolean
udf.boolean_histogram_to_boolean
Given histogram h, return TRUE if it has a value in the "true" bucket, or FALSE if it has a value in the "false" bucket, or NULL otherwise. https://github.com/mozilla/telemetry-batch-view/blob/ea0733c/src/main/scala/com/mozilla/telemetry/utils/MainPing.scala#L309-L317
Source Directory | Metadata File
coalesce_adjacent_days_28_bits
Coalesce Adjacent Days 28 Bits
udf.coalesce_adjacent_days_28_bits
We generally want to believe only the first reasonable profile creation date that we receive from a client. Given bits representing usage from the previous day and the current day, this function shifts the first argument by one day and returns either that value if non-zero and non-null, the current day value if non-zero and non-null, or else 0.
Source Directory | Metadata File
coalesce_adjacent_days_365_bits
Coalesce Adjacent Days 365 Bits
udf.coalesce_adjacent_days_365_bits
Coalesce previous data's PCD with the new data's PCD. We generally want to believe only the first reasonable profile creation date that we receive from a client. Given bytes representing usage from the previous day and the current day, this function shifts the first argument by one day and returns either that value if non-zero and non-null, the current day value if non-zero and non-null, or else 0.
Source Directory | Metadata File
combine_adjacent_days_28_bits
Combine Adjacent Days 28 Bits
udf.combine_adjacent_days_28_bits
Combines two bit patterns. The first pattern represents activity over a 28-day period ending "yesterday". The second pattern represents activity as observed today (usually just 0 or 1). We shift the bits in the first pattern by one to set the new baseline as "today", then perform a bitwise OR of the two patterns.
Source Directory | Metadata File
combine_adjacent_days_365_bits
Combine Adjacent Days 365 Bits
udf.combine_adjacent_days_365_bits
Source Directory | Metadata File
combine_days_seen_maps
Combine Days Seen Maps
udf.combine_days_seen_maps
The "clients_last_seen" class of tables represent various types of client activity within a 28-day window as bit patterns. This function takes in two arrays of structs (aka maps) where each entry gives the bit pattern for days in which we saw a ping for a given user in a given key. We combine the bit patterns for the previous day and the current day, returning a single map. See udf.combine_experiment_days
for a more specific example of this approach.
Source Directory | Metadata File
combine_experiment_days
Combine Experiment Days
udf.combine_experiment_days
The "clients_last_seen" class of tables represent various types of client activity within a 28-day window as bit patterns. This function takes in two arrays of structs where each entry gives the bit pattern for days in which we saw a ping for a given user in a given experiment. We combine the bit patterns for the previous day and the current day, returning a single array of experiment structs.
Source Directory | Metadata File
country_code_to_flag
Country Code To Flag
udf.country_code_to_flag
For a given two-letter ISO 3166-1 alpha-2 country code, returns a string consisting of two Unicode regional indicator symbols, which is rendered in supporting fonts (such as in the BigQuery console or STMO) as flag emoji. This is just for fun. See: - https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 - https://en.wikipedia.org/wiki/Regional_Indicator_Symbol
Source Directory | Metadata File
days_seen_bytes_to_rfm
Days Seen Bytes To Rfm
udf.days_seen_bytes_to_rfm
Return the frequency, recency, and T from a BYTE array, as defined in https://lifetimes.readthedocs.io/en/latest/Quickstart.html#the-shape-of-your-data RFM refers to Recency, Frequency, and Monetary value.
Source Directory | Metadata File
days_since_created_profile_as_28_bits
Days Since Created Profile As 28 Bits
udf.days_since_created_profile_as_28_bits
Takes in a difference between submission date and profile creation date and returns a bit pattern representing the profile creation date IFF the profile date is the same as the submission date or no more than 6 days earlier. Analysis has shown that client-reported profile creation dates are much less reliable outside of this range and cannot be used as reliable indicators of new profile creation.
Source Directory | Metadata File
deanonymize_event
Deanonymize Event
udf.deanonymize_event
Rename struct fields in anonymous event tuples to meaningful names.
Source Directory | Metadata File
decode_int64
Decode Int64
udf.decode_int64
Source Directory | Metadata File
dedupe_array
Dedupe Array
udf.dedupe_array
Return an array containing only distinct values of the given array
Source Directory | Metadata File
event_code_points_to_string
udf.event_code_points_to_string
experiment_search_metric_to_array
Experiment search metric to array
udf.experiment_search_metric_to_array
Used for testing only. Reproduces the string transformations done in experiment_search_events_live_v1 materialized views.
Source Directory | Metadata File
extract_count_histogram_value
Extract Count Histogram Value
udf.extract_count_histogram_value
Source Directory | Metadata File
extract_document_type
Extract Document Type
udf.extract_document_type
Extract the document type from a table name e.g. _TABLE_SUFFIX.
Source Directory | Metadata File
extract_document_version
Extract Document Version
udf.extract_document_version
Extract the document version from a table name e.g. _TABLE_SUFFIX.
Source Directory | Metadata File
extract_histogram_sum
Extract Histogram Sum
udf.extract_histogram_sum
This is a performance optimization compared to the more general mozfun.hist.extract for cases where only the histogram sum is needed. It must support all the same format variants as mozfun.hist.extract but this simplification is necessary to keep the main_summary query complexity in check.
Source Directory | Metadata File
extract_schema_validation_path
Extract Schema Validation Path
udf.extract_schema_validation_path
Return a path derived from an error message in payload_bytes_error
Source Directory | Metadata File
fenix_build_to_datetime
Fenix Build To Datetime
udf.fenix_build_to_datetime
Convert the Fenix client_info.app_build-format string to a DATETIME. May return NULL on failure.
Fenix originally used an 8-digit app_build format
In short it is yDDDHHmm
:
- y is years since 2018
- DDD is day of year, 0-padded, 001-366
- HH is hour of day, 00-23
- mm is minute of hour, 00-59
The last date seen with an 8-digit build ID is 2020-08-10.
Newer builds use a 10-digit format where the integer represents a pattern consisting of 32 bits. The 17 bits starting 13 bits from the left represent a number of hours since UTC midnight beginning 2014-12-28.
This function tolerates both formats.
After using this you may wish to DATETIME_TRUNC(result, DAY)
for grouping
by build date.
Source Directory | Metadata File
ga_is_mozilla_browser
Is Mozilla Browser
udf.ga_is_mozilla_browser
Determine if a browser in a Google Analytics data is produced by Mozilla
Source Directory | Metadata File
geo_struct
Geo Struct
udf.geo_struct
Convert geoip lookup fields to a struct, replacing '??' with NULL. Returns NULL if if required field country would be NULL. Replaces '??' with NULL because '??' is a placeholder that may be used if there was an issue during geoip lookup in hindsight.
Source Directory | Metadata File
get_key
Get Key
udf.get_key
Source Directory | Metadata File
get_key_with_null
Get Key With Null
udf.get_key_with_null
Source Directory | Metadata File
glean_timespan_nanos
Glean Timespan Nanos
udf.glean_timespan_nanos
Source Directory | Metadata File
glean_timespan_seconds
Glean Timespan Seconds
udf.glean_timespan_seconds
Source Directory | Metadata File
gzip_length_footer
Gzip Length Footer
udf.gzip_length_footer
Given a gzip compressed byte string, extract the uncompressed size from the footer. WARNING: THIS FUNCTION IS NOT RELIABLE FOR ARBITRARY GZIP STREAMS. It should, however, be safe to use for checking the decompressed size of payload in payload_bytes_decoded (and NOT payload_bytes_raw) because that payload is produced by the decoder and limited to conditions where the footer is accurate. From https://stackoverflow.com/a/9213826 First, the only information about the uncompressed length is four bytes at the end of the gzip file (stored in little-endian order). By necessity, that is the length modulo 232. So if the uncompressed length is 4 GB or more, you won't know what the length is. You can only be certain that the uncompressed length is less than 4 GB if the compressed length is less than something like 232 / 1032 + 18, or around 4 MB. (1032 is the maximum compression factor of deflate.) Second, and this is worse, a gzip file may actually be a concatenation of multiple gzip streams. Other than decoding, there is no way to find where each gzip stream ends in order to look at the four-byte uncompressed length of that piece. (Which may be wrong anyway due to the first reason.) Third, gzip files will sometimes have junk after the end of the gzip stream (usually zeros). Then the last four bytes are not the length.
Source Directory | Metadata File
histogram_max_key_with_nonzero_value
Histogram Max Key With Nonzero Value
udf.histogram_max_key_with_nonzero_value
Find the largest numeric bucket that contains a value greater than zero. https://github.com/mozilla/telemetry-batch-view/blob/ea0733c/src/main/scala/com/mozilla/telemetry/utils/MainPing.scala#L253-L266
Source Directory | Metadata File
histogram_merge
Histogram Merge
udf.histogram_merge
Source Directory | Metadata File
histogram_normalize
Histogram Normalize
udf.histogram_normalize
Source Directory | Metadata File
histogram_percentiles
Histogram Percentiles
udf.histogram_percentiles
Source Directory | Metadata File
histogram_to_mean
Histogram To Mean
udf.histogram_to_mean
Source Directory | Metadata File
histogram_to_threshold_count
Histogram To Threshold Count
udf.histogram_to_threshold_count
Source Directory | Metadata File
hmac_sha256
Hmac Sha256
udf.hmac_sha256
Given a key and message, return the HMAC-SHA256 hash. This algorithm can be found in Wikipedia: https://en.wikipedia.org/wiki/HMAC#Implementation This implentation is validated against the NIST test vectors. See test/validation/hmac_sha256.py for more information.
Source Directory | Metadata File
int_to_365_bits
Int To 365 Bits
udf.int_to_365_bits
Source Directory | Metadata File
int_to_hex_string
Int To Hex String
udf.int_to_hex_string
Source Directory | Metadata File
json_extract_histogram
Json Extract Histogram
udf.json_extract_histogram
Source Directory | Metadata File
json_extract_int_map
Json Extract Int Map
udf.json_extract_int_map
Source Directory | Metadata File
json_mode_last
Json Mode Last
udf.json_mode_last
Source Directory | Metadata File
keyed_histogram_get_sum
Keyed Histogram Get Sum
udf.keyed_histogram_get_sum
Take a keyed histogram of type STRUCT
Source Directory | Metadata File
kv_array_append_to_json_string
Kv Array Append To Json String
udf.kv_array_append_to_json_string
Returns a JSON string which has the pair
appended to the provided input
JSON string. NULL is also valid for input
. Examples: udf.kv_array_append_to_json_string('{"foo":"bar"}', [STRUCT("baz" AS key, "boo" AS value)]) '{"foo":"bar","baz":"boo"}' udf.kv_array_append_to_json_string('{}', [STRUCT("baz" AS key, "boo" AS value)]) '{"baz": "boo"}'
Source Directory | Metadata File
kv_array_to_json_string
Kv Array To Json String
udf.kv_array_to_json_string
Returns a JSON string representing the input key-value array. Value type must be able to be represented as a string - this function will cast to a string. At Mozilla, the schema for a map is STRUCTudf.kv_array_to_json_string(struct.key_value)
.
Source Directory | Metadata File
main_summary_scalars
Main Summary Scalars
udf.main_summary_scalars
Source Directory | Metadata File
map_bing_revenue_country_to_country_code
Map Bing Revenue Country To Country Code
udf.map_bing_revenue_country_to_country_code
For use by LTV revenue join only. Maps the Bing country to a country code. Only keeps the country codes we want to aggregate on.
Source Directory | Metadata File
map_mode_last
Map Mode Last
udf.map_mode_last
Source Directory | Metadata File
map_revenue_country
Map Revenue Country
udf.map_revenue_country
Only for use by the LTV Revenue join. Maps country codes to the codes we have in the revenue dataset. Buckets small Bing countries into "other".
Source Directory | Metadata File
map_sum
Map Sum
udf.map_sum
Source Directory | Metadata File
merge_scalar_user_data
Merge Scalar User Data
udf.merge_scalar_user_data
Given an array of scalar metric data that might have duplicate values for a metric, merge them into one value.
Source Directory | Metadata File
mod_uint128
Mod Uint128
udf.mod_uint128
This function returns "dividend mod divisor" where the dividend and the result is encoded in bytes, and divisor is an integer.
Source Directory | Metadata File
mode_last
Mode Last
udf.mode_last
Source Directory | Metadata File
mode_last_retain_nulls
Mode Last Retain Nulls
udf.mode_last_retain_nulls
Source Directory | Metadata File
monetized_search
Monetized Search Stub
udf.monetized_search
Stub monetized_search UDF for tests
Source Directory | Metadata File
new_monthly_engine_searches_struct
New Monthly Engine Searches Struct
udf.new_monthly_engine_searches_struct
This struct represents the past year's worth of searches. Each month has its own entry, hence 12.
Source Directory | Metadata File
normalize_fenix_metrics
Normalize Fenix Metrics
udf.normalize_fenix_metrics
Accepts a glean metrics struct as input and returns a modified struct that nulls out histograms for older versions of the Glean SDK that reported pathological binning; see https://bugzilla.mozilla.org/show_bug.cgi?id=1592930
Source Directory | Metadata File
normalize_glean_baseline_client_info
Normalize Glean Baseline Client Info
udf.normalize_glean_baseline_client_info
Source Directory | Metadata File
normalize_glean_ping_info
Normalize Glean Ping Info
udf.normalize_glean_ping_info
Source Directory | Metadata File
normalize_main_payload
Normalize Main Payload
udf.normalize_main_payload
Accepts a pipeline metadata struct as input and returns a modified struct that includes a few parsed or normalized variants of the input metadata fields.
Source Directory | Metadata File
normalize_metadata
Normalize Metadata
udf.normalize_metadata
Source Directory | Metadata File
normalize_monthly_searches
Normalize Monthly Searches
udf.normalize_monthly_searches
Sum up the monthy search count arrays by normalized engine
Source Directory | Metadata File
normalize_os
Normalize Os
udf.normalize_os
Source Directory | Metadata File
normalize_search_engine
Normalize Search Engine
udf.normalize_search_engine
Return normalized engine name for recognized engines This is a stub implementation for use with tests; real implementation is in private-bigquery-etl
Source Directory | Metadata File
null_if_empty_list
Null If Empty List
udf.null_if_empty_list
Return NULL if list is empty, otherwise return list. This cannot be done with NULLIF because NULLIF does not support arrays.
Source Directory | Metadata File
one_as_365_bits
One As 365 Bits
udf.one_as_365_bits
One represented as a byte array of 365 bits
Source Directory | Metadata File
pack_event_properties
udf.pack_event_properties
parquet_array_sum
Parquet Array Sum
udf.parquet_array_sum
Sum an array from a parquet-derived field. These are lists of an element
that contain the field value.
Source Directory | Metadata File
parse_desktop_telemetry_uri
Parse Desktop Telemetry Uri
udf.parse_desktop_telemetry_uri
Parses and labels the components of a telemetry desktop ping submission uri Per https://docs.telemetry.mozilla.org/concepts/pipeline/http_edge_spec.html#special-handling-for-firefox-desktop-telemetry the format is /submit/telemetry/docId/docType/appName/appVersion/appUpdateChannel/appBuildID e.g. /submit/telemetry/ce39b608-f595-4c69-b6a6-f7a436604648/main/Firefox/61.0a1/nightly/20180328030202
Source Directory | Metadata File
parse_iso8601_date
Parse Iso8601 Date
udf.parse_iso8601_date
Take a ISO 8601 date or date and time string and return a DATE. Return null if parse fails. Possible formats: 2019-11-04, 2019-11-04T21:15:00+00:00, 2019-11-04T21:15:00Z, 20191104T211500Z
Source Directory | Metadata File
pos_of_leading_set_bit
Pos Of Leading Set Bit
udf.pos_of_leading_set_bit
Returns the 0-based index of the first set bit. No set bits returns NULL.
Source Directory | Metadata File
pos_of_trailing_set_bit
Pos Of Trailing Set Bit
udf.pos_of_trailing_set_bit
Identical to bits28_days_since_seen. Returns a 0-based index of the rightmost set bit in the passed bit pattern or null if no bits are set (bits = 0). To determine this position, we take a bitwise AND of the bit pattern and its complement, then we determine the position of the bit via base-2 logarithm; see https://stackoverflow.com/a/42747608/1260237
Source Directory | Metadata File
product_info_with_baseline
Product Info with Baseline
udf.product_info_with_baseline
Similar to mozfun.norm.product_info(), but this UDF also handles "baseline" apps that were introduced differentiate for certain apps whether data is sent through Glean or core pings. This UDF has been temporarily introduced as part of https://bugzilla.mozilla.org/show_bug.cgi?id=1775216
Source Directory | Metadata File
pseudonymize_ad_id
Pseudonymize Ad Ids
udf.pseudonymize_ad_id
Pseudonymize Ad IDs, handling opt-outs.
Source Directory | Metadata File
quantile_search_metric_contribution
Quantile Search Metric Contribution
udf.quantile_search_metric_contribution
This function returns how much of one metric is contributed by the quantile of another metric. Quantile variable should add an offset to get the requried percentile value. Example: udf.quantile_search_metric_contribution(sap, search_with_ads, sap_percentiles[OFFSET(9)]) It returns search_with_ads if sap value in top 10% volumn else null.
Source Directory | Metadata File
round_timestamp_to_minute
Round Timestamp To Minute
udf.round_timestamp_to_minute
Floor a timestamp object to the given minute interval.
Source Directory | Metadata File
safe_crc32_uuid
Safe CRC-32 Uuid
udf.safe_crc32_uuid
Calculate the CRC-32 hash of a 36-byte UUID, or NULL if the value isn't 36 bytes. This implementation is limited to an exact length because recursion does not work. Based on https://stackoverflow.com/a/18639999/1260237 See https://en.wikipedia.org/wiki/Cyclic_redundancy_check
Source Directory | Metadata File
safe_sample_id
Safe Sample Id
udf.safe_sample_id
Stably hash a client_id to an integer between 0 and 99, or NULL if client_id isn't 36 bytes
Source Directory | Metadata File
search_counts_map_sum
Search Counts Map Sum
udf.search_counts_map_sum
Calculate the sums of search counts per source and engine
Source Directory | Metadata File
shift_28_bits_one_day
Shift 28 Bits One Day
udf.shift_28_bits_one_day
Shift input bits one day left and drop any bits beyond 28 days.
Source Directory | Metadata File
shift_365_bits_one_day
Shift 365 Bits One Day
udf.shift_365_bits_one_day
Shift input bits one day left and drop any bits beyond 365 days.
Source Directory | Metadata File
shift_one_day
Shift One Day
udf.shift_one_day
Returns the bitfield shifted by one day, 0 for NULL
Source Directory | Metadata File
smoot_usage_from_28_bits
Smoot Usage From 28 Bits
udf.smoot_usage_from_28_bits
Calculates a variety of metrics based on bit patterns of daily usage for the smoot_usage_* tables.
Source Directory | Metadata File
vector_add
Vector Add
udf.vector_add
This function adds two vectors. The two vectors can have different length. If one vector is null, the other vector will be returned directly.
Source Directory | Metadata File
zero_as_365_bits
Zero As 365 Bits
udf.zero_as_365_bits
Zero represented as a 365-bit byte array
Source Directory | Metadata File
zeroed_array
Zeroed Array
udf.zeroed_array
Generates an array if all zeroes, of arbitrary length
Source Directory | Metadata File