mozetl.hardware_report package¶
Submodules¶
mozetl.hardware_report.check_output module¶
mozetl.hardware_report.hardware_dashboard module¶
Hardware Report Generator.
This dashboard can be found at: https://hardware.metrics.mozilla.com/
mozetl.hardware_report.summarize_json module¶
This job was originally located at [1].
-
mozetl.hardware_report.summarize_json.
aggregate_data
(processed_data)[source]¶ Return aggregated data.
-
mozetl.hardware_report.summarize_json.
build_device_map
()[source]¶ Build a dictionary that will help us map vendor/device ids to device families.
-
mozetl.hardware_report.summarize_json.
collapse_buckets
(aggregated_data, count_threshold)[source]¶ Collapse uncommon configurations in generic groups to preserve privacy.
This takes the dictionary of aggregated results from |aggregate_data| and collapses entries with a value less than |count_threshold| in a generic bucket.
- Parameters
aggregated_data – The object containing aggregated data.
count_threhold – Groups (or “configurations”) containing less than this value
collapsed in a generic bucket. (are) –
-
mozetl.hardware_report.summarize_json.
fetch_json
(uri)[source]¶ Perform an HTTP GET on the given uri, return the results as json.
If there is an error fetching the data, raise an exception.
- Parameters
uri – the string URI to fetch.
- Returns
A JSON object with the response.
-
mozetl.hardware_report.summarize_json.
fetch_previous_state
(s3_source_file_name, local_file_name, bucket)[source]¶ Fetch the previous state from S3’s bucket and store it locally.
- Parameters
s3_source_file_name – The name of the file on S3.
local_file_name – The name of the file to save to, locally.
-
mozetl.hardware_report.summarize_json.
finalize_data
(data, sample_count, broken_ratio, inactive_ratio, report_date)[source]¶ Finalize the aggregated data.
Translate raw sample numbers to percentages and add the date for the reported week along with the percentage of discarded samples due to broken data.
Rename the keys to more human friendly names.
- Parameters
data – Data in aggregated form.
sample_count – The number of samples the aggregates where generated from.
broken_ratio – The percentage of samples discarded due to broken data.
inactive_ratio – The percentage of samples discarded due to the client not sending data.
report_date – The starting day for the reported week.
- Returns
An object containing the reported hardware statistics.
-
mozetl.hardware_report.summarize_json.
generate_report
(start_date, end_date, spark)[source]¶ Generate the hardware survey dataset for the reference timeframe.
If the timeframe is longer than a week, split it in in weekly chunks and process each chunk individually (eases backfilling).
The report for each week is saved in a local JSON file.
- Parameters
start_date – The date from which we start generating the report. If None, the report starts from the beginning of the past week (Sunday).
end_date – The date the marks the end of the reporting period. This only makes sense if a |start_date| was provided. If None, this defaults to the end of the past week (Saturday).
-
mozetl.hardware_report.summarize_json.
get_OS_arch
(browser_arch, os_name, is_wow64)[source]¶ Infer the OS arch from environment data.
- Parameters
browser_arch – the browser architecture string (either “x86” or “x86-64”).
os_name – the operating system name.
is_wow64 – on Windows, indicates if the browser process is running under WOW64.
- Returns
‘x86’ if the underlying OS is 32bit, ‘x86-64’ if it’s a 64bit OS.
-
mozetl.hardware_report.summarize_json.
get_device_family_chipset
(vendor_id, device_id, device_map)[source]¶ Get the family and chipset strings given the vendor and device ids.
- Parameters
vendor_id – a string representing the vendor id (e.g. ‘0xabcd’).
device_id – a string representing the device id (e.g. ‘0xbcde’).
- Returns
A string in the format “Device Family Name-Chipset Name”.
-
mozetl.hardware_report.summarize_json.
get_file_name
(suffix='')[source]¶ Return report file name with date appended.
-
mozetl.hardware_report.summarize_json.
get_latest_valid_per_client
(entry, time_start, time_end)[source]¶ Get the most recently submitted ping for a client within the given timeframe.
Then use this index to look up the data from the other columns (we can assume that the sizes of these arrays match, otherwise the longitudinal dataset is broken). Once we have the data, we make sure it’s valid and return it.
- Parameters
entry – The record containing all the data for a single client.
time_start – The beginning of the reference timeframe.
time_end – The end of the reference timeframe.
- Returns
An object containing the valid hardware data for the client or a string describing why the data is discarded. Either REASON_INACTIVE, if the client didn’t submit a ping within the desired timeframe, or REASON_BROKEN_DATA if it send broken data.
- Raises
ValueError – if the columns within the record have mismatching lengths. This
means the longitudinal dataset is corrupted. –
-
mozetl.hardware_report.summarize_json.
get_valid_client_record
(r, data_index)[source]¶ Check if the referenced record is sane or contains partial/broken data.
- Parameters
r – The client entry in the longitudinal dataset.
dat_index – The index of the sample within the client record.
- Returns
An object containing the client hardware data or REASON_BROKEN_DATA if the data is invalid.
-
mozetl.hardware_report.summarize_json.
invert_device_map
(m)[source]¶ Inverts a GPU device map fetched from the jrmuizel’s Github repo.
- The layout of the fetched GPU map layout is:
Vendor ID -> Device Family -> Chipset -> [Device IDs]
- We should convert it to:
Vendor ID -> Device ID -> [Device Family, Chipset]
-
mozetl.hardware_report.summarize_json.
prepare_data
(p, device_map)[source]¶ Prepare data for further analyses.
e.g. unit conversion, vendor id to string, etc.
-
mozetl.hardware_report.summarize_json.
serialize_results
(date_to_json)[source]¶ Save each aggregated data item as an entry in the JSON.
-
mozetl.hardware_report.summarize_json.
store_new_state
(source_file_name, s3_dest_file_name, bucket)[source]¶ Store the new state file to S3.
- Parameters
source_file_name – The name of the local source file.
s3_dest_file_name – The name of the destination file on S3.
-
mozetl.hardware_report.summarize_json.
validate_finalized_data
(data)[source]¶ Validate the aggregated and finalized data.
This checks that the aggregated hardware data object contains all the expected keys and that they sum up roughly 1.0.
When a problem is found a message is printed to make debugging easier.
- Parameters
data – Data in aggregated form.
- Returns
True if the data validates correctly, false otherwise.