Skip to content

Conversation

@akkomar
Copy link
Collaborator

@akkomar akkomar commented Nov 15, 2024

@akkomar akkomar force-pushed the wclouser_fxa_db_counts branch 2 times, most recently from 9433224 to f39c372 Compare November 18, 2024 15:27
@dataops-ci-bot

This comment has been minimized.

@dataops-ci-bot

This comment has been minimized.

@akkomar akkomar marked this pull request as ready for review November 19, 2024 13:11
@akkomar akkomar requested a review from a team November 19, 2024 13:14
@clouserw
Copy link
Member

yessss. Where is the setting for how often it runs?

@akkomar
Copy link
Collaborator Author

akkomar commented Nov 19, 2024

yessss. Where is the setting for how often it runs?

It's configured to run in bqetl_accounts_derived DAG, which is scheduled to run daily. DAG generation magic will make sure that this runs after MySQL tables are synced to BQ (expand Integration report for "Fix schema" in the comment above).

Comment on lines 6 to 7
Note: because its source tables are overwritten daily, query used to
populate this table is not idempotent so this table should not be backfilled.
Copy link
Contributor

@sean-rose sean-rose Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is, if someone re-runs a historical DAG run then historical data would be incorrectly overwritten with current data (which is highly probable since other normal daily-incremental ETLs exist in the bqetl_accounts_derived DAG).

I'd suggest using BigQuery's time travel feature in this ETL to select the data as it existed at data_interval_end so that any DAG run within the last ~7 days could potentially be backfilled, but trying to backfill further than time travel allows will fail outright rather than incorrectly overwriting data.

Here's an example of an ETL using time travel like that (though this doesn't need to be a script, and can remain a normal query ETL; a script was needed in that case because of the discrepancy between the monthly scheduling and the date partitioning setup).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for this idea!

(
SELECT
"accounts_linked_to_google" AS table_name,
COUNT(uid) AS total_rows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases where uid is null in this table?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlikely, IIUC this query has been running for a while in a custom environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run it by hand a few times but not "awhile". A NULL there would be a bug in the system. There shouldn't be any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually the only reason to use COUNT(expression) rather than COUNT(*) is if the expression might be null and shouldn't be counted in that case.

Or if it was intended to count distinct uid values then it should be COUNT(DISTINCT uid).

@dataops-ci-bot

This comment has been minimized.

@dataops-ci-bot

This comment has been minimized.

@akkomar akkomar force-pushed the wclouser_fxa_db_counts branch 2 times, most recently from 8aa5367 to c98a15f Compare November 20, 2024 10:41
@dataops-ci-bot

This comment has been minimized.

@akkomar akkomar force-pushed the wclouser_fxa_db_counts branch from c98a15f to ba2bd7c Compare November 20, 2024 12:42
@dataops-ci-bot

This comment has been minimized.

@akkomar akkomar requested a review from sean-rose November 20, 2024 13:09
Comment on lines 228 to 245
SELECT
"accounts_with_secondary_emails" AS table_name,
COUNT(
DISTINCT `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1`.uid
) AS total_rows
FROM
`moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
@as_of_date,
'UTC'
)
JOIN
`moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
@as_of_date,
'UTC'
)
ON `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1`.uid = `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1`.uid
WHERE
`moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1`.isPrimary = FALSE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using table aliases would make this less verbose/more readable.

Suggested change
SELECT
"accounts_with_secondary_emails" AS table_name,
COUNT(
DISTINCT `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1`.uid
) AS total_rows
FROM
`moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
@as_of_date,
'UTC'
)
JOIN
`moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
@as_of_date,
'UTC'
)
ON `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1`.uid = `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1`.uid
WHERE
`moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1`.isPrimary = FALSE
SELECT
"accounts_with_secondary_emails" AS table_name,
COUNT(DISTINCT fxa_accounts_v1.uid) AS total_rows
FROM
`moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1` AS fxa_accounts_v1 FOR SYSTEM_TIME AS OF TIMESTAMP(
@as_of_date,
'UTC'
)
JOIN
`moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1` AS fxa_emails_v1 FOR SYSTEM_TIME AS OF TIMESTAMP(
@as_of_date,
'UTC'
)
ON fxa_accounts_v1.uid = fxa_emails_v1.uid
WHERE
fxa_emails_v1.isPrimary = FALSE

(same goes for the similar subquery below)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would unfortunately fail on formatting. Sqlgot seems not to be able to handle table aliases combined with another expression following them (FOR SYSTEM_TIME here).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang. I'd suggest reporting that SQLGlot issue, as they're very responsive and have fixed every issue I've reported to them (often the same day on main, to be included in the next release).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is if you don't quote the whole table identifier (e.g. `moz-fx-data-shared-prod`.accounts_db_external.fxa_accounts_v1) then BigQuery allows you to refer to just the final table name segment, like an automatic alias. (though I do generally prefer to quote the entire table identifier)

@akkomar akkomar force-pushed the wclouser_fxa_db_counts branch from ba2bd7c to cb9478e Compare November 20, 2024 18:22
@dataops-ci-bot

This comment has been minimized.

…g_db_counts_v1/metadata.yaml

Co-authored-by: Sean Rose <[email protected]>
@dataops-ci-bot

This comment has been minimized.

@akkomar akkomar requested a review from sean-rose December 3, 2024 13:34
@dataops-ci-bot

This comment has been minimized.

COUNT(*) AS total_rows
FROM
`moz-fx-data-shared-prod.accounts_db_external.fxa_account_groups_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
@as_of_date,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the other time travel expressions in this query should also use @as_of_date + 1 like the first one does, otherwise the data will be inconsistent.

@dataops-ci-bot
Copy link

Integration report for "s/@as_of_date,/@as_of_date+1,/g"

sql.diff

Click to expand!
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_accounts_derived.py /tmp/workspace/generated-sql/dags/bqetl_accounts_derived.py
--- /tmp/workspace/main-generated-sql/dags/bqetl_accounts_derived.py	2024-12-03 22:10:39.000000000 +0000
+++ /tmp/workspace/generated-sql/dags/bqetl_accounts_derived.py	2024-12-03 22:12:39.000000000 +0000
@@ -66,6 +66,17 @@
         pool="DATA_ENG_EXTERNALTASKSENSOR",
     )
 
+    accounts_backend_derived__monitoring_db_counts__v1 = bigquery_etl_query(
+        task_id="accounts_backend_derived__monitoring_db_counts__v1",
+        destination_table="monitoring_db_counts_v1",
+        dataset_id="accounts_backend_derived",
+        project_id="moz-fx-data-shared-prod",
+        owner="[email protected]",
+        email=["[email protected]", "[email protected]", "[email protected]"],
+        date_partition_parameter="as_of_date",
+        depends_on_past=False,
+    )
+
     accounts_backend_derived__users_services_daily__v1 = bigquery_etl_query(
         task_id="accounts_backend_derived__users_services_daily__v1",
         destination_table="users_services_daily_v1",
Only in /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived: monitoring_db_counts_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/metadata.yaml	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/metadata.yaml	2024-12-03 22:07:18.000000000 +0000
@@ -0,0 +1,56 @@
+friendly_name: FxA DB Counts Monitoring
+description: |-
+  Simple aggregation of counts of records in the FxA DB tables.
+  Enables to identify trends within accounts data. E.g. "How many
+  inactive accounts are there?"
+owners:
+- [email protected]
+labels:
+  incremental: true
+  owner1: wclouser
+  dag: bqetl_accounts_derived
+scheduling:
+  dag_name: bqetl_accounts_derived
+  date_partition_parameter: as_of_date
+bigquery:
+  time_partitioning:
+    type: day
+    field: as_of_date
+    require_partition_filter: false
+    expiration_days: null
+  range_partitioning: null
+  clustering: null
+workgroup_access:
+- role: roles/bigquery.dataViewer
+  members:
+  - workgroup:mozilla-confidential
+references:
+  query.sql:
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_account_customers_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_account_groups_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_account_reset_tokens_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_carts_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_device_commands_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_devices_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_email_bounces_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_linked_accounts_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_oauth_codes_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_oauth_refresh_tokens_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_oauth_tokens_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_password_change_tokens_v1 FOR
+    SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_password_forgot_tokens_v1 FOR
+    SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_paypal_customers_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_recovery_codes_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_security_events_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_sent_emails_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_session_tokens_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_signin_codes_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_totp_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_unblock_codes_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_unverified_tokens_v1 FOR SYSTEM_TIME
+  - moz-fx-data-shared-prod.accounts_db_external.fxa_verification_reminders_v1 FOR
+    SYSTEM_TIME
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/query.sql	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/query.sql	2024-12-03 22:04:57.000000000 +0000
@@ -0,0 +1,300 @@
+WITH table_counts AS (
+  SELECT
+    'account_customers' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_account_customers_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'account_groups' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_account_groups_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'account_reset_tokens' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_account_reset_tokens_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'accounts' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'carts' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_carts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'device_commands' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_device_commands_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'devices' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_devices_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'email_bounces' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_email_bounces_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'emails' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'linked_accounts' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_linked_accounts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'oauth_codes' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_oauth_codes_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'oauth_refresh_tokens' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_oauth_refresh_tokens_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'oauth_tokens' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_oauth_tokens_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'password_change_tokens' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_password_change_tokens_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'password_forgot_tokens' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_password_forgot_tokens_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'paypal_customers' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_paypal_customers_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'recovery_codes' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_recovery_codes_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'security_events' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_security_events_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'sent_emails' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_sent_emails_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'session_tokens' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_session_tokens_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'signin_codes' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_signin_codes_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'totp' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_totp_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'unblock_codes' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_unblock_codes_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'unverified_tokens' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_unverified_tokens_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+  SELECT
+    'verification_reminders' AS table_name,
+    COUNT(*) AS total_rows
+  FROM
+    `moz-fx-data-shared-prod.accounts_db_external.fxa_verification_reminders_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+      @as_of_date + 1,
+      'UTC'
+    )
+  UNION ALL
+    (
+      SELECT
+        "accounts_with_secondary_emails" AS table_name,
+        COUNT(
+          DISTINCT `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1`.uid
+        ) AS total_rows
+      FROM
+        `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+          @as_of_date + 1,
+          'UTC'
+        )
+      JOIN
+        `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+          @as_of_date + 1,
+          'UTC'
+        )
+        ON `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1`.uid = `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1`.uid
+      WHERE
+        `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1`.isPrimary = FALSE
+    )
+  UNION ALL
+    (
+      SELECT
+        "accounts_with_unverified_emails" AS table_name,
+        COUNT(
+          DISTINCT `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1`.uid
+        ) AS total_rows
+      FROM
+        `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+          @as_of_date + 1,
+          'UTC'
+        )
+      JOIN
+        `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+          @as_of_date + 1,
+          'UTC'
+        )
+        ON `moz-fx-data-shared-prod.accounts_db_external.fxa_accounts_v1`.uid = `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1`.uid
+      WHERE
+        `moz-fx-data-shared-prod.accounts_db_external.fxa_emails_v1`.isVerified = FALSE
+    )
+  UNION ALL
+    (
+      SELECT
+        "accounts_linked_to_google" AS table_name,
+        COUNT(uid) AS total_rows
+      FROM
+        `moz-fx-data-shared-prod.accounts_db_external.fxa_linked_accounts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+          @as_of_date + 1,
+          'UTC'
+        )
+      WHERE
+        providerId = 1 -- see LinkedAccountProviderIds at https://github.com/mozilla/fxa/blob/main/packages/fxa-settings/src/lib/types.ts
+    )
+  UNION ALL
+    (
+      SELECT
+        "accounts_linked_to_apple" AS table_name,
+        COUNT(uid) AS total_rows
+      FROM
+        `moz-fx-data-shared-prod.accounts_db_external.fxa_linked_accounts_v1` FOR SYSTEM_TIME AS OF TIMESTAMP(
+          @as_of_date + 1,
+          'UTC'
+        )
+      WHERE
+        providerId = 2 -- see LinkedAccountProviderIds at https://github.com/mozilla/fxa/blob/main/packages/fxa-settings/src/lib/types.ts
+    )
+)
+SELECT
+  @as_of_date AS as_of_date,
+  table_name,
+  total_rows
+FROM
+  table_counts
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/schema.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/schema.yaml	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/accounts_backend_derived/monitoring_db_counts_v1/schema.yaml	2024-12-03 22:04:57.000000000 +0000
@@ -0,0 +1,10 @@
+fields:
+- name: as_of_date
+  type: DATE
+  mode: NULLABLE
+- name: table_name
+  type: STRING
+  mode: NULLABLE
+- name: total_rows
+  type: INTEGER
+  mode: NULLABLE
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/apple_ads_external/ios_app_campaign_stats_v1/bigconfig.yml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/apple_ads_external/ios_app_campaign_stats_v1/bigconfig.yml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/apple_ads_external/ios_app_campaign_stats_v1/bigconfig.yml	2024-12-03 22:05:06.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/apple_ads_external/ios_app_campaign_stats_v1/bigconfig.yml	2024-12-03 22:09:02.000000000 +0000
@@ -1,7 +1,6 @@
 type: BIGCONFIG_FILE
-
 tag_deployments:
-  - collection:
+- collection:
       name: Growth Program
     deployments:
       - column_selectors:
@@ -18,8 +17,7 @@
         metrics:
           - saved_metric_id: freshness
           - saved_metric_id: volume
-
-  - collection:
+- collection:
       name: Operational Checks
     deployments:
       - column_selectors:
@@ -27,3 +25,21 @@
         metrics:
           - saved_metric_id: freshness
           - saved_metric_id: volume
+- deployments:
+  - column_selectors:
+    - name: moz-fx-data-shared-prod.moz-fx-data-shared-prod.apple_ads_external.ios_app_campaign_stats_v1.*
+    metrics:
+    - metric_type:
+        type: PREDEFINED
+        predefined_metric: FRESHNESS
+      metric_name: FRESHNESS [warn]
+      metric_schedule:
+        named_schedule:
+          name: Default Schedule - 13:00 UTC
+    - metric_type:
+        type: PREDEFINED
+        predefined_metric: VOLUME
+      metric_name: VOLUME [fail]
+      metric_schedule:
+        named_schedule:
+          name: Default Schedule - 13:00 UTC

Link to full diff

@akkomar akkomar added this pull request to the merge queue Dec 4, 2024
Merged via the queue into main with commit aa5073d Dec 4, 2024
21 checks passed
@akkomar akkomar deleted the wclouser_fxa_db_counts branch December 4, 2024 14:01
@clouserw
Copy link
Member

clouserw commented Dec 5, 2024

Thanks for landing. I see data in the table. I created mozilla/lookml-generator#1116 to be able to access it in Looker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants