Let soda scan do a "dry run"

Currently it's possible to examine a scan's queries by adding the `--verbose` flag to `soda scan`.

This is good to know what queries were sent, but it would be great if you could also know which queries will be sent without actually executing them against the data source.

I'm thinking about a `--dry-run` flag for `soda scan` which would just return the rendered SQL queries. As a user, you could then get a cost estimation with the returned SQL query (see e.g. https://cloud.google.com/bigquery/docs/best-practices-costs#perform-dry-run).

Is this something that could fit in your roadmap?

I've been looking at the soda-core codebase and would be interested in contributing such a feature. At the moment it's not clear to me where to start, since SQL queries are resolved with a succession of steps and are available just ahead of running them against the data source. Then of course the scan logs should also be adapted to account for empty query results with a dry run. So any advice would be great :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Let soda scan do a "dry run" #2473

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Let soda scan do a "dry run" #2473

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions