sqlcompyre.analysis package
Subpackages
- sqlcompyre.analysis.dialects package
- Submodules
- Module contents
DialectProtocolDialectProtocol.case_insensitive_collationDialectProtocol.case_sensitive_collationDialectProtocol.get_table_creation_timestamps()DialectProtocol.nameDialectProtocol.supports_multi_part_schemasDialectProtocol.supports_schemasDialectProtocol.verbose_nameDialectProtocol.views_support_notnull_columns
Submodules
Module contents
- class sqlcompyre.analysis.QueryInspection(engine: Engine, query: FromClause)[source]
Bases:
objectInspect the results of a SQL query.
- Note:
This class should never be initialized directly. Use the
inspect()orinspect_table()functions instead.
- Attributes:
- row_count
Get the number of rows returned by the query.
Methods
column_stats(column)Obtain statistics about a single column.
distinct_row_count(*columns)Get the number of rows with distinct values wrt.
- column_stats(column: str) ColumnStats[source]
Obtain statistics about a single column.
- Args:
column: The name of the column to obtain information about.
- Returns:
An object providing access to column statistics.
- distinct_row_count(*columns: str) int[source]
Get the number of rows with distinct values wrt. to the provided column(s).
- Args:
- columns: The set of columns to compute the number of distinct values for. If no
columns are provided, the number of distinct rows (across all columns) is computed.
- Returns:
The number of distinct rows.
- property row_count: int
Get the number of rows returned by the query.
- class sqlcompyre.analysis.SchemaComparison(engine: Engine, left_schema: str, right_schema: str, left_tables: dict[str, Table], right_tables: dict[str, Table], float_precision: float, collation: str | None, ignore_casing: bool)[source]
Bases:
objectCompare the table metadata from two different SQL database schemas.
- Note:
This class should never be initialized directly. Use the
compare_schemas()function instead.
- Attributes:
- table_counts
A comparison between the number of tables in each schema.
- table_names
A comparison between the table names of the two schemas.
Methods
compare_matched_table(name[, join_columns, ...])A full comparison between the tables in "left" and "right" schema with the specified name.
Generate a report that summarizes the schema comparison.
table_reports([ignore_tables, ...])Generate reports for all tables matched between the schemas.
- compare_matched_table(name: str, join_columns: list[str] | None = None, ignore_columns: list[str] | None = None, column_name_mapping: dict[str, str] | None = None, infer_primary_keys: bool = False) TableComparison[source]
A full comparison between the tables in “left” and “right” schema with the specified name. The name must be available in
table_names.in_common.- Args:
- name: The name of the table (possibly including a schema name if this schema comparison
compares multi-part schemas).
- join_columns: The columns to join the tables on in order to compare column values. If
not provided, defaults to the union of primary keys. The corresponding primary key(s) of the right table are determined via
column_name_mapping.- ignore_columns: Columns to ignore to evaluate equality. These column names should
reference the left table: corresponding columns in the right table are determined via
column_name_mapping. Primary key columns passed here are ignored.- column_name_mapping: A mapping from column names in the left table to column names in
the right table. If not provided, defaults to mapping columns with the same names.
- infer_primary_keys: Whether primary keys should be inferred if the tables do not have
matching primary key(s).
- Returns:
The full comparison between the tables.
- table_reports(ignore_tables: list[str] | None = None, ignore_table_columns: dict[str, list[str]] | None = None, skip_equal: bool = False, infer_primary_keys: bool = False, sort_by: Literal['name', 'creation_timestamp'] = 'name', verbose: bool = True) dict[str, Report][source]
Generate reports for all tables matched between the schemas.
- Args:
- ignore_tables: A list of regexes specifying tables to ignore completely. Consider
using
^and$to match full table names.- ignore_table_columns: A mapping from table names to column names. The provided columns
are ignored to evaluate equality and dropped from all reports. Different to the behavior for
ignore_tables, table names are not considered to be regular expressions.- ignore: A mapping from table names to column names. Each table which provides no
column names is ignored completely. If at least one column name is provided, the provided columns are ignored to evaluate equality.
- skip_equal: Whether to skip reports about tables which are equal in the two schemas.
Tables that cannot be compared for equality (e.g. because they have no primary key) are still included.
- infer_primary_keys: Whether primary keys should be inferred if compared tables do not
have matching primary key(s).
- sort_by: The strategy for sorting the table reports. Either alphabetically by the name
of the tables (
name) or chronologically by the creation timestamp of the tables in the “left” schema (creation_timestamp). Note that the latter option is not supported for all database systems.
verbose: Whether to show a progress bar for the report generation.
- Returns:
A mapping from matched table names to the reports of their comparisons, ordered by the sort strategy.
- class sqlcompyre.analysis.TableComparison(engine: Engine, left_table: FromClause, right_table: FromClause, join_columns: list[str] | None, column_name_mapping: dict[str, str] | None, ignore_columns: list[str] | None, float_precision: float, collation: str | None, ignore_casing: bool, infer_primary_keys: bool)[source]
Bases:
objectCompare the content of two SQL database tables.
- Note:
This class should never be initialized directly. Use the
compare_tables()function instead.
- Attributes:
- column_matches
A comparison between the column values of the two tables.
- column_names
A comparison between the column names of the two tables.
- equal
Whether the compared tables are equal.
- join_columns
The columns used for joining the two tables.
- row_counts
A comparison between the number of rows in each table.
- row_matches
A comparison between the contents of the individual rows in the two tables.
Methods
get_top_changes(column_name[, n])Gets the most common changes in a single column.
Generate a report that summarizes the table comparison.
- property column_matches: ColumnMatches
A comparison between the column values of the two tables.
- property equal: bool
Whether the compared tables are equal.
- get_top_changes(column_name: str, n: int = 5) dict[str, int][source]
Gets the most common changes in a single column.
- Args:
column_name: The name of the column in the “left” table. n: The number of changes to get.
- Returns:
A dictionary where the key is a change in string form (maybe “true -> false”) and the value is the number of times the change occurs.
- property join_columns: list[str]
The columns used for joining the two tables.
- property row_matches: RowMatches
A comparison between the contents of the individual rows in the two tables.