feat: Add column name mappings option#1649
feat: Add column name mappings option#1649FelipeKalinoski wants to merge 3 commits intoGoogleCloudPlatform:developfrom
Conversation
…generation - Added support for --column-name-map in ConfigManager for primary keys, comparison fields, and calculated fields. - Fixed circular dependency and iterative calculated field resolution in __main__.py. - Added comprehensive system tests in test_oracle.py for row validation and partition generation with mapping. - Patched unit tests in test__main.py to correctly mock GCS calls in local environments. - Verified all 365 unit tests and new system tests pass.
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
I tested the changes and my previously passing test |
nj1973
left a comment
There was a problem hiding this comment.
Requested a change to oracle integration tests.
I also need to try and debug the changes for column validation which no longer works.
| } | ||
|
|
||
|
|
||
| class MockIbisClient: |
There was a problem hiding this comment.
Did you add this so you could test without BigQuery?
I tested test_row_validation_column_name_map_to_bigquery without this and it passed.
e.g.:
-@mock.patch("data_validation.clients.get_data_client", side_effect=mock_get_data_client)
+# @mock.patch("data_validation.clients.get_data_client", side_effect=mock_get_data_client)
@mock.patch(
"data_validation.state_manager.StateManager.get_connection_config",
new=mock_get_connection_config,
)
-def test_row_validation_column_name_map_to_bigquery(mock_get_client):
+# def test_row_validation_column_name_map_to_bigquery(mock_get_client):
+def test_row_validation_column_name_map_to_bigquery():
Ideally I'd rather these integration tests run with real tables and not mocked ones.
I appreciate this makes it harder for you to test but it matches other tests in this file.
Can we remove the mocked client please?
There was a problem hiding this comment.
Yes, I did use mock tables for testing without BigQuery. We can delete it.
@nj1973 did you create a unit test for testing column validation or were you executing a "manual" test? |
##PLEASE CAREFULLY REVIEW THE CHANGES BEFORE ACCEPTING IT##
Description of changes
This PR implements support for column name mapping (
--column-name-map) across row validation and partition generation. Key changes include:build_config_comparison_fields,build_dependent_aliases, andbuild_column_configsto correctly resolve source column names to their target equivalents using the mapping provided._get_calculated_configand_get_source/target_ibis_calculated_tableto iteratively apply calculated fields, resolving issues where dependencies (likehash__all) were not found when mappings were active.tests/system/data_sources/test_oracle.pycovering row validation with--concat=*and partition generation with mapped primary keys.tests/unit/test__main.pyto correctly mock GCS directory listing, ensuring local test runs pass without requiring live Google Cloud credentials.Issues to be closed
Closes #1617
Checklist
CONTRIBUTINGGuide.tests/local_check.shscript)