Skip to content

Add Snowflake Horizon community contributed connector#30

Merged
vinitachaudhary merged 10 commits intoGoogleCloudPlatform:mainfrom
vasumittal-googler:add-snowflake-horizon-connector
Feb 10, 2026
Merged

Add Snowflake Horizon community contributed connector#30
vinitachaudhary merged 10 commits intoGoogleCloudPlatform:mainfrom
vasumittal-googler:add-snowflake-horizon-connector

Conversation

@vasumittal-googler
Copy link
Contributor

No description provided.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this file be removed since it doesn't contain any useful info?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please review.

Foreword
In today's complex data landscape, organizations increasingly recognize data as a critical asset. The ability to effectively discover, understand, and govern this data is paramount for informed decision-making, regulatory compliance, and innovation. As data ecosystems grow, spanning various platforms and technologies, maintaining a holistic view of data assets becomes challenging.
This connector addresses a key need for many enterprises: bridging the gap between their data warehousing capabilities and the comprehensive data governance and discovery features offered by Google Cloud's Dataplex. Dataplex provides a unified data fabric to manage, monitor, and govern data across diverse environments within Google Cloud.
The Snowflake to Dataplex Data Catalog Connector, detailed in this guide, is a testament to the power of seamless integration. It's designed to automate metadata synchronization, bringing the rich context of your data into sataplex. This not only enhances data visibility and accessibility for all stakeholders but also strengthens your data governance by centralizing metadata management, lineage tracking, and data quality initiatives.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix type: sataplex -> dataplex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please review.

Comment on lines 69 to 71
SNOWFLAKE_WAREHOUSE = 'Enter your Warehouse'
SNOWFLAKE_DATABASE = 'Enter your Database'
SNOWFLAKE_SCHEMA = 'Enter your Schema'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move all user input to top and mention in README.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in both Script and README as well. Please review.

print(entry_id)

entry_group_id = "snowflakehorizongrp" #enter your entry group id for snowflake
entry_type_id = "snowhorizondb" #enter your entry type id for snowflake dbs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are also user configs? Please move to top and elaborate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are part of one-time set up activity to be performed by the User(Step 3 in README). I have updated this in a detail now in the Step 3 in README file as well. Also, these cannot be moved to the top as these are section specific i.e for ex: for Snowflake Databases it is "snowhorizondb", for Snowflake Tags it is "snowhorizontag". Hence, I have added a detailed comment in the script now so that the user knows that it is coming from one-time setup steps in README file.


#### Step 2: Storing the connecting details in Secret Manager.

So, In **Google Cloud Console** -> Navigate to **Secret Manager** -> **Create Secret** ->
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "So"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please Review.

* SNOWFLAKE_DATABASE = 'Enter your Database'
* SNOWFLAKE_SCHEMA = 'Enter your Schema'

This process involves several steps:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed since the steps are detailed above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please Review.

* Then, retrieving Snowflake connection details from the secret manager.
* Following this, a connection to Snowflake is established to extract Horizon catalog data, which is subsequently loaded directly into Dataplex.

#### Step 5: Validate everything in Dataplex. No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add details on how to validate, like navigate to the EntryGroup you specified or search using a particular syntax etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please Review.


This connector addresses a key need for many enterprises: bridging the gap between their data warehousing capabilities and the comprehensive data governance and discovery features offered by Google Cloud's Dataplex. Dataplex provides a unified data fabric to manage, monitor, and govern data across diverse environments within Google Cloud.

The **Snowflake to Dataplex Data Catalog Connector**, detailed here, is a testament to the power of seamless integration. It's designed to automate metadata synchronization, bringing the rich context of your data into dataplex. This not only enhances data visibility and accessibility for all stakeholders but also strengthens your data governance by centralizing metadata management, lineage tracking, and data quality initiatives.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Dataplex Universal Catalog instead of Dataplex Data Catalog everywhere to avoid confusion with the legacy Data Catalog product.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Have checked it everywhere now. Thanks!

#### Step 1: Setting up Snowflake Environment from where you have to load the metadata.
To access the Horizon catalog in Snowflake, you will need to use the **ACCOUNT_USAGE** views located under the **SNOWFLAKE** database.

![Snowflake Environment Setup](images/SnowflakeSetup.png)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Images are not loading correctly. Please use syntax: <img alt="create workflow" src="images/create_workflow.png" width="600">. Please see example here: https://github.com/GoogleCloudPlatform/cloud-dataplex/blob/main/managed-connectivity/cloud-workflows/README.md?plain=1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Have used the above mentioned syntax for all the images. Please Review.

![New Field](images/NewFieldScreen.png)
* Similarly, add all the fields(metadata that you are trying to bring from Snowflake Horizon) and click on "Save":-
![FinalAspectType](images/FinalAspectType.png)
Similarly, create all the aspect types mentioned above one by one. Required Fields are mentioned in the python script for each aspect type(if you plan to execute the script as it is).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to list all the aspectType templates and entryType definitions here, rather than asking users to find them in the script. Better yet, if you can create a bash script to create these entryTypes, aspectTypes and entryGroups, it might provide a more seamless experience. For example: https://github.com/GoogleCloudPlatform/cloud-dataplex/blob/main/managed-connectivity/cloud-workflows/samples/scripts/gcloud/execute.sh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback. I have updated the README.md to explicitly list all AspectType and EntryType definitions with screenshots so users don't have to search the script.

Regarding the suggestion for a bash script: while I agree automation is generally beneficial, I purposefully opted for a manual setup guide for this specific connector to prioritize customer flexibility:

  • Flexibility & Customization: Snowflake environments and business metadata requirements vary widely across organizations. By providing clear visual templates in the README, customers can easily adapt the naming conventions, field types, and descriptions to align with their specific business use cases before creation.
  • Avoiding "Locked" Schemas: A bash script often creates a black box experience. If customers execute a script and later realize the schema doesn't fit their needs, modifying the established Dataplex types can be more complex than getting it right the first time via a guided manual process.
  • Empowerment: This approach guides the user through the "why" of the setup, ensuring they maintain full control over their metadata governance from the start, making it easier for them to troubleshoot or extend the connector in the future.

@vinitachaudhary vinitachaudhary force-pushed the add-snowflake-horizon-connector branch from 6ef778a to 5f74ee6 Compare February 10, 2026 02:23
@vinitachaudhary vinitachaudhary merged commit ef0e1c3 into GoogleCloudPlatform:main Feb 10, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants