8 Comments
Jan 25, 2023·edited Jan 26, 2023Liked by Chad Sanderson, Daniel Dicker

great post :) as ex-dwh team in a company DL scenario, i frequently see us tackling problems like this. this is a really nice and broad overview!

one of the current projects is a company wide dictionary on how to name fields and their contents. this serves as basis for both, data discovery input and contract generation. let’s see where we end up

Expand full comment
Jan 25, 2023Liked by Chad Sanderson, Daniel Dicker

This post is pure gold. It's a must-read if you own a data platform or a data warehouse and you want to ensure that data quality gets better over time. The challenge is to get the executive buy-in for the upfront effort and investment required.

Expand full comment

Great post! For dbt, the schema.yml file for each model also provides the space to implement data contract validations. Have you thought about utilizing these schema files for data contract enforcement/validation?

Expand full comment

Well done with this post. It is concise yet very informative.

I have one question though, you have stated that:

Similar to contracts in production services, contracts in the warehouse should be implemented in code and version controlled. The implementation of contracts can take many forms depending on your data tech stack and can be spread across tools.

Considering our tech stack includes dbt as well, could you consider the dbt model itself (with tests, metadata, metrics, etc.) the definition of a data contract?

The advantage of that over Protobuf, for example, is that I don't need to write custom-code to set up the monitor. As you mentioned, dbt + great expectations can validate the schema and the semantic layer.

Expand full comment