dbtmock is an utility for end to end testing DBT pipelines.
dbtmock lets you:
- pick a model to test
- mock model dependencies
- mock a final result
- assert output of a model against a final result dataset
dbtmock does not mess with your dbt project, it simply generates SQL queries based on your project manifests.json. These SQL queries can be used as regular dbt tests.
- Select a model you want to test
- Picked any arbitrarly dependencies to Mock. Any intermediate models or sources are Ok. The mocked data is provided via CSV files.
- Provide expected output result using a CSV file
- Use
dbtmockcli to generate test queries - Generated queries are run via
dbt testcommand
- Check Github releases, Download your needed binary
go install github.com/dav009/dbtmock/cmd/dbtmock@latest
dbtmock --manifest target/manifest.json --tests unit_tests --output tests
--manifestis the path to your dbt's projectmanifest.json--testsis a path to a folder contaningjsonfiles. Each json file in that folder is a test definition (see examples below)--outputis the path folder where the generated tests will be stored. By default it points totestsfolder wheredbtlooks for custom tests to run
- Go to your dbt project:
cd my_project - Create a folder to store your tests definitions:
mkdir unit_tests - Create a json file per test definition. For example the Json file below is a single test in which there are two mocks
seed.jaffle_shop.raw_customersandseed.jaffle_shop.raw_orders. The test will run the model"model.jaffle_shop.customers". The content ofoutputhas the data which will be used for assertions.
{
"name": "dummy_test",
"model": "model.jaffle_shop.customers",
"mocks": {
"seed.jaffle_shop.raw_customers": {
"filepath": "seeds/raw_customers.csv",
"types": {
"id": "INT64"
}
},
"seed.jaffle_shop.raw_orders": {
"filepath": "seeds/raw_orders.csv",
"types": {
"id": "INT64",
"user_id": "INT64",
"order_date":"DATE"
}
},
},
"output": {
"filepath": "output.csv",
"types": {
"customer_id": "INT64",
"most_recent_order": "DATE",
"number_of_orders": "INT64",
"customer_lifetime_value": "FLOAT64",
"first_order": "DATE"
}
}
}
- Go to your dbt project, make sure you generated a
manifest.json(e.g: rundbt compile) - run
dbtmock --tests unit_tests --output tests - Check the files in the output folder :
ls tests - Run
dbt test - You should see your tests in the list of test being ran by dbt
Take a look at dbtmock_example
- This a fork of the vanilla
jaffle_storeproject, which is thehello worldof dbt - This fork has a few changes:
unit_testsfolder with a sampletest.json- some mock data as csv:
mocked_data/output.csvas a dataset to assert the final output of the given test - it uses already given
csvfiles as mocked data
You can use this fork to run dbtmock
- Clone the project
git clone .. - Create a
profile.ymlwith your BigQuery settings - Run
dbt seed - Run
dbt compile - Run
dbtmock(no arguments needed) - See test generated in :
tests/dummy_test.sql - Run
dbt testsee in the output the result:START test dummy_test.. PASS dummy_test
go build cmd/dbtmock/dbtmock.go
- This project uses
manifest.jsonto figure out what are the dependencies of the model that you are testing. - Given a model to test
dbtmockrecursively rebuilds the sql queries accross dependencies - While rebuilding a model SQL
dbtmockreplaces table for given mocked data dbtmockcreates a giant SQL query with mocks and model SQL logic. The output of the query is then compared to the provided expected data via aMINUSsql operation.- Only tested with BigQuery
