Replies: 6 comments 4 replies
-
|
Hey, at the moment sling doesn't support this. if you run incremental mode with a file destination, it's not really incremental (just full-refresh). For that we'd need a watermark (or state somewhere) as you've suggested. Would need to think about how that would work. Thinking the target filesystem itself would be the place to put the state, like a json file keeping the latest {
"my_schema.my_table": "2023-01-24 12:34 56.6789",
"my_schema.my_table2": "2023-01-24 12:34 56.6789",
"my_custom_sql": "2023-01-24 12:34 56.6789"
}Sling would read from it at the start of run, and overwrite it at end of run. |
Beta Was this translation helpful? Give feedback.
-
Additionally, incremental state cannot be only for source, it should be unique to each pair of source and destination |
Beta Was this translation helpful? Give feedback.
-
|
Wouldn't saving state for file destination end up in implementation of basic functionality of iceberg or delta format?
Basically ACID |
Beta Was this translation helpful? Give feedback.
-
|
It would be really great to support incremental loading for file targets. My use case is loading data from mongodb to a parquet file in blob storage. For parquet specifically, I believe it stores min/max values for each column in the file, so it should even be efficient to query the file for, e.g. But having a separate state file alongside would also be totally fine for my use case. |
Beta Was this translation helpful? Give feedback.
-
|
Hello! @flarco do you have any new insights to share regarding this feature? Is there any plans on covering the use case of incremental load to file (in particular parquet)? Has anyone found a workaround, or is the only option to go DB to DB if one need incremental load? |
Beta Was this translation helpful? Give feedback.
-
|
State based incremental is available in latest version. See here. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I had few questions about incremental change-set and managing incremental data from DBs to FileSystem (s3)
Beta Was this translation helpful? Give feedback.
All reactions