this task is to add early filtering to the loading samples off disk. this allows us to skip over records that are not applicable instead of waiting until the query stage.
according to abadi, late materialization can speed up queries enormously, but its not clear to me how to late filter when each thread is loading a separate column off disk