Skip to content

Conversation

@pettyalex
Copy link

I observed that skip in data.table::fread is very slow, and that the developers of data.table advise that skip should not be used for chunking: Rdatatable/data.table#1721

This change will use pipes to incrementally read the input files in chunks instead of repeatedly decompressing and re-reading the files. This should enable much smaller chunk sizes, which will significantly reduce memory usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant