-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
Warctools uses the Content-Length field to determine the length of the body for validating and reading WARC files. Since the g-zipped bodies are no longer g-zipped in common-crawl WARC files, not the whole of g-zipped messages is being parsed.
#14 fixes this and allows proper parsing common-crawl WARC files.
Metadata
Metadata
Assignees
Labels
No labels