-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem? Please describe.
Since we plan to include the md5 value in the file name of each npz file, we can add a test to verify the file. This will help avoid malicious attack by someone trying to upload a file with the same name of an existing file but with different content.
Describe the solution you'd like
The test can be added in two levels.
- The pytest level. Whenever there is a change in a dataset folder in a new PR, we can verify all the npz files included in the metadata.json and task json files.
- The dataloader level. We could also add assertion in the dataloading functions (maybe only do this when the files are downloaded). But this will break the current datasets with old file naming. So this can only be done after all the datasets are updated. This will also increase the computation overhead a little bit, which I'm not sure is worth or not.
1 is redundant if 2 is implemented.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request