Create More Test Cases

We can create separate test files to test the functionalities of:

- [ ] generate.py
- [ ] data py
- [ ] benchmark.py
- [ ] eval.py
- [ ] correctness.py
- [ ] sweep.py

I suggest to create a separate PR for each file to make reviewing easier.