Hi, thanks for sharing your work!
I have a question about the exploration agents used for collecting non-expert data. In the paper it says:
- steering angle and speed are randomly sampled from predefined sets
- another agent samples a control configuration and a behavior pattern from a predetermined set
Could you please clarify:
- How is this predefined set constructed?
- How often does the exploration agent switch its strategy?
Thanks a lot!