There are two types of answer matching in the code:
https://github.com/0russwest0/Agent-R1/blob/7be7508f4baf75dd9a5f4618aebace6743926927/agent_r1/src/reward_score/qa_em_and_format.py#L140C1-L148C36
Since the exact matching score is commented in the code, I wasn't sure which matching score is used in the paper's result. It'd be great if you can confirm. Thanks!