Here's the command, I also changed so that should-error prints out the error as well.
python3 deduce.py --error ./test/should-error
# deduce.py, line 66
if error_expected:
print(filename + ' caught an error as expected')
print(e)
print(traceback.format_exc())
You can see that the errors for term_inst_length_node.pf, term_inst_foldr.pf, and some other tests have list index out of range errors, stemming from collect_exports in the Import statement. (Because associative operators don't have resolved names? on line 3695?)
This doesn't happen with just a few of the files, so something strange is happening at a high volume of imported files, that I'm really not sure what could be causing it? I also don't know enough about the new importing mechanism to make much of a gess.
(Matei found this one, so probably direct questions to him)