Conversation
|
This script: demo_resume.py mocks the use of a simple model when performing single step evaluations, and shows the resumability of the process. The raw output is: Parsing this into a more readable format: {
"input": "CN",
"ground_truth": "C.N>>CN",
"num_predictions": 3,
"ground_truth_correct": [
true,
false,
false
],
"timing": {
"time_model_call": 2.1298726399739582e-05,
"time_post_processing": 5.404154459635417e-06
},
"predictions": [
"C.N>>CN",
"c1ccccc1>>CN",
"NC=O>>CN"
]
} |
kmaziarz
left a comment
There was a problem hiding this comment.
Thanks @jla-gardner, this has been annoying me since forever 🙂 I had an initial look and left a few comments, I can have another pass later. Before merging, I will also battle-test it on an actual problematic case (i.e. eval that spans multiple days yet gets pre-empted every 12h).
If you look at CHANGELOG.md, you'll find that we record all changes to syntheseus there (well, apart from those that don't affect any behaviour, e.g. changes purely in tests or fixing a typo in a comment). Feel free to add an entry there following the existing pattern (so, also append yourself to the contributors list)!
| input=d.get("input", ""), | ||
| ground_truth=d.get("ground_truth", ""), |
There was a problem hiding this comment.
Why would input or ground_truth be missing? Also, going in that direction, I'm wondering why we can't do something akin to
return cls(
**{
k: TimingRecord(**v) if k.endswith("timing") else v
for k, v in d.items()
}
)This would fail explicitly if extra stuff is provided in d, but maybe that's a good thing?
Same comment for to_dict; I suppose that also could be a for loop?
| """Reconstruct a Reaction from its reaction SMILES string.""" | ||
| sep = REACTION_SEPARATOR * 2 | ||
| reactants_str, products_str = rxn_smiles.split(sep) | ||
| reactants = Bag(Molecule(s) for s in reactants_str.split(SMILES_SEPARATOR) if s) |
There was a problem hiding this comment.
We have molecule_bag_from_smiles_strict in chem/utils.py doing almost exactly this, modulo the rejection of empty SMILES components via if s. But I'm not sure why we'd want the latter?
More generally, maybe _reaction_from_smiles could be a classmethod of the Reaction class? I was somewhat surprised we don't have it there already.
| @dataclass(frozen=True) | ||
| class TimingRecord: | ||
| """Timing information for a single sample's model call.""" | ||
|
|
||
| time_model_call: float | ||
| time_post_processing: float |
There was a problem hiding this comment.
Is this not an exact copy of ModelTimingResults from utils/metrics.py? I guess the existing one is used for timing of a step (batch) and the new one you use for timing of a single sample (by dividing by batch size), but not sure if that warrants having both of them around.
| if include_predictions: | ||
| all_predictions.extend(batch_predictions) | ||
| all_back_translation_predictions.extend(batch_back_translation_predictions) |
There was a problem hiding this comment.
So in resumable mode, are we flushing the predictions to a file as we go and also keep all of them in memory? Or am I misunderstanding things?
| input_smiles = input.smiles | ||
| else: | ||
| assert isinstance(input, Bag) | ||
| input_smiles = ".".join(mol.smiles for mol in input) |
There was a problem hiding this comment.
This is molecule_bag_to_smiles from interface/molecule.py.
AustinT
left a comment
There was a problem hiding this comment.
Thanks for working on this @jla-gardner. The implementation roughly makes sense to me- I'll leave it to @kmaziarz for detailed review. I agree with his comments so far.
This PR (attempts to) makes the
syntheseus/cli/eval_single_step.pyscript resumable, mitigating the impact of e.g. job crashes/pre-emption.To do this, the script now saves model inputs+outputs+information in a
JSON Linesfile during evaluation. Re-invoking this script after e.g. pre-emption/restarts attempts to load this file if available, and skips calling the model for any already-queried inputs.The new output format is easier to work with than the existing
stats.jsonfiles, since you can load line-by-line (see attached script and analysis for what this looks like).To preserve backwards compatibility, this feature is hidden behind the
config.resumableflag, which is turned off by default.Thank you to @kmaziarz for the suggested implementation.