Skip to content

Comments

mlflow_logger_v1 - Adding option to log last model#13

Open
adin786 wants to merge 2 commits intoexplosion:mainfrom
adin786:mlflow_enhancement
Open

mlflow_logger_v1 - Adding option to log last model#13
adin786 wants to merge 2 commits intoexplosion:mainfrom
adin786:mlflow_enhancement

Conversation

@adin786
Copy link

@adin786 adin786 commented Nov 5, 2022

I noticed the mlflow logger only seems to log the "best" model. This is a bit confusing as the metrics summary mlflow displays, gives the results for the "last" model.

Anyway the wandb logger and the clearml logger both look like they include some functionaliity to choose which model folder to track (or both). So I added functionality to log the "last" model with a boolean option in the [training.logger] config. I thought Id stick to a similar pattern as the one implemented in the wandb logger (which handles this aspect neatly).

Note: I ended up using mlflow.log_artifact() rather than mlflow.spacy.log_model() because when I use the latter method outside of the log_step() function it looks like spacy just aborts the process abruptly, without letting the mlflow.spacy function do its thing. It works if I put it in the log_step() function but I wanted to just trigger an upload at the end... otherwise we'd seriously slow down the training process by forcing a model upload every training step. Perhaps I could use some feedback on this part

Separately, I also added the "step" value to each mlflow.log_metrics() call as this just makes the mlflow plot visualisations a bit nicer if you want to view it by step rather than by time on the x axis.

@adin786
Copy link
Author

adin786 commented Nov 5, 2022

Just to add, I probably did an overkill/hacky thing by making a class just to persist the model path for later uploading. Would be cleaner if I just used output_path as a nonlocal so it can be accessed later in the finalize() method.

I'll wait for feedback before messing with this. The point is, just like the wandb logger I just want to pass the model path forward so I can call mlflow.log_artifact() etc once at the end of the training run. I think it's probably fine for the "best" model to get logged on each (improved) step because that's at least not every step.

@svlandeg
Copy link
Contributor

Hi, thanks for your contribution! We'll review the code and get back to you :-)

@rmitsch rmitsch self-requested a review November 23, 2022 09:22
@rmitsch rmitsch added the enhancement New feature or request label Nov 23, 2022
Copy link
Contributor

@rmitsch rmitsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, @adin786! Sorry it took me so long to review this.

Only some minor comments - and the issue with not being able to log artifacts with mlflow.spacy.log_model() outside of log_step() I'll look into.

Comment on lines +14 to +20
class ModelDir:
def __init__(self) -> None:
self.path = None

def update(self, path: str) -> None:
self.path = path

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can ditch ModelDir - let's use a simple string for the latest model path (latest_model_dir or similar).

Comment on lines +73 to +77
def log_model(path, name):
mlflow.log_artifacts(
path,
name
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar: since this is just a wrapper without added functionality, let's use mlflow.log_artifacts() directly.

Comment on lines 104 to +105
mlflow.spacy.log_model(nlp, "best")
log_model(output_path, 'model_best')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It irks me that now we have three artifacts - best, model_best and model_last. Ideally I'd like to have best and last, both the complete set of associated data (like best has, see screenshot). Just for the sake of consistency. If that doesn't work out due to the issues you described, I guess this is fine. I'll look into this.
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I ended up using mlflow.log_artifact() rather than mlflow.spacy.log_model() because when I use the latter method outside of the log_step() function it looks like spacy just aborts the process abruptly, without letting the mlflow.spacy function do its thing.

Hm, I ran this with

        def finalize() -> None:
            if log_latest_dir:
                mlflow.spacy.log_model(nlp, "model_last")

now and this seemed to work (see image). What's your outcome?
image

Comment on lines +109 to +113

if log_latest_dir:
log_model(latest_model.path, 'model_last')

print('End run')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if log_latest_dir:
log_model(latest_model.path, 'model_last')
print('End run')
if log_latest_dir:
log_model(latest_model.path, 'model_last')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants