Skip to content

Conversation

@dependabot
Copy link

@dependabot dependabot bot commented on behalf of github Jan 20, 2023

Bumps spacy from 3.0.5 to 3.5.0.

Release notes

Sourced from spacy's releases.

v3.5.0: New CLI commands, language updates, bug fixes and much more

✨ New features and improvements

  • NEW: New apply CLI command to annotate new documents with a trained pipeline (#11376).
  • NEW: New benchmark CLI command to benchmark pipelines. The new benchmark speed subcommand measures the speed of a pipeline, the benchmark accuracy subcommand is a new alias for evaluate (#11902).
  • NEW: New find-threshold CLI command to identify an optimal threshold for classification models (#11280).
  • NEW: New FUZZY Matcher operator for fuzzy matches based on Levenshtein edit distance. In addition, the FUZZY and REGEX operators are now supported in combination with IN/NOT_IN. (#11359).
  • Language updates for Ancient Greek, Dutch, Russian, Slovenian and Ukrainian (#11345, #11162, #11426, #11753, #11811, #11997, more details below).
  • Allow up to typer v0.7.x (#11720), mypy 0.990 (#11801) and typing_extensions v4.4.x (#12036).
  • New spacy.ConsoleLogger.v3 with expanded progress tracking (#11972).
  • Improved scoring behavior for textcat with spacy.textcat_scorer.v2 (#11696 and #11971) and spacy.textcat_multilabel_scorer.v2 (#11820).
  • Improved customizability of the knowledge base used for entity linking, with the default implementation being the new InMemoryLookupKB (#11268).
  • Optional before_update callback that is invoked at the start of each training step (#11739).
  • Improve performance of SpanGroup (#11380).
  • Improve UX around displacy.serve when the default port is in use (#11948).
  • Patch a security vulnerability in extracting tar files (#11746).
  • Add equality definition for vectors (#11806).
  • Allow interpolation of variables in directory names in projects (#11235).
  • Update default component configs to use the latest tok2vec version (#11618).

🔴 Bug fixes

  • #11382: Fix lookup behavior for the French and Catalan lemmatizers.
  • #11385: Ensure that downstream components can train properly on a frozen tok2vec or transformer layer.
  • #11762: Support local file system remotes for projects.
  • #11763: Raise an error when unsupported values are used for textcat.
  • #11834: Ensure Vocab.to_disk respects the exclude setting for lookups and vectors.
  • #12009: Fix a few typing issues for SpanGroup and Span objects.
  • #12098: Correctly handle missing annotations in the edit tree lemmatizer.

⚠️ Backwards incompatibilities and model updates

The following changes may require you to update code that is using the relevant functionality:

  • An error is now raised when unsupported values are given as input to train a textcat or textcat_multilabel model - ensure that values are 0.0 or 1.0 as explained in the docs.

The following changes may influence the output of your language pipeline or trained models:

  • Updates to language defaults:
    • Extended support for Slovenian (#11162).
    • Switch Russian and Ukrainian lemmatizers to pymorphy3 (#11345, #11811).
    • Support for editorial punctuation in Ancient Greek (#11426).
    • Update to Russian tokenizer exceptions (#11753).
    • Small fix in the list of Dutch stop words (#11997).
  • Updates to model defaults:
    • Use the latest tok2vec defaults in all components (#11618).
    • Improve the default attributes used for the textcat and textcat_multilabel components (#11698).
    • Update the default scorer for textcat and textcat_multilabel to fix a bug related to threshold for textcat and to make it possible to score multiple textcat/textcat_multilabel components in a single pipeline with custom scorers. If no custom scorers are used, the cat_p/r/f scores will now only reflect the final component's labels and performance (#11696, #11820).
    • Correct the token_acc score to report the intended measure (# correct tokens / # predicted tokens, the same as in spaCy v2). The token_acc scores for v3.5 will be lower for the same performance because they were incorrectly inflated in v3.0-v3.4. The token_p/r/f scores should remain unchanged (#12073).

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [spacy](https://github.com/explosion/spaCy) from 3.0.5 to 3.5.0.
- [Release notes](https://github.com/explosion/spaCy/releases)
- [Commits](explosion/spaCy@v3.0.5...v3.5.0)

---
updated-dependencies:
- dependency-name: spacy
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant