Skip to content

Consider forking and maintaining pyctcdecode or switch to torchaudio.models.decoder #41230

@FredHaa

Description

@FredHaa

System Info

transformers[torch-speech]==4.56.2
pyannote-audio==4.0.0

Who can help?

@Rocketknight1

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

With the release of the new pyannote-audio==4.0.0 a few problems arises due to the pyctcdecode dependency which seems to be abandoned.

pyannote-audio==4.0.0 depends on numpy>=2.0.0, but the latest pyctcdecode==0.5.0 (from January 2023) depends on numpy<2.0.0. A PR for numpy 2.0 support has been ignored since February (kensho-technologies/pyctcdecode/pull/116). The restriction of numpy<2.0.0 is arbitrary and was set way before numpy 2.0.0 was announced.

Another problem arises because one of the last commits to pyctcdecode changes the output format of the decoder from tuple to a dataclass, making it incompatible with the current transformers ASR pipeline code..

I.e., currently the only way to use the new pyannote-audio==4.0.0 speaker diarization lib with a Wav2Vec2ProcessorWithLM is by forking pyctcdecode, reverting the main branch to the 0.5.0 state, and then removing the numpy<2.0.0 restriction.

user@host ~> uv pip install "transformers[torch-speech]==4.56.2" pyannote-audio==4.0.0                                                                                                                                                                                                           1
  × No solution found when resolving dependencies:
  ╰─▶ Because transformers[torch-speech]==4.56.2 depends on pyctcdecode>=0.4.0 and pyctcdecode>=0.4.0 depends on numpy>=1.15.0,<2.0.0, we can conclude that transformers[torch-speech]==4.56.2 depends on numpy>=1.15.0,<2.0.0. (1)

      Because pyannote-core==6.0.1 depends on numpy>=2.0 and only pyannote-core<=6.0.1 is available, we can conclude that pyannote-core>=6.0.1 depends on numpy>=2.0.
      And because pyannote-audio==4.0.0 depends on pyannote-core>=6.0.1, we can conclude that pyannote-audio==4.0.0 depends on numpy>=2.0.
      And because we know from (1) that transformers[torch-speech]==4.56.2 depends on numpy>=1.15.0,<2.0.0, we can conclude that pyannote-audio==4.0.0 and transformers[torch-speech]==4.56.2 are incompatible.
      And because you require transformers[torch-speech]==4.56.2 and pyannote-audio==4.0.0, we can conclude that your requirements are unsatisfiable.

Reproduction

uv venv
uv pip install "transformers[torch-speech]==4.56.2" pyannote-audio==4.0.0

Expected behavior

The packages should install

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions