MPS: NotImplementedError: Output channels > 65536 not supported at the MPS device.

When running inference on macOS, the following error occurs:
<details>
<pre>
2024-12-29 22:14:08 | INFO | fairseq.tasks.hubert_pretraining | HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': 'metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': 'label', 'label_rate': 50.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'pad_audio': False}
2024-12-29 22:14:08 | INFO | fairseq.models.hubert.hubert | HubertModel Config: {'_name': 'hubert', 'label_rate': 50.0, 'extractor_mode': default, 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': gelu, 'layer_type': transformer, 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.05, 'dropout_input': 0.1, 'dropout_features': 0.1, 'final_dim': 256, 'untie_final_proj': True, 'layer_norm_first': False, 'conv_feature_layers': '[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', 'conv_bias': False, 'logit_temp': 0.1, 'target_glu': False, 'feature_grad_mult': 0.1, 'mask_length': 10, 'mask_prob': 0.8, 'mask_selection': static, 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_selection': static, 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'conv_pos': 128, 'conv_pos_groups': 16, 'latent_temp': [2.0, 0.5, 0.999995], 'skip_masked': False, 'skip_nomask': False, 'checkpoint_activations': False, 'required_seq_len_multiple': 2, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}
2024-12-29 22:14:17 | WARNING | rvc_python.modules.vc.modules | Traceback (most recent call last):
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/rvc_python/modules/vc/modules.py", line 184, in vc_single
    audio_opt = self.pipeline.pipeline(
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/rvc_python/modules/vc/pipeline.py", line 415, in pipeline
    self.vc(
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/rvc_python/modules/vc/pipeline.py", line 222, in vc
    logits = model.extract_features(**inputs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/fairseq/models/hubert/hubert.py", line 535, in extract_features
    res = self.forward(
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/fairseq/models/hubert/hubert.py", line 437, in forward
    features = self.forward_features(source)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/fairseq/models/hubert/hubert.py", line 392, in forward_features
    features = self.feature_extractor(source)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/fairseq/models/wav2vec/wav2vec2.py", line 895, in forward
    x = conv(x)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward
    input = module(input)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 375, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 370, in _conv_forward
    return F.conv1d(
NotImplementedError: Output channels > 65536 not supported at the MPS device. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
</pre>
</details>

Setting `PYTORCH_ENABLE_MPS_FALLBACK=1` doesn't work no matter how I try it, and passing `RVCInference(device="cpu")` doesn't work either. Maybe we shouldn't overwrite the device parameter if `has_mps` is True? At least have a fallback/force option imo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPS: NotImplementedError: Output channels > 65536 not supported at the MPS device. #34

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MPS: NotImplementedError: Output channels > 65536 not supported at the MPS device. #34

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions