Skip to content

MPS: NotImplementedError: Output channels > 65536 not supported at the MPS device. #34

@kotx

Description

@kotx

When running inference on macOS, the following error occurs:

Details
2024-12-29 22:14:08 | INFO | fairseq.tasks.hubert_pretraining | HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': 'metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': 'label', 'label_rate': 50.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'pad_audio': False}
2024-12-29 22:14:08 | INFO | fairseq.models.hubert.hubert | HubertModel Config: {'_name': 'hubert', 'label_rate': 50.0, 'extractor_mode': default, 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': gelu, 'layer_type': transformer, 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.05, 'dropout_input': 0.1, 'dropout_features': 0.1, 'final_dim': 256, 'untie_final_proj': True, 'layer_norm_first': False, 'conv_feature_layers': '[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', 'conv_bias': False, 'logit_temp': 0.1, 'target_glu': False, 'feature_grad_mult': 0.1, 'mask_length': 10, 'mask_prob': 0.8, 'mask_selection': static, 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_selection': static, 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'conv_pos': 128, 'conv_pos_groups': 16, 'latent_temp': [2.0, 0.5, 0.999995], 'skip_masked': False, 'skip_nomask': False, 'checkpoint_activations': False, 'required_seq_len_multiple': 2, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}
2024-12-29 22:14:17 | WARNING | rvc_python.modules.vc.modules | Traceback (most recent call last):
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/rvc_python/modules/vc/modules.py", line 184, in vc_single
    audio_opt = self.pipeline.pipeline(
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/rvc_python/modules/vc/pipeline.py", line 415, in pipeline
    self.vc(
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/rvc_python/modules/vc/pipeline.py", line 222, in vc
    logits = model.extract_features(**inputs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/fairseq/models/hubert/hubert.py", line 535, in extract_features
    res = self.forward(
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/fairseq/models/hubert/hubert.py", line 437, in forward
    features = self.forward_features(source)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/fairseq/models/hubert/hubert.py", line 392, in forward_features
    features = self.feature_extractor(source)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/fairseq/models/wav2vec/wav2vec2.py", line 895, in forward
    x = conv(x)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/container.py", line 250, in forward
    input = module(input)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 375, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/Users/kot/Documents/nyano/tts/.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 370, in _conv_forward
    return F.conv1d(
NotImplementedError: Output channels > 65536 not supported at the MPS device. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

Setting PYTORCH_ENABLE_MPS_FALLBACK=1 doesn't work no matter how I try it, and passing RVCInference(device="cpu") doesn't work either. Maybe we shouldn't overwrite the device parameter if has_mps is True? At least have a fallback/force option imo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions