Skip to content

pruning 之后使用 无法读取模型 #7

@JCDemon

Description

@JCDemon

我尝试使用
model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto", trust_remote_code=True, low_cpu_mem_usage=True)
但是会报错
“Traceback (most recent call last):
File "/home/ubuntu/test_scripts/benchmark_r.py", line 154, in
main()
File "/home/ubuntu/test_scripts/benchmark_r.py", line 63, in main
model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto", trust_remote_code=True, low_cpu_mem_usage=True,
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 556, in from_pretrained
return model_class.from_pretrained(
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3502, in from_pretrained
) = cls._load_pretrained_model(
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3926, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 805, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 348, in set_module_tensor_to_device
raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([2048, 2785]) in "weight" (which has shape torch.Size([2048, 5504])), this look incorrect.

我也尝试了在加载的时候添加参数 ignore_mismatched_sizes=True
model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto", trust_remote_code=True, low_cpu_mem_usage=True, ignore_mismatched_sizes=True)

同样也会报错:
Some weights of QWenLMHeadModel were not initialized from the model checkpoint at /data/xxxx and are newly initialized because the shapes did not match:

  • transformer.h.10.mlp.c_proj.weight: found shape torch.Size([2048, 2785]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.10.mlp.w1.weight: found shape torch.Size([2785, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.10.mlp.w2.weight: found shape torch.Size([2785, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.11.mlp.c_proj.weight: found shape torch.Size([2048, 2518]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.11.mlp.w1.weight: found shape torch.Size([2518, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.11.mlp.w2.weight: found shape torch.Size([2518, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.12.attn.c_attn.weight: found shape torch.Size([3840, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.12.attn.c_proj.weight: found shape torch.Size([2048, 1280]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.12.mlp.c_proj.weight: found shape torch.Size([2048, 2393]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.12.mlp.w1.weight: found shape torch.Size([2393, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.12.mlp.w2.weight: found shape torch.Size([2393, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.13.attn.c_attn.weight: found shape torch.Size([3072, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.13.attn.c_proj.weight: found shape torch.Size([2048, 1024]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.13.mlp.c_proj.weight: found shape torch.Size([2048, 3776]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.13.mlp.w1.weight: found shape torch.Size([3776, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.13.mlp.w2.weight: found shape torch.Size([3776, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.14.attn.c_attn.weight: found shape torch.Size([2688, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.14.attn.c_proj.weight: found shape torch.Size([2048, 896]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.14.mlp.c_proj.weight: found shape torch.Size([2048, 3594]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.14.mlp.w1.weight: found shape torch.Size([3594, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.14.mlp.w2.weight: found shape torch.Size([3594, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.15.attn.c_attn.weight: found shape torch.Size([3072, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.15.attn.c_proj.weight: found shape torch.Size([2048, 1024]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.15.mlp.c_proj.weight: found shape torch.Size([2048, 4113]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.15.mlp.w1.weight: found shape torch.Size([4113, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.15.mlp.w2.weight: found shape torch.Size([4113, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.16.attn.c_attn.weight: found shape torch.Size([3072, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.16.attn.c_proj.weight: found shape torch.Size([2048, 1024]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.17.attn.c_attn.weight: found shape torch.Size([2688, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.17.attn.c_proj.weight: found shape torch.Size([2048, 896]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.17.mlp.c_proj.weight: found shape torch.Size([2048, 3263]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.17.mlp.w1.weight: found shape torch.Size([3263, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.17.mlp.w2.weight: found shape torch.Size([3263, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.18.mlp.c_proj.weight: found shape torch.Size([2048, 3861]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.18.mlp.w2.weight: found shape torch.Size([3861, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.18.attn.c_attn.weight: found shape torch.Size([1536, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.18.attn.c_proj.weight: found shape torch.Size([2048, 512]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.18.mlp.w1.weight: found shape torch.Size([3861, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.19.attn.c_attn.weight: found shape torch.Size([2688, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.19.attn.c_proj.weight: found shape torch.Size([2048, 896]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.20.attn.c_attn.weight: found shape torch.Size([2688, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.20.attn.c_proj.weight: found shape torch.Size([2048, 896]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.20.mlp.c_proj.weight: found shape torch.Size([2048, 3291]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.20.mlp.w1.weight: found shape torch.Size([3291, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.20.mlp.w2.weight: found shape torch.Size([3291, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.21.attn.c_attn.weight: found shape torch.Size([1536, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.21.attn.c_proj.weight: found shape torch.Size([2048, 512]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.22.attn.c_attn.weight: found shape torch.Size([3072, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
  • transformer.h.22.attn.c_proj.weight: found shape torch.Size([2048, 1024]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
  • transformer.h.9.mlp.c_proj.weight: found shape torch.Size([2048, 2630]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
  • transformer.h.9.mlp.w1.weight: found shape torch.Size([2630, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
  • transformer.h.9.mlp.w2.weight: found shape torch.Size([2630, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
    You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
    Traceback (most recent call last):
    File "/home/ubuntu/test_scripts/benchmark_r.py", line 152, in
    main()
    File "/home/ubuntu/test_scripts/benchmark_r.py", line 63, in main
    model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto", trust_remote_code=True, low_cpu_mem_usage=True, ignore_mismatched_sizes=True)
    File "/home/ubuntu/miniconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 556, in from_pretrained
    return model_class.from_pretrained(
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3558, in from_pretrained
    dispatch_model(model, **device_map_kwargs)
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/accelerate/big_modeling.py", line 474, in dispatch_model
    model.to(device)
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2556, in to
    return super().to(*args, **kwargs)
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1152, in to
    return self._apply(convert)
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
    [Previous line repeated 2 more times]
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 825, in _apply
    param_applied = fn(param)
    File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1150, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    NotImplementedError: Cannot copy out of meta tensor; no data!

请问你们在prune模型之后是怎么去加载的呢。
很着急尝试FLAP,期待您的回复,谢谢。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions