-
Notifications
You must be signed in to change notification settings - Fork 17
Description
我尝试使用
model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto", trust_remote_code=True, low_cpu_mem_usage=True)
但是会报错
“Traceback (most recent call last):
File "/home/ubuntu/test_scripts/benchmark_r.py", line 154, in
main()
File "/home/ubuntu/test_scripts/benchmark_r.py", line 63, in main
model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto", trust_remote_code=True, low_cpu_mem_usage=True,
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 556, in from_pretrained
return model_class.from_pretrained(
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3502, in from_pretrained
) = cls._load_pretrained_model(
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3926, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 805, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 348, in set_module_tensor_to_device
raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([2048, 2785]) in "weight" (which has shape torch.Size([2048, 5504])), this look incorrect.
”
我也尝试了在加载的时候添加参数 ignore_mismatched_sizes=True
model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto", trust_remote_code=True, low_cpu_mem_usage=True, ignore_mismatched_sizes=True)
同样也会报错:
Some weights of QWenLMHeadModel were not initialized from the model checkpoint at /data/xxxx and are newly initialized because the shapes did not match:
- transformer.h.10.mlp.c_proj.weight: found shape torch.Size([2048, 2785]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.10.mlp.w1.weight: found shape torch.Size([2785, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.10.mlp.w2.weight: found shape torch.Size([2785, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.11.mlp.c_proj.weight: found shape torch.Size([2048, 2518]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.11.mlp.w1.weight: found shape torch.Size([2518, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.11.mlp.w2.weight: found shape torch.Size([2518, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.12.attn.c_attn.weight: found shape torch.Size([3840, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.12.attn.c_proj.weight: found shape torch.Size([2048, 1280]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.12.mlp.c_proj.weight: found shape torch.Size([2048, 2393]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.12.mlp.w1.weight: found shape torch.Size([2393, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.12.mlp.w2.weight: found shape torch.Size([2393, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.13.attn.c_attn.weight: found shape torch.Size([3072, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.13.attn.c_proj.weight: found shape torch.Size([2048, 1024]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.13.mlp.c_proj.weight: found shape torch.Size([2048, 3776]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.13.mlp.w1.weight: found shape torch.Size([3776, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.13.mlp.w2.weight: found shape torch.Size([3776, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.14.attn.c_attn.weight: found shape torch.Size([2688, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.14.attn.c_proj.weight: found shape torch.Size([2048, 896]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.14.mlp.c_proj.weight: found shape torch.Size([2048, 3594]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.14.mlp.w1.weight: found shape torch.Size([3594, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.14.mlp.w2.weight: found shape torch.Size([3594, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.15.attn.c_attn.weight: found shape torch.Size([3072, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.15.attn.c_proj.weight: found shape torch.Size([2048, 1024]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.15.mlp.c_proj.weight: found shape torch.Size([2048, 4113]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.15.mlp.w1.weight: found shape torch.Size([4113, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.15.mlp.w2.weight: found shape torch.Size([4113, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.16.attn.c_attn.weight: found shape torch.Size([3072, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.16.attn.c_proj.weight: found shape torch.Size([2048, 1024]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.17.attn.c_attn.weight: found shape torch.Size([2688, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.17.attn.c_proj.weight: found shape torch.Size([2048, 896]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.17.mlp.c_proj.weight: found shape torch.Size([2048, 3263]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.17.mlp.w1.weight: found shape torch.Size([3263, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.17.mlp.w2.weight: found shape torch.Size([3263, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.18.mlp.c_proj.weight: found shape torch.Size([2048, 3861]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.18.mlp.w2.weight: found shape torch.Size([3861, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.18.attn.c_attn.weight: found shape torch.Size([1536, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.18.attn.c_proj.weight: found shape torch.Size([2048, 512]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.18.mlp.w1.weight: found shape torch.Size([3861, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.19.attn.c_attn.weight: found shape torch.Size([2688, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.19.attn.c_proj.weight: found shape torch.Size([2048, 896]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.20.attn.c_attn.weight: found shape torch.Size([2688, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.20.attn.c_proj.weight: found shape torch.Size([2048, 896]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.20.mlp.c_proj.weight: found shape torch.Size([2048, 3291]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.20.mlp.w1.weight: found shape torch.Size([3291, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.20.mlp.w2.weight: found shape torch.Size([3291, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.21.attn.c_attn.weight: found shape torch.Size([1536, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.21.attn.c_proj.weight: found shape torch.Size([2048, 512]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.22.attn.c_attn.weight: found shape torch.Size([3072, 2048]) in the checkpoint and torch.Size([6144, 2048]) in the model instantiated
- transformer.h.22.attn.c_proj.weight: found shape torch.Size([2048, 1024]) in the checkpoint and torch.Size([2048, 2048]) in the model instantiated
- transformer.h.9.mlp.c_proj.weight: found shape torch.Size([2048, 2630]) in the checkpoint and torch.Size([2048, 5504]) in the model instantiated
- transformer.h.9.mlp.w1.weight: found shape torch.Size([2630, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
- transformer.h.9.mlp.w2.weight: found shape torch.Size([2630, 2048]) in the checkpoint and torch.Size([5504, 2048]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "/home/ubuntu/test_scripts/benchmark_r.py", line 152, in
main()
File "/home/ubuntu/test_scripts/benchmark_r.py", line 63, in main
model = AutoModelForCausalLM.from_pretrained(args.model_path, device_map="auto", trust_remote_code=True, low_cpu_mem_usage=True, ignore_mismatched_sizes=True)
File "/home/ubuntu/miniconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 556, in from_pretrained
return model_class.from_pretrained(
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3558, in from_pretrained
dispatch_model(model, **device_map_kwargs)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/accelerate/big_modeling.py", line 474, in dispatch_model
model.to(device)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2556, in to
return super().to(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1152, in to
return self._apply(convert)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
module._apply(fn)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
module._apply(fn)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 802, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 825, in _apply
param_applied = fn(param)
File "/home/ubuntu/miniconda3/envs/xxx/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1150, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Cannot copy out of meta tensor; no data!
请问你们在prune模型之后是怎么去加载的呢。
很着急尝试FLAP,期待您的回复,谢谢。