-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
I am encountering an issue while working with the roberta-base model configuration. I noticed that the following configuration is being used:
RobertaConf.hidden_size = 256
RobertaConf.intermediate_size = 1024
RobertaConf.num_attention_heads = 8
However, when running the model, I encounter the following error:
RuntimeError: Given normalized_shape=[768], expected input with shape [*, 768], but got input of size[1, 198, 256]
The error suggests a shape mismatch where the model expects an input with a hidden_size of 768, but the input has a size of [1, 198, 256]. Given that the hidden_size has been changed to 256, I am unsure how to handle this discrepancy.
Vision-Kek
Metadata
Metadata
Assignees
Labels
No labels