Skip to content

RoBERTa-base hidden_size Mismatch Error #26

@toututuya

Description

@toututuya

I am encountering an issue while working with the roberta-base model configuration. I noticed that the following configuration is being used:
RobertaConf.hidden_size = 256
RobertaConf.intermediate_size = 1024
RobertaConf.num_attention_heads = 8
However, when running the model, I encounter the following error:
RuntimeError: Given normalized_shape=[768], expected input with shape [*, 768], but got input of size[1, 198, 256]
The error suggests a shape mismatch where the model expects an input with a hidden_size of 768, but the input has a size of [1, 198, 256]. Given that the hidden_size has been changed to 256, I am unsure how to handle this discrepancy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions