Skip to content

Add max-model-len param for vLLM#502

Open
oyilmaz-nvidia wants to merge 3 commits intomainfrom
onur/add_max_model_len
Open

Add max-model-len param for vLLM#502
oyilmaz-nvidia wants to merge 3 commits intomainfrom
onur/add_max_model_len

Conversation

@oyilmaz-nvidia
Copy link
Contributor

No description provided.

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 4, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@oyilmaz-nvidia
Copy link
Contributor Author

/ok to test 6ffc9b3

cpu_offload_gb: float = 0,
enforce_eager: bool = False,
max_seq_len_to_capture: int = 8192,
max_model_len: int = 8192,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @oyilmaz-nvidia . For my understanding, why is this param needed now ? Is it new introduced by vLLM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really didn't need to set the parameter up until now but some of the large models like llama 70B might need some tuning to fit model into the GPUs. And CI is giving error now to fit this model (it was working before but with the newer versions of vllm, we might need to tune it).

Copy link
Contributor

@athitten athitten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @oyilmaz-nvidia!

oyilmaz-nvidia and others added 2 commits November 5, 2025 14:31
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
@github-actions github-actions bot added the tests label Nov 5, 2025
@oyilmaz-nvidia
Copy link
Contributor Author

/ok to test 4d16afa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants