Skip to content

Comments

Add load pattern configuration guide to benchmarks#1

Draft
mpashkovskii wants to merge 10 commits intomainfrom
docs/burst-concurrency-req-rate
Draft

Add load pattern configuration guide to benchmarks#1
mpashkovskii wants to merge 10 commits intomainfrom
docs/burst-concurrency-req-rate

Conversation

@mpashkovskii
Copy link
Owner

Purpose

Add comprehensive documentation for load pattern configuration parameters in vLLM benchmarking tools. This PR enhances the benchmarks documentation by:

  • Adding detailed explanations of --burstiness, --request-rate, and --max-concurrency parameters
  • Providing mathematical foundations for traffic pattern generation using Gamma distributions
  • Including data-driven visualizations showing request arrival patterns and inter-arrival time distributions
  • Offering practical use case recommendations for different testing scenarios
  • Explaining how to interpret KV cache configuration logs for capacity planning

This addresses the need for better guidance on simulating realistic load patterns for performance testing and capacity planning.

Test Plan

N/A

Test Result

N/A


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)

…rrency / burstiness use cases

Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
@mpashkovskii mpashkovskii force-pushed the docs/burst-concurrency-req-rate branch from 33850b4 to d7f101d Compare October 15, 2025 11:21
mpashkovskii and others added 7 commits October 15, 2025 11:52
Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Matvei Pashkovskii <matvei.pashkovskii@amd.com>
Signed-off-by: Matvei Pashkovskii <mpashkov@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants