New blog: AI Dynamo on AKS series part 2 #5558

sdesai345 · 2026-01-16T21:10:28Z

No description provided.

Copilot

Pull request overview

This pull request adds Part 2 of the "AI Dynamo on AKS" blog series, focusing on NVIDIA Dynamo's Planner Profiler and SLO-based Planner for optimizing multi-node LLM inference. The PR also updates the Part 1 post's tags to establish series consistency.

Changes:

New blog post introducing Dynamo Planner tools for automated performance tuning and dynamic scaling
Added "Dynamo series" tag to Part 1 for series linking

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 8 comments.

File	Description
website/blog/2026-01-22-dynamo-on-aks-part-2/index.md	New Part 2 blog post covering the Dynamo Planner Profiler and SLO-based Planner with real-world airline app scenario
website/blog/2025-10-24-dynamo-on-aks/index.md	Updated tags to include "Dynamo series" for series consistency

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

website/blog/2025-10-24-dynamo-on-aks/index.md

Copilot

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md:102

The acronym "AIC" is defined here but never used again in the document. Consider removing the acronym since it's only mentioned once, or if it will be referenced later, ensure it's used consistently.

* **Hardware Simulation**: Using the **AI Configurator (AIC)** mode, the

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

allyford

Some nits, overall lgtm!

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

Copilot

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.

Copilot · 2026-01-20T15:35:30Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+<!-- truncate -->
+
+## The Challenge: Balancing the "Rate Matching" Equation
+
+Disaggregated serving separates the prefill and decode phases of inference
+across distinct GPU nodes. This allows each phase to be independently
+optimized with custom GPU counts and model parallelism configurations.
+
+![Disaggregated serving with Dynamo](./disag-serving-with-dynamo.png)


According to the blog post content guidelines, a hero image should be placed immediately after the truncate marker. The current structure has the truncate marker on line 26, but the first image appears on line 34 after heading content. Consider adding a hero image right after the truncate marker to follow the recommended pattern.

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

Copilot

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

Copilot · 2026-01-20T18:32:56Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+static optimization becomes a dynamic, SLO-aware reality on AKS, capable of
+weathering the unpredictable traffic spikes of production environments.
+
+Ultimately, this suite transforms your inference stack from a series of


Missing comma after introductory phrase. The sentence should read: "Ultimately, this suite transforms your inference stack..."

Copilot · 2026-01-20T18:32:56Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+Now, you can try this yourself by running the NVIDIA Dynamo Planner Profiler
+to capture burst and request behavior, then using the SLO-based Planner to
+translate latency targets into placement and scaling decisions on your AKS
+cluster. Setting it up in this order - profile under stress, define SLOs,


The dash separating items in this sentence should be an em dash (—) without spaces, or use a colon or period to separate the two independent clauses. Current usage with a hyphen is incorrect punctuation.

Suggested change

cluster. Setting it up in this order - profile under stress, define SLOs,

cluster. Setting it up in this order: profile under stress, define SLOs,

Copilot · 2026-01-20T18:32:56Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+these peaks, the underlying system requires the precise orchestration
+offered by a disaggregated architecture.
+
+To build a truly efficient disaggregated AI inference system you


Missing comma after introductory phrase. The sentence should read: "To build a truly efficient disaggregated AI inference system, you need to transition from manual..."

Suggested change

To build a truly efficient disaggregated AI inference system you

To build a truly efficient disaggregated AI inference system, you

Copilot · 2026-01-20T18:32:57Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+configuration that maximizes "Goodput", the maximum throughput
+achievable while staying strictly within your latency bounds.
+
+Ultimately the app developers and AI engineers reduce their time


Missing comma after introductory phrase. The sentence should read: "Ultimately, the app developers and AI engineers reduce their time..."

Suggested change

Ultimately the app developers and AI engineers reduce their time

Ultimately, the app developers and AI engineers reduce their time

wangyira · 2026-01-21T00:13:28Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+![Disaggregated serving with Dynamo](./disag-serving-with-dynamo.png)
+
+One of the main challenges in disaggregated serving is **rate matching**:
+determining the right GPU allocation between prefill and decode stages to


can you add more context on what the prefill and decode stages are and their importance before diving into how to solve these issues?

wangyira · 2026-01-21T00:15:10Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+One of the main challenges in disaggregated serving is **rate matching**:
+determining the right GPU allocation between prefill and decode stages to
+meet a specific Service Level Objective (SLO). If you miscalculate the GPU
+ratio between these stages, you face two "silent killers" of performance:


is there a quick example on how the gpu ratio is calculated without the AI dynamo solution to show the benefit of AI dynamo?

wangyira · 2026-01-21T00:16:34Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+rerouting during flight delays. This use case is a 'stress test' for
+inference: it is subject to massive, sudden bursts in traffic and highly
+variable request patterns, such as a mix of short status queries and
+long-context itinerary processing. To prevent latency spikes during


Not very familar with long-context itinerary processing. An example I could think of is when a flight gets cancelled and all the passengers on that flight are rebooking at the same time

wangyira · 2026-01-21T00:19:10Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+profiler can simulate performance in just 20–30 seconds
+based on pre-measured performance data, allowing for rapid
+iteration before you ever touch a physical GPU.
+* **Resulting Recommendation**: The output is a highly tuned


It's a bit hard for me to follow who dynamo planner profiler automatically does this and the added benefits. Is there an example of what this looks like?

wangyira · 2026-01-21T00:20:20Z

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

+Using the performance bounds identified earlier by the profiler, the Planner
+proactively scales the number of prefill and decode workers up or down. For
+example, if a *sudden burst of long-context itinerary queries* floods the
+system, the Planner detects the spike in the prefill queue and shifts available
+GPU resources to the prefill pool *before* your TTFT violates its SLO.
+
+Now, you can try this yourself by running the NVIDIA Dynamo Planner Profiler
+to capture burst and request behavior, then using the SLO-based Planner to
+translate latency targets into placement and scaling decisions on your AKS
+cluster. Setting it up in this order - profile under stress, define SLOs,
+and let the planner orchestrate your disaggregated inference system to
+handle sudden traffic spikes without latency spikes.


More detailed example could help here as well

New blog: Dynamo on AKS part 2

5ced52c

sdesai345 requested review from a team, Copilot and wangyira January 16, 2026 21:10

Copilot started reviewing on behalf of sdesai345 January 16, 2026 21:11 View session

Copilot AI reviewed Jan 16, 2026

View reviewed changes

sdesai345 added 2 commits January 16, 2026 16:21

Addressing linter errors

e2ed94b

Addressed copilot suggestions

4f1dcf8

Copilot AI review requested due to automatic review settings January 16, 2026 21:27

Copilot started reviewing on behalf of sdesai345 January 16, 2026 21:28 View session

Copilot AI reviewed Jan 16, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Show resolved Hide resolved

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Show resolved Hide resolved

Addressing more copilot suggestions

c60ed7a

allyford reviewed Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

allyford reviewed Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

allyford reviewed Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

allyford reviewed Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

allyford approved these changes Jan 20, 2026

View reviewed changes

sdesai345 commented Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

sdesai345 commented Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

sdesai345 commented Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Show resolved Hide resolved

sdesai345 commented Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

Apply suggestions from code review

776f481

Copilot AI review requested due to automatic review settings January 20, 2026 15:16

Copilot started reviewing on behalf of sdesai345 January 20, 2026 15:17 View session

Copilot AI reviewed Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

Update website/blog/2026-01-22-dynamo-on-aks-part-2/index.md

e806bae

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings January 20, 2026 15:31

Copilot started reviewing on behalf of sdesai345 January 20, 2026 15:31 View session

Copilot AI reviewed Jan 20, 2026

View reviewed changes

Nit abbrev change

9a6ec13

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings January 20, 2026 16:48

Copilot started reviewing on behalf of sdesai345 January 20, 2026 16:49 View session

Copilot AI reviewed Jan 20, 2026

View reviewed changes

Azure deleted a comment from Copilot AI Jan 20, 2026

sdesai345 commented Jan 20, 2026

View reviewed changes

website/blog/2026-01-22-dynamo-on-aks-part-2/index.md Outdated Show resolved Hide resolved

Apply suggestion from @sdesai345

14eaec7

Copilot AI review requested due to automatic review settings January 20, 2026 18:29

Copilot started reviewing on behalf of sdesai345 January 20, 2026 18:30 View session

Copilot AI reviewed Jan 20, 2026

View reviewed changes

wangyira reviewed Jan 21, 2026

View reviewed changes

	cluster. Setting it up in this order - profile under stress, define SLOs,
	cluster. Setting it up in this order: profile under stress, define SLOs,

	To build a truly efficient disaggregated AI inference system you
	To build a truly efficient disaggregated AI inference system, you

	Ultimately the app developers and AI engineers reduce their time
	Ultimately, the app developers and AI engineers reduce their time

New blog: AI Dynamo on AKS series part 2 #5558

Are you sure you want to change the base?

New blog: AI Dynamo on AKS series part 2 #5558

Conversation

sdesai345 commented Jan 16, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

allyford left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

wangyira Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

wangyira Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

wangyira Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!