Skip to content

Docs : Add HyperJob concept documentation for multi-cluster job splitting#477

Open
codeEvolveZenith345 wants to merge 3 commits intovolcano-sh:masterfrom
codeEvolveZenith345:docs-hyperjob
Open

Docs : Add HyperJob concept documentation for multi-cluster job splitting#477
codeEvolveZenith345 wants to merge 3 commits intovolcano-sh:masterfrom
codeEvolveZenith345:docs-hyperjob

Conversation

@codeEvolveZenith345
Copy link
Contributor

Description

Introduce HyperJob documentation in both English and Chinese Concepts sections, covering multi-cluster job splitting, key features, use cases, and distinctions from standard Volcano Jobs

  • Please check if the PR fulfills these requirements
  • The commit message follows our guidelines
  • What kind of change does this PR introduce?

/kind documentation

  • What this PR does / why we need it:

This PR adds comprehensive documentation for the HyperJob concept to the Volcano website in both English and Chinese. HyperJob is a high-level multi-cluster scheduling abstraction built on top of Volcano Job that enables:

  • Automatic job splitting across multiple Kubernetes clusters
  • Unified status management for multi-cluster workloads
  • Simplified multi-cluster AI/ML job orchestration
  • Better resource utilization across heterogeneous clusters

Additionally, cross-reference links have been added to the existing Multi-Cluster Scheduling documentation to improve discoverability.

  • Which issue(s) this PR fixes:

Fixes #475

Introduce HyperJob documentation in both English and Chinese Concepts sections, covering multi-cluster job splitting, key features, use cases, and distinctions from standard Volcano Jobs

Signed-off-by: Lakshya Saxena <suntiwari3495@gmail.com>
@volcano-sh-bot volcano-sh-bot added the kind/documentation Categorizes issue or PR as related to documentation. label Feb 5, 2026
@volcano-sh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kevin-wangzefeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link

Summary of Changes

Hello @codeEvolveZenith345, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Volcano project's documentation by introducing the HyperJob concept. HyperJob provides a crucial high-level abstraction for managing large-scale AI/ML workloads across multiple Kubernetes clusters, simplifying complex distributed job orchestration. It aims to improve resource utilization and provide a unified control plane for jobs that span heterogeneous and geographically dispersed environments, thereby making multi-cluster AI job scheduling more accessible and efficient.

Highlights

  • New HyperJob Documentation: Introduced comprehensive documentation for the HyperJob concept in both English and Chinese, detailing its purpose, features, and usage within the Volcano ecosystem.
  • Multi-Cluster AI Workload Orchestration: HyperJob is presented as a high-level abstraction for multi-cluster AI workloads, enabling automatic job splitting, unified status management, and simplified orchestration across multiple Kubernetes clusters.
  • Key Features and Use Cases: The documentation outlines HyperJob's core capabilities such as automatic job splitting, unified status management, and high-level scheduling, along with typical use cases like multi-cluster LLM training and heterogeneous resource scheduling.
  • Comparison with Volcano Job: A clear distinction is made between HyperJob and standard Volcano Job, highlighting when to use each based on single-cluster vs. multi-cluster requirements and job splitting needs.
  • Cross-Referencing: Existing multi-cluster scheduling documentation (both English and Chinese) has been updated to include cross-reference links to the new HyperJob concept, improving discoverability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • content/en/docs/hyperjob.md
    • Added new English documentation for the HyperJob concept, covering its overview, key features, comparison with Volcano Job, typical use cases, architectural overview, and a conceptual YAML example.
    • Configured the document for inclusion in the 'concepts' section of the documentation sidebar with a weight of 4.
  • content/en/docs/multi_cluster_scheduling.md
    • Appended a new paragraph and link referencing the HyperJob documentation, guiding users to a high-level multi-cluster abstraction for job splitting.
  • content/zh/docs/hyperjob.md
    • Added new Chinese documentation for the HyperJob concept, mirroring the English content, including its overview, core features, comparison with Volcano Job, typical use cases, architecture, and a conceptual YAML example.
    • Configured the document for inclusion in the 'concepts' section of the Chinese documentation sidebar with a weight of 4.
  • content/zh/docs/multi_cluster_scheduling.md
    • Appended a new paragraph and link referencing the Chinese HyperJob documentation, providing guidance on using a high-level abstraction for multi-cluster job splitting.
Activity
  • The author, codeEvolveZenith345, initiated this pull request to add new documentation for the HyperJob concept.
  • The pull request is categorized as '/kind documentation'.
  • This PR addresses and fixes issue Add HyperJob concept to concepts documentation (both EN and CN) #475.
  • No further review comments or activity have been recorded since the initial submission.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 5, 2026
@codeEvolveZenith345
Copy link
Contributor Author

/cc @JesseStutler

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully introduces comprehensive documentation for the HyperJob concept in both English and Chinese. The new hyperjob.md files clearly explain the purpose, key features, use cases, and architecture of HyperJob, along with a helpful conceptual YAML example. Additionally, the cross-references added to the multi_cluster_scheduling.md files significantly improve the discoverability of this new concept. The documentation is well-structured and provides valuable information for users looking to understand multi-cluster job splitting with HyperJob.

Signed-off-by: Lakshya Saxena <suntiwari3495@gmail.com>
Signed-off-by: Lakshya Saxena <suntiwari3495@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/documentation Categorizes issue or PR as related to documentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add HyperJob concept to concepts documentation (both EN and CN)

2 participants