Skip to content

Balancer Code Analyzation

Avilez-dev-11 edited this page Dec 1, 2023 · 22 revisions

MappingState class:

  • The constructor init initializes various attributes of the MappingState object, such as osdmap, raw_pg_stats, raw_pool_stats, and others.
  • It calculates the pg_stat attribute based on the provided raw_pg_stats.
  • It determines common pool IDs between OSD and PG statistics.
  • OSD Statistics: OSDs (Object Storage Daemons) are responsible for managing storage devices in a Ceph cluster. OSD statistics typically include information about the status, health, and performance of individual OSDs in the cluster. These statistics may also include details about the pools that each OSD is responsible for.
  • PG Statistics: PGs (Placement Groups) are a concept in Ceph used for data placement and distribution across OSDs. PG statistics would include information about the status, distribution, and health of these PGs in the cluster.
  • It creates dictionaries for pg_up and pg_up_by_poolid.
  • The calc_misplaced_from method calculates the percentage of misplaced placement groups (PGs) between two MappingState objects.

Mode enum class:

  • It defines an enumeration of cluster modes, including "none," "crush-compat," and "upmap." These modes might represent different strategies or configurations for managing the Ceph cluster.

Plan class:

  • This class represents a plan for managing the Ceph cluster.
  • The constructor init initializes various attributes, including the plan's name, mode, OSD map, pools, and other related data.
  • It defines methods like dump and show, which can be used to obtain a JSON representation of the plan and a textual description of the plan, respectively.
  • It seems that this code is a part of a larger Ceph cluster management system, and these classes provide abstractions for managing and planning changes to the cluster's configuration. Depending on how these classes are used in the broader codebase, they likely play a role in orchestrating and executing changes to the Ceph cluster's OSD (Object Storage Daemon) placement or other configurations.

MSPlan class:

  • The MsPlan class extends the functionality of the base Plan class by adding specific methods and attributes related to a MappingState member.
  • The init method serves as the constructor for the MsPlan class. It takes the following parameters: name: A string representing the name of the plan. mode: A string representing the mode of the plan. ms: An instance of the MappingState class, which is a preloaded mapping state. pools: A list of strings representing pool names. The constructor of the parent class creates a Plan instance (the superclass) using super() and passes arguments name, mode, ms.osdmap, and pools to the parent class's constructor. ms.osdmap seems to be a member or attribute of the MappingState object, representing some mapping of objects in a cluster. It stores the initial state of the plan by setting self.initial to the ms parameter.
  • final_state Method:

This method generates a final state. It updates properties of an object called self.inc. It returns a new instance of a MappingState object.show Method:

  • show Method:

This method generates a textual representation (a string) of the plan. It initializes an empty list ls. It appends comments and commands to the ls list based on various properties and data stored within the class. Finally, it joins the elements of the ls list into a single string using line breaks as separators and returns this string.

Eval class:

  • The Eval Class evaluates the data distribution and statistics in the MappingState.
  • The constructor init takes in a MappingState instance and initializes dictionaries to map values.
  • The show method takes in a bool that, depending on if its true or false, prints out a detailed or brief summary of the MappingState instance.
  • The calc_stats method calculates statistics based on count, target, and total dictionaries containing values, storing the statistics in a result dictionary.

Module class:

  • The class defines a set of module options (MODULE_OPTIONS) using the Option class. These options seem to control various aspects of automatic balancing and optimization within the Ceph cluster. Each option has a name, a type, a default value, a description, and, where applicable, additional constraints such as minimum and maximum values. Options Overview:
  • Some notable options include:
  • active: A boolean indicating whether automatic balancing is active.
  • begin_time and end_time: Time range for automatic balancing.
  • crush_compat_max_iterations: Maximum number of iterations for optimization.
  • crush_compat_metrics: Metrics used to calculate OSD (Object Storage Device) utilization.
  • min_score: Minimum score for optimization.
  • mode: Balancer mode, with possible values of 'none', 'crush-compat', or 'upmap'.
  • sleep_interval: How frequently the system wakes up to attempt optimization.
  • upmap_max_optimizations and upmap_max_deviation: Parameters related to upmap optimization.
  • pool_ids: Pools to which automatic balancing is limited.
  • Runtime Parameters:
  • Many of these options are marked as runtime=True, indicating that they can be dynamically adjusted during runtime.
  • Documentation:
  • Descriptive strings such as desc and long_desc are provided for each option, offering clear explanations of their purposes and usage.
  • Enum:
  • The mode option is an enumerated type, restricting its values to a predefined set ('none', 'crush-compat', 'upmap').
  • Default Values:
  • Default values are provided for each option, ensuring that the system will function even if the user does not explicitly configure these parameters.
  • Attributes:
  • active: A boolean attribute indicating whether automatic balancing is currently active. The value is set to False.
  • run: A boolean attribute set to True.
  • plans: A dictionary attribute, initialized as an empty dictionary, presumably to store optimization plans.
  • mode: An empty string attribute.
  • optimizing: A boolean attribute set to False.
  • last_optimize_started: An empty string attribute.
  • last_optimize_duration: An empty string attribute.
  • optimize_result: An empty string attribute.
  • no_optimization_needed: A boolean attribute set to False.
  • success_string: A string attribute containing the message 'Optimization plan created successfully'.
  • in_progress_string: A string attribute containing the message 'in progress'.
  • Constructor (init method):
  • The constructor initializes the class, calling the superclass constructor and setting the event attribute to an instance of the Event class.
  • CLI Commands:
  • The class defines three CLI commands: show_status, set_mode, on, and off, each annotated with decorators (@CLIReadCommand and @CLICommand) to specify their behavior in the Ceph CLI.
  • show_status Method:

  • Displays the status of the balancer, returning a tuple with an integer exit code, a JSON-formatted status string, and an empty string.
  • The status includes information such as active plans, last optimization details, and the current mode.
  • set_mode Method:

  • Sets the balancer mode based on the provided Mode enum.
  • Performs additional checks and actions based on the selected mode.
  • Returns a tuple with an integer exit code, an empty string, and a warning message if applicable.
  • on Method:

  • Enables automatic balancing if it's not already active.
  • Sets the active attribute to True and signals the event.
  • Returns a tuple with an integer exit code, an empty string, and an empty string.
  • off Method:

  • Disables automatic balancing if it's currently active.
  • Sets the active attribute to False and signals the event.
  • Returns a tuple with an integer exit code, an empty string, and an empty string.
  • These CLI commands provide a programmatic interface for interacting with and controlling the behavior of the automatic balancing module in a Ceph cluster. The show_status command allows users to inspect the current status, while set_mode, on, and off commands provide ways to configure and control the automatic balancing mode.
  • Pool Management Commands:
  • pool_ls: Lists automatic balancing pools. It retrieves the pool IDs from the module options, converts them to pool names, and prunes non-existing pools. The result is a JSON-formatted list of pool names.
  • pool_add: Enables automatic balancing for specific pools. It validates pool names, adds valid pools to the existing ones, and updates the module options.
  • pool_rm: Disables automatic balancing for specific pools. It validates pool names, removes specified pools from the existing ones, and updates the module options.
  • State and Evaluation Commands:
  • _state_from_option: A helper method that determines the state (MappingState) based on the provided option (plan, pool, or current cluster state).
  • plan_eval_verbose: Evaluates data distribution for the current cluster, specific pool, or specific plan in a verbose manner.
  • plan_eval_brief: Evaluates data distribution for the current cluster, specific pool, or specific plan in a brief manner.
  • Optimization Planning Commands:
  • plan_optimize: Runs the optimizer to create a new plan. It performs checks, creates a new plan, runs the optimization, and updates the results accordingly.
  • plan_show: Shows details of an optimization plan.
  • plan_rm: Discards an optimization plan.
  • plan_reset: Discards all optimization plans.
  • plan_dump: Shows the details of an optimization plan in a raw format.
  • plan_ls: Lists all optimization plans.
  • plan_execute: Executes an optimization plan, performing checks and updating the results accordingly.
  • Summary and Comments:
  • The commands provide a comprehensive set of operations for managing automatic balancing, including configuring pools, evaluating data distribution, creating, showing, and executing optimization plans.
  • The methods encapsulate logic related to validation, option management, and plan execution.
  • The code follows a clear structure and leverages helper methods for readability and maintainability.
  • CLI commands are decorated with annotations (@CLIReadCommand and @CLICommand) to specify their behavior in the Ceph CLI.
  • Overall, this part of the class extends the functionality of the automatic balancing module, offering commands to manage pools, evaluate data distribution, create optimization plans, and interact with existing plans in various ways.
  • Shutdown and Serve Methods:

  • shutdown: Stops the module by setting the run flag to False and signaling the event.

* serve: The main loop of the module that runs as long as the run flag is True. It checks if the module is active and within the permitted time window. If conditions are met, it creates a new optimization plan, runs the optimizer, and executes the plan.

> THIS is the main code we will be tackling for our contributions to this project. Serve is responsible for where the actual optimization part takes place. Reference this line for deciding when to

update pg_upmap_items.**

    1. Initialization and Logging:
    • Log an informational message indicating that the process is starting.
    • Enter into a while loop that continues executing as long as the self.run attribute is True.
    1. Configuration Retrieval:
    • Retrieve the values of active and sleep_interval from the configuration of the module.
    1. Debug Logging:
    • Log debug information about the current state of the module (active or inactive) and the current time.
    1. Conditional Execution:
    • Check if the module is marked as active (self.active) and if a function self.time_permit() returns True. If both conditions are met, execute the block of code inside.
    1. Optimization Logic:
    • Generate a unique name based on the current time.
    • Retrieve information about the OSD (Object Storage Daemon) map.
    • Extract pool IDs from the configuration and filter them based on the valid pools.
    • Create a plan for optimization.
    • Execute the optimization plan and, if successful, perform further actions.
    1. Sleeping Interval:
    • Log debug information about the sleeping interval.
    • Wait for the specified sleep_interval before continuing the loop.
  • Time Permit Method:

  • time_permit: Checks if the current time and day fall within the configured time and weekday window specified in the module options. It ensures that the module is allowed to run based on the time and weekday settings.
  • Plan Creation and Evaluation Methods:

  • plan_create: Creates a new optimization plan based on the specified name, OSD map, and list of pools. The type of plan created depends on the configured mode.
  • calc_eval: Calculates the evaluation metrics for the current state, considering pools and their distributions. It calculates actual and target distributions, averages, standard deviations, and scores.
  • evaluate: Invokes calc_eval and returns a string representation of the evaluation results, which can be verbose or brief based on the verbose parameter.
  • Optimization Method:

  • optimize: Initiates the optimization process based on the specified plan. It checks for various conditions such as unknown PGs, degraded objects, inactive PGs, and misplaced objects. If conditions are met, it proceeds to optimize based on the specified mode (upmap or crush-compat).
  • Upmap and Crush-Compat Optimization Methods:
  • do_upmap: Executes the optimization for the upmap mode. The specific implementation is not provided here.
  • do_crush_compat: Executes the optimization for the crush-compat mode, which involves compatibility with the CRUSH map. The specific implementation is not provided here.
  • Logging and Error Handling:
  • The methods log relevant information, warnings, and errors using the module's logger.
  • The methods return error codes and details if certain conditions prevent optimization.
  • do_upmap method:

  • This method is responsible for performing optimizations on the placement of PGs.
  • It shuffles and filters pools, checks for pending PG merges, and calculates the adjustments needed for PGs based on certain criteria.
  • The goal is to find a better distribution of PGs across OSDs.
  • do_crush_compat method:

  • This method attempts to optimize the CRUSH map compatibility in the Ceph storage cluster.
  • It iteratively adjusts the weights of OSDs to achieve a more balanced distribution based on specified metrics (e.g., number of PGs, bytes, or objects).
  • get_compat_weight_set_weights method:

  • Retrieves the weights of OSDs from the compatibility weight-set in the CRUSH map.
  • execute method:
  • Executes the planned changes in the Ceph storage cluster.
  • This includes adjusting compatibility weight-sets, OSD weights, and performing PG upmap operations.
  • gather_telemetry method:
  • Gathers telemetry data related to the current state of the storage manager, such as whether it's active and the operating mode.
  • It's important to note that the code includes logging statements (self.log.info, self.log.debug, self.log.error) for tracking the progress and debugging information.

Clone this wiki locally