Skip to content

Meeting Notes

Avilez-dev-11 edited this page Dec 1, 2023 · 10 revisions

Meeting Notes - 2023-09-26

Explain project details Help set up dev environments Agree on a timeline Discuss deliverables

https://github.com/ceph/ceph/blob/main/src/pybind/mgr/balancer/module.py -- code you want to check out

Important ceph commands: Check the status of your cluster: ./bin/ceph -s Check balancer status: ./bin/ceph balancer status Stop your cluster: ../src/stop.sh (good practice after you're done playing with the cluster) Restart the mgr daemon: ./bin/init-ceph restart mgr (do if you want to test out a code change you made in the balancer) Run 'nproc" to see how many processers you have to compile your project, and then specify a slightly lower amount: ninja vstart -j(ncproc - a bit) Working on a shared branch (don't worry about this for now; Laura will guide you through it)

Meeting Notes - 2023-10-03

Recording: https://drive.google.com/file/d/1BcrUELCXJc0V42xvs0VEhXPSKW4pZpAM/view

Follow these steps for VPN access: https://wiki.sepia.ceph.com/doku.php?id=vpnaccess Relevant tracker ticket: https://tracker.ceph.com/issues/63084

Student action items: Watch "Intro to Ceph" video: https://youtu.be/PmLPbrf-x9g?si=Zhrv9Nb6DR7rQKbd Read this page on Balancer design: https://docs.ceph.com/en/reef/dev/balancer-design/ The goal is to get a better understanding of what Ceph is and how balancing works at a high level, which will help with your poster project.

Student action items: Experiment with a vstart cluster Experiment with the balancer commands Try committing something and pushing it to your local repository (so you can get comfortable sharing a link to your work) Get familiar with the balancer code (linked above)

Laura action items: Set up students with sepia lab computers to help with compile time

Meeting Notes - 2023-10-10

Recording: https://drive.google.com/file/d/1PGp841k0BW61DVbeXDhQ6h5lfmFgaVSb/view?usp=sharing

https://help.github.com/en/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account

When collaborating in git: git checkout <your local branch> Always pull changes from the remote repo first! git pull <remote repo> <remote branch> --rebase If conflicts, make sure to resolve: Open the file (i.e. with vim) Search for "HEAD" Change the line so no conflicts; remove "HEAD" and surrounding lines Save and exit file Add file (git add <file name>) git rebase --continue Finally, push your changes: git push <remote repo> <remote branch> If no conflicts, Make commit as usual git push <remote repo> <remote branch>

Meeting Notes - 2023-10-17

Recording: https://drive.google.com/file/d/1FYhJky-LLLIQxuVLu-AgrU-k2eS74YpN/view See this link for the demo from the recording: https://pad.ceph.com/p/unbalanced_cluster_scenario

Important Commands:

show current evaluations ./bin/ceph balancer eval-verbose

set max deviation for pgs ./bin/ceph config get mgr mgr/balancer/upmap_max_deviation

show osd maps ceph osd dump

turn balancer off ./bin/ceph balancer off

turn balancer on ./bin/ceph balancer on

show numbers assigned to pool ./bin/ceph osd lspools

moves the objects around ./bin/ceph osd pg-upmap-items

restart manager ./bin/init-ceph restart mgr

In-person poster project in a month! (Nov 17) Balancer demo (in video) Establish milestones for poster project

Create an unbalanced cluster scenario:

#Start a cluster with 4 OSDs OSD=4 ../src/vstart.sh --debug --new -x --localhost --bluestore

Items from osdmap (epoch 61)

  • pg_upmap_items 2.5 [0,2]
  • pg_upmap_items 3.b [0,2]
  • pg_upmap_items 3.10 [0,2]
  • pg_upmap_items 3.12 [0,2]
  • pg_upmap_items 3.17 [0,2]
  • pg_upmap_items 3.18 [0,2]
  • pg_upmap_items 3.23 [0,1]
  • pg_upmap_items 3.2e [0,1]
  • pg_upmap_items 3.30 [3,1]
  • pg_upmap_items 3.39 [0,2]
  • pg_upmap_items 3.44 [0,1]
  • pg_upmap_items 3.52 [0,2]
  • pg_upmap_items 3.70 [0,2]
  • pg_upmap_items 3.79 [3,2]

BEFORE unbalancing the cluster: 'cephfs.a.meta': {'pgs': {3: 12, 1: 13, 0: 12, 2: 11} AFTER unbalancing the cluster: 'cephfs.a.meta': {'pgs': {3: 11, 1: 16, 0: 12, 2: 9}

From osdmap epoch 67:

  • pg_upmap_items 2.1 [2,1]
  • pg_upmap_items 2.5 [0,2]
  • pg_upmap_items 2.7 [3,1]
  • pg_upmap_items 2.e [2,1]
  • pg_upmap_items 3.b [0,2]
  • pg_upmap_items 3.10 [0,2]
  • pg_upmap_items 3.12 [0,2]
  • pg_upmap_items 3.17 [0,2]
  • pg_upmap_items 3.18 [0,2]
  • pg_upmap_items 3.23 [0,1]
  • pg_upmap_items 3.2e [0,1]
  • pg_upmap_items 3.30 [3,1]
  • pg_upmap_items 3.39 [0,2]
  • pg_upmap_items 3.44 [0,1]
  • pg_upmap_items 3.52 [0,2]
  • pg_upmap_items 3.70 [0,2]
  • pg_upmap_items 3.79 [3,2]

AFTER rebalancing the cluster: 'cephfs.a.meta': {'pgs': {3: 12, 1: 13, 0: 12, 2: 11}

From osdmap epoch 69: pg_upmap_items 2.5 [0,2] pg_upmap_items 3.b [0,2] pg_upmap_items 3.10 [0,2] pg_upmap_items 3.12 [0,2] pg_upmap_items 3.17 [0,2] pg_upmap_items 3.18 [0,2] pg_upmap_items 3.23 [0,1] pg_upmap_items 3.2e [0,1] pg_upmap_items 3.30 [3,1] pg_upmap_items 3.39 [0,2] pg_upmap_items 3.44 [0,1] pg_upmap_items 3.52 [0,2] pg_upmap_items 3.70 [0,2] pg_upmap_items 3.79 [3,2]

https://www.diffchecker.com/

Meeting Notes - 2023-10-24

Recording: https://drive.google.com/file/d/1okMd3D-nF2O5DLpwZwo-YY0h7mEGBBW6/view?usp=sharing

Meeting Notes - 2023-10-31 Recording: https://drive.google.com/file/d/1pWn_Zq74zaiqfXj2-aT2D4hPqlCFY1-d/view?usp=sharing

$ git diff diff --git a/src/pybind/mgr/balancer/module.py b/src/pybind/mgr/balancer/module.py index 1c40425115c..1c7294ae228 100644 --- a/src/pybind/mgr/balancer/module.py +++ b/src/pybind/mgr/balancer/module.py @@ -345,6 +345,7 @@ class Module(MgrModule): 'optimize_result': self.optimize_result, 'no_optimization_needed': self.no_optimization_needed, 'mode': self.get_module_option('mode'), + 'osdmap': self.get_osdmap().dump(), } return (0, json.dumps(s, indent=4, sort_keys=True), '')

Meeting Notes - 2023-11-07

Recording: https://drive.google.com/file/d/1Tm869Xwt29ivHvEjkTuSFSRU4P3rlDsC/view?usp=sharing

Watch the ceph status: watch ./bin/ceph -s

Meeting Notes - 2023-11-14

Recording: https://drive.google.com/file/d/1Cn731yvBs2tJ5rtIPXPf-0D8aFZ2G2Uf/view?usp=sharing

Before

"pg_upmap_items": [ { "pgid": "3.10", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.12", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.14", "mappings": [ { "from": 3, "to": 1 } ] }, { "pgid": "3.20", "mappings": [ { "from": 0, "to": 1 } ] }, { "pgid": "3.53", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.5f", "mappings": [ { "from": 0, "to": 1 } ] }, { "pgid": "3.7d", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.7f", "mappings": [ { "from": 0, "to": 2 } ] } ],

After

"pg_upmap_items": [ { "pgid": "3.10", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.12", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.14", "mappings": [ { "from": 3, "to": 1 } ] }, { "pgid": "3.20", "mappings": [ { "from": 0, "to": 1 } ] }, { "pgid": "3.53", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.5f", "mappings": [ { "from": 0, "to": 1 } ] }, { "pgid": "3.7d", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.7f", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "4.0", "mappings": [ { "from": 3, "to": 2 } ] }, { "pgid": "4.c", "mappings": [ { "from": 3, "to": 2 } ] }, { "pgid": "4.f", "mappings": [ { "from": 1, "to": 0 } ] }, { "pgid": "4.10", "mappings": [ { "from": 3, "to": 0 } ] }, { "pgid": "4.18", "mappings": [ { "from": 1, "to": 0 } ] } ],

Instructions on creating changes in the balancer

  1. Create a cluster with 4 OSDs OSD=4 ../src/vstart.sh --debug --new -x --localhost --bluestore

  2. Run the balancer ./bin/ceph balancer on

  3. Check the status to see if it says "Optimized plan created successfully" (this means the balancer has created mappings) ./bin/ceph balancer status 3.a OR, check the osdmap to see if any pg_upmap_items entries have been created ./bin/ceph osd dump -f json-pretty (the json-pretty part shows the osdmap exacty as it is structured when you're accessing it in the code)

  4. Grab the pg_upmap_items output from the osdmap: ./bin/ceph osd dump -f json-pretty (copy where it says pg_upmap_items)

  5. To make the cluster need to rebalance itself, create a pool (this creates more placement groups): ./bin/ceph osd pool create <pool_name>

  6. Check the status to see it says "Optimized plan created successfully" (this means the balancer has created mappings) ./bin/ceph balancer status

  7. Grab the pg_upmap_items output from the osdmap: ./bin/ceph osd dump -f json-pretty (copy where it says pg_upmap_items)

Before "pg_upmap_items": [ { "pgid": "3.12", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.15", "mappings": [ { "from": 0, "to": 1 } ] }, { "pgid": "3.1d", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.39", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.51", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.63", "mappings": [ { "from": 0, "to": 1 } ] }, { "pgid": "3.69", "mappings": [ { "from": 0, "to": 2 } ] } ],

After: "pg_upmap_items": [ { "pgid": "3.12", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.15", "mappings": [ { "from": 0, "to": 1 } ] }, { "pgid": "3.1d", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.39", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.51", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "3.63", "mappings": [ { "from": 0, "to": 1 } ] }, { "pgid": "3.69", "mappings": [ { "from": 0, "to": 2 } ] }, { "pgid": "4.b", "mappings": [ { "from": 1, "to": 2 } ] }, { "pgid": "4.14", "mappings": [ { "from": 3, "to": 2 } ] }, { "pgid": "4.18", "mappings": [ { "from": 3, "to": 0 } ] }, { "pgid": "4.1b", "mappings": [ { "from": 3, "to": 0 } ] }, { "pgid": "4.1c", "mappings": [ { "from": 1, "to": 0 } ] } ],

Meeting Notes - 2023-11-21

Recording: https://drive.google.com/file/d/1exa416wyn581DyaKQky1PMzN9lsjv4xF/view?usp=sharing

Meeting Notes - 2023-11-28

`@CLIReadCommand('balancer status')
def show_status(self) -> Tuple[int, str, str]:
    """
    Show balancer status
    """
    self.log.debug("osdmap_rcos {}".format(self.get_osdmap().dump().get('epoch', '')))
    s = {
        'plans': list(self.plans.keys()),
        'active': self.active,
        'last_optimize_started': self.last_optimize_started,
        'last_optimize_duration': self.last_optimize_duration,
        'optimize_result': self.optimize_result,
        'no_optimization_needed': self.no_optimization_needed,
        'mode': self.get_module_option('mode'),
    }
    return (0, json.dumps(s, indent=4, sort_keys=True), '')`

This is the line where the actual optimization part takes place. Reference this line for deciding when to update pg_upmap_items. https://github.com/ceph/ceph/blob/785c1083fa93f41d0dcbb7f16a651615bbb44771/src/pybind/mgr/balancer/module.py#L694

Clone this wiki locally