Skip to content

CPUID leaf 0x4 cache topology doesn't account for SMT #1001

@glitzflitz

Description

@glitzflitz

Linux guests with SMT enabled print this on boot

[    0.215000] smp: Brought up 1 node, 4 CPUs
[    0.215000] smpboot: Total of 4 processors activated (30011.39 BogoMIPS)
[    0.216000] BUG: arch topology borken
[    0.217000]      the SMT domain not a subset of the CLS domain
[    0.219000] BUG: arch topology borken
[    0.220000]      the SMT domain not a subset of the CLS domain
[    0.221000] BUG: arch topology borken
[    0.222000]      the SMT domain not a subset of the CLS domain
[    0.223000] BUG: arch topology borken
[    0.224000]      the SMT domain not a subset of the CLS domain

The problem is in https://github.com/oxidecomputer/propolis/blame/master/lib/propolis/src/cpuid.rs#L340. When specializing Intel's CPUID leaf 4 (Deterministic Cache Parameters), L1/L2 caches are set as unshared

  if level < 3 {
      subleaf.eax &= !LEAF4_EAX_VCPU_MASK;
      // And leave that range 0: this means only one
      // vCPU shares the cache.
  }

As per the intel's documentation https://cdrdv2-public.intel.com/775917/intel-64-architecture-processor-topology-enumeration.pdf, in Table 1-15. Reference for CPUID Leaf 04H

Bits 25-14: Maximum number of addressable IDs for logical processors sharing this
cache**, ***
** Add one to the return value to get the result.
***The nearest power-of-2 integer that is not smaller than (1 + EAX[25:14]) is the number of
unique initial APIC IDs reserved for addressing different logical processors sharing this
cache.

The leaf 0xB tells the guest SMT is enabled with 2 threads per core. Linux uses leaf 4's num_threads_sharing field to figure out which CPUs share L2 cache, then builds scheduler domains expecting SMT siblings to be in the same cache sharing group. So it complains when it sees "nobody shares L2" but also "you have 2 threads per core"

calc_cache_topo_id() https://github.com/torvalds/linux/blob/master/arch/x86/kernel/cpu/cacheinfo.c#L413, computes L2 cache ids from leaf 4,
cpu_clustergroup_mask() https://github.com/torvalds/linux/blob/master/arch/x86/kernel/smpboot.c#L670 returns CPUs sharing L2 and
build_sched_domain() https://github.com/torvalds/linux/blob/master/kernel/sched/topology.c#L2472 then checks that SMT domain is subset of CLS domain

The same case for AMD in fix_amd_cache_topo is handled correctly

// L1/L2 shared by SMT siblings

  0b001 | 0b010 => {
      // L1/L2 shared by SMT siblings
      if self.has_smt { 2 } else { 1 }
  }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions