Skip to content

Conversation

@PlaidCat
Copy link
Collaborator

@PlaidCat PlaidCat commented Jan 30, 2026

This is the attempt at a re-builder built on Cron and some internal tools, but the same process is as follows as previous rebuilds

  • Download all unprocessed src.rpm
  • for each src,pm
    • Find all commits in changelog up to last known tag ... in this case 4.18.0-553
    • Re-play commits in reverse order (oldest in change log to newest) with git cherry-pick
    • After replay replace ENTIRE code in branch with rpmbuild -bp from corresponding src.rpm.
    • Tag Rebuild branch

JIRA Tickets

Rebuild Splat Inspection

kernel-4.18.0-553.97.1.el8_10

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 594898
Number of commits in rpm: 42
Number of commits matched with upstream: 35 (83.33%)
Number of commits in upstream but not in rpm: 594863
Number of commits NOT found in upstream: 7 (16.67%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.97.1.el8_10 for kernel-4.18.0-553.97.1.el8_10
Clean Cherry Picks: 19 (54.29%)
Empty Cherry Picks: 16 (45.71%)
_______________________________

__EMPTY COMMITS__________________________
ba3d58a1df460ba28bb5989ad7269ff48682375a scsi: lpfc: Fill in missing ndlp kref puts in error paths
6e5c5d246e6c16127327eeecf61ef2cf21a94ce5 scsi: lpfc: Move scsi_host_template outside dynamically allocated/freed phba
97f975823f8196d970bd795087b514271214677a scsi: lpfc: Fix double free in lpfc_cmpl_els_logo_acc() caused by lpfc_nlp_not_used()
9914a3d033d3e1d836a43e93e9738e7dd44a096a scsi: lpfc: Revise NPIV ELS unsol rcv cmpl logic to drop ndlp based on nlp_state
1dec1311b9b6cc9c5fd26a77b936f542f03c51d1 scsi: lpfc: Fix list_entry null check warning in lpfc_cmpl_els_plogi()
900db34ad26554d83ae033065a047358994bfe88 scsi: lpfc: Add condition to delete ndlp object after sending BLS_RJT to an ABTS
b5c18c9dd138733c16893613345af44deadcf05e scsi: lpfc: Fix unsolicited FLOGI kref imbalance when in direct attached topology
1f0f7679ad8942f810b0f19ee9cf098c3502d66a scsi: lpfc: Update PRLO handling in direct attached topology
4281f44ea8bfedd25938a0031bebba1473ece9ad scsi: lpfc: Prevent NDLP reference count underflow in dev_loss_tmo callback
bb33b07ac6e3fede9b54cd8bf83a66dbdc6afe89 scsi: lpfc: Delete NLP_TARGET_REMOVE flag due to obsolete usage
ee80d8c2d4ccebed1015f6c9ba6a07c85e149785 scsi: lpfc: Modify handling of ADISC based on ndlp state and RPI registration
05ae6c9c7315d844fbc15afe393f5ba5e5771126 scsi: lpfc: Fix lpfc_check_sli_ndlp() handling for GEN_REQUEST64 commands
b5162bb6aa1ec04dff4509b025883524b6d7e7ca scsi: lpfc: Avoid potential ndlp use-after-free in dev_loss_tmo_callbk
07caedc6a3887938813727beafea40f07c497705 scsi: lpfc: Fix reusing an ndlp that is marked NLP_DROPPED during FLOGI
867cefb4cb1012f42cada1c7d1f35ac8dd276071 xen: Fix x86 sched_clock() interface for xen
eed05744322da07dd7e419432dcedf3c2e017179 xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32

__CHANGES NOT IN UPSTREAM________________
Adding prod certs and changed cert date to 20210620
Adding Rocky secure boot certs
Fixing vmlinuz removal
Fixing UEFI CA path
Porting to 8.10, debranding and Rocky branding
Fixing pesign_key_name values
scsi: lpfc: avoid crashing in lpfc_nlp_get() if lpfc_nodelist was freed

BUILD

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree-build
Running make mrproper...
  CLEAN   scripts/basic
  CLEAN   scripts/kconfig
[TIMER]{MRPROPER}: 5s
x86_64 architecture detected, copying config
'configs/kernel-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky8_10_rebuild-2c03cea8bcc5"
Making olddefconfig
--
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf  --olddefconfig Kconfig
#
# configuration written to .config
#
Starting Build
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
--
  LD [M]  sound/usb/usx2y/snd-usb-usx2y.ko
  LD [M]  sound/virtio/virtio_snd.ko
  LD [M]  sound/x86/snd-hdmi-lpe-audio.ko
  LD [M]  sound/xen/snd_xen_front.ko
  LD [M]  virt/lib/irqbypass.ko
[TIMER]{BUILD}: 1450s
Making Modules
  INSTALL arch/x86/crypto/blowfish-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx2.ko
  INSTALL arch/x86/crypto/camellia-x86_64.ko
--
  INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-rocky8_10_rebuild-2c03cea8bcc5+
[TIMER]{MODULES}: 19s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-rocky8_10_rebuild-2c03cea8bcc5+ arch/x86/boot/bzImage \
	System.map "/boot"
[TIMER]{INSTALL}: 19s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-rocky8_10_rebuild-2c03cea8bcc5+ and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 5s
[TIMER]{BUILD}: 1450s
[TIMER]{MODULES}: 19s
[TIMER]{INSTALL}: 19s
[TIMER]{TOTAL} 1499s
Rebooting in 10 seconds

KSelfTests

[jmaple@devbox code]$ ~/workspace/auto_kernel_history_rebuild/Rocky10/rocky10/code/get_kselftest_diff.sh
kselftest.4.18.0-rocky8_10_rebuild-a8a2edf9cea8+.log
207
kselftest.4.18.0-rocky8_10_rebuild-4be5de5480b0+.log
207
kselftest.4.18.0-rocky8_10_rebuild-106ef6545eca+.log
207
kselftest.4.18.0-rocky8_10_rebuild-2c03cea8bcc5+.log
207
Before: kselftest.4.18.0-rocky8_10_rebuild-106ef6545eca+.log
After: kselftest.4.18.0-rocky8_10_rebuild-2c03cea8bcc5+.log
Diff:
No differences found.

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author James Smart <jsmart2021@gmail.com>
commit 3850394

The following kasan bug was called out:

 BUG: KASAN: slab-out-of-bounds in lpfc_unreg_login+0x7c/0xc0 [lpfc]
 Read of size 2 at addr ffff889fc7c50a22 by task lpfc_worker_3/6676
 ...
 Call Trace:
 dump_stack+0x96/0xe0
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 print_address_description.constprop.6+0x1b/0x220
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 __kasan_report.cold.9+0x37/0x7c
 ? lpfc_unreg_login+0x7c/0xc0 [lpfc]
 kasan_report+0xe/0x20
 lpfc_unreg_login+0x7c/0xc0 [lpfc]
 lpfc_sli_def_mbox_cmpl+0x334/0x430 [lpfc]
 ...

When processing the completion of a "Reg Rpi" login mailbox command in
lpfc_sli_def_mbox_cmpl, a call may be made to lpfc_unreg_login. The vpi is
extracted from the completing mailbox context and passed as an input for
the next. However, the vpi stored in the mailbox command context is an
absolute vpi, which for SLI4 represents both base + offset.  When used with
a non-zero base component, (function id > 0) this results in an
out-of-range access beyond the allocated phba->vpi_ids array.

Fix by subtracting the function's base value to get an accurate vpi number.

Link: https://lore.kernel.org/r/20200322181304.37655-2-jsmart2021@gmail.com
	Signed-off-by: James Smart <jsmart2021@gmail.com>
	Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 3850394)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author James Smart <jsmart2021@gmail.com>
commit ba3d58a
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/ba3d58a1.failed

Code review, following every lpfc_nlp_get() call vs calls during error
handling, discovered cases of missing put calls.

Correct by adding ndlp kref puts in the respective error paths.

Also added comments to several of the error paths to record relationships
to reference counts.

Link: https://lore.kernel.org/r/20220506035519.50908-3-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
	Signed-off-by: James Smart <jsmart2021@gmail.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ba3d58a)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_els.c
…ed phba

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author James Smart <jsmart2021@gmail.com>
commit 6e5c5d2
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/6e5c5d24.failed

On a PCI hotplug capable system, it is possible for scsi_device_put() to
happen after lpfc_pci_remove_one() is called.  As a result, the
sdev->host->hostt->module dereference is for a previously freed memory
location because the phba structure containing the hostt template was
already freed when lpfc_pci_remove_one() returned.

Since the lpfc module is still loaded during power slot disable, all
scsi_host_templates should be declared as part of the global data segment
instead of inside the heap allocated phba structure.  This way the
sdev->host->hostt memory area is always valid as long as the module is
loaded regardless if PCI hotplug dynamically allocates or frees phba
structures.

Move all scsi_host_templates in the phba structure to global variables.
Create a small helper routine to determine appropriate sg_tablesize during
shost allocation.

Link: https://lore.kernel.org/r/20220911221505.117655-7-jsmart2021@gmail.com
Co-developed-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
	Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Co-developed-by: Daniel Wagner <dwagner@suse.de>
	Signed-off-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
	Signed-off-by: James Smart <jsmart2021@gmail.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 6e5c5d2)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc.h
#	drivers/scsi/lpfc/lpfc_init.c
#	drivers/scsi/lpfc/lpfc_scsi.c
…c_nlp_not_used()

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 97f9758
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/97f97582.failed

Smatch detected a double free path because lpfc_nlp_not_used() releases an
ndlp object before reaching lpfc_nlp_put() at the end of
lpfc_cmpl_els_logo_acc().

Remove the outdated lpfc_nlp_not_used() routine.  In
lpfc_mbx_cmpl_ns_reg_login(), replace the call with lpfc_nlp_put().  In
lpfc_cmpl_els_logo_acc(), replace the call with lpfc_unreg_rpi() and keep
the lpfc_nlp_put() at the end of the routine.  If ndlp's rpi was
registered, then lpfc_unreg_rpi()'s completion routine performs the final
ndlp clean up after lpfc_nlp_put() is called from lpfc_cmpl_els_logo_acc().
Otherwise if ndlp has no rpi registered, the lpfc_nlp_put() at the end of
lpfc_cmpl_els_logo_acc() is the final ndlp clean up.

Fixes: 4430f7f ("scsi: lpfc: Rework locations of ndlp reference taking")
	Cc: <stable@vger.kernel.org> # v5.11+
	Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/all/Y3OefhyyJNKH%2Fiaf@kili/
	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230417191558.83100-3-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 97f9758)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_els.c
…n nlp_state

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 9914a3d
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/9914a3d0.failed

When NPIV ports are zoned to devices that support both initiator and target
mode, a remote device's initiated PRLI results in unintended final kref
clean up of the device's ndlp structure.  This disrupts NPIV ports'
discovery for target devices that support both initiator and target mode.

Modify the NPIV lpfc_drop_node clause such that we allow the ndlp to live
so long as it was in NLP_STE_PLOGI_ISSUE, NLP_STE_REG_LOGIN_ISSUE, or
NLP_STE_PRLI_ISSUE nlp_state.  This allows lpfc's issued PRLI completion
routine to determine if the final kref clean up should execute rather than
a remote device's issued PRLI.

Fixes: db651ec ("scsi: lpfc: Correct used_rpi count when devloss tmo fires with no recovery")
	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230523183206.7728-5-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9914a3d)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_els.c
…opology

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 869ab8b

In lpfc_cmpl_els_flogi(), the return out: label decrements the ndlp kref
signaling that FLOGI processing on the ndlp is complete.  In loop topology
path, there is an unnecessary ndlp put because it also branches to the out:
label.  This also signals ndlp usage completion too soon.  As such, remove
the extra lpfc_nlp_put() when in loop topology.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230712180522.112722-4-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 869ab8b)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit a3c3c0a

A WCQE success completion status does not guarantee valid LS_ACC receipt
for ELS commands.  So, introduce a small helper routine that validates ELS
LS_ACC frames in ELS cmpl routines.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-5-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a3c3c0a)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…ware

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit f5779b5

Because file_name and phba->ModelName are both declared a size 80 bytes,
the extra ".grp" file extension could cause an overflow into file_name.

Define a ELX_FW_NAME_SIZE macro with value 84.  84 incorporates the 4 extra
characters from ".grp".  file_name is changed to be declared as a char and
initialized to zeros i.e. null chars.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-3-justintee8345@gmail.com
	Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f5779b5)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 1dec131
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/1dec1311.failed

Smatch called out a warning for null checking a ptr that is assigned by
list_entry(). list_entry() does not return null and, if the list is empty,
can return an invalid ptr. Thus, the !psrp check does not execute properly.

 drivers/scsi/lpfc/lpfc_els.c:2133 lpfc_cmpl_els_plogi()
 warn: list_entry() does not return NULL 'prsp'

Replace list_entry() with list_get_first(), which does a list_empty() check
before returning the first entry.

Fixes: a3c3c0a ("scsi: lpfc: Validate ELS LS_ACC completion payload")
	Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-scsi/01b7568f-4ab4-4d56-bfa6-9ecc5fc261fe@moroto.mountain/
	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-4-justintee8345@gmail.com
	Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1dec131)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_els.c
…ric nodes

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit e1b3aca

Remove the early return NLP_FABRIC check in lpfc_plogi_confirm_nport()
because it is possible for switch domain controllers to change WWPN.

As a result, allow lpfc_plogi_confirm_nport() to detect that a new ndlp
should be initialized in such cases.  The old ndlp object will be cleaned
up when dev_loss_tmo callbk executes.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-6-justintee8345@gmail.com
	Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e1b3aca)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…for ndlps

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit a801d57

Upon first RSCN receipt of a target server's remote port that is initially
acting as an initiator function, the driver marks the ndlp->nlp_type as an
initiator role.  Then later, when processing an RSCN for a target function
role switch, that ndlp remote port is permanently stuck as an initiator
role and can never transition to be discovered as an updated target role
function.

Remove the NLP_RCV_PLOGI early return if statement clause so that the
NLP_NPR_2B_DISC flag gets set.  This allows for role change detections.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-7-justintee8345@gmail.com
	Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a801d57)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
… to an ABTS

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 900db34
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/900db34a.failed

The "Nodelist not empty" log message and an accompanying delay may be
observed when deleting an NPIV port or unloading the lpfc driver.  This can
occur due to receipt of an ABTS for which there is no corresponding login
context or ndlp allocated.  In such cases, the driver allocates a new ndlp
object to send a BLS_RJT after which the ndlp object unintentionally
remains in the NLP_STE_UNUSED_NODE state forever.

Add a check to conditionally remove ndlp's initial reference count when
queuing a BLS response.  If the initial reference is removed, then set
the NLP_DROPPED flag to notify other code paths.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240131185112.149731-9-justintee8345@gmail.com
	Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 900db34)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_sli.c
…allbk

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 15e21dc

In rare cases when a fabric node is recovered after a link bounce and
before dev_loss_tmo callbk is reached, the driver may leave the fabric node
in an inconsistent state with the NLP_IN_DEV_LOSS flag perpetually set.

In lpfc_dev_loss_tmo_callbk, a check is added for a recovered fabric node.
If the node is recovered, then don't queue the lpfc_dev_loss_tmo_handler
work. In lpfc_dev_loss_tmo_handler, the path taken for the recovered fabric
nodes is updated to clear the NLP_IN_DEV_LOSS flag.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-5-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 15e21dc)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…hed topology

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit b5c18c9
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/b5c18c9d.failed

In direct attached topology, certain target vendors that are quick to issue
FLOGI followed by a cable pull for more than dev_loss_tmo may result in a
kref imbalance for the remote port ndlp object.

Add an nlp_get when the defer_flogi_acc flag is set.  This is expected to
balance the nlp_put in the defer_flogi_acc clause in the
lpfc_issue_els_flogi() routine.  Because we need to retain the ndlp ptr,
reorganize all of the defer_flogi_acc information into one
lpfc_defer_flogi_acc struct.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-6-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit b5c18c9)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_els.c
#	drivers/scsi/lpfc/lpfc_hbadisc.c
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 1f0f767
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/1f0f7679.failed

A kref imbalance occurs when handling an unsolicited PRLO in direct
attached topology.

Rework PRLO rcv handling when in MAPPED state.  Save the state that we were
handling a PRLO by setting nlp_last_elscmd to ELS_CMD_PRLO.  Then in the
lpfc_cmpl_els_logo_acc() completion routine, manually restart discovery.
By issuing the PLOGI, which nlp_gets, before nlp_put at the end of the
lpfc_cmpl_els_logo_acc() routine, we are saving us from a final nlp_put.
And, we are still allowing the unreg_rpi to happen.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-7-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1f0f767)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_els.c
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit d1a2ef6

With a FLOGI outstanding and loss of physical link connection to the fabric
for the duration of dev_loss_tmo, there is a fabric ndlp kref imbalance
that decrements the kref and sets the NLP_IN_RECOV_POST_DEV_LOSS flag at
the same time.

The issue is that when the FLOGI completion routine executes, the fabric
ndlp could already be freed because of the final kref put from the
dev_loss_tmo handler.  Fix by early returning before the ndlp kref put if
the ndlp is deemed a candidate for NLP_IN_RECOV_POST_DEV_LOSS in the FLOGI
completion routine.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-5-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d1a2ef6)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…instance

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 0a3c84f

Deleting an NPIV instance requires all fabric ndlps to be released before
an NPIV's resources can be torn down.  Failure to release fabric ndlps
beforehand opens kref imbalance race conditions.  Fix by forcing the DA_ID
to complete synchronously with usage of wait_queue.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-6-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 0a3c84f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 4c113ac

Should an rport remain in the NOTPRESENT state it is possible that
stgt_delete_work is scheduled after dev_loss_tmo_callbk.  In such cases,
dev_loss_tmo_callbk would have cleaned up the NDLP object resulting in
stale ndlp pointers in lpfc_terminate_rport_io().

Check for the DEVLOSS_CALLBK_DONE flag to know if dev_loss_tmo_callbk
has been called.  This is a more reliable way to avoid dereferencing
stale NDLP pointers.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241031223219.152342-3-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4c113ac)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…llback

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 4281f44
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/4281f44e.failed

Current dev_loss_tmo handling checks whether there has been a previous
call to unregister with SCSI transport.  If so, the NDLP kref count is
decremented a second time in dev_loss_tmo as the final kref release.
However, this can sometimes result in a reference count underflow if
there is also a race to unregister with NVMe transport as well.  Add a
check for NVMe transport registration before decrementing the final
kref.  If NVMe transport is still registered, then the NVMe transport
unregistration is designated as the final kref decrement.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241031223219.152342-8-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4281f44)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_hbadisc.c
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit bb33b07
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/bb33b07a.failed

Remove the NLP_TARGET_REMOVE flag as its usage is obsolete.  The current
framework is to rely on the lpfc_dev_loss_tmo_callbk from upper layer to
notify final ndlp kref release.  There's no need to specifically set
NLP_EVT_DEVICE_RM when a LOGO completes.  The dev_loss_tmo_callbk is
responsible for the final kref put.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241212233309.71356-4-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit bb33b07)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_disc.h
#	drivers/scsi/lpfc/lpfc_els.c
#	drivers/scsi/lpfc/lpfc_nportdisc.c
…stration

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit ee80d8c
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/ee80d8c2.failed

In lpfc_check_adisc, remove the requirement that the ndlp object must have
been RPI registered.  Whether or not the ndlp is RPI registered is
unrelated to verifying that the received ADISC is intended for that ndlp
rport object.

After ADISC receipt, there's no need to put the ndlp state into NPR.  Let
the cmpl routines from the actions taken earlier in ADISC handling set the
proper ndlp state.

Also, refactor when a RESUME_RPI mailbox command should be sent.  It should
only be sent if the RPI registered flag is set.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241212233309.71356-5-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ee80d8c)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_nportdisc.c
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 23ed628

With repeated port swaps between separate fabrics, there can be multiple
registrations for fabric well known address 0xfffffe.  This can cause ndlp
reference confusion due to the usage of a single ndlp ptr that stores the
rport object in fc_rport struct private storage during transport
registration.  Subsequent registrations update the ndlp->rport field with
the newer rport, so when transport layer triggers dev_loss_tmo for the
earlier registered rport the ndlp->rport private storage is referencing the
newer rport instead of the older rport in dev_loss_tmo callbk.

Because the older ndlp->rport object is already cleaned up elsewhere in
driver code during the time of fabric swap, check that the rport provided
in dev_loss_tmo callbk actually matches the rport stored in the LLDD's
ndlp->rport field.  Otherwise, skip dev_loss_tmo work on a stale rport.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-4-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 23ed628)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 56c3d80

After a port swap between separate fabrics, there may be multiple nodes in
the vport's fc_nodes list with the same fabric well known address.
Duplication is temporary and eventually resolves itself after dev_loss_tmo
expires, but nameserver queries may still occur before dev_loss_tmo.  This
possibly results in returning stale fabric ndlp objects.  Fix by adding an
nlp_state check to ensure the ndlp search routine returns the correct newer
allocated ndlp fabric object.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-5-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 56c3d80)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…ands

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 05ae6c9
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/05ae6c9c.failed

In lpfc_check_sli_ndlp(), the get_job_els_rsp64_did remote_id assignment
does not apply for GEN_REQUEST64 commands as it only has meaning for a
ELS_REQUEST64 command.  So, if (iocb->ndlp == ndlp) is false, we could
erroneously return the wrong value.  Fix by replacing the fallthrough
statement with a break statement before the remote_id check.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250425194806.3585-2-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 05ae6c9)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_hbadisc.c
…RLI retry

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit df117c9

A failure to unregister with the NVMe transport may occur when a PRLI is
retried.

Remove duplicate testing of NLP_NVME_TARGET flag. Add a secondary check
of the registered state based on the nrport information.  Further qualify
the ndlp reference count modification when nvme_fc_register_remoteport()
returns an error.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250425194806.3585-5-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit df117c9)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit b5162bb
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/b5162bb6.failed

Smatch detected a potential use-after-free of an ndlp oject in
dev_loss_tmo_callbk during driver unload or fatal error handling.

Fix by reordering code to avoid potential use-after-free if initial
nodelist reference has been previously removed.

Fixes: 4281f44 ("scsi: lpfc: Prevent NDLP reference count underflow in dev_loss_tmo callback")
	Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-scsi/41c1d855-9eb5-416f-ac12-8b61929201a3@stanley.mountain/
	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250425194806.3585-6-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit b5162bb)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_hbadisc.c
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Justin Tee <justin.tee@broadcom.com>
commit 07caedc
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/07caedc6.failed

It's possible for an unstable link to repeatedly bounce allowing a FLOGI
retry, but then bounce again forcing an abort of the FLOGI.  Ensure that
the initial reference count on the FLOGI ndlp is restored in this faulty
link scenario.

	Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://patch.msgid.link/20251106224639.139176-8-justintee8345@gmail.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 07caedc)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/scsi/lpfc/lpfc_els.c
#	drivers/scsi/lpfc/lpfc_hbadisc.c
jira KERNEL-548
cve CVE-2025-40248
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Michal Luczaj <mhal@rbox.co>
commit 002541e

During connect(), acting on a signal/timeout by disconnecting an already
established socket leads to several issues:

1. connect() invoking vsock_transport_cancel_pkt() ->
   virtio_transport_purge_skbs() may race with sendmsg() invoking
   virtio_transport_get_credit(). This results in a permanently elevated
   `vvs->bytes_unsent`. Which, in turn, confuses the SOCK_LINGER handling.

2. connect() resetting a connected socket's state may race with socket
   being placed in a sockmap. A disconnected socket remaining in a sockmap
   breaks sockmap's assumptions. And gives rise to WARNs.

3. connect() transitioning SS_CONNECTED -> SS_UNCONNECTED allows for a
   transport change/drop after TCP_ESTABLISHED. Which poses a problem for
   any simultaneous sendmsg() or connect() and may result in a
   use-after-free/null-ptr-deref.

Do not disconnect socket on signal/timeout. Keep the logic for unconnected
sockets: they don't linger, can't be placed in a sockmap, are rejected by
sendmsg().

[1]: https://lore.kernel.org/netdev/e07fd95c-9a38-4eea-9638-133e38c2ec9b@rbox.co/
[2]: https://lore.kernel.org/netdev/20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co/
[3]: https://lore.kernel.org/netdev/60f1b7db-3099-4f6a-875e-af9f6ef194f6@rbox.co/

Fixes: d021c34 ("VSOCK: Introduce VM Sockets")
	Signed-off-by: Michal Luczaj <mhal@rbox.co>
	Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251119-vsock-interrupted-connect-v2-1-70734cf1233f@rbox.co
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 002541e)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…ocked()

jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Trond Myklebust <trond.myklebust@hammerspace.com>
commit 9e8f324

Check that the delegation is still attached after taking the spin lock
in nfs_start_delegation_return_locked().

	Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
(cherry picked from commit 9e8f324)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Pavel Tatashin <pasha.tatashin@oracle.com>
commit 38669ba

It is expected for sched_clock() to output data from 0, when system boots.

Add an offset xen_sched_clock_offset (similarly how it is done in other
hypervisors i.e. kvm_sched_clock_offset) to count sched_clock() from 0,
when time is first initialized.

	Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
	Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
	Cc: steven.sistare@oracle.com
	Cc: daniel.m.jordan@oracle.com
	Cc: linux@armlinux.org.uk
	Cc: schwidefsky@de.ibm.com
	Cc: heiko.carstens@de.ibm.com
	Cc: john.stultz@linaro.org
	Cc: sboyd@codeaurora.org
	Cc: hpa@zytor.com
	Cc: douly.fnst@cn.fujitsu.com
	Cc: peterz@infradead.org
	Cc: prarit@redhat.com
	Cc: feng.tang@intel.com
	Cc: pmladek@suse.com
	Cc: gnomes@lxorguk.ukuu.org.uk
	Cc: linux-s390@vger.kernel.org
	Cc: boris.ostrovsky@oracle.com
	Cc: jgross@suse.com
	Cc: pbonzini@redhat.com
Link: https://lkml.kernel.org/r/20180719205545.16512-14-pasha.tatashin@oracle.com

(cherry picked from commit 38669ba)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Juergen Gross <jgross@suse.com>
commit 867cefb
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/867cefb4.failed

Commit f94c8d1 ("sched/clock, x86/tsc: Rework the x86 'unstable'
sched_clock() interface") broke Xen guest time handling across
migration:

[  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  187.251137] OOM killer disabled.
[  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  187.252299] suspending xenstore...
[  187.266987] xen:grant_table: Grant tables using version 1 layout
[18446743811.706476] OOM killer enabled.
[18446743811.706478] Restarting tasks ... done.
[18446743811.720505] Setting capacity to 16777216

Fix that by setting xen_sched_clock_offset at resume time to ensure a
monotonic clock value.

[boris: replaced pr_info() with pr_info_once() in xen_callback_vector()
 to avoid printing with incorrect timestamp during resume (as we
 haven't re-adjusted the clock yet)]

Fixes: f94c8d1 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
	Cc: <stable@vger.kernel.org> # 4.11
	Reported-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
	Signed-off-by: Juergen Gross <jgross@suse.com>
	Tested-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
	Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
(cherry picked from commit 867cefb)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/xen/events/events_base.c
jira KERNEL-548
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Dongli Zhang <dongli.zhang@oracle.com>
commit eed0574
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/eed05744.failed

The sched_clock() can be used very early since commit 857baa8
("sched/clock: Enable sched clock early"). In addition, with commit
38669ba ("x86/xen/time: Output xen sched_clock time from 0"), kdump
kernel in Xen HVM guest may panic at very early stage when accessing
&__this_cpu_read(xen_vcpu)->time as in below:

setup_arch()
 -> init_hypervisor_platform()
     -> x86_init.hyper.init_platform = xen_hvm_guest_init()
         -> xen_hvm_init_time_ops()
             -> xen_clocksource_read()
                 -> src = &__this_cpu_read(xen_vcpu)->time;

This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.

However, when Xen HVM guest panic on vcpu >= 32, since
xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.

This patch calls xen_hvm_init_time_ops() again later in
xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
registered when the boot vcpu is >= 32.

This issue can be reproduced on purpose via below command at the guest
side when kdump/kexec is enabled:

"taskset -c 33 echo c > /proc/sysrq-trigger"

The bugfix for PVM is not implemented due to the lack of testing
environment.

[boris: xen_hvm_init_time_ops() returns on errors instead of jumping to end]

	Cc: Joe Jin <joe.jin@oracle.com>
	Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
	Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220302164032.14569-3-dongli.zhang@oracle.com
	Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
(cherry picked from commit eed0574)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/x86/xen/time.c
jira KERNEL-548
cve CVE-2025-40277
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Ian Forbes <ian.forbes@broadcom.com>
commit 32b415a

This data originates from userspace and is used in buffer offset
calculations which could potentially overflow causing an out-of-bounds
access.

Fixes: 8ce75f8 ("drm/vmwgfx: Update device includes for DX device functionality")
	Reported-by: Rohit Keshri <rkeshri@redhat.com>
	Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
	Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
	Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patch.msgid.link/20251021190128.13014-1-ian.forbes@broadcom.com
(cherry picked from commit 32b415a)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
cve CVE-2023-53673
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Pauli Virtanen <pav@iki.fi>
commit 7f7cfcb

In hci_cs_disconnect, we do hci_conn_del even if disconnection failed.

ISO, L2CAP and SCO connections refer to the hci_conn without
hci_conn_get, so disconn_cfm must be called so they can clean up their
conn, otherwise use-after-free occurs.

ISO:
==========================================================
iso_sock_connect:880: sk 00000000eabd6557
iso_connect_cis:356: 70:1a:b8:98:ff:a2 -> 28:3d:c2:4a:7e:da
...
iso_conn_add:140: hcon 000000001696f1fd conn 00000000b6251073
hci_dev_put:1487: hci0 orig refcnt 17
__iso_chan_add:214: conn 00000000b6251073
iso_sock_clear_timer:117: sock 00000000eabd6557 state 3
...
hci_rx_work:4085: hci0 Event packet
hci_event_packet:7601: hci0: event 0x0f
hci_cmd_status_evt:4346: hci0: opcode 0x0406
hci_cs_disconnect:2760: hci0: status 0x0c
hci_sent_cmd_data:3107: hci0 opcode 0x0406
hci_conn_del:1151: hci0 hcon 000000001696f1fd handle 2560
hci_conn_unlink:1102: hci0: hcon 000000001696f1fd
hci_conn_drop:1451: hcon 00000000d8521aaf orig refcnt 2
hci_chan_list_flush:2780: hcon 000000001696f1fd
hci_dev_put:1487: hci0 orig refcnt 21
hci_dev_put:1487: hci0 orig refcnt 20
hci_req_cmd_complete:3978: opcode 0x0406 status 0x0c
... <no iso_* activity on sk/conn> ...
iso_sock_sendmsg:1098: sock 00000000dea5e2e0, sk 00000000eabd6557
BUG: kernel NULL pointer dereference, address: 0000000000000668
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
RIP: 0010:iso_sock_sendmsg (net/bluetooth/iso.c:1112) bluetooth
==========================================================

L2CAP:
==================================================================
hci_cmd_status_evt:4359: hci0: opcode 0x0406
hci_cs_disconnect:2760: hci0: status 0x0c
hci_sent_cmd_data:3085: hci0 opcode 0x0406
hci_conn_del:1151: hci0 hcon ffff88800c999000 handle 3585
hci_conn_unlink:1102: hci0: hcon ffff88800c999000
hci_chan_list_flush:2780: hcon ffff88800c999000
hci_chan_del:2761: hci0 hcon ffff88800c999000 chan ffff888018ddd280
...
BUG: KASAN: slab-use-after-free in hci_send_acl+0x2d/0x540 [bluetooth]
Read of size 8 at addr ffff888018ddd298 by task bluetoothd/1175

CPU: 0 PID: 1175 Comm: bluetoothd Tainted: G            E      6.4.0-rc4+ #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x5b/0x90
 print_report+0xcf/0x670
 ? __virt_addr_valid+0xf8/0x180
 ? hci_send_acl+0x2d/0x540 [bluetooth]
 kasan_report+0xa8/0xe0
 ? hci_send_acl+0x2d/0x540 [bluetooth]
 hci_send_acl+0x2d/0x540 [bluetooth]
 ? __pfx___lock_acquire+0x10/0x10
 l2cap_chan_send+0x1fd/0x1300 [bluetooth]
 ? l2cap_sock_sendmsg+0xf2/0x170 [bluetooth]
 ? __pfx_l2cap_chan_send+0x10/0x10 [bluetooth]
 ? lock_release+0x1d5/0x3c0
 ? mark_held_locks+0x1a/0x90
 l2cap_sock_sendmsg+0x100/0x170 [bluetooth]
 sock_write_iter+0x275/0x280
 ? __pfx_sock_write_iter+0x10/0x10
 ? __pfx___lock_acquire+0x10/0x10
 do_iter_readv_writev+0x176/0x220
 ? __pfx_do_iter_readv_writev+0x10/0x10
 ? find_held_lock+0x83/0xa0
 ? selinux_file_permission+0x13e/0x210
 do_iter_write+0xda/0x340
 vfs_writev+0x1b4/0x400
 ? __pfx_vfs_writev+0x10/0x10
 ? __seccomp_filter+0x112/0x750
 ? populate_seccomp_data+0x182/0x220
 ? __fget_light+0xdf/0x100
 ? do_writev+0x19d/0x210
 do_writev+0x19d/0x210
 ? __pfx_do_writev+0x10/0x10
 ? mark_held_locks+0x1a/0x90
 do_syscall_64+0x60/0x90
 ? lockdep_hardirqs_on_prepare+0x149/0x210
 ? do_syscall_64+0x6c/0x90
 ? lockdep_hardirqs_on_prepare+0x149/0x210
 entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7ff45cb23e64
Code: 15 d1 1f 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 80 3d 9d a7 0d 00 00 74 13 b8 14 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 89 54 24 1c 48 89
RSP: 002b:00007fff21ae09b8 EFLAGS: 00000202 ORIG_RAX: 0000000000000014
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007ff45cb23e64
RDX: 0000000000000001 RSI: 00007fff21ae0aa0 RDI: 0000000000000017
RBP: 00007fff21ae0aa0 R08: 000000000095a8a0 R09: 0000607000053f40
R10: 0000000000000001 R11: 0000000000000202 R12: 00007fff21ae0ac0
R13: 00000fffe435c150 R14: 00007fff21ae0a80 R15: 000060f000000040
 </TASK>

Allocated by task 771:
 kasan_save_stack+0x33/0x60
 kasan_set_track+0x25/0x30
 __kasan_kmalloc+0xaa/0xb0
 hci_chan_create+0x67/0x1b0 [bluetooth]
 l2cap_conn_add.part.0+0x17/0x590 [bluetooth]
 l2cap_connect_cfm+0x266/0x6b0 [bluetooth]
 hci_le_remote_feat_complete_evt+0x167/0x310 [bluetooth]
 hci_event_packet+0x38d/0x800 [bluetooth]
 hci_rx_work+0x287/0xb20 [bluetooth]
 process_one_work+0x4f7/0x970
 worker_thread+0x8f/0x620
 kthread+0x17f/0x1c0
 ret_from_fork+0x2c/0x50

Freed by task 771:
 kasan_save_stack+0x33/0x60
 kasan_set_track+0x25/0x30
 kasan_save_free_info+0x2e/0x50
 ____kasan_slab_free+0x169/0x1c0
 slab_free_freelist_hook+0x9e/0x1c0
 __kmem_cache_free+0xc0/0x310
 hci_chan_list_flush+0x46/0x90 [bluetooth]
 hci_conn_cleanup+0x7d/0x330 [bluetooth]
 hci_cs_disconnect+0x35d/0x530 [bluetooth]
 hci_cmd_status_evt+0xef/0x2b0 [bluetooth]
 hci_event_packet+0x38d/0x800 [bluetooth]
 hci_rx_work+0x287/0xb20 [bluetooth]
 process_one_work+0x4f7/0x970
 worker_thread+0x8f/0x620
 kthread+0x17f/0x1c0
 ret_from_fork+0x2c/0x50
==================================================================

Fixes: b8d2905 ("Bluetooth: clean up connection in hci_cs_disconnect")
	Signed-off-by: Pauli Virtanen <pav@iki.fi>
	Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
(cherry picked from commit 7f7cfcb)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-548
cve CVE-2025-40154
Rebuild_History Non-Buildable kernel-4.18.0-553.97.1.el8_10
commit-author Takashi Iwai <tiwai@suse.de>
commit fba404e

When an invalid value is passed via quirk option, currently
bytcr_rt5640 driver only shows an error message but leaves as is.
This may lead to unepxected results like OOB access.

This patch corrects the input mapping to the certain default value if
an invalid value is passed.

Fixes: 063422c ("ASoC: Intel: bytcr_rt5640: Set card long_name based on quirks")
	Signed-off-by: Takashi Iwai <tiwai@suse.de>
Message-ID: <20250902171826.27329-3-tiwai@suse.de>
	Signed-off-by: Mark Brown <broonie@kernel.org>
(cherry picked from commit fba404e)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 594898
Number of commits in rpm: 42
Number of commits matched with upstream: 35 (83.33%)
Number of commits in upstream but not in rpm: 594863
Number of commits NOT found in upstream: 7 (16.67%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.97.1.el8_10 for kernel-4.18.0-553.97.1.el8_10
Clean Cherry Picks: 19 (54.29%)
Empty Cherry Picks: 16 (45.71%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-4.18.0-553.97.1.el8_10/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
@PlaidCat PlaidCat self-assigned this Jan 30, 2026
@PlaidCat PlaidCat requested review from a team January 30, 2026 17:37
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants