Commit Graph

1397885 Commits

Author SHA1 Message Date
Linus Torvalds c668da99b9 fscrypt fix for v6.18-rc5
Fix an UBSAN warning that started occurring when the block layer started
 supporting logical_block_size > PAGE_SIZE.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaQ0DUxQcZWJpZ2dlcnNA
 a2VybmVsLm9yZwAKCRDzXCl4vpKOK0E/AQDUd4N4YJ1D/vUcQXjtHnH6lkSMSRgo
 ZDJmoZcwSI+28QEA31Lvjs3G1heMSpKCTcz/+C7FLwM35+9e/ZUA5z+0fQg=
 =fPMw
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux

Pull fscrypt fix from Eric Biggers:
 "Fix an UBSAN warning that started occurring when the block layer
  started supporting logical_block_size > PAGE_SIZE"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
  fscrypt: fix left shift underflow when inode->i_blkbits > PAGE_SHIFT
2025-11-06 12:45:15 -08:00
Linus Torvalds c90841db35 hardening fixes for v6.18-rc5
- Introduce __nocfi_generic for arm32 Clang (Nathan Chancellor)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaQz8RwAKCRA2KwveOeQk
 u51dAP9YhHIttRev7rdRwDWwrlhO+i2fps0vq8G9S6keSndr+AEAv8BtQexexPhI
 1oSNLeB/kVUVp5dQWjni48IgMxaG4A8=
 =zFGT
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:
 "This is a work-around for a (now fixed) corner case in the arm32 build
  with Clang KCFI enabled.

   - Introduce __nocfi_generic for arm32 Clang (Nathan Chancellor)"

* tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  libeth: xdp: Disable generic kCFI pass for libeth_xdp_tx_xmit_bulk()
  ARM: Select ARCH_USES_CFI_GENERIC_LLVM_PASS
  compiler_types: Introduce __nocfi_generic
2025-11-06 11:54:59 -08:00
Krzysztof Kozlowski 4436f484cb gpio: tb10x: Drop unused tb10x_set_bits() function
tb10x_set_bits() is not referenced anywhere leading to W=1 warning:

  gpio-tb10x.c:59:20: error: unused function 'tb10x_set_bits' [-Werror,-Wunused-function]

After its removal, tb10x_reg_write() becomes unused as well.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20251106-gpio-of-match-v1-1-50c7115a045e@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-11-06 18:19:44 +01:00
Wayne Lin 3c6a743c69 drm/amd/display: Enable mst when it's detected but yet to be initialized
[Why]
drm_dp_mst_topology_queue_probe() is used under the assumption that
mst is already initialized. If we connect system with SST first
then switch to the mst branch during suspend, we will fail probing
topology by calling the wrong API since the mst manager is yet to
be initialized.

[How]
At dm_resume(), once it's detected as mst branc connected, check if
the mst is initialized already. If not, call
dm_helpers_dp_mst_start_top_mgr() instead to initialize mst

V2: Adjust the commit msg a bit

Fixes: bc068194f5 ("drm/amd/display: Don't write DP_MSTM_CTRL after LT")
Cc: Fangzhi Zuo <jerry.zuo@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 62320fb8d91a0bddc44a228203cfa9bfbb5395bd)
Cc: stable@vger.kernel.org
2025-11-06 11:58:55 -05:00
Lijo Lazar 570a66b48c drm/amdgpu: Fix wait after reset sequence in S3
For a mode-1 reset done at the end of S3 on PSPv11 dGPUs, only check if
TOS is unloaded.

Fixes: 32f73741d6 ("drm/amdgpu: Wait for bootloader after PSPv11 reset")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4649
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1ad25fd272753db14c5d1cc8c68e20ce01f3f888)
2025-11-06 11:58:32 -05:00
Mario Limonciello b09cb2996c drm/amd: Fix suspend failure with secure display TA
commit c760bcda83 ("drm/amd: Check whether secure display TA loaded
successfully") attempted to fix extra messages, but failed to port the
cleanup that was in commit 5c6d52ff4b ("drm/amd: Don't try to enable
secure display TA multiple times") to prevent multiple tries.

Add that to the failure handling path even on a quick failure.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4679
Fixes: c760bcda83 ("drm/amd: Check whether secure display TA loaded successfully")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4104c0a454f6a4d1e0d14895d03c0e7bdd0c8240)
2025-11-06 11:58:10 -05:00
Samuel Zhang eb6e7f520d drm/amdgpu: fix gpu page fault after hibernation on PF passthrough
On PF passthrough environment, after hibernate and then resume, coralgemm
will cause gpu page fault.

Mode1 reset happens during hibernate, but partition mode is not restored
on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right
after resume. When CP access the MQD BO, wrong stride size is used,
this will cause out of bound access on the MQD BO, resulting page fault.

The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called
when resume from a hibernation.
KFD resume is called separately during a reset recovery or resume from
suspend sequence. Hence it's not required to be called as part of
partition switch.

Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5d1b32cfe4a676fe552416cb5ae847b215463a1a)
2025-11-06 11:57:08 -05:00
Linus Torvalds c2c2ccfd4b Including fixes from bluetooth and wireless.
Current release - new code bugs:
 
  - ptp: expose raw cycles only for clocks with free-running counter
 
  - bonding: fix null-deref in actor_port_prio setting
 
  - mdio: ERR_PTR-check regmap pointer returned by device_node_to_regmap()
 
  - eth: libie: depend on DEBUG_FS when building LIBIE_FWLOG
 
 Previous releases - regressions:
 
  - virtio_net: fix perf regression due to bad alignment of
    virtio_net_hdr_v1_hash
 
  - Revert "wifi: ath10k: avoid unnecessary wait for service ready message"
    caused regressions for QCA988x and QCA9984
 
  - Revert "wifi: ath12k: Fix missing station power save configuration"
    caused regressions for WCN7850
 
  - eth: bnxt_en: shutdown FW DMA in bnxt_shutdown(), fix memory
    corruptions after kexec
 
 Previous releases - always broken:
 
  - virtio-net: fix received packet length check for big packets
 
  - sctp: fix races in socket diag handling
 
  - wifi: add an hrtimer-based delayed work item to avoid low granularity
    of timers set relatively far in the future, and use it where it matters
    (e.g. when performing AP-scheduled channel switch)
 
  - eth: mlx5e:
    - correctly propagate error in case of module EEPROM read failure
    - fix HW-GRO on systems with PAGE_SIZE == 64kB
 
  - dsa: b53: fixes for tagging, link configuration / RMII, FDB, multicast
 
  - phy: lan8842: implement latest errata
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmkMyx0ACgkQMUZtbf5S
 Irs4gA//fRGPHfEksbVO0lAR9QVUKq/Wvokt6DnLXQsi7js7kun7ymVFyU8tVacP
 sGoYbTTXO34XpijKVMTFe5WnnnSF3cri1/e+PYNXndKGHOheLvz0aq+CniiLexaT
 ag/opHX9+Io6x8DyOVkkRJYByvEcXgy1vFjhk5O04wn7tGPBNzvaKQJzjfVIiiWP
 thma/e77jVUqsNptS/VzJNCDwPB3Qi0fqylRovu7A59Kmr7GLZxaqZQpvM5/XjU3
 s3NhNjNDawmiwxN2AIztXo+vMReqFaiCr+z36OcruE04CFslkQGmE/5g3FdDGg+Q
 RE7Wfgk2UJ+Q5Y1jnQWNYOkGaMd6bMgPQD/8zZvwEf163Gh1YBh0rtDY07DWV5FR
 wp5cmeNDo2Mf+nnCM2UaoUQS2AAmkub2aAIjnlvrefeJMKciyMqtQe+V02aIxuvK
 BPmeSCN1IOyAIDCRE3Fd6JFyk+4Z4YC6Hdm/WOG517eGrDh3yV9JE/+C3jdh3JwT
 lvaScKhCdyq/9mFqEcU/yR5Kc4Od6Rw0K0s5DM4LAbKSsxI6hp6SkuoFHUsqFsRR
 HL3ucy4c4hmLxz6AA5/+UVIPzFsY5cwOnhs3AU2UVuA95fH0QWqX2u2ll0rC9mAW
 xDXa/ai0/KAGvVYqjC+MDyF8zrtzSG7C40ZsxhdcU84s94ZNVFI=
 =a/Ml
 -----END PGP SIGNATURE-----

Merge tag 'net-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
  Including fixes from bluetooth and wireless.

  Current release - new code bugs:

   - ptp: expose raw cycles only for clocks with free-running counter

   - bonding: fix null-deref in actor_port_prio setting

   - mdio: ERR_PTR-check regmap pointer returned by
     device_node_to_regmap()

   - eth: libie: depend on DEBUG_FS when building LIBIE_FWLOG

  Previous releases - regressions:

   - virtio_net: fix perf regression due to bad alignment of
     virtio_net_hdr_v1_hash

   - Revert "wifi: ath10k: avoid unnecessary wait for service ready
     message" caused regressions for QCA988x and QCA9984

   - Revert "wifi: ath12k: Fix missing station power save configuration"
     caused regressions for WCN7850

   - eth: bnxt_en: shutdown FW DMA in bnxt_shutdown(), fix memory
     corruptions after kexec

  Previous releases - always broken:

   - virtio-net: fix received packet length check for big packets

   - sctp: fix races in socket diag handling

   - wifi: add an hrtimer-based delayed work item to avoid low
     granularity of timers set relatively far in the future, and use it
     where it matters (e.g. when performing AP-scheduled channel switch)

   - eth: mlx5e:
       - correctly propagate error in case of module EEPROM read failure
       - fix HW-GRO on systems with PAGE_SIZE == 64kB

   - dsa: b53: fixes for tagging, link configuration / RMII, FDB,
     multicast

   - phy: lan8842: implement latest errata"

* tag 'net-6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits)
  selftests/vsock: avoid false-positives when checking dmesg
  net: bridge: fix MST static key usage
  net: bridge: fix use-after-free due to MST port state bypass
  lan966x: Fix sleeping in atomic context
  bonding: fix NULL pointer dereference in actor_port_prio setting
  net: dsa: microchip: Fix reserved multicast address table programming
  net: wan: framer: pef2256: Switch to devm_mfd_add_devices()
  net: libwx: fix device bus LAN ID
  net/mlx5e: SHAMPO, Fix header formulas for higher MTUs and 64K pages
  net/mlx5e: SHAMPO, Fix skb size check for 64K pages
  net/mlx5e: SHAMPO, Fix header mapping for 64K pages
  net: ti: icssg-prueth: Fix fdb hash size configuration
  net/mlx5e: Fix return value in case of module EEPROM read error
  net: gro_cells: Reduce lock scope in gro_cell_poll
  libie: depend on DEBUG_FS when building LIBIE_FWLOG
  wifi: mac80211_hwsim: Limit destroy_on_close radio removal to netgroup
  netpoll: Fix deadlock in memory allocation under spinlock
  net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error
  virtio-net: fix received length check in big packets
  bnxt_en: Fix warning in bnxt_dl_reload_down()
  ...
2025-11-06 08:52:30 -08:00
Nathan Chancellor a26a6c93ed
kbuild: Strip trailing padding bytes from modules.builtin.modinfo
After commit d50f210913 ("kbuild: align modinfo section for Secureboot
Authenticode EDK2 compat"), running modules_install with certain
versions of kmod (such as 29.1 in Ubuntu Jammy) in certain
configurations may fail with:

  depmod: ERROR: kmod_builtin_iter_next: unexpected string without modname prefix

The additional padding bytes to ensure .modinfo is aligned within
vmlinux.unstripped are unexpected by kmod, as this section has always
just been null-terminated strings.

Strip the trailing padding bytes from modules.builtin.modinfo after it
has been extracted from vmlinux.unstripped to restore the format that
kmod expects while keeping .modinfo aligned within vmlinux.unstripped to
avoid regressing the Authenticode calculation fix for EDK2.

Cc: stable@vger.kernel.org
Fixes: d50f210913 ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat")
Reported-by: Omar Sandoval <osandov@fb.com>
Reported-by: Samir M <samir@linux.ibm.com>
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/7fef7507-ad64-4e51-9bb8-c9fb6532e51e@linux.ibm.com/
Tested-by: Omar Sandoval <osandov@fb.com>
Tested-by: Samir M <samir@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20251105-kbuild-fix-builtin-modinfo-for-kmod-v1-1-b419d8ad4606@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-11-06 09:50:23 -07:00
Bobby Eshleman 3534e03e0e selftests/vsock: avoid false-positives when checking dmesg
Sometimes VMs will have some intermittent dmesg warnings that are
unrelated to vsock. Change the dmesg parsing to filter on strings
containing 'vsock' to avoid false positive failures that are unrelated
to vsock. The downside is that it is possible for some vsock related
warnings to not contain the substring 'vsock', so those will be missed.

Fixes: a4a65c6fe0 ("selftests/vsock: add initial vmtest.sh for vsock")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251105-vsock-vmtest-dmesg-fix-v2-1-1a042a14892c@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 07:34:50 -08:00
Jakub Kicinski 13fef4fb05 Merge branch 'net-bridge-fix-two-mst-bugs'
Nikolay Aleksandrov says:

====================
net: bridge: fix two MST bugs

Patch 01 fixes a race condition that exists between expired fdb deletion
and port deletion when MST is enabled. Learning can happen after the
port's state has been changed to disabled which could lead to that
port's memory being used after it's been freed. The issue was reported
by syzbot, more information in patch 01. Patch 02 fixes an issue with
MST's static key which Ido spotted, we can have multiple bridges with MST
and a single bridge can erroneously disable it for all.
====================

Link: https://patch.msgid.link/20251105111919.1499702-1-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 07:32:20 -08:00
Nikolay Aleksandrov ee87c63f9b net: bridge: fix MST static key usage
As Ido pointed out, the static key usage in MST is buggy and should use
inc/dec instead of enable/disable because we can have multiple bridges
with MST enabled which means a single bridge can disable MST for all.
Use static_branch_inc/dec to avoid that. When destroying a bridge decrement
the key if MST was enabled.

Fixes: ec7328b591 ("net: bridge: mst: Multiple Spanning Tree (MST) mode")
Reported-by: Ido Schimmel <idosch@nvidia.com>
Closes: https://lore.kernel.org/netdev/20251104120313.1306566-1-razor@blackwall.org/T/#m6888d87658f94ed1725433940f4f4ebb00b5a68b
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251105111919.1499702-3-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 07:32:17 -08:00
Nikolay Aleksandrov 8dca36978a net: bridge: fix use-after-free due to MST port state bypass
syzbot reported[1] a use-after-free when deleting an expired fdb. It is
due to a race condition between learning still happening and a port being
deleted, after all its fdbs have been flushed. The port's state has been
toggled to disabled so no learning should happen at that time, but if we
have MST enabled, it will bypass the port's state, that together with VLAN
filtering disabled can lead to fdb learning at a time when it shouldn't
happen while the port is being deleted. VLAN filtering must be disabled
because we flush the port VLANs when it's being deleted which will stop
learning. This fix adds a check for the port's vlan group which is
initialized to NULL when the port is getting deleted, that avoids the port
state bypass. When MST is enabled there would be a minimal new overhead
in the fast-path because the port's vlan group pointer is cache-hot.

[1] https://syzkaller.appspot.com/bug?extid=dd280197f0f7ab3917be

Fixes: ec7328b591 ("net: bridge: mst: Multiple Spanning Tree (MST) mode")
Reported-by: syzbot+dd280197f0f7ab3917be@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69088ffa.050a0220.29fc44.003d.GAE@google.com/
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251105111919.1499702-2-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 07:32:17 -08:00
Horatiu Vultur 0216721ce7 lan966x: Fix sleeping in atomic context
The following warning was seen when we try to connect using ssh to the device.

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:575
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 104, name: dropbear
preempt_count: 1, expected: 0
INFO: lockdep is turned off.
CPU: 0 UID: 0 PID: 104 Comm: dropbear Tainted: G        W           6.18.0-rc2-00399-g6f1ab1b109b9-dirty #530 NONE
Tainted: [W]=WARN
Hardware name: Generic DT based system
Call trace:
 unwind_backtrace from show_stack+0x10/0x14
 show_stack from dump_stack_lvl+0x7c/0xac
 dump_stack_lvl from __might_resched+0x16c/0x2b0
 __might_resched from __mutex_lock+0x64/0xd34
 __mutex_lock from mutex_lock_nested+0x1c/0x24
 mutex_lock_nested from lan966x_stats_get+0x5c/0x558
 lan966x_stats_get from dev_get_stats+0x40/0x43c
 dev_get_stats from dev_seq_printf_stats+0x3c/0x184
 dev_seq_printf_stats from dev_seq_show+0x10/0x30
 dev_seq_show from seq_read_iter+0x350/0x4ec
 seq_read_iter from seq_read+0xfc/0x194
 seq_read from proc_reg_read+0xac/0x100
 proc_reg_read from vfs_read+0xb0/0x2b0
 vfs_read from ksys_read+0x6c/0xec
 ksys_read from ret_fast_syscall+0x0/0x1c
Exception stack(0xf0b11fa8 to 0xf0b11ff0)
1fa0:                   00000001 00001000 00000008 be9048d8 00001000 00000001
1fc0: 00000001 00001000 00000008 00000003 be905920 0000001e 00000000 00000001
1fe0: 0005404c be9048c0 00018684 b6ec2cd8

It seems that we are using a mutex in a atomic context which is wrong.
Change the mutex with a spinlock.

Fixes: 12c2d0a5b8 ("net: lan966x: add ethtool configuration and statistics")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251105074955.1766792-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 07:31:34 -08:00
Hangbin Liu 067bf016e9 bonding: fix NULL pointer dereference in actor_port_prio setting
Liang reported an issue where setting a slave’s actor_port_prio to
predefined values such as 0, 255, or 65535 would cause a system crash.

The problem occurs because in bond_opt_parse(), when the provided value
matches a predefined table entry, the function returns that table entry,
which does not contain slave information. Later, in
bond_option_actor_port_prio_set(), calling bond_slave_get_rtnl() leads
to a NULL pointer dereference.

Since actor_port_prio is defined as a u16 and initialized to the default
value of 255 in ad_initialize_port(), there is no need for the
bond_actor_port_prio_tbl. Using the BOND_OPTFLAG_RAWVAL flag is sufficient.

Fixes: 6b6dc81ee7 ("bonding: add support for per-port LACP actor priority")
Reported-by: Liang Li <liali@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251105072620.164841-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 07:16:37 -08:00
Tristram Ha 96baf482ca net: dsa: microchip: Fix reserved multicast address table programming
KSZ9477/KSZ9897 and LAN937X families of switches use a reserved multicast
address table for some specific forwarding with some multicast addresses,
like the one used in STP.  The hardware assumes the host port is the last
port in KSZ9897 family and port 5 in LAN937X family.  Most of the time
this assumption is correct but not in other cases like KSZ9477.
Originally the function just setups the first entry, but the others still
need update, especially for one common multicast address that is used by
PTP operation.

LAN937x also uses different register bits when accessing the reserved
table.

Fixes: 457c182af5 ("net: dsa: microchip: generic access to ksz9477 static and reserved table")
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Tested-by: Łukasz Majewski <lukma@nabladev.com>
Link: https://patch.msgid.link/20251105033741.6455-1-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-06 07:11:36 -08:00
LiangCheng Wang b750f5a9d6 drm/tiny: pixpaper: add explicit dependency on MMU
The DRM_GEM_SHMEM_HELPER helper requires MMU enabled because it uses
vmf_insert_pfn() in its mmap implementation. On NOMMU configurations
(e.g. some RISC-V randconfig builds), this symbol is unavailable and
selecting DRM_GEM_SHMEM_HELPER causes a modpost undefined reference:

    ERROR: modpost: "vmf_insert_pfn" [drivers/gpu/drm/drm_shmem_helper.ko] undefined!

Normally, Kconfig prevents this helper from being selected when
CONFIG_MMU=n. However, in some randconfig builds (such as those used by
0day CI), select statements can override unmet dependencies, triggering
the issue.

Add an explicit dependency on MMU to DRM_PIXPAPER to prevent this.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202510280213.0rlYA4T3-lkp@intel.com/
Fixes: 0c4932f6dd ("drm/tiny: pixpaper: Fix missing dependency on DRM_GEM_SHMEM_HELPER")
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: LiangCheng Wang <zaq14760@gmail.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251028-bar-v1-1-edfbd13fafff@gmail.com
2025-11-06 13:47:29 +01:00
Peter Zijlstra 4cb5ac2626 futex: Optimize per-cpu reference counting
Shrikanth noted that the per-cpu reference counter was still some 10%
slower than the old immutable option (which removes the reference
counting entirely).

Further optimize the per-cpu reference counter by:

 - switching from RCU to preempt;
 - using __this_cpu_*() since we now have preempt disabled;
 - switching from smp_load_acquire() to READ_ONCE().

This is all safe because disabling preemption inhibits the RCU grace
period exactly like rcu_read_lock().

Having preemption disabled allows using __this_cpu_*() provided the
only access to the variable is in task context -- which is the case
here.

Furthermore, since we know changing fph->state to FR_ATOMIC demands a
full RCU grace period we can rely on the implied smp_mb() from that to
replace the acquire barrier().

This is very similar to the percpu_down_read_internal() fast-path.

The reason this is significant for PowerPC is that it uses the generic
this_cpu_*() implementation which relies on local_irq_disable() (the
x86 implementation relies on it being a single memop instruction to be
IRQ-safe). Switching to preempt_disable() and __this_cpu*() avoids
this IRQ state swizzling. Also, PowerPC needs LWSYNC for the ACQUIRE
barrier, not having to use explicit barriers safes a bunch.

Combined this reduces the performance gap by half, down to some 5%.

Fixes: 760e6f7bef ("futex: Remove support for IMMUTABLE")
Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20251106092929.GR4067720@noisy.programming.kicks-ass.net
2025-11-06 12:30:54 +01:00
Aaron Lu 956dfda6a7 sched/fair: Prevent cfs_rq from being unthrottled with zero runtime_remaining
When a cfs_rq is to be throttled, its limbo list should be empty and
that's why there is a warn in tg_throttle_down() for non empty
cfs_rq->throttled_limbo_list.

When running a test with the following hierarchy:

          root
        /      \
        A*     ...
     /  |  \   ...
        B
       /  \
      C*

where both A and C have quota settings, that warn on non empty limbo list
is triggered for a cfs_rq of C, let's call it cfs_rq_c(and ignore the cpu
part of the cfs_rq for the sake of simpler representation).

Debug showed it happened like this:
Task group C is created and quota is set, so in tg_set_cfs_bandwidth(),
cfs_rq_c is initialized with runtime_enabled set, runtime_remaining
equals to 0 and *unthrottled*. Before any tasks are enqueued to cfs_rq_c,
*multiple* throttled tasks can migrate to cfs_rq_c (e.g., due to task
group changes). When enqueue_task_fair(cfs_rq_c, throttled_task) is
called and cfs_rq_c is in a throttled hierarchy (e.g., A is throttled),
these throttled tasks are directly placed into cfs_rq_c's limbo list by
enqueue_throttled_task().

Later, when A is unthrottled, tg_unthrottle_up(cfs_rq_c) enqueues these
tasks. The first enqueue triggers check_enqueue_throttle(), and with zero
runtime_remaining, cfs_rq_c can be throttled in throttle_cfs_rq() if it
can't get more runtime and enters tg_throttle_down(), where the warning
is hit due to remaining tasks in the limbo list.

I think it's a chaos to trigger throttle on unthrottle path, the status
of a being unthrottled cfs_rq can be in a mixed state in the end, so fix
this by granting 1ns to cfs_rq in tg_set_cfs_bandwidth(). This ensures
cfs_rq_c has a positive runtime_remaining when initialized as unthrottled
and cannot enter tg_unthrottle_up() with zero runtime_remaining.

Also, update outdated comments in tg_throttle_down() since
unthrottle_cfs_rq() is no longer called with zero runtime_remaining.
While at it, remove a redundant assignment to se in tg_throttle_down().

Fixes: e1fad12dcb ("sched/fair: Switch to task based throttle model")
Reviewed-By: Benjamin Segall <bsegall@google.com>
Suggested-by: Benjamin Segall <bsegall@google.com>
Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Tested-by: Hao Jia <jiahao1@lixiang.com>
Link: https://patch.msgid.link/20251030032755.560-1-ziqianlu@bytedance.com
2025-11-06 12:30:52 +01:00
Christoph Hellwig d8a823c6f0 xfs: free xfs_busy_extents structure when no RT extents are queued
kmemleak occasionally reports leaking xfs_busy_extents structure
from xfs_scrub calls after running xfs/528 (but attributed to following
tests), which seems to be caused by not freeing the xfs_busy_extents
structure when tr.queued is 0 and xfs_trim_rtgroup_extents breaks out
of the main loop.  Free the structure in this case.

Fixes: a3315d1130 ("xfs: use rtgroup busy extent list for FITRIM")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-06 08:59:19 +01:00
Vlastimil Babka c379b745e1 slab: prevent infinite loop in kmalloc_nolock() with debugging
In review of a followup work, Harry noticed a potential infinite loop.
Upon closed inspection, it already exists for kmalloc_nolock() on a
cache with debugging enabled, since commit af92793e52 ("slab:
Introduce kmalloc_nolock() and kfree_nolock().")

When alloc_single_from_new_slab() fails to trylock node list_lock, we
keep retrying to get partial slab or allocate a new slab. If we indeed
interrupted somebody holding the list_lock, the trylock fill fail
deterministically and we end up allocating and defer-freeing slabs
indefinitely with no progress.

To fix it, fail the allocation if spinning is not allowed. This is
acceptable in the restricted context of kmalloc_nolock(), especially
with debugging enabled.

Reported-by: Harry Yoo <harry.yoo@oracle.com>
Closes: https://lore.kernel.org/all/aQLqZjjq1SPD3Fml@hyeyoo/
Fixes: af92793e52 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Link: https://patch.msgid.link/20251103-fix-nolock-loop-v1-1-6e2b3e82b9da@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-11-06 08:13:12 +01:00
Jakub Kicinski 7d1988a943 Just two small fixes:
- ath12k: revert a change that caused performance regressions
  - hwsim: don't ignore netns on netlink socket matching
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmkLbLsACgkQ10qiO8sP
 aAAShQ//cCY3DzzictRKxhlgV+O4PQFzp39ItfwWqEKYW4egqUajDMOB0rneMA4h
 ZGsI3fy/5jrblcPSVSGljwkjPaYW9gbsxz0PnGKKAtlKIHSgAEoEb+EQcZFJy//u
 JCwZAN9EQuqiTDu82/NIqk+nFcSoNEtY56gkkzmcTlXjE4fswEFadayqOAhLBTWp
 gv0iC656r5IgZNbiXoR1Ja7qu6nubcZWkeOcbgqDJT29vZDFd315DDzx1kScmUM/
 KPRpv0rTMUULM5V9JxQGOCtwEQ8DZwaE75SL+/uiDvKCTJFxVoWXMuTeQqnwqOt2
 WDHCU9+oi43JKuaqT+aF1JdDAJZboizDbNqqOxtxuGBvUH84JiuhOCAF2ItBwczu
 IMsgJ9XGRRiSLNQ0qWjPW4tGXzGBieY6ec8RKhtldjLQsIZVRNamaJzIjZKFmmP+
 tYwICE26keQwgrM2AjDkXGUrtJPP7LzUUqnP6DXrb5M5nLd6zbI7AyU+2CSMXN20
 pYFZZA4j6hcFayDg5yQUoVDUab/2sTevclyuNhLsdFT2ymiyMTLjtIMczKVg+DIl
 m7oTRIMCS+7k25nG/B0+LqV3qv1Y06Ct3U/T/3S77HyvoPM1054dpDwLWHIyvpOG
 xw7crsLYRbUYcBr4ie5DOq7blzmvpdjgCMrFR5rDn2/4lhjIWI4=
 =zUoa
 -----END PGP SIGNATURE-----

Merge tag 'wireless-2025-11-05' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Just two small fixes:

 - ath12k: revert a change that caused performance regressions
 - hwsim: don't ignore netns on netlink socket matching

* tag 'wireless-2025-11-05' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: mac80211_hwsim: Limit destroy_on_close radio removal to netgroup
  Revert "wifi: ath12k: Fix missing station power save configuration"
====================

Link: https://patch.msgid.link/20251105152827.53254-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 18:04:55 -08:00
Haotian Zhang 4d6ec3a793 net: wan: framer: pef2256: Switch to devm_mfd_add_devices()
The driver calls mfd_add_devices() but fails to call mfd_remove_devices()
in error paths after successful MFD device registration and in the remove
function. This leads to resource leaks where MFD child devices are not
properly unregistered.

Replace mfd_add_devices with devm_mfd_add_devices to automatically
manage the device resources.

Fixes: c96e976d9a ("net: wan: framer: Add support for the Lantiq PEF2256 framer")
Suggested-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Acked-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20251105034716.662-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 18:02:34 -08:00
Jiawen Wu a04ea57aae net: libwx: fix device bus LAN ID
The device bus LAN ID was obtained from PCI_FUNC(), but when a PF
port is passthrough to a virtual machine, the function number may not
match the actual port index on the device. This could cause the driver
to perform operations such as LAN reset on the wrong port.

Fix this by reading the LAN ID from port status register.

Fixes: a34b3e6ed8 ("net: txgbe: Store PCI info")
Cc: stable@vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/B60A670C1F52CB8E+20251104062321.40059-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:52:13 -08:00
Jakub Kicinski b1d9154878 Merge branch 'net-mlx5e-shampo-fixes-for-64kb-page-size'
Tariq Toukan says:

====================
net/mlx5e: SHAMPO fixes for 64KB page size

This series by Dragos contains fixes for HW-GRO issues found on systems
with 64KB page size.
====================

Link: https://patch.msgid.link/1762238915-1027590-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:48:41 -08:00
Dragos Tatulea d8a7ed9586 net/mlx5e: SHAMPO, Fix header formulas for higher MTUs and 64K pages
The MLX5E_SHAMPO_WQ_HEADER_PER_PAGE and
MLX5E_SHAMPO_LOG_MAX_HEADER_ENTRY_SIZE macros are used directly in
several places under the assumption that there will always be more
headers per WQE than headers per page. However, this assumption doesn't
hold for 64K page sizes and higher MTUs (> 4K). This can be first
observed during header page allocation: ksm_entries will become 0 during
alignment to MLX5E_SHAMPO_WQ_HEADER_PER_PAGE.

This patch introduces 2 additional members to the mlx5e_shampo_hd struct
which are meant to be used instead of the macrose mentioned above.
When the number of headers per WQE goes below
MLX5E_SHAMPO_WQ_HEADER_PER_PAGE, clamp the number of headers per
page and expand the header size accordingly so that the headers
for one WQE cover a full page.

All the formulas are adapted to use these two new members.

Fixes: 945ca432bf ("net/mlx5e: SHAMPO, Drop info array")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1762238915-1027590-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:48:37 -08:00
Dragos Tatulea bacd8d8018 net/mlx5e: SHAMPO, Fix skb size check for 64K pages
mlx5e_hw_gro_skb_has_enough_space() uses a formula to check if there is
enough space in the skb frags to store more data. This formula is
incorrect for 64K page sizes and it triggers early GRO session
termination because the first fragment will blow up beyond
GRO_LEGACY_MAX_SIZE.

This patch adds a special case for page sizes >= GRO_LEGACY_MAX_SIZE
(64K) which uses the skb->len instead. Within this context,
the check is safe from fragment overflow because the hardware
will continuously fill the data up to the reservation size of 64K
and the driver will coalesce all data from the same page to the same
fragment. This means that the data will span one fragment or at most
two for such a large page size.

It is expected that the if statement will be optimized out as the
check is done with constants.

Fixes: 92552d3abd ("net/mlx5e: HW_GRO cqe handler implementation")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1762238915-1027590-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:48:36 -08:00
Dragos Tatulea 665a7e13c2 net/mlx5e: SHAMPO, Fix header mapping for 64K pages
HW-GRO is broken on mlx5 for 64K page sizes. The patch in the fixes tag
didn't take into account larger page sizes when doing an align down
of max_ksm_entries. For 64K page size, max_ksm_entries is 0 which will skip
mapping header pages via WQE UMR. This breaks header-data split
and will result in the following syndrome:

mlx5_core 0000:00:08.0 eth2: Error cqe on cqn 0x4c9, ci 0x0, qn 0x1133, opcode 0xe, syndrome 0x4, vendor syndrome 0x32
00000000: 00 00 00 00 04 4a 00 00 00 00 00 00 20 00 93 32
00000010: 55 00 00 00 fb cc 00 00 00 00 00 00 07 18 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4a
00000030: 00 00 3b c7 93 01 32 04 00 00 00 00 00 00 bf e0
mlx5_core 0000:00:08.0 eth2: ERR CQE on RQ: 0x1133

Furthermore, the function that fills in WQE UMRs for the headers
(mlx5e_build_shampo_hd_umr()) only supports mapping page sizes that
fit in a single UMR WQE.

This patch goes back to the old non-aligned max_ksm_entries value and it
changes mlx5e_build_shampo_hd_umr() to support mapping a large page over
multiple UMR WQEs.

This means that mlx5e_build_shampo_hd_umr() can now leave a page only
partially mapped. The caller, mlx5e_alloc_rx_hd_mpwqe(), ensures that
there are enough UMR WQEs to cover complete pages by working on
ksm_entries that are multiples of MLX5E_SHAMPO_WQ_HEADER_PER_PAGE.

Fixes: 8a0ee54027 ("net/mlx5e: SHAMPO, Simplify UMR allocation for headers")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1762238915-1027590-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:48:36 -08:00
Meghana Malladi ae4789affd net: ti: icssg-prueth: Fix fdb hash size configuration
The ICSSG driver does the initial FDB configuration which
includes setting the control registers. Other run time
management like learning is managed by the PRU's. The default
FDB hash size used by the firmware is 512 slots, which is
currently missing in the current driver. Update the driver
FDB config to include FDB hash size as well.

Please refer trm [1] 6.4.14.12.17 section on how the FDB config
register gets configured. From the table 6-1404, there is a reset
field for FDB_HAS_SIZE which is 4, meaning 1024 slots. Currently
the driver is not updating this reset value from 4(1024 slots) to
3(512 slots). This patch fixes this by updating the reset value
to 512 slots.

[1]: https://www.ti.com/lit/pdf/spruim2
Fixes: abd5576b9c ("net: ti: icssg-prueth: Add support for ICSSG switch firmware")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251104104415.3110537-1-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:43:08 -08:00
Gal Pressman d1c94bc5b9 net/mlx5e: Fix return value in case of module EEPROM read error
mlx5e_get_module_eeprom_by_page() has weird error handling.

First, it is treating -EINVAL as a special case, but it is unclear why.

Second, it tries to fail "gracefully" by returning the number of bytes
read even in case of an error. This results in wrongly returning
success (0 return value) if the error occurs before any bytes were
read.

Simplify the error handling by returning an error when such occurs. This
also aligns with the error handling we have in mlx5e_get_module_eeprom()
for the old API.

This fixes the following case where the query fails, but userspace
ethtool wrongly treats it as success and dumps an output:

  # ethtool -m eth2
  netlink warning: mlx5_core: Query module eeprom by page failed, read 0 bytes, err -5
  netlink warning: mlx5_core: Query module eeprom by page failed, read 0 bytes, err -5
  Offset		Values
  ------		------
  0x0000:		00 00 00 00 05 00 04 00 00 00 00 00 05 00 05 00
  0x0010:		00 00 00 00 05 00 06 00 50 00 00 00 67 65 20 66
  0x0020:		61 69 6c 65 64 2c 20 72 65 61 64 20 30 20 62 79
  0x0030:		74 65 73 2c 20 65 72 72 20 2d 35 00 14 00 03 00
  0x0040:		08 00 01 00 03 00 00 00 08 00 02 00 1a 00 00 00
  0x0050:		14 00 04 00 08 00 01 00 04 00 00 00 08 00 02 00
  0x0060:		0e 00 00 00 14 00 05 00 08 00 01 00 05 00 00 00
  0x0070:		08 00 02 00 1a 00 00 00 14 00 06 00 08 00 01 00

Fixes: e109d2b204 ("net/mlx5: Implement get_module_eeprom_by_page()")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Alex Lazar <alazar@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1762265736-1028868-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:42:37 -08:00
Sebastian Andrzej Siewior d917c217b6 net: gro_cells: Reduce lock scope in gro_cell_poll
One GRO-cell device's NAPI callback can nest into the GRO-cell of
another device if the underlying device is also using GRO-cell.
This is the case for IPsec over vxlan.
These two GRO-cells are separate devices. From lockdep's point of view
it is the same because each device is sharing the same lock class and so
it reports a possible deadlock assuming one device is nesting into
itself.

Hold the bh_lock only while accessing gro_cell::napi_skbs in
gro_cell_poll(). This reduces the locking scope and avoids acquiring the
same lock class multiple times.

Fixes: 25718fdcbd ("net: gro_cells: Use nested-BH locking for gro_cell")
Reported-by: Gal Pressman <gal@nvidia.com>
Closes: https://lore.kernel.org/all/66664116-edb8-48dc-ad72-d5223696dd19@nvidia.com/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20251104153435.ty88xDQt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:41:29 -08:00
Michal Swiatkowski b1d16f7c00 libie: depend on DEBUG_FS when building LIBIE_FWLOG
LIBIE_FWLOG is unusable without DEBUG_FS. Mark it in Kconfig.

Fix build error on ixgbe when DEBUG_FS is not set. To not add another
layer of #if IS_ENABLED(LIBIE_FWLOG) in ixgbe fwlog code define debugfs
dentry even when DEBUG_FS isn't enabled. In this case the dummy
functions of LIBIE_FWLOG will be used, so not initialized dentry isn't a
problem.

Fixes: 641585bc97 ("ixgbe: fwlog support for e610")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/lkml/f594c621-f9e1-49f2-af31-23fbcb176058@roeck-us.net/
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251104172333.752445-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-05 17:38:03 -08:00
James Jones 664ce10246 drm/nouveau: Advertise correct modifiers on GB20x
8 and 16 bit formats use a different layout on
GB20x than they did on prior chips. Add the
corresponding DRM format modifiers to the list of
modifiers supported by the display engine on such
chips, and filter the supported modifiers for each
format based on its bytes per pixel in
nv50_plane_format_mod_supported().

Note this logic will need to be updated when GB10
support is added, since it is a GB20x chip that
uses the pre-GB20x sector layout for all formats.

Fixes: 6cc6e08d45 ("drm/nouveau/kms: add support for GB20x")
Signed-off-by: James Jones <jajones@nvidia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251030181153.1208-3-jajones@nvidia.com
2025-11-06 11:02:08 +10:00
James Jones 1cf52a0d4b drm: define NVIDIA DRM format modifiers for GB20x
The layout of bits within the individual tiles
(referred to as sectors in the
DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro)
changed for 8 and 16-bit surfaces starting in
Blackwell 2 GPUs (With the exception of GB10).
To denote the difference, extend the sector field
in the parametric format modifier definition used
to generate modifier values for NVIDIA hardware.

Without this change, it would be impossible to
differentiate the two layouts based on modifiers,
and as a result software could attempt to share
surfaces directly between pre-GB20x and GB20x
cards, resulting in corruption when the surface
was accessed on one of the GPUs after being
populated with content by the other.

Of note: This change causes the
DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro to
evaluate its "s" parameter twice, with the side
effects that entails. I surveyed all usage of the
modifier in the kernel and Mesa code, and that
does not appear to be problematic in any current
usage, but I thought it was worth calling out.

Fixes: 6cc6e08d45 ("drm/nouveau/kms: add support for GB20x")
Signed-off-by: James Jones <jajones@nvidia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251030181153.1208-2-jajones@nvidia.com
2025-11-06 11:01:45 +10:00
Timur Tabi ebe7556050 drm/nouveau: set DMA mask before creating the flush page
Set the DMA mask before calling nvkm_device_ctor(), so that when the
flush page is created in nvkm_fb_ctor(), the allocation will not fail
if the page is outside of DMA address space, which can easily happen if
IOMMU is disable.  In such situations, you will get an error like this:

nouveau 0000:65:00.0: DMA addr 0x0000000107c56000+4096 overflow (mask ffffffff, bus limit 0).

Commit 38f5359354 ("rm/nouveau/pci: set streaming DMA mask early")
set the mask after calling nvkm_device_ctor(), but back then there was
no flush page being created, which might explain why the mask wasn't
set earlier.

Flush page allocation was added in commit 5728d06419 ("drm/nouveau/fb:
handle sysmem flush page from common code").  nvkm_fb_ctor() calls
alloc_page(), which can allocate a page anywhere in system memory, but
then calls dma_map_page() on that page.  But since the DMA mask is still
set to 32, the map can fail if the page is allocated above 4GB.  This is
easy to reproduce on systems with a lot of memory and IOMMU disabled.

An alternative approach would be to force the allocation of the flush
page to low memory, by specifying __GFP_DMA32.  However, this would
always allocate the page in low memory, even though the hardware can
access high memory.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20251014174512.3172102-1-ttabi@nvidia.com
2025-11-06 10:26:51 +10:00
Linus Torvalds dc77806cf3 Rust fixes for v6.18
Toolchain and infrastructure:
 
   - Fix/workaround a couple Rust 1.91.0 build issues when sanitizers are
     enabled due to extra checking performed by the compiler and an
     upstream issue already fixed for Rust 1.93.0.
 
   - Fix future Rust 1.93.0 builds by supporting the stabilized name for
     the 'no-jump-tables' flag.
 
   - Fix a couple private/broken intra-doc links uncovered by the future
     move of pin-init to 'syn'.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmkLYvMACgkQGXyLc2ht
 IW2I9g/+IIP7rJEQug5EpyTuxO2GX1gKOf3lbV5YxpUi+BXIFL5ZY7Nlgi3EssSd
 Cj3LaoEiDzFeYewCnjhk3JAsuzmr/bEjN5xWiDT3Rk/yhBIX+oRBJjW+yze+gfoe
 27gqS20W08WSfIj2n91EDwbdrhiz0Lp87cMlcesdDVZKnY215Whf8zsYNGIGutSe
 ISdDPsHdoCe2RHoxLcP0Yo6OD/bRK7hTQxEp6rPh787+SK4znjdJWmOVoi4D/HdI
 fMHG3T8HfDIOTQJITDTlvne9CRvq6+JaUDQBPFDZBFwxA7S83SdiDDIqWb4N7zlh
 ETySFSblkrJdCdT2LelRu/JPv25h1vUvhQObNBY77enBeOxJdeJNWk7iY2p6GRCD
 4+qunebeRsLuAQqzAAhk5gLz5Jb7mPcGbhJzAebOlp/bixwS3YQb9710FvWwmjQY
 dPxbkxAQj1dAHgkVLDk91mzwU3R2m3OcWMrwC6rU2O2Evraq58B7N2xDOSSi1uzK
 e77oBk8CEuFLw/EIZq32dWhRf+9F8x1CrG4K6bTXTax1ZgHp/UZ4/yNrd5+7z2wZ
 5gVoCqswTqQg4YOPhwk+tACFDS0vDYMnsdT7BNNkRAUibBC7q52jbPpBsMBaJ3ji
 3VwS93idCkEGYYjNlPRKxtlL3nIFbbN4PbsvfjIKSauv2MyxV7Y=
 =0bvf
 -----END PGP SIGNATURE-----

Merge tag 'rust-fixes-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull rust fixes from Miguel Ojeda:

 - Fix/workaround a couple Rust 1.91.0 build issues when sanitizers are
   enabled due to extra checking performed by the compiler and an
   upstream issue already fixed for Rust 1.93.0

 - Fix future Rust 1.93.0 builds by supporting the stabilized name for
   the 'no-jump-tables' flag

 - Fix a couple private/broken intra-doc links uncovered by the future
   move of pin-init to 'syn'

* tag 'rust-fixes-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: kbuild: support `-Cjump-tables=n` for Rust 1.93.0
  rust: kbuild: workaround `rustdoc` doctests modifier bug
  rust: kbuild: treat `build_error` and `rustdoc` as kernel objects
  rust: condvar: fix broken intra-doc link
  rust: devres: fix private intra-doc link
2025-11-05 11:15:36 -08:00
Jason Gunthorpe afb47765f9 iommufd: Make vfio_compat's unmap succeed if the range is already empty
iommufd returns ENOENT when attempting to unmap a range that is already
empty, while vfio type1 returns success. Fix vfio_compat to match.

Fixes: d624d6652a ("iommufd: vfio container FD ioctl compatibility")
Link: https://patch.msgid.link/r/0-v1-76be45eff0be+5d-iommufd_unmap_compat_jgg@nvidia.com
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Alex Mastro <amastro@fb.com>
Reported-by: Alex Mastro <amastro@fb.com>
Closes: https://lore.kernel.org/r/aP0S5ZF9l3sWkJ1G@devgpu012.nha5.facebook.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-11-05 15:11:26 -04:00
Linus Torvalds 5624d4c378 platform-drivers-x86 for v6.18-3
Fixes and New Hotkey Support
 
 - input + dell-wmi-base: Electronic privacy screen on/off hotkey support
 
 - int3472: Fix unregister double free
 
 - wireless-hotkey: Fix Kconfig typo
 
 The following is an automated shortlog grouped by driver:
 
 dell-wmi-base:
  -  Handle electronic privacy screen on/off events
 
 Input:
  -  Add keycodes for electronic privacy screen on/off hotkeys
 
 int3472:
  -  Fix double free of GPIO device during unregister
 
 MAINTAINERS:
  -  Update int3472 maintainers
 
 x86: Kconfig:
  -  fix minor typo in help for WIRELESS_HOTKEY
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSCSUwRdwTNL2MhaBlZrE9hU+XOMQUCaQso/QAKCRBZrE9hU+XO
 Me2tAQCoij2NER2aThaFPzTjBfvIKF4DbpsSo9V0I2r+gR6xzAD/UWmliCDGQ0dV
 NS28/L982I716VK2Mv5SvdG9BKxAlwM=
 =Nnez
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:
 "Fixes and New Hotkey Support:

   - input + dell-wmi-base: Electronic privacy screen on/off hotkey
     support

   - int3472: Fix unregister double free

   - wireless-hotkey: Fix Kconfig typo"

* tag 'platform-drivers-x86-v6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform: x86: Kconfig: fix minor typo in help for WIRELESS_HOTKEY
  platform/x86: dell-wmi-base: Handle electronic privacy screen on/off events
  Input: Add keycodes for electronic privacy screen on/off hotkeys
  MAINTAINERS: Update int3472 maintainers
  platform/x86: int3472: Fix double free of GPIO device during unregister
2025-11-05 11:08:10 -08:00
Pavel Begunkov 1fd5367391 io_uring: fix types for region size calulation
->nr_pages is int, it needs type extension before calculating the region
size.

Fixes: a90558b36c ("io_uring/memmap: helper for pinning region pages")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-11-05 11:45:07 -07:00
Christoph Hellwig 21ab5179aa xfs: fix zone selection in xfs_select_open_zone_mru
xfs_select_open_zone_mru needs to pass XFS_ZONE_ALLOC_OK to
xfs_try_use_zone because we only want to tightly pack into zones of the
same or a compatible temperature instead of any available zone.

This got broken in commit 0301dae732 ("xfs: refactor hint based zone
allocation"), which failed to update this particular caller when
switching to an enum.  xfs/638 sometimes, but not reliably fails due to
this change.

Fixes: 0301dae732 ("xfs: refactor hint based zone allocation")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-05 16:54:38 +01:00
Christoph Hellwig f5714a3c1a xfs: fix a rtgroup leak when xfs_init_zone fails
Drop the rtgrop reference when xfs_init_zone fails for a conventional
device.

Fixes: 4e4d520755 ("xfs: add the zoned space allocator")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-05 16:53:49 +01:00
Darrick J. Wong 8d7bba1e83 xfs: fix various problems in xfs_atomic_write_cow_iomap_begin
I think there are several things wrong with this function:

A) xfs_bmapi_write can return a much larger unwritten mapping than what
   the caller asked for.  We convert part of that range to written, but
   return the entire written mapping to iomap even though that's
   inaccurate.

B) The arguments to xfs_reflink_convert_cow_locked are wrong -- an
   unwritten mapping could be *smaller* than the write range (or even
   the hole range).  In this case, we convert too much file range to
   written state because we then return a smaller mapping to iomap.

C) It doesn't handle delalloc mappings.  This I covered in the patch
   that I already sent to the list.

D) Reassigning count_fsb to handle the hole means that if the second
   cmap lookup attempt succeeds (due to racing with someone else) we
   trim the mapping more than is strictly necessary.  The changing
   meaning of count_fsb makes this harder to notice.

E) The tracepoint is kinda wrong because @length is mutated.  That makes
   it harder to chase the data flows through this function because you
   can't just grep on the pos/bytecount strings.

F) We don't actually check that the br_state = XFS_EXT_NORM assignment
   is accurate, i.e that the cow fork actually contains a written
   mapping for the range we're interested in

G) Somewhat inadequate documentation of why we need to xfs_trim_extent
   so aggressively in this function.

H) Not sure why xfs_iomap_end_fsb is used here, the vfs already clamped
   the write range to s_maxbytes.

Fix these issues, and then the atomic writes regressions in generic/760,
generic/617, generic/091, generic/263, and generic/521 all go away for
me.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-05 16:52:49 +01:00
Darrick J. Wong 8d54eacd82 xfs: fix delalloc write failures in software-provided atomic writes
With the 20 Oct 2025 release of fstests, generic/521 fails for me on
regular (aka non-block-atomic-writes) storage:

QA output created by 521
dowrite: write: Input/output error
LOG DUMP (8553 total operations):
1(  1 mod 256): SKIPPED (no operation)
2(  2 mod 256): WRITE    0x7e000 thru 0x8dfff	(0x10000 bytes) HOLE
3(  3 mod 256): READ     0x69000 thru 0x79fff	(0x11000 bytes)
4(  4 mod 256): FALLOC   0x53c38 thru 0x5e853	(0xac1b bytes) INTERIOR
5(  5 mod 256): COPY 0x55000 thru 0x59fff	(0x5000 bytes) to 0x25000 thru 0x29fff
6(  6 mod 256): WRITE    0x74000 thru 0x88fff	(0x15000 bytes)
7(  7 mod 256): ZERO     0xedb1 thru 0x11693	(0x28e3 bytes)

with a warning in dmesg from iomap about XFS trying to give it a
delalloc mapping for a directio write.  Fix the software atomic write
iomap_begin code to convert the reservation into a written mapping.
This doesn't fix the data corruption problems reported by generic/760,
but it's a start.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-05 16:52:49 +01:00
Johannes Berg 4c740c4d8b ath.git update for v6.18-rc5
Revert an ath12k change which resulted in a significance performance
 impact on WCN7850.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ/mtSHzPUi16IfDEksFbugiYzLewUCaQjAlgAKCRAsFbugiYzL
 e1g/AP0VjhhuC5zzElFTNf+5HTgNnXAvs6ghg2BUPkJug+X6xAD9GnKdlrqoo4qw
 iGex7lkBnIzPn+fTlF0xPHFDvF0cgww=
 =b8DL
 -----END PGP SIGNATURE-----

Merge tag 'ath-current-20251103' of git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath

Jeff Johnson says:
==================
ath.git update for v6.18-rc5

Revert an ath12k change which resulted in a significance performance
impact on WCN7850.
==================

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-05 16:18:48 +01:00
Martin Willi c74619e760 wifi: mac80211_hwsim: Limit destroy_on_close radio removal to netgroup
hwsim radios marked destroy_on_close are removed when the Netlink socket
that created them is closed. As the portid is not unique across network
namespaces, closing a socket in one namespace may remove radios in another
if it has the destroy_on_close flag set.

Instead of matching the network namespace, match the netgroup of the radio
to limit radio removal to those that have been created by the closing
Netlink socket. The netgroup of a radio identifies the network namespace
it was created in, and matching on it removes a destroy_on_close radio
even if it has been moved to another namespace.

Fixes: 100cb9ff40 ("mac80211_hwsim: Allow managing radios from non-initial namespaces")
Signed-off-by: Martin Willi <martin@strongswan.org>
Link: https://patch.msgid.link/20251103082436.30483-1-martin@strongswan.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-05 16:18:16 +01:00
Pierre-Eric Pelloux-Prayer 487df8b698 drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb
The Mesa issue referenced below pointed out a possible deadlock:

[ 1231.611031]  Possible interrupt unsafe locking scenario:

[ 1231.611033]        CPU0                    CPU1
[ 1231.611034]        ----                    ----
[ 1231.611035]   lock(&xa->xa_lock#17);
[ 1231.611038]                                local_irq_disable();
[ 1231.611039]                                lock(&fence->lock);
[ 1231.611041]                                lock(&xa->xa_lock#17);
[ 1231.611044]   <Interrupt>
[ 1231.611045]     lock(&fence->lock);
[ 1231.611047]
                *** DEADLOCK ***

In this example, CPU0 would be any function accessing job->dependencies
through the xa_* functions that don't disable interrupts (eg:
drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()).

CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling
callback so in an interrupt context. It will deadlock when trying to
grab the xa_lock which is already held by CPU0.

Replacing all xa_* usage by their xa_*_irq counterparts would fix
this issue, but Christian pointed out another issue: dma_fence_signal
takes fence.lock and so does dma_fence_add_callback.

  dma_fence_signal() // locks f1.lock
  -> drm_sched_entity_kill_jobs_cb()
  -> foreach dependencies
     -> dma_fence_add_callback() // locks f2.lock

This will deadlock if f1 and f2 share the same spinlock.

To fix both issues, the code iterating on dependencies and re-arming them
is moved out to drm_sched_entity_kill_jobs_work().

Cc: stable@vger.kernel.org # v6.2+
Fixes: 2fdb8a8f07 ("drm/scheduler: rework entity flush, kill and fini")
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
[phasta: commit message nits]
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patch.msgid.link/20251104095358.15092-1-pierre-eric.pelloux-prayer@amd.com
2025-11-05 12:29:52 +01:00
Thomas Richard 5232334bae gpio: aggregator: restore the set_config operation
Restore the set_config operation, as it was lost during the refactoring of
the gpio-aggregator driver while creating the gpio forwarder library.

Fixes: b31c68fd85 ("gpio: aggregator: handle runtime registration of gpio_desc in gpiochip_fwd")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202509281206.a7334ae8-lkp@intel.com
Signed-off-by: Thomas Richard <thomas.richard@bootlin.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20250929-gpio-aggregator-fix-set-config-callback-v1-1-39046e1da609@bootlin.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-11-05 11:34:26 +01:00
Linus Torvalds 1c353dc8d9 [GIT PULL for v6.18-rc5] media fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmkLExYACgkQCF8+vY7k
 4RUbiQ//bQCnfN2a+ombBf8Qr7UWlS+3YSW1qHtAMwjsMVxcgORhOJiudYMIe4Z8
 sXb8xS6uyvZiRrnBUjB8nryfnpKXyGVeRD7nFK9AblBUgGt4UsOHc4PQPUkSeRoa
 JinCUnxwu5VHTjDaLYqVBdgKyQYHU6ctfs0PlVjmf7dtolWyF2zMRhZbPy/3mf9M
 gF3bfhoLPHXMAYtHcR9QUnib1dvbrLvVDJ9t+emssKu7SWKxcc5lxG0b5dmRaaZB
 UBwToFZR/bjbOnxwDBwbyLcARVAXc9xenmc8xBnkhR+SrIbgb24+BO+VifNKu6pu
 crgoJWTGjE8/t5fgQpgxHgRBWGwlyS/aYaufXWe4Qz6VMI9RzJelLM184LWH04Pa
 CVEPurO1B9PpqSDXZ12q19wcXXxXv5ES4LZ+guXWqPhAI2C5jH8IxIVyRPTmWK8X
 VwGootioTfR+eSpHxn4kBJqLPuNN9Sx+M8COvA1WP/LQ27ZObbOy7P1bqGnZBn9F
 sbSYfBKxcCiMcpht4pNZ3V/mC66kf3HbDubMGlFxK7rrYXv+9WiksNC1LkiCHQt8
 iklV2Vx4rgKPGfcWQELq7YELDPRZXDZPVxpAFOSCrWSeYwUacR+QznbOrGn3qnle
 3XAh+rgpa46piJuBYfknYb5rr6jkk5rwehYgh056DOCth8fUNJg=
 =RhZS
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - honour privacy led with pdx86/int3472

 - fix invalid file access on cx18 and ivtv

 - forbid remove_bufs when legacy fileio is active on videbuf2

 - add an heuristic to find stream entity on uvcvideo

* tag 'media/v6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: videobuf2: forbid remove_bufs when legacy fileio is active
  media: uvcvideo: Use heuristic to find stream entity
  media: v4l2-subdev / pdx86: int3472: Use "privacy" as con_id for the privacy LED
  media: ivtv: Fix invalid access to file *
  media: cx18: Fix invalid access to file *
2025-11-05 18:56:15 +09:00
Breno Leitao 327c20c21d netpoll: Fix deadlock in memory allocation under spinlock
Fix a AA deadlock in refill_skbs() where memory allocation while holding
skb_pool->lock can trigger a recursive lock acquisition attempt.

The deadlock scenario occurs when the system is under severe memory
pressure:

1. refill_skbs() acquires skb_pool->lock (spinlock)
2. alloc_skb() is called while holding the lock
3. Memory allocator fails and calls slab_out_of_memory()
4. This triggers printk() for the OOM warning
5. The console output path calls netpoll_send_udp()
6. netpoll_send_udp() attempts to acquire the same skb_pool->lock
7. Deadlock: the lock is already held by the same CPU

Call stack:
  refill_skbs()
    spin_lock_irqsave(&skb_pool->lock)    <- lock acquired
    __alloc_skb()
      kmem_cache_alloc_node_noprof()
        slab_out_of_memory()
          printk()
            console_flush_all()
              netpoll_send_udp()
                skb_dequeue()
                  spin_lock_irqsave(&skb_pool->lock)     <- deadlock attempt

This bug was exposed by commit 248f6571fd ("netpoll: Optimize skb
refilling on critical path") which removed refill_skbs() from the
critical path (where nested printk was being deferred), letting nested
printk being called from inside refill_skbs()

Refactor refill_skbs() to never allocate memory while holding
the spinlock.

Another possible solution to fix this problem is protecting the
refill_skbs() from nested printks, basically calling
printk_deferred_{enter,exit}() in refill_skbs(), then, any nested
pr_warn() would be deferred.

I prefer this approach, given I _think_ it might be a good idea to move
the alloc_skb() from GFP_ATOMIC to GFP_KERNEL in the future, so, having
the alloc_skb() outside of the lock will be necessary step.

There is a possible TOCTOU issue when checking for the pool length, and
queueing the new allocated skb, but, this is not an issue, given that
an extra SKB in the pool is harmless and it will be eventually used.

Signed-off-by: Breno Leitao <leitao@debian.org>
Fixes: 248f6571fd ("netpoll: Optimize skb refilling on critical path")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251103-fix_netpoll_aa-v4-1-4cfecdf6da7c@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04 19:17:00 -08:00
Nishanth Menon 90a88306eb net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error
Make knav_dma_open_channel consistently return NULL on error instead
of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h
returns NULL when the driver is disabled, but the driver
implementation does not even return NULL or ERR_PTR on failure,
causing inconsistency in the users. This results in a crash in
netcp_free_navigator_resources as followed (trimmed):

Unhandled fault: alignment exception (0x221) at 0xfffffff2
[fffffff2] *pgd=80000800207003, *pmd=82ffda003, *pte=00000000
Internal error: : 221 [#1] SMP ARM
Modules linked in:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE
Hardware name: Keystone
PC is at knav_dma_close_channel+0x30/0x19c
LR is at netcp_free_navigator_resources+0x2c/0x28c

[... TRIM...]

Call trace:
 knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c
 netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c
 netcp_ndo_open from __dev_open+0x114/0x29c
 __dev_open from __dev_change_flags+0x190/0x208
 __dev_change_flags from netif_change_flags+0x1c/0x58
 netif_change_flags from dev_change_flags+0x38/0xa0
 dev_change_flags from ip_auto_config+0x2c4/0x11f0
 ip_auto_config from do_one_initcall+0x58/0x200
 do_one_initcall from kernel_init_freeable+0x1cc/0x238
 kernel_init_freeable from kernel_init+0x1c/0x12c
 kernel_init from ret_from_fork+0x14/0x38
[... TRIM...]

Standardize the error handling by making the function return NULL on
all error conditions. The API is used in just the netcp_core.c so the
impact is limited.

Note, this change, in effect reverts commit 5b6cb43b4d ("net:
ethernet: ti: netcp_core: return error while dma channel open issue"),
but provides a less error prone implementation.

Suggested-by: Simon Horman <horms@kernel.org>
Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251103162811.3730055-1-nm@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-04 19:15:36 -08:00