Compare commits

...

1 Commits

Author SHA1 Message Date
Venkat Chimata
fcc965567f WIFI-14998: wifi: ap: mitigate peer-delete WMI timeout to reduce blind period & prevent peer leaks
1. When a connected client roams to another AP, the AP is trying to delete the peer
   but for some reason the WMI command times out and while driver is waiting for
   the response, we observed that the AP doesn't respond to any frames from STA
   (probe requests, authentication etc) and once the response times out (3seconds default)
   then AP starts responding to the older requets but client has already connected to
   another AP. As the root cause for the response timing out is in the FW, we added
   a WAR to reduce the timeout to minimize this blind period, with this AP responds
   after 100ms and client connects successfully. And 100ms timeout is also reasonable
   for this internal operation.
2. In case of peer deletion timeout, the driver peer database is not cleared, so,
   if this happens often (which it is) then eventually we hit the max peers in the
   driver and all subsequent operations fail, so, in case of timeout ignore the failure
   and proceed with driver peer database cleanup.

Signed-off-by: Venkat Chimata <venkat@nearhop.com>
2025-09-04 00:16:00 +05:30

View File

@@ -0,0 +1,53 @@
From 375d0d25e6c02991392e44956c81cbac84909f49 Mon Sep 17 00:00:00 2001
From: Venkat Chimata <venkat@nearhop.com>
Date: Thu, 4 Sep 2025 00:09:17 +0530
Subject: [PATCH] wifi: ap: mitigate peer-delete WMI timeout to reduce blind
period & prevent peer leaks
1. When a connected client roams to another AP, the AP is trying to delete the peer
but for some reason the WMI command times out and while driver is waiting for
the response, we observed that the AP doesn't respond to any frames from STA
(probe requests, authentication etc) and once the response times out (3seconds default)
then AP starts responding to the older requets but client has already connected to
another AP. As the root cause for the response timing out is in the FW, we added
a WAR to reduce the timeout to minimize this blind period, with this AP responds
after 100ms and client connects successfully. And 100ms timeout is also reasonable
for this internal operation.
2. In case of peer deletion timeout, the driver peer database is not cleared, so,
if this happens often (which it is) then eventually we hit the max peers in the
driver and all subsequent operations fail, so, in case of timeout ignore the failure
and proceed with driver peer database cleanup.
Signed-off-by: Venkat Chimata <venkat@nearhop.com>
---
drivers/net/wireless/ath/ath11k/peer.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/peer.c b/drivers/net/wireless/ath/ath11k/peer.c
index 1907067..aefc6ba 100644
--- a/drivers/net/wireless/ath/ath11k/peer.c
+++ b/drivers/net/wireless/ath/ath11k/peer.c
@@ -771,7 +771,7 @@ int ath11k_wait_for_peer_delete_done(struct ath11k *ar, u32 vdev_id,
}
time_left = wait_for_completion_timeout(&ar->peer_delete_done,
- 3 * HZ);
+ 100 * HZ / 1000);
if (time_left == 0) {
ath11k_warn(ar->ab, "Timeout in receiving peer delete response\n");
return -ETIMEDOUT;
@@ -857,7 +857,10 @@ int ath11k_peer_delete(struct ath11k *ar, u32 vdev_id, u8 *addr)
}
ret = ath11k_wait_for_peer_delete_done(ar, vdev_id, addr);
- if (ret)
+ /* WAR: For the timeout case, proceed to delete the peer anyway, as FW is
+ * still functional, without this, driver ends up hitting max peers
+ */
+ if (ret && ret != -ETIMEDOUT)
return ret;
ATH11K_MEMORY_STATS_DEC(ar->ab, per_peer_object,
--
2.34.1