tcp: fix tcp_packet_delayed() for tcp_is_non_sack_preventing_reopen() behavior

[ Upstream commit d0fa59897e049e84432600e86df82aab3dce7aa5 ]

After the following commit from 2024:

commit e37ab7373696 ("tcp: fix to allow timestamp undo if no retransmits were sent")

...there was buggy behavior where TCP connections without SACK support
could easily see erroneous undo events at the end of fast recovery or
RTO recovery episodes. The erroneous undo events could cause those
connections to suffer repeated loss recovery episodes and high
retransmit rates.

The problem was an interaction between the non-SACK behavior on these
connections and the undo logic. The problem is that, for non-SACK
connections at the end of a loss recovery episode, if snd_una ==
high_seq, then tcp_is_non_sack_preventing_reopen() holds steady in
CA_Recovery or CA_Loss, but clears tp->retrans_stamp to 0. Then upon
the next ACK the "tcp: fix to allow timestamp undo if no retransmits
were sent" logic saw the tp->retrans_stamp at 0 and erroneously
concluded that no data was retransmitted, and erroneously performed an
undo of the cwnd reduction, restoring cwnd immediately to the value it
had before loss recovery.  This caused an immediate burst of traffic
and build-up of queues and likely another immediate loss recovery
episode.

This commit fixes tcp_packet_delayed() to ignore zero retrans_stamp
values for non-SACK connections when snd_una is at or above high_seq,
because tcp_is_non_sack_preventing_reopen() clears retrans_stamp in
this case, so it's not a valid signal that we can undo.

Note that the commit named in the Fixes footer restored long-present
behavior from roughly 2005-2019, so apparently this bug was present
for a while during that era, and this was simply not caught.

Fixes: e37ab7373696 ("tcp: fix to allow timestamp undo if no retransmits were sent")
Reported-by: Eric Wheeler <netdev@lists.ewheeler.net>
Closes: https://lore.kernel.org/netdev/64ea9333-e7f9-0df-b0f2-8d566143acab@ewheeler.net/
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Co-developed-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
This commit is contained in:
Neal Cardwell
2025-06-13 15:30:56 -04:00
committed by Greg Kroah-Hartman
parent 3261c017a7
commit 9d0ddfb574

View File

@@ -2463,20 +2463,33 @@ static inline bool tcp_packet_delayed(const struct tcp_sock *tp)
{ {
const struct sock *sk = (const struct sock *)tp; const struct sock *sk = (const struct sock *)tp;
if (tp->retrans_stamp && /* Received an echoed timestamp before the first retransmission? */
tcp_tsopt_ecr_before(tp, tp->retrans_stamp)) if (tp->retrans_stamp)
return true; /* got echoed TS before first retransmission */ return tcp_tsopt_ecr_before(tp, tp->retrans_stamp);
/* Check if nothing was retransmitted (retrans_stamp==0), which may /* We set tp->retrans_stamp upon the first retransmission of a loss
* happen in fast recovery due to TSQ. But we ignore zero retrans_stamp * recovery episode, so normally if tp->retrans_stamp is 0 then no
* in TCP_SYN_SENT, since when we set FLAG_SYN_ACKED we also clear * retransmission has happened yet (likely due to TSQ, which can cause
* retrans_stamp even if we had retransmitted the SYN. * fast retransmits to be delayed). So if snd_una advanced while
* (tp->retrans_stamp is 0 then apparently a packet was merely delayed,
* not lost. But there are exceptions where we retransmit but then
* clear tp->retrans_stamp, so we check for those exceptions.
*/ */
if (!tp->retrans_stamp && /* no record of a retransmit/SYN? */
sk->sk_state != TCP_SYN_SENT) /* not the FLAG_SYN_ACKED case? */
return true; /* nothing was retransmitted */
return false; /* (1) For non-SACK connections, tcp_is_non_sack_preventing_reopen()
* clears tp->retrans_stamp when snd_una == high_seq.
*/
if (!tcp_is_sack(tp) && !before(tp->snd_una, tp->high_seq))
return false;
/* (2) In TCP_SYN_SENT tcp_clean_rtx_queue() clears tp->retrans_stamp
* when setting FLAG_SYN_ACKED is set, even if the SYN was
* retransmitted.
*/
if (sk->sk_state == TCP_SYN_SENT)
return false;
return true; /* tp->retrans_stamp is zero; no retransmit yet */
} }
/* Undo procedures. */ /* Undo procedures. */