http: fix potential FD leak when zombie streams prevent connection close #42859

wdauchy · 2026-01-05T16:43:30Z

Commit Message:
I am chasing a case where a client sends POST messages on a HTTP1.1 connection, but is being systematically denied. The connection is immediately reused by the client. In some case we seem to trigger a racy case where the number of open file descriptor explodes.

When a stream becomes a zombie (waiting for codec encode completion) and the connection is in DrainState::Closing, the connection is not closed because checkForDeferredClose() is not called after the zombie stream was finally destroyed.

This commonly occurs with HTTP/1.1 connections when:

ext_authz (or similar filter) denies a request early, sending a 403 response before the full request body is received
The connection is (potentially) marked DrainState::Closing
The stream becomes a zombie waiting for the codec to finish encoding
When onCodecEncodeComplete() fires, doDeferredStreamDestroy() was called but checkForDeferredClose() was not, leaving the connection open

In scenarios where clients try to reuse the same HTTP/1.1 connection and requests are systematically denied (e.g., by ext_authz), this causes rapid FD exhaustion as each denied request leaves an orphaned connection.

The fix adds checkForDeferredClose() calls after zombie stream destruction in both onCodecEncodeComplete() and onCodecLowLevelReset().
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

repokitteh-read-only · 2026-01-05T16:43:36Z

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #42859 was opened by wdauchy.

see: more, trace.

I am chasing a case where a client sends POST messages on a HTTP1.1 connection, but is being systematically denied. The connection is immediately reused by the client. In some case we seem to trigger a racy case where the number of open file descriptor explodes. When a stream becomes a zombie (waiting for codec encode completion) and the connection is in DrainState::Closing, the connection is not closed because checkForDeferredClose() is not called after the zombie stream was finally destroyed. This commonly occurs with HTTP/1.1 connections when: - ext_authz (or similar filter) denies a request early, sending a 403 response before the full request body is received - The connection is (potentially) marked DrainState::Closing - The stream becomes a zombie waiting for the codec to finish encoding - When onCodecEncodeComplete() fires, doDeferredStreamDestroy() was called but checkForDeferredClose() was not, leaving the connection open In scenarios where clients try to reuse the same HTTP/1.1 connection and requests are systematically denied (e.g., by ext_authz), this causes rapid FD exhaustion as each denied request leaves an orphaned connection. The fix adds checkForDeferredClose() calls after zombie stream destruction in both onCodecEncodeComplete() and onCodecLowLevelReset(). Signed-off-by: William Dauchy <[email protected]>

botengyao · 2026-01-06T19:37:48Z

@KBaichoo, Kevin, do you want also to take a look for this?

botengyao

Thanks for fixing this, and left a high level question.

/wait

botengyao · 2026-01-07T16:32:13Z

source/common/http/conn_manager_impl.cc

+    // closed. doEndStream() call that created the zombie may have set
+    // drain_state_ to Closing, but checkForDeferredClose() couldn't close the
+    // connection at that time because streams_ wasn't empty yet.
+    connection_manager_.checkForDeferredClose(false);


then, should we distinguish all the conditions like in the doEndStream

envoy/source/common/http/conn_manager_impl.cc

Lines 314 to 328 in 25fedad

if (check_for_deferred_close) {

// If HTTP/1.0 has no content length, it is framed by close and won't consider

// the request complete until the FIN is read. Don't delay close in this case.

const bool http_10_sans_cl =

(codec_->protocol() == Protocol::Http10) &&

(!stream.response_headers_ || !stream.response_headers_->ContentLength());

// We also don't delay-close in the case of HTTP/1.1 where the request is

// fully read, as there's no race condition to avoid.

const bool connection_close =

stream.filter_manager_.streamInfo().shouldDrainConnectionUponCompletion();

const bool request_complete = stream.filter_manager_.hasLastDownstreamByteReceived();

// Don't do delay close for HTTP/1.0 or if the request is complete.

checkForDeferredClose(connection_close && (request_complete || http_10_sans_cl));

}

to call checkForDeferredClose(false) here as well?

Good point. When doEndStream() creates a zombie stream, it does call checkForDeferredClose() with the computed skip_delay_close value, but that call is a no-op because streams_ is possibly not empty yet, so the computed value is effectively lost.

I chose checkForDeferredClose(false) to avoid duplicating that conditional logic, accepting the tradeoff that we always use the delay close path even when it's not strictly necessary.

If you prefer consistency with doEndStream(), I can add the same conditional logic here but I am not sure this is 100% needed in the zombie case. Alternatively, we could store the computed skip_delay_close value in stream state when it becomes a zombie and reuse it later. Let me know which approach you'd prefer.

Yes, we would need to keep the logic consistent to the existing logic, and maybe refactoring it a bit. There is a long history for the deferClose as these conditions were added.

thanks for the feedback. I updated the PR about it

Signed-off-by: William Dauchy <[email protected]>

wdauchy · 2026-01-08T12:39:19Z

/retest transients

botengyao

/wait

botengyao · 2026-01-08T21:04:34Z

source/common/http/conn_manager_impl.cc

  if (state_.is_zombie_stream_) {
+    const bool skip_delay = shouldSkipDeferredCloseDelay();
    connection_manager_.doDeferredStreamDestroy(*this);
+    // After destroying a zombie stream, check if the connection should be


Do you mind adding a runtime guard and a release note? I think this will be a high risk change since the release will be next week, does merging it after early next week work for you?

wdauchy force-pushed the fd_leak branch from 38a30e7 to cc7cb81 Compare January 5, 2026 17:30

wdauchy force-pushed the fd_leak branch from cc7cb81 to f476640 Compare January 5, 2026 17:52

botengyao assigned KBaichoo Jan 7, 2026

botengyao reviewed Jan 7, 2026

View reviewed changes

repokitteh-read-only bot added the waiting label Jan 7, 2026

botengyao self-assigned this Jan 7, 2026

wdauchy marked this pull request as ready for review January 7, 2026 18:16

repokitteh-read-only bot removed the waiting label Jan 7, 2026

wdauchy force-pushed the fd_leak branch 6 times, most recently from 495f026 to c96a9be Compare January 8, 2026 08:42

wdauchy requested a review from botengyao January 8, 2026 10:34

keep same logic for delayed close

296a00a

Signed-off-by: William Dauchy <[email protected]>

wdauchy force-pushed the fd_leak branch from c96a9be to 296a00a Compare January 8, 2026 11:05

botengyao reviewed Jan 8, 2026

View reviewed changes

repokitteh-read-only bot added the waiting label Jan 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

http: fix potential FD leak when zombie streams prevent connection close #42859

http: fix potential FD leak when zombie streams prevent connection close #42859

Uh oh!

wdauchy commented Jan 5, 2026

Uh oh!

repokitteh-read-only bot commented Jan 5, 2026

Uh oh!

botengyao commented Jan 6, 2026 •

edited

Loading

Uh oh!

botengyao left a comment

Uh oh!

botengyao Jan 7, 2026

Uh oh!

wdauchy Jan 7, 2026

Uh oh!

botengyao Jan 7, 2026 •

edited

Loading

Uh oh!

wdauchy Jan 8, 2026

Uh oh!

wdauchy commented Jan 8, 2026

Uh oh!

botengyao left a comment

Uh oh!

botengyao Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if (check_for_deferred_close) {
	// If HTTP/1.0 has no content length, it is framed by close and won't consider
	// the request complete until the FIN is read. Don't delay close in this case.
	const bool http_10_sans_cl =
	(codec_->protocol() == Protocol::Http10) &&
	(!stream.response_headers_ \|\| !stream.response_headers_->ContentLength());
	// We also don't delay-close in the case of HTTP/1.1 where the request is
	// fully read, as there's no race condition to avoid.
	const bool connection_close =
	stream.filter_manager_.streamInfo().shouldDrainConnectionUponCompletion();
	const bool request_complete = stream.filter_manager_.hasLastDownstreamByteReceived();

	// Don't do delay close for HTTP/1.0 or if the request is complete.
	checkForDeferredClose(connection_close && (request_complete \|\| http_10_sans_cl));
	}

http: fix potential FD leak when zombie streams prevent connection close #42859

Are you sure you want to change the base?

http: fix potential FD leak when zombie streams prevent connection close #42859

Uh oh!

Conversation

wdauchy commented Jan 5, 2026

Uh oh!

repokitteh-read-only bot commented Jan 5, 2026

Uh oh!

botengyao commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

botengyao left a comment

Choose a reason for hiding this comment

Uh oh!

botengyao Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

wdauchy Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

botengyao Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wdauchy Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

wdauchy commented Jan 8, 2026

Uh oh!

botengyao left a comment

Choose a reason for hiding this comment

Uh oh!

botengyao Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

botengyao commented Jan 6, 2026 •

edited

Loading

botengyao Jan 7, 2026 •

edited

Loading