Introduce deadline for init containers#4960
Conversation
Greptile SummaryThis PR adds a
Confidence Score: 4/5The runtime logic is sound, but two newly-added tests will fail as written. The production path for init-container deadline enforcement is correct. However, two tests — Test_GetAction_InitContainerDeadlineUsesFirstInitStart and Test_GetAction_InitContainerDeadlineRequiresAllRegularInitContainersToComplete — will fail at assertion time: the expected message uses lowercase 'init' while the code emits uppercase 'Init', and both expected strings omit the running-container suffix that is appended whenever state.current is set. Both pods in those tests have a running init container, so the suffix is always appended. CI will catch this before the code ships, but the tests need to be updated before the PR can merge. internal/executor/podchecks/pod_checks_test.go — lines 119 and 160 contain assertions that don't match the message the production code produces. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[GetAction called] --> B[LastStatusChange]
B -->|error| C[return ActionWait]
B -->|ok| D{deadlineForInitContainers > 0?}
D -->|no| H[compute timeInState]
D -->|yes| E[getInitContainerState]
E --> F{first != nil AND last == nil AND elapsed > deadline?}
F -->|no| H
F -->|yes| G[return ActionFail / PodStartupIssue]
H --> I{timeInState > deadlineForNodeAssignment AND not scheduled?}
I -->|yes| J[return ActionRetry / NoNodeAssigned]
I -->|no| K{isBadNode?}
K -->|yes + over deadline| L[return ActionRetry / NoStatusUpdates]
K -->|yes + under deadline| M[return ActionWait / NoStatusUpdates]
K -->|no| N[eventChecks + containerStateChecks]
N --> O[return maxAction result]
Reviews (2): Last reviewed commit: "pod checks extended message" | Re-trigger Greptile |
|
|
||
| func Test_GetAction_BadNodeButUnderTimeLimit(t *testing.T) { | ||
| podChecks := podChecksWithMocks(ActionWait, ActionWait) | ||
| result, cause, message := podChecks.GetAction(createBasicPod(true), []*v1.Event{}, 10*time.Second) | ||
| result, cause, message := podChecks.GetAction(createBasicPodInStateFor(true, 10*time.Second), []*v1.Event{}) | ||
| assert.Equal(t, result, ActionWait) |
There was a problem hiding this comment.
Missing test for init container in Waiting/CrashLoopBackoff state
There is test coverage for Waiting (before first start), Running, and Terminated (exit 0 and multi-container partial success), but no test for the state where an init container previously ran with a non-zero exit code and is now in Waiting state (CrashLoopBackoff). That scenario exercises the LastState gap described in the sibling comment on pod_checks.go, and a test would confirm whether the deadline is expected to fire or not during the backoff window.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Signed-off-by: Nikola Jokic <jokicnikola07@gmail.com>
|
|
||
| assert.Equal(t, ActionFail, result) | ||
| assert.Equal(t, PodStartupIssue, cause) | ||
| assert.Equal(t, "init containers did not complete within deadline of 1m0s", message) |
There was a problem hiding this comment.
Test assertions will fail — message case mismatch and missing running-container suffix
Two new tests assert on the exact message string but it won't match what the code produces:
-
Case mismatch:
pod_checks.goline 67 writes"Init containers did not complete…"(capital I), while both assertions here (line 119) and at line 160 expect"init containers did not complete…"(lowercase i). -
Truncated expected string: When
state.current != nil(true for both test scenarios — the pods have a running or running-as-second init container), the code appends"\nInit container <name> has been running for <duration>". The expected strings don't include this suffix, soassert.Equalwill fail even after fixing the case.
Both Test_GetAction_InitContainerDeadlineUsesFirstInitStart and Test_GetAction_InitContainerDeadlineRequiresAllRegularInitContainersToComplete are affected. The simplest fix is to update the expected strings (or use assert.Contains for the prefix, and check the suffix separately), or update the production message to lowercase and include the running-container details in the expected strings.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
What type of PR is this?
Enhancement
What this PR does / why we need it
Init containers could take time that is counted in the startup time of main containers. This change includes init containers so that checks related to the main container are based on the time where the last transition time occurred, including the init container state change.
The deadline doesn't include side-car containers (init containers with restart policy always).