Skip to content

fix(cluster): tolerate partial mesh and cluster state during delete (#52)#57

Merged
dennisklein merged 2 commits into
nextfrom
bugfix-52
May 21, 2026
Merged

fix(cluster): tolerate partial mesh and cluster state during delete (#52)#57
dennisklein merged 2 commits into
nextfrom
bugfix-52

Conversation

@dennisklein
Copy link
Copy Markdown
Member

Closes #52.

  • writeDNSEntries skips the DNS-container reload when sind-dns is not
    running; the updated Corefile is loaded on next start
  • DeleteContainers / DeleteNetwork / DeleteVolumes swallow
    IsNotFound with a warning; non-IsNotFound errors still abort
  • DeregisterMesh wraps DNS-record and known_hosts removal with
    warn-and-continue
  • mesh CleanupMesh helpers drop the redundant existence precheck and
    swallow IsNotFound from the actual remove, closing the TOCTOU window
  • redundant if err := DeregisterMesh(...); err != nil checks in
    deleteClusterResources and WorkerRemove are removed

- writeDNSEntries now inspects the DNS container after writing the new
  Corefile and skips the kill/start reload if the container is not
  running; the updated Corefile is loaded when it next starts
- previously `docker kill sind-dns` failed with exit 1 against a stopped
  container, breaking `sind delete cluster` (issue #52) and the symmetric
  `sind create cluster` path
- cluster test mock dispatchers now seed an inspect→running result
  between the Corefile write and the reload kill
Generalises issue #52: the delete orchestrator now treats already-gone
resources and unreachable mesh helpers as warnings instead of fatal errors.
A user who has `docker stop`'d, `docker rm`'d, or `docker network rm`'d
parts of a cluster between sind invocations can still complete the
teardown.

- DeleteContainers / DeleteNetwork / DeleteVolumes swallow IsNotFound
  with a warning; other errors still abort
- DeregisterMesh wraps RemoveDNSRecords and RemoveKnownHosts with
  warn-and-continue; failing to update a torn-down helper is no longer
  fatal
- mesh CleanupMesh's removeContainerIfExists / removeNetworkIfExists /
  removeVolumeIfExists drop the redundant existence precheck and swallow
  IsNotFound from the actual remove, closing the TOCTOU window between
  inspect and rm
- redundant `if err := DeregisterMesh(...); err != nil` checks in
  deleteClusterResources and WorkerRemove are removed
@dennisklein dennisklein merged commit 60cd702 into next May 21, 2026
5 checks passed
@dennisklein dennisklein deleted the bugfix-52 branch May 21, 2026 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant