Skip to content

[Audit][Medium] captureScreenshot bypasses submitGuarded mutex and swallows errors #709

@MichaelFisher1997

Description

@MichaelFisher1997

🔍 Module Scanned

src/engine/graphics/vulkan/ (automated audit scan)

📝 Summary

The captureScreenshot function in screenshot.zig directly calls vkQueueSubmit on the graphics queue without going through the thread-safe submitGuarded wrapper that all other queue submissions use. Additionally, it swallows all errors with a generic catch-all, making debugging impossible.

📍 Location

  • File: src/engine/graphics/vulkan/screenshot.zig:157
  • Function/Scope: captureScreenshot

🔴 Severity: Medium

  • Critical: Crashes, data corruption, GPU device loss
  • High: Memory leaks, race conditions, incorrect rendering, broken features
  • Medium: Performance degradation, missing error handling, suboptimal patterns
  • Low: Code style, dead code, minor improvements

💥 Impact

If captureScreenshot is ever called concurrently with the normal render loop (e.g., from a background thread, or if the screenshot path is triggered during a frame), it would race with the guarded submitGuarded calls in FrameManager.endFrame(), breaking the mutex-based serialization guarantee and causing undefined Vulkan behavior.

Even if called only from the main thread, the function silently swallows all errors via a catch-all, meaning users cannot diagnose why screenshot capture failed.

🔎 Evidence

screenshot.zig:157:

Utils.checkVk(c.vkQueueSubmit(device.queue, 1, &submit_info, fence)) catch {
    log.log.err("screenshot: failed to submit command buffer", .{});
    return false;
};

Compare to the proper pattern in frame_manager.zig:195:

try self.vulkan_device.submitGuarded(submit_info, self.in_flight_fences[self.current_frame]);

The submitGuarded function (in modules/engine-graphics/src/vulkan_device.zig:463-477) properly:

  1. Acquires device.mutex before submission
  2. Checks for VK_ERROR_DEVICE_LOST and logs detailed fault info
  3. Returns a typed error for proper handling

screenshot.zig does none of this - it directly submits without mutex protection and catches all errors to a boolean false.

🛠️ Proposed Fix

Modify captureScreenshot to use the device mutex around the queue submission:

device.mutex.lock();
const result = c.vkQueueSubmit(device.queue, 1, &submit_info, fence);
device.mutex.unlock();

if (result == c.VK_ERROR_DEVICE_LOST) {
    log.log.err("screenshot: GPU device lost during command buffer submit", .{});
    return false;
}
if (result != c.VK_SUCCESS) {
    log.log.err("screenshot: vkQueueSubmit failed with result: {}", .{result});
    return false;
}

Also consider surfacing the error rather than returning bool so callers can handle specific failure modes.

✅ Acceptance Criteria

  • captureScreenshot acquires device.mutex before calling vkQueueSubmit
  • VK_ERROR_DEVICE_LOST is handled specially to provide diagnostic info
  • Error is propagated (or at minimum logged with specific error code) rather than swallowed
  • The fix has been verified with nix develop --command zig build test

📚 References

  • modules/engine-graphics/src/vulkan_device.zig:463-477submitGuarded implementation showing correct pattern
  • src/engine/graphics/vulkan/frame_manager.zig:195 — Correct usage of submitGuarded
  • Vulkan spec §7.2 "Thread Safety" — Queue submission synchronization requirements

Metadata

Metadata

Assignees

No one assigned

    Labels

    automated-auditIssues found by automated opencode audit scansbugSomething isn't workingdocumentationImprovements or additions to documentationenhancementNew feature or requesthotfixquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions