Skip to content

Conversation

@renaudhartert-db
Copy link
Contributor

@renaudhartert-db renaudhartert-db commented Dec 11, 2025

What changes are proposed in this pull request?

This PR improves the performance of the databricks fs cp command when copying directories by parallelizing file uploads. The command uses 8 concurrent workers by default but the number can be controlled via --concurrency.

Implementation details:

  • No ordering guarantee: Files are now copied in parallel with no guaranteed order (previously sequential).
  • Fail-fast on errors: If any file copy fails, the context is cancelled and remaining operations are stopped (first error is returned).
  • Retry responsibility: The implementation does not retry failed operations; this remains the responsibility of the underlying Filer implementation as before.

Why --concurrency? No strong preference here, it does not seem that there is a pattern in the CLI to control concurrency in other places. This is the flag name used in most Go tools but I'm happy to use something else.

How is this tested?

Added acceptance tests to exercise most code paths + unit tests to validate that the context cancellation and propagation works properly.

@eng-dev-ecosystem-bot
Copy link
Collaborator

eng-dev-ecosystem-bot commented Dec 11, 2025

Commit: ff6c248

Run: 20849815312

Env 🪲​BUG 🟨​KNOWN 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🪲​ aws linux 9 9 14 2 381 678 33:23
🪲​ aws windows 9 17 6 2 383 676 30:35
🪲​ aws-ucws linux 9 5 2 18 2 530 555 61:30
🪲​ aws-ucws windows 9 13 10 2 534 553 52:42
🪲​ azure linux 9 3 15 3 381 677 37:51
🪲​ azure windows 9 11 7 3 383 675 34:08
🪲​ azure-ucws linux 9 3 15 3 528 554 63:21
🪲​ azure-ucws windows 9 11 7 3 530 552 58:26
🪲​ gcp linux 9 3 2 15 3 368 683 36:38
🪲​ gcp windows 9 11 7 3 372 681 31:46
38 interesting tests: 17 KNOWN, 9 BUG, 7 RECOVERED, 4 flaky, 1 SKIP
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
💚​ TestAccept/bundle/deployment/bind/alert 🙈​S 🙈​S 🙈​S 🙈​S 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🟨​ TestAccept/bundle/generate/alert 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
🟨​ TestAccept/bundle/generate/alert/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
🟨​ TestAccept/bundle/generate/alert/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
💚​ TestAccept/bundle/resources/alerts/basic 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/basic/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/basic/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/with_file 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🔄​ TestAccept/bundle/resources/grants/volumes 🙈​s 🙈​s 🔄​f ✅​p 🙈​s 🙈​s ✅​p ✅​p 🙈​s 🙈​s
🔄​ TestAccept/bundle/resources/grants/volumes/DATABRICKS_BUNDLE_ENGINE=terraform 🔄​f ✅​p ✅​p ✅​p
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 🟨​K 🟨​K 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🔄​ TestAccept/bundle/resources/pipelines/lakeflow-pipeline ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p
🔄​ TestAccept/bundle/resources/pipelines/lakeflow-pipeline/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p
🪲​ TestAccept/cmd/fs/cp/dir-to-dir 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🪲​ TestAccept/cmd/fs/cp/dir-to-dir/DATABRICKS_BUNDLE_ENGINE=direct 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🪲​ TestAccept/cmd/fs/cp/dir-to-dir/DATABRICKS_BUNDLE_ENGINE=terraform 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🪲​ TestAccept/cmd/fs/cp/file-to-dir 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🪲​ TestAccept/cmd/fs/cp/file-to-dir/DATABRICKS_BUNDLE_ENGINE=direct 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🪲​ TestAccept/cmd/fs/cp/file-to-dir/DATABRICKS_BUNDLE_ENGINE=terraform 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🪲​ TestAccept/cmd/fs/cp/file-to-file 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🪲​ TestAccept/cmd/fs/cp/file-to-file/DATABRICKS_BUNDLE_ENGINE=direct 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🪲​ TestAccept/cmd/fs/cp/file-to-file/DATABRICKS_BUNDLE_ENGINE=terraform 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B 🪲​B
🟨​ TestExport 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestExportWithFileFlag 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K 🟨​K
🟨​ TestImportDir 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
🟨​ TestImportDirDoesNotOverwrite 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
🟨​ TestImportDirWithOverwriteFlag 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
🟨​ TestImportFileFormatAuto 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
🟨​ TestImportFileFormatSource 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K 💚​R 🟨​K
Top 50 slowest tests (at least 2 minutes):
duration env testname
7:04 aws-ucws linux TestAccept/bundle/resources/synced_database_tables/basic
6:10 aws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
6:04 aws-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
5:51 aws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:51 aws-ucws windows TestAccept/bundle/resources/synced_database_tables/basic
5:51 aws-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:42 gcp linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
5:32 gcp windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:18 azure-ucws windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:16 azure-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
5:05 azure linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
4:48 azure-ucws windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
4:31 azure-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
4:24 azure-ucws windows TestAccept/bundle/resources/secret_scopes/permissions/DATABRICKS_BUNDLE_ENGINE=terraform
4:08 azure-ucws windows TestAccept/bundle/resources/synced_database_tables/basic
4:08 azure-ucws linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=no/PY=no/READPLAN=
3:59 azure-ucws linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=no/PY=yes/READPLAN=
3:40 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=yes/PY=yes/READPLAN=
3:33 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=yes/PY=yes/READPLAN=
3:30 azure-ucws linux TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=direct/DLT=no/NBOOK=no/PY=yes/READPLAN=1
3:22 azure-ucws windows TestAccept/bundle/resources/registered_models/basic/DATABRICKS_BUNDLE_ENGINE=terraform
3:21 azure-ucws linux TestAccept/bundle/resources/registered_models/basic/DATABRICKS_BUNDLE_ENGINE=terraform
3:17 azure linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=yes/PY=yes/READPLAN=
3:13 aws-ucws linux TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=no/PY=yes/READPLAN=
3:12 aws-ucws windows TestAccept/bundle/resources/registered_models/basic/DATABRICKS_BUNDLE_ENGINE=terraform
3:12 azure-ucws linux TestAccept/bundle/resources/synced_database_tables/basic
3:08 aws-ucws windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=terraform
3:03 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=no/NBOOK=yes/PY=no/READPLAN=
3:02 azure windows TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.11
3:02 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=no/NBOOK=yes/PY=yes/READPLAN=
3:00 aws-ucws windows TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=direct/UV_PYTHON=3.13
2:59 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=no/PY=yes/READPLAN=
2:58 aws-ucws linux TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=direct/UV_PYTHON=3.13
2:57 azure-ucws linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=yes/PY=yes/READPLAN=
2:57 azure-ucws windows TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=yes/PY=yes/READPLAN=
2:54 azure windows TestAccept/bundle/resources/experiments/basic/DATABRICKS_BUNDLE_ENGINE=terraform
2:53 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=yes/PY=no/READPLAN=
2:53 aws-ucws linux TestAccept/bundle/templates/default-python/combinations/serverless/DATABRICKS_BUNDLE_ENGINE=direct/DLT=no/NBOOK=no/PY=yes/READPLAN=1
2:52 aws linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=no/PY=yes/READPLAN=
2:52 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=yes/NBOOK=yes/PY=no/READPLAN=1
2:50 azure linux TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.12
2:50 gcp linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=no/PY=yes/READPLAN=
2:50 azure-ucws linux TestAccept/bundle/templates/default-python/integration_classic/DATABRICKS_BUNDLE_ENGINE=terraform/UV_PYTHON=3.11
2:48 aws-ucws windows TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=direct/DLT=no/NBOOK=yes/PY=no/READPLAN=
2:48 gcp linux TestAccept/bundle/resources/models/basic/DATABRICKS_BUNDLE_ENGINE=direct
2:48 aws-ucws linux TestAccept/bundle/resources/experiments/basic/DATABRICKS_BUNDLE_ENGINE=terraform
2:47 gcp linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
2:46 gcp linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=yes/PY=yes/READPLAN=
2:45 aws-ucws linux TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=yes/PY=yes/READPLAN=
2:44 aws windows TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct

Copy link
Contributor

@shreyas-goenka shreyas-goenka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Issue: Context Shadowing Bug

Line 61 in cp.go shadows the context variable from line 57, causing the defer cancel() to cancel the wrong context. This breaks proper cleanup and cancellation propagation.

Issues

  1. Context shadowing at line 61: The errgroup.WithContext returns a new context that shadows the cancellable context from line 57. This means defer cancel() on line 58 will cancel the wrong context.

  2. Redundant context check at lines 90-92: The ctx.Err() check inside the goroutine is ineffective since the goroutine may start before cancellation, and cpFileToFile already handles context cancellation.

  3. Missing test coverage: TestCp_concurrencyValidation only tests invalid values. Should also test valid values (1, 16, 100) work correctly.

  4. No integration test for --concurrency flag: The new flag should have an integration test exercising different concurrency values.

Suggestions

  1. Consider removing the double context wrapping since errgroup.WithContext already provides cancellation on error.

  2. Document the no-ordering-guarantee behavior in command help text.

  3. Consider adding debug logs for performance monitoring.

Questions

  1. Why the double context wrapping (lines 57 and 61)? Is there a specific reason beyond what errgroup provides?

  2. Is concurrency=16 based on benchmarking? Different scenarios might benefit from different values.

  3. With concurrent output, messages will be interleaved. Is this acceptable UX?


Review generated by reviewbot

@shreyas-goenka shreyas-goenka self-requested a review December 12, 2025 10:56
Copy link
Contributor

@shreyas-goenka shreyas-goenka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to leave this review before going on vacation.

@@ -0,0 +1,20 @@
Local = true
Cloud = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend setting Cloud = true for this test as well. That validates the implementation against a real server. You can use fs cat to confirm that the files were uploaded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also nice to ensure that the local test server implementation matches the remote behaviour, atleast for the interfaces we own (like the fs commands)

renaudhartert-db and others added 2 commits January 9, 2026 09:33
Add missing blank lines and trailing newlines in fs cp test outputs
to match actual command output format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
The HEAD handler for /api/2.0/fs/directories now uses a simple heuristic:
if a path has a file extension (e.g., .txt, .json), it's assumed to be a
file, not a directory. This avoids the need for complex state tracking or
per-test Server overrides while handling the common case correctly.

Also regenerated acceptance test output files to include proper blank lines
between command outputs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Added comprehensive documentation explaining:
- WHY: test server doesn't track UC Volumes state
- WHAT: the problem of checking paths before they exist
- HOW: file extension heuristic solves the common case
- Assumptions and limitations

Also simplified code by inlining lastSlash comparison and ensured
all comments start with capital letters and end with periods.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
renaudhartert-db and others added 3 commits January 9, 2026 10:53
The file extension heuristic is now only applied to Unity Catalog Volumes
paths (/Volumes/...), which makes semantic sense since:

1. Workspace files/directories: use tracked state (correct)
2. Volume paths: use heuristic (necessary - we don't track volume state)
3. Non-existent workspace paths: return 404 (correct)

Previously, the handler returned 200 for ALL non-existent paths, which was
semantically incorrect. Now it only assumes directories exist for volume
paths where we genuinely don't have state to check.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Apply file extension heuristic only to /Volumes/ paths where state
isn't tracked. Return correct 404 for non-existent workspace paths.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Resolved conflict by keeping volume-specific file extension heuristic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants