Skip to content

increase ulimit for cutadapt#14

Open
vmikk wants to merge 1 commit intotest-v1.2.0from
demux
Open

increase ulimit for cutadapt#14
vmikk wants to merge 1 commit intotest-v1.2.0from
demux

Conversation

@vmikk
Copy link
Copy Markdown
Member

@vmikk vmikk commented Mar 9, 2026

Demultiplexing fails with ERROR: [Errno 24] Too many open files

image

This is related to the number of open files allowed for a user.
E.g., for 24 forward and 24 reverse adapters and paired-end data, you need about 25252=1250 file descriptors, plus a couple for communication between each process (ref).
The default ulimit is 1024.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a Too many open files error that occurs during demultiplexing when cutadapt needs more file descriptors than the default ulimit of 1024 allows. When many adapters are used (e.g., 24 forward × 24 reverse in paired-end mode requires ~1250 file descriptors), the process fails. The fix raises the soft open-file limit to 6000 before cutadapt is invoked.

Changes:

  • Raise the soft ulimit for open files to 6000 in the single-end demultiplexing script
  • Raise the soft ulimit for open files to 6000 in the paired-end demultiplexing script

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/pipecraft-core/service_scripts/demux_single_end_data.sh Adds ulimit -S -n 6000 before the workflow starts to prevent "Too many open files" errors during single-end demultiplexing
src/pipecraft-core/service_scripts/demux_paired_end_data.sh Adds ulimit -S -n 6000 before the workflow starts to prevent "Too many open files" errors during paired-end demultiplexing

Two moderate concerns with this approach:

  1. Soft vs. hard limit: ulimit -S only modifies the soft limit. If the Docker container's hard limit for open files happens to be below 6000 (which can occur depending on the Docker/host configuration since no explicit Ulimits are set in HostConfig when containers are created), the command will fail and the actual limit will remain at the default 1024, meaning the bug is not fixed in those environments.

  2. Hardcoded value may be insufficient for large multiplexing runs: The value 6000 is adequate for the example in the PR description (24×24 adapters, paired-end → ~1250 FDs), but for larger experiments (e.g., 96×96 dual adapters on paired-end data), the required file descriptors could exceed 6000. A more conservative choice like 65536 (a common production default) would make this more future-proof.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

run_python_module=$"python3 /scripts/submodules/assign_sample_names.demuxModule.py $indexes_file"

# Increase the number of open files limit
ulimit -S -n 6000
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command ulimit -S -n 6000 only raises the soft limit for open files. The soft limit can only be raised up to the current hard limit. If the container's hard limit is lower than 6000 (which on some Docker/OS configurations may equal the default of 1024), this command will fail with an error like bash: ulimit: open files: cannot modify limit: Operation not permitted or silently do nothing.

To ensure this works reliably across all deployment environments, consider also setting the hard limit (ulimit -H -n 6000), which requires root/privileged access (common in containers), or — more robustly — setting the nofile ulimit at the Docker container creation level in HostConfig.Ulimits. Additionally, if users run large multiplexing experiments (e.g., 96×96 adapters for paired-end), they would need 96×96×2 ≈ 18,432 file descriptors, which would exceed the hardcoded 6000 limit. Consider making this value dynamically computed based on the number of adapters or using a higher ceiling such as 65536.

Suggested change
ulimit -S -n 6000
# Allow override via DESIRED_NOFILE; default to a higher ceiling suitable for large multiplexing
DESIRED_NOFILE="${DESIRED_NOFILE:-65536}"
# Get current hard limit
current_hard_limit=$(ulimit -H -n 2>/dev/null || echo "")
if [ -n "$current_hard_limit" ]; then
# Try to raise hard limit to desired value (may fail without sufficient privileges)
if [ "$current_hard_limit" != "unlimited" ] && [ "$current_hard_limit" -lt "$DESIRED_NOFILE" ] 2>/dev/null; then
ulimit -H -n "$DESIRED_NOFILE" 2>/dev/null || true
current_hard_limit=$(ulimit -H -n 2>/dev/null || echo "")
fi
# Clamp soft limit to the (possibly updated) hard limit
target_soft_limit="$DESIRED_NOFILE"
if [ "$current_hard_limit" != "unlimited" ] && [ "$current_hard_limit" -lt "$target_soft_limit" ] 2>/dev/null; then
target_soft_limit="$current_hard_limit"
fi
ulimit -S -n "$target_soft_limit" 2>/dev/null || true
fi

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +46
# Increase the number of open files limit
ulimit -S -n 6000

Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as in the paired-end script: ulimit -S -n 6000 only raises the soft limit. If the container's hard limit is below 6000, this command will fail. For a single-end scenario with many adapters (e.g., 96+), even 6000 might not be sufficient (96 adapters × 1 output file each + communication overhead could be fine, but the risk grows with scale).

Consider using a higher, more future-proof value (e.g., 65536) or dynamically computing the required value based on the adapter count.

Suggested change
# Increase the number of open files limit
ulimit -S -n 6000
# Increase the number of open files limit (respect hard limit and avoid lowering current soft limit)
desired_nofiles=65536
hard_limit=$(ulimit -H -n 2>/dev/null)
soft_limit=$(ulimit -S -n 2>/dev/null)
# Determine target_limit: do not exceed hard limit (if numeric), and do not go below current soft limit
if [ "$hard_limit" = "unlimited" ] || [ "$hard_limit" = "unlimited\n" ]; then
target_limit=$desired_nofiles
else
target_limit=$desired_nofiles
if [ "$hard_limit" -lt "$desired_nofiles" ] 2>/dev/null; then
target_limit=$hard_limit
fi
fi
# Never lower the existing soft limit
if [ "$soft_limit" -gt "$target_limit" ] 2>/dev/null; then
target_limit=$soft_limit
fi
ulimit -S -n "$target_limit" 2>/dev/null || {
printf 'Warning: unable to increase open file limit; continuing with current limits.\n' >&2
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants