-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Labels
troubleshootingworkflow and data preparation questionsworkflow and data preparation questions
Description
pileup command is giving BED files that dmr command says invalid-bedmethyl-data. I have attached the output of DMR and the commands I have run.
OUTPUT (DMR)
> reading reference FASTA at "<genome_fasta_name>.fasta"
> 1521 common sequence(s) between FASTA and both samples
> running single-site analysis
> using default prior, Beta(α: 0.55, β: 0.55)
> estimating max coverages from data
> sampled 1094523 a records and 905404 b records, calculating max coverages for 95th percentile
> calculated max coverage for a: 6 and b: 21
> running with replicates, but not matched samples
> batch failed: invalid data, valid coverage (1) is not equal to the sum of canonical and modified counts (0), [BedMethylLine { chrom: "AAKM01000036", interval: Interval { start: 7056, stop: 7057, val: () }, raw_mod_code: Code('h'), strand: Positive, count_methylated: 0, valid_coverage: 1, count_canonical: 0, count_other: 1, count_delete: 0, count_fail: 0, count_diff: 0, count_nocall: 0 }] chrom: AAKM01000036 starting at 7056, stopping
> 12 batches processed > Error! invalid-bedmethyl-data
PILEUP LOGS
[modkit-logging/src/lib.rs::69][2026-02-07 10:41:20][DEBUG] command line: modkit pileup J_CC_5hmC.bam J_CC_5hmC.bed.gz --modified-bases 5hmC --reference /home/<genome_fasta_name>.fasta --log J_CC_5hmC.log --bgzf
[modkit-core/src/pileup/subcommand.rs::802][2026-02-07 10:41:20][INFO] discarded 1107 contigs with zero aligned reads
[modkit-core/src/pileup/subcommand.rs::576][2026-02-07 10:41:20][INFO] parsed 1 base modification(s). Base modifications other than 'C:h' will be counted as 'N_other'.
[modkit-core/src/pileup/subcommand.rs::653][2026-02-07 10:41:20][INFO] adding single-base motif: 'C 0'
[modkit-core/src/pileup/subcommand.rs::944][2026-02-07 10:41:20][INFO] using bgzf compression with 4 compression threads
[modkit-core/src/command_utils.rs::102][2026-02-07 10:41:20][INFO] attempting to sample 10042 reads
[modkit-core/src/reads_sampler/mod.rs::49][2026-02-07 10:41:20][DEBUG] found BAM index, sampling reads in 1000000 base pair chunks
[modkit-core/src/reads_sampler/sampling_schedule.rs::261][2026-02-07 10:41:20][DEBUG] removed 0 contigs from schedule with <= 1 reads
[modkit-core/src/reads_sampler/sampling_schedule.rs::141][2026-02-07 10:41:20][DEBUG] derived sampling schedule, sampling total 11547 reads from 1641 contigs, 0 unmapped reads
[modkit-core/src/reads_sampler/sampling_schedule.rs::164][2026-02-07 10:41:20][DEBUG] schedule
chrom count/frac
DMR Command
modkit dmr pair \
-a UNC_ED_5hmC.bed.gz \
-a UNC_YR1_5hmC.bed.gz \
-a UNC_YR2_5hmC.bed.gz \
-a UNC_EL_5hmC.bed.gz \
-a UNC_FY_5hmC.bed.gz \
-b CM_EL_5hmC.bed.gz \
-b CM_EJ_5hmC.bed.gz \
-o DMR_UNC_vs_CM_5hmC.bed \
--ref "${REF}" \
--base C \
--threads "${THREADS}" \
--log-filepath DMR_UNC_vs_CM_5hmC.log
PILEUP Command. mod = 5hmC
pileup () {
local bam=$1
local mod=$2
local prefix=$3
log "Running modkit pileup (${mod}) for ${prefix}"
modkit pileup \
"${bam}" \
"${prefix}_${mod}.bed.gz" \
--modified-bases "${mod}" \
--reference "${REF}" \
--log "${prefix}_${mod}.log" \
--bgzf
}
I had run modified basecalling for 5mC 5hmC to create modified BAM files. @ArtRand
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
troubleshootingworkflow and data preparation questionsworkflow and data preparation questions