chore: new release#277
Conversation
…into development
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
* docs: add GSE293036 example with correct FASTQ file names Add GSE293036 (Granitto et al. 2025) example documentation with renamed FASTQ files matching the notebook conventions (SRR*_Barcode_1/2, SRR*_Control, SRR*_GM12878) and include QC report HTML files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit hooks for more information, see https://pre-commit.ci * adding to menue --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Max Schubach <max.schubach@bih-charite.de>
* feat(experiment)!: ✨ NGmerge for paired-end BC reads merging NGmerge is a replacement of the custom merge script with bam files NGmerge will be the default! Co-authored-by: Copilot <copilot@github.com> * fix: wildcards in stats * attach with multiple files * check * check * feat: update default merge tool to custom and improve documentation for NGmerge --------- Co-authored-by: Copilot <copilot@github.com>
…#273) * feat(assignment): 🔥 Filtering option based on CIGAR string (BWA dn BBMAP) User can spcify a regex to filter the CIGAR string in the bam file * feat(assignment): enhance design check configuration and validation for sequence parameters * Refactor shell commands in Snakemake rules for improved readability and consistency - Updated multiple Snakemake rules to enhance the formatting of shell commands by aligning pipes and parameters for better readability. - Replaced `&>` with `&>{log}` for consistent logging across rules. - Ensured proper indentation and spacing in shell commands to follow best practices. - Added validation requirements in the config schema for sequence collisions to enforce necessary fields when applicable. * feat(config): update sequence_length and alignment_start options for bwa filtering
* GSE316891 example * qc reports * more examples + overview page * index * refactor release please * update GSE284330 example config for NGmerge optimization * Update GSE325670 example configuration and documentation for version 0.7.0 - Changed version from 0.6.2 to 0.7.0 in config files. - Renamed assignment from GSE325256StrandSensitive to GSE325256CIGAR. - Updated alignment configurations to include min_mapping_quality and cigar_filter_regex. - Adjusted documentation to reflect changes in assignment results and improved mapping strategy. * more docs * refactor * fix target
* Added Zahm and KopliK examples - GSE271608 and GSE307247 * [pre-commit.ci] auto fixes from pre-commit hooks for more information, see https://pre-commit.ci * docs: add to index + overview * docs: fix typos --------- Co-authored-by: j-r-class2 <clnyc01@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Max Schubach <max.schubach@bih-charite.de>
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
There was a problem hiding this comment.
Pull request overview
This PR broadens MPRAsnakeflow’s configuration and experiment-counting capabilities (notably adding an NGmerge-based merge path), extends mapping-time filtering options, and substantially expands/refreshes documentation and example resources in preparation for a new release.
Changes:
- Add an experiment-level
merge_toolswitch with an NGmerge-based merging/counting path (including a shared NGmerge rule template). - Extend assignment/mapping configuration options (e.g., optional bwa min/max windows,
cigar_filter_regex) and adjust design-check window handling. - Add multiple new dataset example pages and update documentation/config examples accordingly.
Reviewed changes
Copilot reviewed 43 out of 66 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| workflow/scripts/attachBCToFastQ.py | Allow multiple --reads/--barcodes pairs (used for NGmerge UMI header attachment). |
| workflow/scripts/assignment/check_design_file.py | Make --start/--length conditionally required; improve help text and erroring. |
| workflow/schemas/config.schema.yaml | Update schema for new/changed config keys (design_check window, cigar regex, experiment merge_tool/NGmerge). |
| workflow/rules/experiment/statistic/counts.smk | Minor fix/formatting; use wc.project in helper inputs. |
| workflow/rules/experiment/counts/counts_umi.smk | Add NGmerge UMI path (attach UMI to headers + NGmerge + header parsing for counts). |
| workflow/rules/experiment/counts/counts_noUMI.smk | Add NGmerge no-UMI path for paired reads. |
| workflow/rules/experiment/counts/counts_merge_ngmerge.smk | New shared NGmerge merge rule template for experiment workflows. |
| workflow/rules/experiment/counts.smk | Include the new NGmerge template rule file. |
| workflow/rules/common.smk | Add config pre/post processing and helper typing/validation tweaks. |
| workflow/rules/assignment/mapping_bwa.smk | Add optional window filtering + optional CIGAR-regex filtering to barcode extraction. |
| workflow/rules/assignment/mapping_bbmap.smk | Add optional CIGAR-regex filtering to barcode extraction. |
| workflow/rules/assignment/hybridFWDRead.smk | Formatting-only changes in awk block. |
| workflow/rules/assignment/common.smk | Add helpers for design_check access and CLI arg formatting. |
| workflow/rules/assignment.smk | Switch design-check arg construction to new helper functions. |
| resources/count_basic/experiment_noUMI.csv | New experiment CSV example (no UMI). |
| resources/count_basic/experiment_FWDUMI.csv | New experiment CSV example (FWD + UMI). |
| resources/count_basic/experiment_FWD.csv | New experiment CSV example (FWD-only). |
| docs/index.rst | Add examples overview + new example pages to the docs toctree. |
| docs/4_examples/plasmid_example.rst | Update heading/formatting and clarify dataset reference. |
| docs/4_examples/overview.rst | New “Examples Overview” landing page with dataset summaries/links. |
| docs/4_examples/GSE325670_example.rst | New detailed example page for Hauser et al. dataset. |
| docs/4_examples/GSE316891_example.rst | New detailed example page for Yan et al. dataset. |
| docs/4_examples/GSE307247_example.rst | New detailed example page for Koplik et al. dataset. |
| docs/4_examples/GSE306816_example.rst | New detailed example page for Zhang et al. dataset. |
| docs/4_examples/GSE293036_example.rst | New detailed example page for Granitto et al. dataset. |
| docs/4_examples/GSE284330_example.rst | New detailed example page for Zaratiana et al. dataset. |
| docs/4_examples/GSE271608_example.rst | New detailed example page for Zahm et al. dataset. |
| docs/4_examples/count_example1.rst | Update Klein et al. link to DOI. |
| docs/4_examples/complex_readstructure_example.rst | Update title/anchors and adjust resource guidance (memory). |
| docs/4_examples/combined_example1.rst | Update Klein et al. link to DOI. |
| docs/4_examples/assignment_example1.rst | Update Klein et al. link to DOI. |
| docs/3_further_documentation/adapter.rst | Grammar/clarity improvements. |
| docs/2_workflows/experiment.rst | Correct multiple typos/grammar in workflow documentation. |
| docs/1_getting_started/config.rst | Document new/changed config keys (merge_tool, cigar_filter_regex, design_check window). |
| config/example_count.yaml | Bump example config version. |
| config/example_config.yaml | Bump example config version. |
| config/example_assignment_pbmm2.yaml | Bump example config version; move window to design_check. |
| config/example_assignment_exact_linker.yaml | Bump example config version. |
| config/example_assignment_exact_lazy.yaml | Bump example config version. |
| config/example_assignment_bwa.yaml | Bump example config version. |
| config/example_assignment_bbmap.yaml | Bump example config version; move window to design_check. |
| .github/release-please-config.json | Configure changelog sections for release-please. |
| .github/copilot-instructions.md | Update “Adding a New Rule” snippet to match current style/structure. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 46 out of 69 changed files in this pull request and generated 14 comments.
Comments suppressed due to low confidence (1)
docs/4_examples/GSE284330_example.rst:165
- This paragraph contains multiple typos/grammar issues (e.g.
meadian,less then,a exact,satting,highlits) that reduce readability of the rendered docs.
* :code:`results/experiments/GSE284330/reporter_experiment.barcode.HepG2Sub1.GSE284330.BCthreshold1.all.tsv.gz`
You should also inspect the QC report in the same directory, or have a look `here <https://htmlpreview.github.io/?https://github.com/kircherlab/MPRAsnakeflow/blob/master/docs/4_examples/GSE284330.experiment.qc_report.HepG2Sub1.BCthreshold1.html>`_. As expected we have a meadian BC of 1. We also see that we only get a library coverage of less then 20%. This is expected because this assay is not really designed for barcode-based counting. We are doing a exact mapping of 200bp sequenced oligos. The assay is more similar to a STARR-seq like assay where the oligos themselves are sequenced and counted. Therefore, we have a lot of oligos that are not covered by any read and many reads that do not match any oligo. This example is therefore not ideal for MPRAsnakeflow, but it demonstrates how to use the workflow can be used in that satting and highlits potential optimizations for future versions of the workflow, like implementing a mapping strategy for the experiment sub-workflow.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
No description provided.