Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .pre-commit-hooks.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
- id: decaylanguage-validate
name: Validate EvtGen decay files
description: Validate .dec decay files with decaylanguage.DecFileParser.
entry: decaylanguage-validate
language: python
files: '\.(dec|DEC)$'
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

* Parsing of decay files (aka .dec files):
- Various improvements to the code, for more robustness.
- Added `decaylanguage-validate` and a pre-commit hook for validating
EvtGen `.dec` files with selectable diagnostic codes.
- Performance improvements in `DecFileParser`, with caching and lazily-built indexing where possible.
- A couple of fixes related to the decay file parser.
- Typing modernisations.
Expand Down
19 changes: 19 additions & 0 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,22 @@ Tips
To run a subset of tests::

nox -s tests-3.9 -- -k test_myfeature

Decay file pre-commit hook
--------------------------

Downstream projects can validate EvtGen ``.dec`` files with the packaged
pre-commit hook::

- repo: https://github.com/scikit-hep/decaylanguage
rev: <version>
hooks:
- id: decaylanguage-validate

The hook reports selectable diagnostic codes. Experiments can ignore exact
codes or whole code families, for example::

- id: decaylanguage-validate
args: ["--ignore=DLW004"]

Use ``decaylanguage-validate --list-diagnostics`` to list the current codes.
72 changes: 70 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,9 @@ Just run the following:
pip install decaylanguage
```

The amplitude `modeling` subpackage and the command-line interface need a few
extra dependencies (NumPy, pandas and plumbum). Install them with:
The amplitude `modeling` subpackage and the AmpGen-to-GooFit command-line
interface need a few extra dependencies (NumPy, pandas and plumbum). Install
them with:

```bash
pip install "decaylanguage[modeling]"
Expand Down Expand Up @@ -179,6 +180,73 @@ dfp.parse()
This being said, please do submit a pull request to add new models,
if you spot missing ones ...

#### Validating decay files

Decay files can be validated from the command line:

```bash
decaylanguage-validate my-decay-file.dec
decaylanguage-validate path/to/decfiles-directory
```

The validator is also available as a pre-commit hook for downstream projects:

```yaml
- repo: https://github.com/scikit-hep/decaylanguage
rev: <version>
hooks:
- id: decaylanguage-validate
```

Diagnostics use stable codes, which can be disabled per experiment policy. For
example, LHCb-style files that intentionally rely on unmatched `CDecay` source
decays can ignore `DLW004`:

```yaml
- id: decaylanguage-validate
args: ["--ignore=DLW004"]
```

Run `decaylanguage-validate --list-diagnostics` to list the currently available
codes.

Available diagnostics:

| Code | Name | Meaning |
| --- | --- | --- |
| `DLP001` | `parse-error` | The file could not be parsed by `DecFileParser`. |
| `DLW001` | `duplicate-decay` | A particle has multiple `Decay` blocks; only the first is retained. |
| `DLW002` | `missing-copydecay-source` | A `CopyDecay` statement references a missing `Decay` source. |
| `DLW003` | `duplicate-cdecay` | A particle is defined with both `Decay` and `CDecay`; `CDecay` is ignored. |
| `DLW004` | `missing-cdecay-source` | A `CDecay` statement has no corresponding `Decay` source. |
| `DLW005` | `self-conjugate-cdecay` | A `CDecay` statement targets a self-conjugate particle. |
| `DLW999` | `parser-warning` | An otherwise unclassified warning was emitted by `DecFileParser`. |

When the hook finds a problem, pre-commit prints the validator output. A parser
error includes the source line and column pointer:

```text
Validate EvtGen decay files..............................................Failed
- hook id: decaylanguage-validate
- exit code: 1

DecayLanguage: 1 diagnostic(s) in 1 file(s)
tests/data/broken.dec:13:68: DLP001 parse-error: UnexpectedToken: Unexpected token Token('SIGNED_NUMBER', '2') at line 13, column 68.
13: 0.000044342 Upsilon pi0 pi0 VVPIPI;2 #[Reconstructed PDG2011]
^
summary: DLP001=1
```

Parser warnings are shorter:

```text
tests/data/example.dec: DLW004 missing-cdecay-source: missing Decay source for CDecay: anti-B0sig
summary: DLW004=1
```

By default, the validator prints up to 100 diagnostics and then summarizes the
rest. Use `--max-diagnostics=0` to print every diagnostic.

### Visualize decay files

The class `DecayChainViewer` allows the visualization of parsed decay chains:
Expand Down
6 changes: 6 additions & 0 deletions docs/api/dec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,9 @@ Decay file parser — :mod:`decaylanguage.dec`
.. automodule:: decaylanguage.dec.dec
:members:
:undoc-members:

Decay file validation — :mod:`decaylanguage.dec.validate`
---------------------------------------------------------

.. automodule:: decaylanguage.dec.validate
:members:
85 changes: 85 additions & 0 deletions docs/examples/decfile_parsing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,89 @@ Charge conjugation
By default, charge-conjugated decays are automatically included. This behavior
can be controlled at parse time.

Command-line validation
-----------------------

EvtGen ``.dec`` files can be validated without writing Python code:

.. code-block:: bash

decaylanguage-validate my-decay-file.dec
decaylanguage-validate path/to/decfiles-directory

The validator reports stable diagnostic codes. Exact codes or code families can
be disabled, which lets experiments choose their own pre-commit policy:

.. code-block:: bash

decaylanguage-validate --ignore=DLW004 my-decay-file.dec

Use ``decaylanguage-validate --list-diagnostics`` to inspect the currently
available diagnostics.

Available diagnostics:

.. list-table::
:header-rows: 1

* - Code
- Name
- Meaning
* - ``DLP001``
- ``parse-error``
- The file could not be parsed by ``DecFileParser``.
* - ``DLW001``
- ``duplicate-decay``
- A particle has multiple ``Decay`` blocks; only the first is retained.
* - ``DLW002``
- ``missing-copydecay-source``
- A ``CopyDecay`` statement references a missing ``Decay`` source.
* - ``DLW003``
- ``duplicate-cdecay``
- A particle is defined with both ``Decay`` and ``CDecay``; ``CDecay`` is ignored.
* - ``DLW004``
- ``missing-cdecay-source``
- A ``CDecay`` statement has no corresponding ``Decay`` source.
* - ``DLW005``
- ``self-conjugate-cdecay``
- A ``CDecay`` statement targets a self-conjugate particle.
* - ``DLW999``
- ``parser-warning``
- An otherwise unclassified warning was emitted by ``DecFileParser``.

When run through pre-commit, failures include the validator output. Parser
errors include the source location and a pointer:

.. code-block:: text

Validate EvtGen decay files..............................................Failed
- hook id: decaylanguage-validate
- exit code: 1

DecayLanguage: 1 diagnostic(s) in 1 file(s)
tests/data/broken.dec:13:68: DLP001 parse-error: UnexpectedToken: Unexpected token Token('SIGNED_NUMBER', '2') at line 13, column 68.
13: 0.000044342 Upsilon pi0 pi0 VVPIPI;2 #[Reconstructed PDG2011]
^
summary: DLP001=1

Parser warnings are reported more compactly:

.. code-block:: text

tests/data/example.dec: DLW004 missing-cdecay-source: missing Decay source for CDecay: anti-B0sig
summary: DLW004=1

By default, at most 100 diagnostics are printed before the remaining diagnostics
are summarized. Pass ``--max-diagnostics=0`` to print every diagnostic.

Downstream projects can use the packaged pre-commit hook:

.. code-block:: yaml

- repo: https://github.com/scikit-hep/decaylanguage
rev: <version>
hooks:
- id: decaylanguage-validate
args: ["--ignore=DLW004"]

For more detailed examples, see the :doc:`/examples/notebooks/index` section.
18 changes: 18 additions & 0 deletions docs/getting_started/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,24 @@ Use :class:`~decaylanguage.dec.dec.DecFileParser` to parse EvtGen-format ``.dec`

See :doc:`/examples/decfile_parsing` for more details.

Validate ``.dec`` files from the command line:

.. code-block:: bash

decaylanguage-validate my_decays.dec
decaylanguage-validate path/to/decfiles-directory

Use ``decaylanguage-validate --list-diagnostics`` to list selectable
diagnostic codes. Downstream pre-commit hooks can disable experiment-specific
codes with options such as ``--ignore=DLW004``.

On failure, pre-commit shows output such as:

.. code-block:: text

tests/data/example.dec: DLW004 missing-cdecay-source: missing Decay source for CDecay: anti-B0sig
summary: DLW004=1

Building and visualizing decay chains
-------------------------------------

Expand Down
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,9 @@ test = [
[project.urls]
Homepage = "https://github.com/scikit-hep/decaylanguage"

[project.scripts]
decaylanguage-validate = "decaylanguage.dec.validate:main"


[tool.hatch]
version.source = "vcs"
Expand Down
Loading
Loading