Skip to content

Add support for parsing decay descriptors#573

Open
admorris wants to merge 9 commits into
scikit-hep:mainfrom
admorris:descriptor_parsing
Open

Add support for parsing decay descriptors#573
admorris wants to merge 9 commits into
scikit-hep:mainfrom
admorris:descriptor_parsing

Conversation

@admorris

Copy link
Copy Markdown
Contributor

Implements the rest of #200

Added DecayChain.from_string method

Decay descriptors are parsed with Lark. A Transformer class converts them into DecayChainDict objects, which are then used to initialise DecayChain objects.

Custom descriptor formats can be used by pointing to another .lark file in an argument of DecayChain.from_string. These pretty much only have the freedom to modify ARROW, LPAR and RPAR. The rest of the structure is assumed by the Transformer. i.e. I did not find a way to support sub-decays written with the mother outside of braces like A -> B (-> C D) E

One glaring limitation (which is inherent to DecayChain/_build_decay_modes) is that sub-decays of identically named particles are not supported: e.g.. "B_s0 -> (phi -> K+ K-) (phi -> K+ K-)" will result in an exception. This could possibly be handled by adding internal/hidden uniqueness when duplicates are encountered.

@eduardo-rodrigues

Copy link
Copy Markdown
Member

Hi @admorris, I am not forgetting to check this. It's just that I have been working on urgent and important suff. Will get back to you very soon.

@eduardo-rodrigues eduardo-rodrigues added the enhancement New feature or request label May 1, 2026
Comment thread src/decaylanguage/data/descriptor.lark Outdated
Comment thread src/decaylanguage/data/descriptor.lark
Comment thread src/decaylanguage/decay/decay.py
Comment thread src/decaylanguage/decay/decay.py
return cls(mother, decay_modes)

@classmethod
def from_string(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point it would make sense to "synchronise" this from_string function with the existing to_string one, since they should effectively be the "mirror of each other". Else one would name this function to from_descriptor. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pytest test_from_string_to_string demonstrates they mirror eachother in the specific case of using the default grammar.

What if this function is renamed from_descriptor, then from_string just invokes from_descriptor with the default grammar?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds good to me.

Comment thread src/decaylanguage/decay/decay.py
Comment thread src/decaylanguage/decay/decay.py
Comment thread src/decaylanguage/decay/decay.py Outdated
Comment thread tests/data/descriptor_alt.lark

@eduardo-rodrigues eduardo-rodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this, @admorris! It's a really nice enhancement 👍.

I left a few little suggestions but this is looking excellent anway.

I am well aware of the limitation you point out. It does annoy me.

@eduardo-rodrigues

Copy link
Copy Markdown
Member

Hey @admorris, let me know your thoughts on this enhancement and whether you would prefer to have a follow-up for the limitations and some of the matters discussed above.

BTW, for the issue with parsing parenthesis for particle names. One thing that differentiates parentheses for particle names wrt parenthesis denoting decays is that the latter always contain an arrow inside, so between ( and ).

@eduardo-rodrigues

Copy link
Copy Markdown
Member

Hey. So aside the trivial comment above, I think there are only 2 things to get sorted:$

  • Add some more info/doc to the descriptor.lark file as it got reasonably complicated to deal with particle names involving parentheses.
  • Enhance the tests to include a couple of complicated particle names with parentheses, so names of the sort psi(2S) or Upsilon_3(1D) or anti-Lambda_b(5920)0.

@eduardo-rodrigues

Copy link
Copy Markdown
Member

Hi @admorris, you may have seen the major improvements made to the package recently, thanks to @henryiii. We are moving towards a 1.0 release ...

Let me know if you will be able to pick up the work here after a rebase.

FYI you can now do the following:

In [2]: dm_Bs = DecayMode(1, 'phi phi')
   ...: dm_phi = DecayMode(1, 'K+ K-')
   ...: dc = DecayChain('Bs', {'Bs':dm_Bs, 'phi':dm_phi})

In [3]: dc_dict = dc.to_dict()

In [4]: DecayChain.from_dict(dc_dict).to_dict() == dc_dict
Out[4]: True

In [5]: dc.to_string()
Out[5]: 'Bs -> (phi -> K+ K-) (phi -> K+ K-)'

@admorris

Copy link
Copy Markdown
Contributor Author

Very good! I will have a look

@admorris admorris force-pushed the descriptor_parsing branch from 2943a59 to c8eb6de Compare June 16, 2026 17:38
@henryiii

Copy link
Copy Markdown
Member

Whenever you are ready, I can do an Opus 4.8 or GPT 5.5 review of the PR if that sounds helpful. (if you are using one of those, or already ran a review with one, let me know and I'll do the other one, otherwise I'll do Opus because I've got higher limits on that)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants