Skip to content

Migrate repository to the new EasyScience Copier templates#225

Open
AndrewSazonov wants to merge 36 commits intodevelopfrom
apply-new-templates
Open

Migrate repository to the new EasyScience Copier templates#225
AndrewSazonov wants to merge 36 commits intodevelopfrom
apply-new-templates

Conversation

@AndrewSazonov
Copy link
Member

This PR migrates the repository to the new EasyScience Copier templates.

@AndrewSazonov AndrewSazonov added the [scope] maintenance Code/tooling cleanup, no feature or bugfix (major.minor.PATCH) label Mar 5, 2026
@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

Copy link
Member

@rozyczko rozyczko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • potential issue with non-tag builds
  • actual issue with source-level import

Comment on lines +125 to +126
if: startsWith(github.ref , 'refs/tags/v') != true
run: git tag --delete $(git tag)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI workflow can fail on non-tag builds when no local tags are fetched

if git tag returns null (when checking out non-tagged commits), empty git tag --delete exits with non-zero error code, failing the job before wheel is built.

Maybe add guards?

tags=$(git tag)
if [ -n "$tags" ]; then
  git tag --delete $tags
fi

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a good catch, thank you. I will add this guard.

Also, looking again at this step, I remember that it was originally added to ensure that the package built in this CI workflow gets a version higher (999.0.0) than the latest one published on PyPI. This way pip would prefer the locally built package rather than installing it from PyPI. This actually happened a few times: after publishing a new release (e.g. 2.2.0), the tagged master branch was not merged back into develop, and a feature branch was created from develop that still had the previous tag (e.g. 2.1.0).

However, now we have an automated backmerge into develop, so as long as that backmerge does not fail (or is fixed quickly if it does), new feature branches should be created from a develop branch that already contains the latest tag (e.g. 2.2.0). During development the version will then look like, for example, 2.2.0+dev37 (see my other comment with more explanation). This version is already higher than 2.2.0.

So in this case the step that deletes local tags to force falling back to the default 999.0.0 version seems no longer necessary. What do you think?

Copy link
Contributor

@damskii9992 damskii9992 Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it is still very likely that you work in a feature branch which hasn't had the latest develop branch merged into it. Either because you as a developer simply forgot to rebase it, or because something in the most recent develop branch interferes with your current feature branch, but you wanna test it before also rebasing it.
So I think it is prudent to keep this step. To protect us against ourselves.

Comment on lines +20 to +21
__version__ = version('easyscience')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of explicit version number (relying on dist-info) can be confusing.
After installing the source, the queried version turns out to be 2.2.0+dev37
Yet there is no dev37 anywhere in the tree. How are we supposed to know the version of the source we work with? It is always good to check __version__.py or the content of pyproject.toml and see what version is there.
To assure we are working on the current source, to say the least.

The package now depends on explicitly installed distribution metadata.
When importing from source, I get

Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    import easyscience
  File "C:\Users\piotr\projects\easy\p312_1\corelib\src\easyscience\__init__.py", line 20, in <module>
    __version__ = version('easyscience')
  File "c:\Anaconda3\envs\sdk\Lib\importlib\metadata\__init__.py", line 987, in version
    return distribution(distribution_name).version
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "c:\Anaconda3\envs\sdk\Lib\importlib\metadata\__init__.py", line 960, in distribution
    return Distribution.from_name(distribution_name)
           ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "c:\Anaconda3\envs\sdk\Lib\importlib\metadata\__init__.py", line 409, in from_name
    raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: No package metadata was found for easyscience

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an intentional decision to remove the hardcoded version from __version__.py or pyproject.toml to simplify the process. In practice, the manual change before every new release tends to get out of sync with Git tags or is simply forgotten during development.

With versioningit, the version is derived directly from the Git history (tags + commit distance), so every build corresponds exactly to a particular state of the repository.

For example, the version you saw 2.2.0+dev37 means 37 commits after the last tag 2.2.0. This number comes directly from Git. So dev37 is not something stored in the tree - it is calculated from the commit history.

It is similar to running:

git describe --tags --long

which in this case returns:

v2.2.0-37-gd6a19b2

The new version is not known in advance, so using the previous tag with the +devN suffix makes sense. The next tag/version will be determined automatically from the labels of PRs merged into the develop branch when a draft for the new release is created via CI. Manually adding a hardcoded version into __version__.py or pyproject.toml after that is a pain and also error-prone. Without manual changes, you do not need to remember that the version has to be modified somewhere before a release, which is really nice. Comparing this with my old experience of doing this in easydiffraction, I am really happy with this approach.

The importlib approach is also a more modern way of doing this. Instead of keeping a separate __version__.py with hardcoded __version__ in the source, the recommended way is to read the version from the installed package metadata using importlib.metadata.

from importlib.metadata import version
version("easyscience”)

This reads the version from the metadata generated during installation (dist-info). The importlib.metadata module is part of the Python standard library since Python 3.8 and is intended to provide access to installed package metadata.

More:

And now, let’s get back to the real issue you encountered during the development process. Are you using the Pixi-based development workflow, which we agreed should be the default and which is described in the new Contributing Guide, or are you running this directly from a conda environment + pip?

If it is done via Pixi, it should in principle work without issues. In that workflow the package is installed from the repository root in editable mode. This ensures that:

  • the package is available directly from the source tree

  • distribution metadata (dist-info) is created
  • importlib.metadata works correctly

Then easyscience is imported smoothly for me, and I would expect it to work on other platforms as well. I just tried to redo the main steps in a separate location with:

git clone https://github.com/easyscience/core.git
cd core
git checkout apply-new-templates
pixi install
pixi run post-install

And then the following produces no errors:

pixi run python -c "import easyscience"

And this one returns the correct version 2.2.0+dev37:

pixi run python -c "from importlib.metadata import version; v = version('easyscience'); print(v)"

The latter one, by the way, could be assigned to a new Pixi task, something like:

pixi run version

But to understand what is going wrong in your case, could you share more details about your setup and the steps you follow when the error happens?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. It is very detailed and maybe you should add it to the ADR, since it is also informative.

As for the error, I said

When importing from source

Usually, when I cd to src, I should be able to just import the given module, assuming all requirements are installed.
That is,

git clone easyscience
cd easyscience
pip install requirements.txt (assuming there is one!)
cd src
python
>>> import easyscience
>>> easyscience.version
'2.2.0'

This is a quick, install-free way of checking/running a module.

Now, this becomes impossible and will have to be explicitly mentioned in the installation docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is especially common when we want to quickly test things without installing the module, by e.g. adding

PYTHONPATH=<module>/src

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the scenario you described, when you import directly from src/ without installing the package, why do you even need to check the version?

You cloned a specific branch or commit and you are importing from that source tree, so you already know what you are importing from, right? Why would the version be needed in this case?

Moreover, if this is not a production release, you would likely see the old hardcoded version from the previous release anyway. So that version string would not really reflect the current state of the code.

So I’m not sure what the benefit would be in this situation. Why is the version needed here at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at Scipp, they use setuptools_scm to provide dynamic versioning. Should we maybe use this standard build-tool instead of this introspection?

Well, I have no experience with setuptools_scm, but from a quick look it seems to do basically the same thing as what I am suggesting here with another tool called versioningit. They both:

  • extracts Python package versions from git metadata
  • produce versions for builds/installations automatically
  • allow to avoid manually editing __version.py__ or pyproject.toml for every release

setuptools-scm seems to be centered around setuptools, which is the build backend used in Scipp. In our case we use hatch/hatchling, where versioningit is smoothly integrated.

So I am wondering what the advantage would be of using setuptools-scm together with hatch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To manually change the version, you'd then just have to edit it on the Develop branch. It's really not that complicated . . . Yes, you could encounter issues with people committing to develop, overwriting this version, before the PR to Master is merged. But you'd have a merge-conflict then which the person would then on purpose have to resolve wrong. And also, we don't exactly expect Develop -> Master PR's to live for very long. They're usually merged in almost instantly.

This is what you are suggesting:

  1. “simply have a workflow which runs on PR creation on the master branch, which updates this hard-coded value, and commits it directly to develop”
  2. If the version needs to be corrected, “manually change the version … on the Develop branch”
  3. Make sure no one is “committing to develop, overwriting this version, before the PR to Master is merged”
  4. If this happens, “you’d have a merge-conflict … to resolve”

Each individual step may not look complicated, but we still need to implement, maintain and keep in mind all of this. And the question is: why?

With Git-based versioning all of this simply disappears and is not needed at all. The version is derived automatically from the repository history, so there is nothing to update manually and no extra workflow logic required to keep the version in sync. And, most importantly, you never forget to do something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea: update the file on tag creation. No manual commits, no bad timing issues

  • name: Update version from tag
 if: startsWith(github.ref, 'refs/tags/')
 run: |
   TAG=${GITHUB_REF#refs/tags/v}
   sed -i "s/__version__ = .*/__version__ = \"$TAG\"/" src/easyscience/__version__.py

On tag push the action takes the existing tag, removes v and overwrites the file in place.
We can place this step after actions/checkout but before any build/publish steps.

So in our workflow, this means:

  1. I click Publish release
  2. GitHub creates/pushes the tag
  3. this workflow starts
  4. it modifies src/easyscience/__version__.py

But at this point the tag already points to the previous commit, not to the commit with the updated __version__.py.

So the version file would be changed after tagging, which means the source code and the tag would be out of sync. In addition, the source artifacts attached to this release on the GitHub page would still contain the old __version__.py.

This is exactly why Git-based versioning is cleaner: the version is derived from the tag itself, so there is no need to patch source files after tagging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is in one, well defined place, like __version__.py or project.toml, then no manual search is necessary. One location is all that's needed.

But what if you contribute to multiple projects with different rules? It is not always easy to keep in mind how each project handles the version.

In that case, asking Git for the version is not really more difficult than remembering where a particular project stores its version file.

And after all, what is the strong need to inspect this version? As I mentioned above, if you clone a specific branch or commit and import from that source tree, so you already know what you are importing from, right? Why would the version be needed in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

Copy link
Member

@rozyczko rozyczko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[scope] maintenance Code/tooling cleanup, no feature or bugfix (major.minor.PATCH)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants