Skip to content

Conversation

@ZedThree
Copy link
Member

@ZedThree ZedThree commented Dec 5, 2025

We have historically been reluctant to add too many external dependencies, but with CMake and spack making it easier to pull them in and build them, I thought I'd try this library for generating stack traces.

Two useful things:

  1. it's very portable out of the box, so we don't need to rely on external utilities like addr2line and instead can always enable backtraces
  2. we can print snippets:
#3  in checkData(double)
    at /home/peter/Codes/BOUT++/BOUT-dev/include/bout/utils.hxx:493:65
     491: inline void checkData(BoutReal f) {
     492:   if (!std::isfinite(f)) {
   > 493:     throw BoutException("BoutReal: Operation on non-finite data");
                                                                          ^
     494:   }
     495: }
#4  in operator*(Field3D const&, double)
    at /home/peter/Codes/BOUT++/BOUT-dev/src/field/generated_fieldops.cxx:480:12
     478:   Field3D result{emptyFrom(lhs)};
     479:   checkData(lhs);
   > 480:   checkData(rhs);
                     ^
     481: 
     482:   result.setRegion(lhs.getRegionID());
#5  in Blob2D::rhs(double)
    at /home/peter/Codes/BOUT++/BOUT-dev/examples/blob2d/blob2d.cxx:177:41
     176:     ddt(n) = -bracket(phi, n, BRACKET_ARAKAWA) // ExB term
   > 177:              + 2 * DDZ(n) * (rho_s / R_c)      // Curvature term
                                                  ^
     178:              + D_n * Delp2(n);                 // Diffusion term
     179:     if (compressible) {

and can even enable colour:

image

It also has its own signal handler we could perhaps use instead of ours (though I've not tested this out).

Lastly, this is maybe a complete enough solution that we could also remove MsgStack/TRACE/AUTO_TRACE?

@ZedThree
Copy link
Member Author

Lots of CMake pain here

@bendudson
Copy link
Contributor

Thanks @ZedThree ! This does look nice, if the CMake pain can be overcome. I think if it works portably then it could replace all the MsgStack / AUTOTRACE code.

ZedThree and others added 3 commits January 5, 2026 16:43
Required because cpptrace prevents in-source builds

Also switch to using `subprocess.run` so we can better see errors
Seems to cause MPI problems?
@ZedThree
Copy link
Member Author

ZedThree commented Jan 6, 2026

I've removed AUTO_TRACE completely, and TRACE where it was just essentially repeating the function name (or something like intialising PETSc solver in PetscSolver::init).

That leaves uses like TRACE("PetscMatrix setting values at ({}, {})", petscRow, petscCol) which include function arguments or local variables that the stacktrace won't capture. I think they likely have some value for debugging, but quite limited?

Same goes for uses like TRACE("Creating Outer SOL communicators for Single Null operation") in BoutMesh. I've left them in for now, but I'm not sure how much more information they offer over the exception message + stacktrace.

Here's an example with CMAKE_BUILD_TYPE=RelWithDebInfo:

Error encountered: Stack trace (most recent call first):
#0 (filtered)
#1 (filtered)
#2 (filtered)
#3 in BoutMesh::createCommunicators()
   at /home/peter/Codes/BOUT++/BOUT-dev/src/mesh/impls/bout/boutmesh.cxx:709:64
     707:       }
     708:       // Unconditional exception for demo purposes
   > 709:       throw BoutException("Single null outer SOL not correct\n");      
                                                                         ^
     710:       MPI_Group_free(&group);
     711:     }
#4 in BoutMesh::load()
   at /home/peter/Codes/BOUT++/BOUT-dev/src/mesh/impls/bout/boutmesh.cxx:575:22
     573:   /// Communicator
     574: 
   > 575:   createCommunicators();
                               ^
     576:   output_debug << "Got communicators" << endl;
#5 in BoutInitialise(int&, char**&)
   at /home/peter/Codes/BOUT++/BOUT-dev/src/bout++.cxx:201:28
     199:   bout::globals::mesh = Mesh::create();
     200:   // Load from sources. Required for Field initialisation
   > 201:   bout::globals::mesh->load();
                                     ^
     202: 
     203:   // time_report options are used in BoutFinalise, i.e. after we
#6 in main
   at /home/peter/Codes/BOUT++/BOUT-dev/examples/elm-pb/elm_pb.cxx:2161:1
    2159: };
    2160: 
  > 2161: BOUTMAIN(ELMpb);
          ^
#7 (filtered)
#8 (filtered)
#9 (filtered)

====== Exception thrown ======
Single null outer SOL not correct


=== Additional information ===
 -> Creating Outer SOL communicators for Single Null operation on line 687 of '/home/peter/Codes/BOUT++/BOUT-dev/src/mesh/impls/bout/boutmesh.cxx'

And here's what it looks like with CMAKE_BUILD_TYPE=Release (no debug info):

Error encountered: Stack trace (most recent call first):
#0 (filtered)
#1 (filtered)
#2 (filtered)
#3 in BoutMesh::createCommunicators() [clone .cold]
   at /home/peter/Codes/BOUT++/BOUT-dev/build_release/lib/libbout++.so.5.2.0
#4 in BoutMesh::load()
   at /home/peter/Codes/BOUT++/BOUT-dev/build_release/lib/libbout++.so.5.2.0
#5 in BoutInitialise(int&, char**&)
   at /home/peter/Codes/BOUT++/BOUT-dev/build_release/lib/libbout++.so.5.2.0
#6 in main
   at /home/peter/Codes/BOUT++/BOUT-dev/examples/elm-pb/build/elm_pb
#7 (filtered)
#8 (filtered)
#9 (filtered)

====== Exception thrown ======
Single null outer SOL not correct


=== Additional information ===
 -> Creating Outer SOL communicators for Single Null operation on line 687 of '/home/peter/Codes/BOUT++/BOUT-dev/src/mesh/impls/bout/boutmesh.cxx'

Clearly the TRACE call is more useful when we don't have debug info, but if we recommend always building with it, it's probably not that useful?

@bendudson Thoughts on completely removing MsgStack? Maybe in a separate PR so we can more easily revert it if we change our minds?


CUDA tests are failing due to #3233

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

bout::globals::mesh = Mesh::create();
// Load from sources. Required for Field initialisation
bout::globals::mesh->load();
bout::globals::mpi = new MpiWrapper();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: assigning newly created 'gsl::owner<>' to non-owner 'MpiWrapper *' [cppcoreguidelines-owning-memory]

  bout::globals::mpi = new MpiWrapper();
  ^

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


#include <fmt/base.h>
#include <fmt/format.h>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: included header format.h is not used directly [misc-include-cleaner]

Suggested change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants