steno is a personal project to digitize my stenographic writings.
In my youth most of my notes were written in shorthand (Duployer system). This project aims to digitize old notes and integrate them into the current note system (org-roam).
The method would contain 2 steps:
- scan the notes page by page
- transform each image into a text file
Another goal of this project is to check the viability of basilisp (clojure on python vm). Python has many interesting libraries (especially in the science part) but has horrible syntax so basilisp could be a very good solution.
The application should be a simple pipe:
- extractor
- split the page image in word images
- image-processor
- clean and simplify the word image
- converter
- convert the word image in a sequence of numbers
- translator
- convert the number sequence in a string of chars
- Install
- direnv
- nix (https://nixos.org/download/#nix-install-linux)
- Create an
.envrc.localfile (see .envrc.local.example). - In the project folder run:
direnv allowfirst time it will be a long process to download all packages and libraries.
- Install
- python 3.12+
- uv (https://docs.astral.sh/uv/getting-started/installation/)
- babashka
- cljfmt
- kondo
- Create manually the user variables defined in
.envrc.local.example. - In the project folder run:
uv venv uv sync
The application could be run with the command:
bb app <params>To see the params available run:
bb app -hTo format the code run:
bb formatTo lint the code run:
bb kondoTo make sure that no unformatted commits with lint errors end up in the main branch run initially:
git config --local core.hooksPath ./githooksThe pre-push script will block the push if there are style or lint errors in code.
- https://en.wikipedia.org/wiki/Duployan_shorthand
- https://opencv.org/
- https://theailearner.com/tag/skeletonization-opencv/
- https://github.com/Wesley-Li/skeleton
- https://docs.opencv.org/4.x/d9/d61/tutorial_py_morphological_ops.html
This project is released under the GNU General Public License. See the file for details.