Welcome to the DSPS Analyse team's Data Science repository! This repository stores all of our Data Science projects.
This should be a frictionless installation process that works on various operating systems (macOS, Linux, Windows WSL) and handles all the dependencies.
Clone the repository (SSH)
git clone [email protected]:NHSDigital/dtos-analyse-data-science.gitThe following software packages, or their equivalents, are expected to be installed and configured:
- Docker container runtime or a compatible tool, e.g. Podman,
- GNU make 3.82 or later,
- pip package manager for Python
Note
The version of GNU make available by default on macOS is earlier than 3.82. You will need to upgrade it or certain make tasks will fail. On macOS, you will need Homebrew installed, then to install make, like so:
brew install makeYou will then see instructions to fix your $PATH variable to make the newly installed version available. If you are using dotfiles, this is all done for you.
- GNU sed and GNU grep are required for the scripted command-line output processing,
- GNU coreutils and GNU binutils may be required to build dependencies like Python, which may need to be compiled during installation,
Note
For macOS users, installation of the GNU toolchain has been scripted and automated as part of the dotfiles project. Please see this script for details.
Installation and configuration of the toolchain dependencies
make configInside the projects folder, each project has it's own README which explains the purpose of the project and how to install and run it.
To run tests on your local branch (these are the same tests that run automatically on commit, and remotely on GitHub)
make githooks-run
Each project in the projects folder is self-contained, with it's own README, pyproject.toml and Docker file. Projects can be developed on local machines, using Poetry for virtual environments, package and dependency management. Alternatively Podman/Docker can be used to run the scripts.
Contact screening-team-analyse-data-science on Slack
Unless stated otherwise, the codebase is released under the MIT License. This covers both the codebase and any sample code in the documentation.
Any HTML or Markdown documentation is © Crown Copyright and available under the terms of the Open Government Licence v3.0.