Sentinel

A GNN-based NPM supply chain attack detection tool. Sentinel analyzes package metadata across dependency networks to flag potentially malicious packages. It trains on real-world NPM data and uses synthetic attack simulations to learn what suspicious activity looks like.

Features

GNN-based detection - Trains a Graph Neural Network on 41 package features (scripts, maintainers, downloads, naming patterns, dependency relationships, and more)
Attack simulation - Generate realistic coordinated attacks (maintainer compromise, dependency injection, script injection) to test model performance
Real-time monitoring - Poll any NPM package and its full dependency tree for updates, with optional threat detection on changes
Data collection - Crawl the NPM registry and store full dependency networks in a Neo4j graph database

Quick Start

Setup

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run the CLI

python sentinel.py

This opens an interactive menu with all options: data collection, training, evaluation, attack simulation, and live monitoring.

Run the model directly

# Train
python app.py --train --use-real-data

# Evaluate
python app.py --eval --use-real-data

--use-real-data pulls from the Neo4j database. Without it, the model uses local JSON sample files.

Run data crawlers

python data_crawler.py --samples <packages_file.txt>

Run tests

pytest -q --disable-warnings

How it works

Data collection - Crawls NPM registry packages and their full dependency trees, storing everything in Neo4j with Redis caching for visited nodes
Graph construction - Converts package data into a PyTorch Geometric graph where packages are nodes and dependencies are edges
Training - Feeds the graph into a 2-layer GCN with skip connections. Tracks loss to make sure the model is learning, not memorizing
Evaluation - Tests the trained model on unseen data and reports accuracy, precision, recall, and F1
Monitoring - Polls NPM for version changes in a dependency tree and runs the model against any updated packages

Project structure

sentinel.py        - Main CLI application
app.py             - Direct model training/evaluation interface
data_crawler.py    - NPM registry crawlers
db.py              - Neo4j database layer
cache.py           - Redis caching layer
helpers.py         - Attack simulations and synthetic data generation
model/
  model.py         - GNN model (training, evaluation, save/load)
  build_graph.py   - Graph construction and feature extraction
samples/
  benign/          - Benign package datasets
  malicious/       - Malicious/attack simulation datasets
  top*packages.txt - Package lists for data collection
tests/             - Automated test suite
static/            - Diagrams and screenshots

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentinel

Features

Quick Start

Setup

Run the CLI

Run the model directly

Run data crawlers

Run tests

How it works

Project structure

Tech stack

Diagrams

Model

Data Crawler

Graph Database

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
model		model
old_models		old_models
samples		samples
static		static
tests		tests
.gitignore		.gitignore
README.md		README.md
app.py		app.py
cache.py		cache.py
data_crawler.py		data_crawler.py
db.py		db.py
gnn_model.pt		gnn_model.pt
helpers.py		helpers.py
requirements.txt		requirements.txt
sentinel.py		sentinel.py

Folders and files

Latest commit

History

Repository files navigation

Sentinel

Features

Quick Start

Setup

Run the CLI

Run the model directly

Run data crawlers

Run tests

How it works

Project structure

Tech stack

Diagrams

Model

Data Crawler

Graph Database

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages