A GNN-based NPM supply chain attack detection tool. Sentinel analyzes package metadata across dependency networks to flag potentially malicious packages. It trains on real-world NPM data and uses synthetic attack simulations to learn what suspicious activity looks like.
- GNN-based detection - Trains a Graph Neural Network on 41 package features (scripts, maintainers, downloads, naming patterns, dependency relationships, and more)
- Attack simulation - Generate realistic coordinated attacks (maintainer compromise, dependency injection, script injection) to test model performance
- Real-time monitoring - Poll any NPM package and its full dependency tree for updates, with optional threat detection on changes
- Data collection - Crawl the NPM registry and store full dependency networks in a Neo4j graph database
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtpython sentinel.pyThis opens an interactive menu with all options: data collection, training, evaluation, attack simulation, and live monitoring.
# Train
python app.py --train --use-real-data
# Evaluate
python app.py --eval --use-real-data--use-real-data pulls from the Neo4j database. Without it, the model uses local JSON sample files.
python data_crawler.py --samples <packages_file.txt>pytest -q --disable-warnings- Data collection - Crawls NPM registry packages and their full dependency trees, storing everything in Neo4j with Redis caching for visited nodes
- Graph construction - Converts package data into a PyTorch Geometric graph where packages are nodes and dependencies are edges
- Training - Feeds the graph into a 2-layer GCN with skip connections. Tracks loss to make sure the model is learning, not memorizing
- Evaluation - Tests the trained model on unseen data and reports accuracy, precision, recall, and F1
- Monitoring - Polls NPM for version changes in a dependency tree and runs the model against any updated packages
sentinel.py - Main CLI application
app.py - Direct model training/evaluation interface
data_crawler.py - NPM registry crawlers
db.py - Neo4j database layer
cache.py - Redis caching layer
helpers.py - Attack simulations and synthetic data generation
model/
model.py - GNN model (training, evaluation, save/load)
build_graph.py - Graph construction and feature extraction
samples/
benign/ - Benign package datasets
malicious/ - Malicious/attack simulation datasets
top*packages.txt - Package lists for data collection
tests/ - Automated test suite
static/ - Diagrams and screenshots
- Python, PyTorch Geometric, Rich, Neo4j, Redis, Pytest



