Skip to content
Open

4.0 #402

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
621 commits
Select commit Hold shift + click to select a range
7920c78
kaiming init
jsuarez5341 Mar 3, 2026
dd8f639
delete ortho init. Speculative
jsuarez5341 Mar 3, 2026
b27de98
cleanup muon
jsuarez5341 Mar 4, 2026
fc9f98c
Initial pufferl refactor training
jsuarez5341 Mar 5, 2026
649826a
update log format
jsuarez5341 Mar 6, 2026
2fb2fb7
sweep keys
jsuarez5341 Mar 6, 2026
760c6e4
dtype fix
jsuarez5341 Mar 6, 2026
426d6e8
tsnee
jsuarez5341 Mar 6, 2026
c6e3fc6
Fix rare norm bug
jsuarez5341 Mar 6, 2026
81bdfc6
constellation fixes
jsuarez5341 Mar 7, 2026
fbacc09
prevent tooltip drawing offscreen
jsuarez5341 Mar 7, 2026
0d82a03
pong
jsuarez5341 Mar 7, 2026
30a5a21
Begin refactor constellation
jsuarez5341 Mar 7, 2026
cd0fa4c
clean ui
jsuarez5341 Mar 7, 2026
a3c3edf
UI cleanup
jsuarez5341 Mar 7, 2026
9e6e431
temp fix color scale
jsuarez5341 Mar 8, 2026
fc09814
merge precision_t kernels and prune dead code
jsuarez5341 Mar 8, 2026
069f3df
purge check macros
jsuarez5341 Mar 10, 2026
fd28ef8
delete transpose indirection
jsuarez5341 Mar 10, 2026
4fbbc36
Initial cudnn conv + nmmo encoder
jsuarez5341 Mar 10, 2026
d2bbec6
refactor kernels
jsuarez5341 Mar 12, 2026
2966410
temp determ fix
jsuarez5341 Mar 12, 2026
dd4c184
refactor muon -> simplify, keep more ops in precision_t. Changes nume…
jsuarez5341 Mar 12, 2026
8c4117e
merge grad clip into muon
jsuarez5341 Mar 12, 2026
3063ff1
refactor
jsuarez5341 Mar 13, 2026
403cec0
more refactor
jsuarez5341 Mar 13, 2026
dff4b0e
more refactors
jsuarez5341 Mar 13, 2026
5f7784d
nccl bind
jsuarez5341 Mar 13, 2026
e5a0139
Stable multigpu
jsuarez5341 Mar 13, 2026
770c270
minor refactor
jsuarez5341 Mar 13, 2026
c8be914
:qMerge branch 'static-native' of https://github.com/pufferai/pufferl…
jsuarez5341 Mar 13, 2026
432e4f1
more refactor
jsuarez5341 Mar 14, 2026
90512a8
minor
jsuarez5341 Mar 14, 2026
29d746a
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Mar 14, 2026
f0f4df6
cursed merge fix
jsuarez5341 Mar 14, 2026
6be557c
Merge pull request #498 from PufferAI/static-native
jsuarez5341 Mar 14, 2026
695b116
remove structs file
jsuarez5341 Mar 14, 2026
28b0ad1
zero the damn rewards and terms for you
jsuarez5341 Mar 14, 2026
7901b82
Per-type tensors = minus a ton of casts. Had to get obs_dtype from sy…
jsuarez5341 Mar 14, 2026
27b45db
Small refactors
jsuarez5341 Mar 15, 2026
b940322
nmmo fixes - maybe will train?
jsuarez5341 Mar 15, 2026
38b6821
tensor
jsuarez5341 Mar 15, 2026
dd38699
initial port of python backend to match latest cuda. breakout trains.…
jsuarez5341 Mar 16, 2026
752b95e
binding files
jsuarez5341 Mar 17, 2026
7483468
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Mar 17, 2026
e59b6a7
blah
l1onh3art88 Mar 17, 2026
2d9a7c7
nmmo3 sota (pretty sure) python only
jsuarez5341 Mar 17, 2026
29a93be
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Mar 17, 2026
4dbddd4
init scale changes
jsuarez5341 Mar 17, 2026
f9ac3af
cuda cache
jsuarez5341 Mar 17, 2026
fe83cfa
fix threadlocal
jsuarez5341 Mar 17, 2026
a50d9ff
cursed conv
jsuarez5341 Mar 17, 2026
be6a29c
cursed cudnn
jsuarez5341 Mar 17, 2026
15667a9
stupid idiot fallback
jsuarez5341 Mar 18, 2026
3aed630
Working nmmo3! +muon scale changes, but main diff was just stupid im2…
jsuarez5341 Mar 18, 2026
d99dfc8
better sweep caching
jsuarez5341 Mar 19, 2026
550ff94
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Mar 19, 2026
fccba07
bfloat atns
jsuarez5341 Mar 19, 2026
8adcc9e
fix determinism
jsuarez5341 Mar 19, 2026
b936162
minor cleanup
jsuarez5341 Mar 20, 2026
0336926
Merge pull request #501 from PufferAI/bfloatatns
jsuarez5341 Mar 20, 2026
91e4ce9
minor refactor
jsuarez5341 Mar 20, 2026
d9241e6
Merge branch 'PufferAI:4.0' into 4.0
l1onh3art88 Mar 20, 2026
8d2752e
refactors
jsuarez5341 Mar 20, 2026
1674bea
forgot binds
jsuarez5341 Mar 20, 2026
eeddceb
Temp commit. Fixed determinism w/ rng but bloated
jsuarez5341 Mar 20, 2026
6bd83a9
Remove per-thread cublas handles. It is bloat and deterministic witho…
jsuarez5341 Mar 20, 2026
43d603f
Refactor - remove decoder bias
jsuarez5341 Mar 20, 2026
f26d98e
small refactors
jsuarez5341 Mar 21, 2026
2ff0f8d
nmmo3 float
jsuarez5341 Mar 21, 2026
275f492
quick nmmo fix
jsuarez5341 Mar 21, 2026
df4fe4c
sweep config
jsuarez5341 Mar 21, 2026
54822f0
fix axis cutoff
jsuarez5341 Mar 21, 2026
b45526f
sweep settings
jsuarez5341 Mar 21, 2026
94f9256
Fix python version (needed float32 atns)
jsuarez5341 Mar 23, 2026
982d0d4
local changes to make it work
Mar 23, 2026
9982626
working sweeps
Mar 23, 2026
bdf8ad0
Initial python refactor
jsuarez5341 Mar 23, 2026
289f1e3
Drop vecenv, emulation, gymansium, pz
jsuarez5341 Mar 23, 2026
f5dfce7
Move everything
jsuarez5341 Mar 23, 2026
0222988
easy config
Mar 23, 2026
0979617
Move configs
jsuarez5341 Mar 23, 2026
2d70d9b
Remove trash
jsuarez5341 Mar 23, 2026
2ce8c42
remove definitely dead tests
jsuarez5341 Mar 23, 2026
11586a2
Delete torch ext crap
jsuarez5341 Mar 23, 2026
d2471de
dead scripts
jsuarez5341 Mar 23, 2026
5200492
setup cleanup
jsuarez5341 Mar 23, 2026
4ba3ba4
cleanup torch models
jsuarez5341 Mar 23, 2026
5c811c6
small fixes
jsuarez5341 Mar 23, 2026
e72370b
cleanups
jsuarez5341 Mar 23, 2026
a3c1a90
delete more
jsuarez5341 Mar 23, 2026
a030af8
minor
jsuarez5341 Mar 23, 2026
fbd52ce
drop no build isolation
jsuarez5341 Mar 23, 2026
58b3fc9
uh forgot src
jsuarez5341 Mar 23, 2026
1292b81
toml license
jsuarez5341 Mar 23, 2026
032a61a
fix ocean
jsuarez5341 Mar 24, 2026
41469f3
pybind11?
jsuarez5341 Mar 24, 2026
0aa5bdf
khr compile fix
jsuarez5341 Mar 24, 2026
f51cb87
build fixes for ocean
jsuarez5341 Mar 24, 2026
e137d95
Update manifest
jsuarez5341 Mar 24, 2026
a320b24
fuck you setup.py!
jsuarez5341 Mar 24, 2026
88d0e20
Nice simple build script!
jsuarez5341 Mar 24, 2026
f931f3f
single build script
jsuarez5341 Mar 24, 2026
5b5c217
Some refactors, needs more work
jsuarez5341 Mar 24, 2026
5345067
Old extensions
jsuarez5341 Mar 24, 2026
c3c0de8
.ini
Mar 24, 2026
ff8e6c6
adjust scoring metrics
Mar 24, 2026
a1f84ab
adjust scoring metrics
Mar 24, 2026
3cc928d
remove locally changed files
Mar 24, 2026
f95a39b
revert nmmo3.ini
Mar 24, 2026
9caed86
revert toml changes
Mar 24, 2026
1266be1
threads
Mar 24, 2026
eb4c17c
Jonah's safe_logit
jsuarez5341 Mar 24, 2026
d7b33a4
Initial profile update
jsuarez5341 Mar 24, 2026
713c659
profile updates
jsuarez5341 Mar 24, 2026
3268630
load map changes to remove duplition on lots of inits and latest ini …
Mar 24, 2026
b4badb3
Update profiling
jsuarez5341 Mar 24, 2026
c79ed99
delete old profile
jsuarez5341 Mar 24, 2026
8ecc862
refactor
jsuarez5341 Mar 24, 2026
9c7bd6c
Move more stuff around
jsuarez5341 Mar 24, 2026
94511ba
fix eval
jsuarez5341 Mar 24, 2026
c1c31a2
refactor errors
jsuarez5341 Mar 24, 2026
f2008d2
Log frequency
jsuarez5341 Mar 24, 2026
a120550
latest
l1onh3art88 Mar 25, 2026
4d2787f
100x data load speed, severl fixes
jsuarez5341 Mar 25, 2026
8c17929
filter fig4
jsuarez5341 Mar 25, 2026
512fd3a
prune old code
jsuarez5341 Mar 25, 2026
cbc13d6
constellation build
jsuarez5341 Mar 25, 2026
eb93927
prune trash
jsuarez5341 Mar 26, 2026
bb15a59
move stuff around a bit
jsuarez5341 Mar 26, 2026
4e0c951
CPU fallback for mac scrubs
jsuarez5341 Mar 26, 2026
6bbcdf0
Fix hardcoded CUDA path and stuff that installs cudnn dependency.
daphne-cornelisse Mar 28, 2026
9e23fa3
Fix OBS_TENSOR build error.
daphne-cornelisse Mar 28, 2026
c0f378c
Required 4.0 env changes for drive.
daphne-cornelisse Mar 28, 2026
86c8faf
Safeguard to prevent segfault if binaries are not stored at the right…
daphne-cornelisse Mar 28, 2026
dd7b2cb
don't try to pickle backend
jsuarez5341 Mar 29, 2026
90d08e0
delete old nv flag
jsuarez5341 Mar 30, 2026
2b665d3
Merge branch 'PufferAI:4.0' into 4.0
l1onh3art88 Mar 30, 2026
fc6800f
Trailer progress
jsuarez5341 Mar 30, 2026
cd4f19a
Decent progress
jsuarez5341 Mar 31, 2026
3d75952
Small timing fixes:
jsuarez5341 Mar 31, 2026
e74e33b
Merge branch 'PufferAI:4.0' into 4.0
l1onh3art88 Mar 31, 2026
ac977eb
Data processing script with instructions.
daphne-cornelisse Mar 31, 2026
f3647f7
Trailer + constellation shader updates
jsuarez5341 Mar 31, 2026
599594b
Delete legacy bindings.h
daphne-cornelisse Mar 31, 2026
13ccdfd
Delete hardcoded logic for 8 agents. Can train at 1.9M SPS.
daphne-cornelisse Mar 31, 2026
b52dfb7
Add datapaths in .gitignore.
daphne-cornelisse Mar 31, 2026
7431f0f
Fix: Ensure save_map_binary() only has matching attributes.
daphne-cornelisse Mar 31, 2026
55f59c9
Provide map_dir.
daphne-cornelisse Mar 31, 2026
b9f3443
Typo fix.
daphne-cornelisse Mar 31, 2026
24ccfdd
move configs
jsuarez5341 Mar 31, 2026
da731f1
Merge pull request #508 from daphne-cornelisse/4.0
jsuarez5341 Mar 31, 2026
91d5128
Major bug fix on rendering; integrate initial drive
jsuarez5341 Mar 31, 2026
eb321c3
Iterate through multiple maps.
daphne-cornelisse Mar 31, 2026
2eed0f2
Use 1k maps dataset for benchmarking
daphne-cornelisse Apr 1, 2026
10e1105
moba port
jsuarez5341 Apr 1, 2026
9dd827e
Merge remote-tracking branch 'upstream/4.0' into 4.0
daphne-cornelisse Apr 1, 2026
65fced5
patch
jsuarez5341 Apr 1, 2026
ad78fc6
old drone
FinlaySanders Apr 1, 2026
5220174
new drone
FinlaySanders Apr 1, 2026
528c45c
fix: continuous action logstd indexing in ppo kernel
FinlaySanders Apr 1, 2026
c0364b1
rename drone
FinlaySanders Apr 1, 2026
dfbc4a1
Merge remote-tracking branch 'upstream/4.0' into 4.0
daphne-cornelisse Apr 1, 2026
dac58ec
Binding fix: use max_agents instead of num_agents.
daphne-cornelisse Apr 1, 2026
a824475
Drive sweep configs.
daphne-cornelisse Apr 1, 2026
8e605c5
Small changes I had to make to run sweeps.
daphne-cornelisse Apr 1, 2026
b53fe5a
Clean up drive env: Remove magic values and legacy code.
daphne-cornelisse Apr 1, 2026
254dd4b
moba race fix
jsuarez5341 Apr 1, 2026
11ff42a
faster rng
FinlaySanders Apr 1, 2026
b1966b0
terraform ported and ready to sweep
jsuarez5341 Apr 1, 2026
5d80f05
ported tower climb
jsuarez5341 Apr 1, 2026
f6f64d3
better score metric
FinlaySanders Apr 1, 2026
57f6a12
squared continuous test env
jsuarez5341 Apr 1, 2026
33fb81e
Merge pull request #511 from FinlaySanders/4.0
jsuarez5341 Apr 1, 2026
168b2a9
Merge pull request #510 from daphne-cornelisse/4.0
jsuarez5341 Apr 1, 2026
d09e1ca
Merge pull request #512 from daphne-cornelisse/4.0
jsuarez5341 Apr 1, 2026
e15e424
drive tweaks
jsuarez5341 Apr 1, 2026
d4f4ff2
full solve
FinlaySanders Apr 2, 2026
378cff7
fix nonetype
jsuarez5341 Apr 2, 2026
46c9d20
latest
l1onh3art88 Apr 2, 2026
a859c9a
Merge pull request #513 from FinlaySanders/4.0
jsuarez5341 Apr 2, 2026
ade25d2
Many small fixes
jsuarez5341 Apr 2, 2026
ab75881
refactor build
jsuarez5341 Apr 2, 2026
cdf68b0
Minor fixes
jsuarez5341 Apr 2, 2026
bbbf27f
vendor minshell
jsuarez5341 Apr 2, 2026
9f19d5e
test env fixes
jsuarez5341 Apr 2, 2026
b554919
trailer
jsuarez5341 Apr 3, 2026
4886ac8
g2048
jsuarez5341 Apr 3, 2026
5ef7fdb
default profile breakout
jsuarez5341 Apr 3, 2026
613b19d
vendo ini files
jsuarez5341 Apr 3, 2026
9b5d6d1
fix vendor
jsuarez5341 Apr 3, 2026
4a2f231
fixes
jsuarez5341 Apr 3, 2026
42b3509
logs ignore
jsuarez5341 Apr 3, 2026
5ced979
robust arch
jsuarez5341 Apr 3, 2026
0f17fa6
don't fail build
jsuarez5341 Apr 3, 2026
0c20d2e
build fix
jsuarez5341 Apr 3, 2026
d6c7525
cache
jsuarez5341 Apr 3, 2026
8e51417
cache
jsuarez5341 Apr 3, 2026
a919ca7
PufferNet fixes. pong, breakout, moba local pols. Pong fixes
jsuarez5341 Apr 4, 2026
a059fa6
env updates for 4.0
l1onh3art88 Apr 4, 2026
579f978
models
jsuarez5341 Apr 4, 2026
7039a9e
conflicts
jsuarez5341 Apr 4, 2026
1ae8df6
Merge pull request #517 from PufferAI/l1onh3art88-4.0
jsuarez5341 Apr 4, 2026
3a53e10
Nmmo3 model port
jsuarez5341 Apr 4, 2026
4126c9b
Initial env bind updates
jsuarez5341 Apr 4, 2026
cfbe174
testing go
jsuarez5341 Apr 4, 2026
5b8f120
Merge branch '4.0' of https://github.com/pufferai/pufferlib into 4.0
jsuarez5341 Apr 4, 2026
c00352e
Some tuned runs
jsuarez5341 Apr 4, 2026
bfb1d22
rware stuff blah
l1onh3art88 Apr 4, 2026
bae4041
git is stupid
l1onh3art88 Apr 4, 2026
8973613
Merge pull request #518 from l1onh3art88/4.0
jsuarez5341 Apr 4, 2026
92ec53a
tuned models
jsuarez5341 Apr 4, 2026
6aa0877
tuned models
jsuarez5341 Apr 4, 2026
d040db1
g2048
jsuarez5341 Apr 4, 2026
40fc193
go
l1onh3art88 Apr 5, 2026
12b0cef
Merge pull request #519 from l1onh3art88/4.0
jsuarez5341 Apr 5, 2026
c626bb1
go
jsuarez5341 Apr 5, 2026
344ba04
rlights
jsuarez5341 Apr 5, 2026
d276cb8
vendor
jsuarez5341 Apr 5, 2026
4717894
tower climb pol
jsuarez5341 Apr 5, 2026
e177bf8
major envs refactors
jsuarez5341 Apr 5, 2026
460a269
drive
jsuarez5341 Apr 5, 2026
4ae0b12
merge
jsuarez5341 Apr 5, 2026
e87175a
merge
jsuarez5341 Apr 5, 2026
87e8941
Drive tuned
jsuarez5341 Apr 5, 2026
cf61787
drive, tower_climb fixes
jsuarez5341 Apr 5, 2026
2552532
drive map
l1onh3art88 Apr 5, 2026
e8b4263
Merge pull request #520 from l1onh3art88/4.0
jsuarez5341 Apr 5, 2026
e57d709
trailer
jsuarez5341 Apr 5, 2026
c35b862
trailer
jsuarez5341 Apr 5, 2026
40594a5
remove old .h files
jsuarez5341 Apr 5, 2026
a2b2dc7
minor
jsuarez5341 Apr 5, 2026
75d28c3
readme
jsuarez5341 Apr 5, 2026
d21a161
small experiments file
jsuarez5341 Apr 5, 2026
d68e04a
feat: implement hex
Egiob Apr 8, 2026
b19da35
wip: better heuristic
Egiob Apr 9, 2026
1ffc53e
remove outdated installation tests
PLAZMAMA Apr 10, 2026
50cd8f6
better hyperparams
Egiob Apr 12, 2026
5ec8cdf
better hyperparams
Egiob Apr 12, 2026
8371961
boxoban port
jsuarez5341 Apr 12, 2026
6927c20
Compress assets
jsuarez5341 Apr 12, 2026
09adc2d
Merge pull request #526 from PufferAI/TBBristol-4.0-boxoban
jsuarez5341 Apr 12, 2026
6dfb5f7
Merge pull request #524 from PLAZMAMA/remove_outdated_install_tests
jsuarez5341 Apr 12, 2026
eb1ebea
better heuristic
Egiob Apr 12, 2026
1b13827
Overcooked port
jsuarez5341 Apr 12, 2026
483f1bf
Merge pull request #525 from Egiob/env_hex
jsuarez5341 Apr 12, 2026
40c2ff4
minor cleanups
jsuarez5341 Apr 12, 2026
aed88e5
Lights out
jsuarez5341 Apr 12, 2026
82346a3
odd config
jsuarez5341 Apr 12, 2026
5577300
boxoban
jsuarez5341 Apr 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 0 additions & 1 deletion .github/workflows/install.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ jobs:
py:
- "3.11"
- "3.10"
- "3.9"
env:
- pip
- conda
Expand Down
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
c_*.c
pufferlib/extensions.c
pufferlib/puffernet.c
logs/

# Build dir
build/

# hipified cuda extensions dir [HIP/ROCM]
pufferlib/extensions/hip/
Expand All @@ -18,6 +22,7 @@ cy_*.c

# C extensions
*.so
*.o

# Distribution / packaging
.Python
Expand Down Expand Up @@ -162,3 +167,8 @@ pufferlib/ocean/impulse_wars/*-release/
pufferlib/ocean/impulse_wars/debug-*/
pufferlib/ocean/impulse_wars/release-*/
pufferlib/ocean/impulse_wars/benchmark/

# Data
resources/drive/data/*
resources/drive/binaries/*

19 changes: 0 additions & 19 deletions MANIFEST.in

This file was deleted.

13 changes: 4 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,11 @@
![figure](https://pufferai.github.io/source/resource/header.png)

[![PyPI version](https://badge.fury.io/py/pufferlib.svg)](https://badge.fury.io/py/pufferlib)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pufferlib)
![Github Actions](https://github.com/PufferAI/PufferLib/actions/workflows/install.yml/badge.svg)
[![](https://dcbadge.vercel.app/api/server/spT4huaGYV?style=plastic)](https://discord.gg/spT4huaGYV)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/cloudposse.svg?style=social&label=Follow%20%40jsuarez5341)](https://twitter.com/jsuarez5341)
[![Discord](https://dcbadge.vercel.app/api/server/spT4huaGYV?style=plastic)](https://discord.gg/spT4huaGYV)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/cloudposse.svg?style=social&label=Follow%20%40jsuarez)](https://twitter.com/jsuarez)

PufferLib is the reinforcement learning library I wish existed during my PhD. It started as a compatibility layer to make working with complex environments a breeze. Now, it's a high-performance toolkit for research and industry with optimized parallel simulation, environments that run and train at 1M+ steps/second, and tons of quality of life improvements for practitioners. All our tools are free and open source. We also offer priority service for companies, startups, and labs!
PufferLib is a fast and sane reinforcement learning library that can train tiny, super-human models in seconds. The included learning algorithm, hyperparameter tuning, and simulation methods are the product of our own research. All our tools are free and open source. Need a high performance environment for your application? We build them professionally and offer training + extended support. Contact jsuarez🐡puffer🐡ai.

![Trailer](https://github.com/PufferAI/puffer.ai/blob/main/docs/assets/puffer_2.gif?raw=true)

All of our documentation is hosted at [puffer.ai](https://puffer.ai "PufferLib Documentation"). @jsuarez5341 on [Discord](https://discord.gg/puffer) for support -- post here before opening issues. We're always looking for new contributors, too!
All of our documentation is hosted at [puffer.ai](https://puffer.ai "PufferLib Documentation"). @jsuarez5341 on [Discord](https://discord.gg/puffer) for support. Post there before opening issues. We're always looking for new contributors!

## Star to puff up the project!

Expand Down
300 changes: 300 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
#!/bin/bash
set -e

# Usage:
# ./build.sh breakout # Build _C.so with breakout statically linked
# ./build.sh breakout --float # float32 precision (required for --slowly)
# ./build.sh breakout --cpu # CPU fallback, torch only
# ./build.sh breakout --debug # Debug build
# ./build.sh breakout --local # Standalone executable (debug, sanitizers)
# ./build.sh breakout --fast # Standalone executable (optimized)
# ./build.sh breakout --web # Emscripten web build
# ./build.sh breakout --profile # Kernel profiling binary
# ./build.sh all # Build all envs with default and --float

if [ -z "$1" ]; then
echo "Usage: ./build.sh ENV_NAME [--float] [--debug] [--local|--fast|--web|--profile|--cpu|--all]"
exit 1
fi
ENV=$1
shift

for arg in "$@"; do
case $arg in
--float) PRECISION="-DPRECISION_FLOAT" ;;
--debug) DEBUG=1 ;;
--local) MODE=local ;;
--fast) MODE=fast ;;
--web) MODE=web ;;
--profile) MODE=profile ;;
--cpu) MODE=cpu; PRECISION="-DPRECISION_FLOAT" ;;
*) echo "Error: unknown argument '$arg'" && exit 1 ;;
esac
done

if [ "$ENV" = "all" ]; then
FAILED=""
for env_dir in ocean/*/; do
env=$(basename "$env_dir")
if bash "$0" "$env" && bash "$0" "$env" --float; then
echo "OK: $env"
else
echo "FAIL: $env"
FAILED="$FAILED\n $env"
fi
done

if [ -n "$FAILED" ]; then
echo -e "\nFailed builds:$FAILED"
fi
exit 0
fi

# Linux/mac
PLATFORM="$(uname -s)"
if [ "$PLATFORM" = "Linux" ]; then
RAYLIB_NAME='raylib-5.5_linux_amd64'
OMP_LIB=-lomp5
SANITIZE_FLAGS=(-fsanitize=address,undefined,bounds,pointer-overflow,leak -fno-omit-frame-pointer)
STANDALONE_LDFLAGS=(-lGL)
SHARED_LDFLAGS=(-Bsymbolic-functions)
else
RAYLIB_NAME='raylib-5.5_macos'
OMP_LIB=-lomp
SANITIZE_FLAGS=()
STANDALONE_LDFLAGS=(-framework Cocoa -framework IOKit -framework CoreVideo -framework OpenGL)
SHARED_LDFLAGS=(-framework Cocoa -framework OpenGL -framework IOKit -undefined dynamic_lookup)
fi

CLANG_WARN=(
-Wall
-ferror-limit=3
-Werror=incompatible-pointer-types
-Werror=return-type
-Wno-error=incompatible-pointer-types-discards-qualifiers
-Wno-incompatible-pointer-types-discards-qualifiers
-Wno-error=array-parameter
)

download() {
local name=$1 url=$2
[ -d "$name" ] && return
echo "Downloading $name..."
case "$url" in
*.zip) curl -sL "$url" -o "$name.zip" && unzip -q "$name.zip" && rm "$name.zip" ;;
*) curl -sL "$url" -o "$name.tar.gz" && tar xf "$name.tar.gz" && rm "$name.tar.gz" ;;
esac
}

RAYLIB_URL="https://github.com/raysan5/raylib/releases/download/5.5"
if [ "$MODE" = "web" ]; then
RAYLIB_NAME='raylib-5.5_webassembly'
download "$RAYLIB_NAME" "$RAYLIB_URL/$RAYLIB_NAME.zip"
else
download "$RAYLIB_NAME" "$RAYLIB_URL/$RAYLIB_NAME.tar.gz"
fi

RAYLIB_A="$RAYLIB_NAME/lib/libraylib.a"
INCLUDES=(-I./$RAYLIB_NAME/include -I./src -I./vendor)
LINK_ARCHIVES=("$RAYLIB_A")
EXTRA_SRC=""

if [ "$ENV" = "constellation" ]; then
SRC_DIR="constellation"
EXTRA_SRC="vendor/cJSON.c"
OUTPUT_NAME="seethestars"
elif [ "$ENV" = "trailer" ]; then
SRC_DIR="trailer"
OUTPUT_NAME="trailer/trailer"
elif [ "$ENV" = "impulse_wars" ]; then
SRC_DIR="ocean/$ENV"
if [ "$MODE" = "web" ]; then BOX2D_NAME='box2d-web'
elif [ "$PLATFORM" = "Linux" ]; then BOX2D_NAME='box2d-linux-amd64'
else BOX2D_NAME='box2d-macos-arm64'
fi
BOX2D_URL="https://github.com/capnspacehook/box2d/releases/latest/download"
download "$BOX2D_NAME" "$BOX2D_URL/$BOX2D_NAME.tar.gz"
INCLUDES+=(-I./$BOX2D_NAME/include -I./$BOX2D_NAME/src)
LINK_ARCHIVES+=("./$BOX2D_NAME/libbox2d.a")
elif [ -d "ocean/$ENV" ]; then
SRC_DIR="ocean/$ENV"
else
echo "Error: environment '$ENV' not found" && exit 1
fi

OUTPUT_NAME=${OUTPUT_NAME:-$ENV}

# Standalone environment build
if [ -n "$DEBUG" ] || [ "$MODE" = "local" ]; then
CLANG_OPT=(-g -O0 "${CLANG_WARN[@]}" "${SANITIZE_FLAGS[@]}")
NVCC_OPT="-O0 -g"
LINK_OPT="-g"
else
CLANG_OPT=(-O2 -DNDEBUG "${CLANG_WARN[@]}")
NVCC_OPT="-O2 --threads 0"
LINK_OPT="-O2"
fi
if [ "$MODE" = "local" ] || [ "$MODE" = "fast" ]; then
FLAGS=(
"${INCLUDES[@]}"
"$SRC_DIR/$ENV.c" $EXTRA_SRC -o "$OUTPUT_NAME"
"${LINK_ARCHIVES[@]}"
"${STANDALONE_LDFLAGS[@]}"
-lm -lpthread -fopenmp
-DPLATFORM_DESKTOP
)
echo "Compiling $ENV..."
${CC:-clang} "${CLANG_OPT[@]}" "${FLAGS[@]}"
echo "Built: ./$OUTPUT_NAME"
exit 0
elif [ "$MODE" = "web" ]; then
mkdir -p "build/web/$ENV"
echo "Compiling $ENV for web..."
emcc \
-o "build/web/$ENV/game.html" \
"$SRC_DIR/$ENV.c" $EXTRA_SRC \
-O3 -Wall \
"${LINK_ARCHIVES[@]}" \
"${INCLUDES[@]}" \
-L. -L./$RAYLIB_NAME/lib \
-sASSERTIONS=2 -gsource-map \
-sUSE_GLFW=3 -sUSE_WEBGL2=1 -sASYNCIFY -sFILESYSTEM -sFORCE_FILESYSTEM=1 \
--shell-file vendor/minshell.html \
-sINITIAL_MEMORY=512MB -sALLOW_MEMORY_GROWTH -sSTACK_SIZE=512KB \
-DNDEBUG -DPLATFORM_WEB -DGRAPHICS_API_OPENGL_ES3 \
--preload-file resources/$ENV@resources/$ENV \
--preload-file resources/shared@resources/shared
echo "Built: build/web/$ENV/game.html"
exit 0
fi

# Find cuDNN path
CUDA_HOME=${CUDA_HOME:-${CUDA_PATH:-$(dirname "$(dirname "$(which nvcc)")")}}
CUDNN_IFLAG=""
CUDNN_LFLAG=""
for dir in /usr/local/cuda/include /usr/include; do
if [ -f "$dir/cudnn.h" ]; then
CUDNN_IFLAG="-I$dir"
break
fi
done
for dir in /usr/local/cuda/lib64 /usr/lib/x86_64-linux-gnu; do
if [ -f "$dir/libcudnn.so" ]; then
CUDNN_LFLAG="-L$dir"
break
fi
done
if [ -z "$CUDNN_IFLAG" ]; then
CUDNN_IFLAG=$(python -c "import nvidia.cudnn, os; print('-I' + os.path.join(nvidia.cudnn.__path__[0], 'include'))" 2>/dev/null || echo "")
fi
if [ -z "$CUDNN_LFLAG" ]; then
CUDNN_LFLAG=$(python -c "import nvidia.cudnn, os; print('-L' + os.path.join(nvidia.cudnn.__path__[0], 'lib'))" 2>/dev/null || echo "")
fi

export CCACHE_DIR="${CCACHE_DIR:-$HOME/.ccache}"
export CCACHE_BASEDIR="$(pwd)"
export CCACHE_COMPILERCHECK=content
NVCC="ccache $CUDA_HOME/bin/nvcc"
CC="${CC:-$(command -v ccache >/dev/null && echo 'ccache clang' || echo 'clang')}"
ARCH=${NVCC_ARCH:-native}

PYTHON_INCLUDE=$(python -c "import sysconfig; print(sysconfig.get_path('include'))")
PYBIND_INCLUDE=$(python -c "import pybind11; print(pybind11.get_include())")
NUMPY_INCLUDE=$(python -c "import numpy; print(numpy.get_include())")
EXT_SUFFIX=$(python -c "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX'))")
OUTPUT="pufferlib/_C${EXT_SUFFIX}"

BINDING_SRC="$SRC_DIR/binding.c"
mkdir -p build
STATIC_OBJ="build/libstatic_${ENV}.o"
STATIC_LIB="build/libstatic_${ENV}.a"

if [ ! -f "$BINDING_SRC" ]; then
echo "Error: $BINDING_SRC not found"
exit 1
fi

echo "Compiling static library for $ENV..."
${CC:-clang} -c "${CLANG_OPT[@]}" \
-I. -Isrc -I$SRC_DIR -Ivendor \
-I./$RAYLIB_NAME/include -I$CUDA_HOME/include \
-DPLATFORM_DESKTOP \
-fno-semantic-interposition -fvisibility=hidden \
-fPIC -fopenmp \
"$BINDING_SRC" -o "$STATIC_OBJ"
ar rcs "$STATIC_LIB" "$STATIC_OBJ"

# Brittle hack: have to extract the tensor type from the static lib to build trainer
OBS_TENSOR_T=$(awk '/^#define OBS_TENSOR_T/{print $3}' "$BINDING_SRC")
if [ -z "$OBS_TENSOR_T" ]; then
echo "Error: Could not find OBS_TENSOR_T in $BINDING_SRC"
exit 1
fi

if [ -z "$MODE" ]; then
echo "Compiling CUDA ($ARCH) training backend..."
$NVCC -c -arch=$ARCH -Xcompiler -fPIC \
-Xcompiler=-D_GLIBCXX_USE_CXX11_ABI=1 \
-Xcompiler=-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION \
-Xcompiler=-DPLATFORM_DESKTOP \
-std=c++17 \
-I. -Isrc \
-I$PYTHON_INCLUDE -I$PYBIND_INCLUDE -I$NUMPY_INCLUDE \
-I$CUDA_HOME/include $CUDNN_IFLAG -I$RAYLIB_NAME/include \
-Xcompiler=-fopenmp \
-DOBS_TENSOR_T=$OBS_TENSOR_T \
-DENV_NAME=$ENV \
$PRECISION $NVCC_OPT \
src/bindings.cu -o build/bindings.o

LINK_CMD=(
${CXX:-g++} -shared -fPIC -fopenmp
build/bindings.o "$STATIC_LIB" "$RAYLIB_A"
-L$CUDA_HOME/lib64 $CUDNN_LFLAG
-lcudart -lnccl -lnvidia-ml -lcublas -lcusolver -lcurand -lcudnn
$OMP_LIB $LINK_OPT
"${SHARED_LDFLAGS[@]}"
-o "$OUTPUT"
)
"${LINK_CMD[@]}"
echo "Built: $OUTPUT"

elif [ "$MODE" = "cpu" ]; then
echo "Compiling CPU training backend..."
${CXX:-g++} -c -fPIC -fopenmp \
-D_GLIBCXX_USE_CXX11_ABI=1 \
-DPLATFORM_DESKTOP \
-std=c++17 \
-I. -Isrc \
-I$PYTHON_INCLUDE -I$PYBIND_INCLUDE \
-DOBS_TENSOR_T=$OBS_TENSOR_T \
-DENV_NAME=$ENV \
$PRECISION $LINK_OPT \
src/bindings_cpu.cpp -o build/bindings_cpu.o
LINK_CMD=(
${CXX:-g++} -shared -fPIC -fopenmp
build/bindings_cpu.o "$STATIC_LIB" "$RAYLIB_A"
-lm -lpthread $OMP_LIB $LINK_OPT
"${SHARED_LDFLAGS[@]}"
-o "$OUTPUT"
)
"${LINK_CMD[@]}"
echo "Built: $OUTPUT"

elif [ "$MODE" = "profile" ]; then
echo "Compiling profile binary ($ARCH)..."
$NVCC $NVCC_OPT -arch=$ARCH -std=c++17 \
-I. -Isrc -I$SRC_DIR -Ivendor \
-I$CUDA_HOME/include $CUDNN_IFLAG -I$RAYLIB_NAME/include \
-DOBS_TENSOR_T=$OBS_TENSOR_T \
-DENV_NAME=$ENV \
-Xcompiler=-DPLATFORM_DESKTOP \
$PRECISION \
-Xcompiler=-fopenmp \
tests/profile_kernels.cu vendor/ini.c \
"$STATIC_LIB" "$RAYLIB_A" \
-lnccl -lnvidia-ml -lcublas -lcurand -lcudnn \
-lGL -lm -lpthread $OMP_LIB \
-o profile
echo "Built: ./profile"
fi
1 change: 0 additions & 1 deletion config

This file was deleted.

7 changes: 2 additions & 5 deletions pufferlib/config/ocean/asteroids.ini → config/asteroids.ini
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
[base]
package = ocean
env_name = puffer_asteroids
policy_name = Policy
rnn_name = Recurrent
env_name = asteroids

[vec]
num_envs = 8
Expand All @@ -17,7 +14,7 @@ adam_beta2 = 0.9999436458974764
adam_eps = 6.915036275112011e-08
anneal_lr = true
batch_size = auto
bptt_horizon = 64
horizon = 64
checkpoint_interval = 200
clip_coef = 0.18588778503512546
ent_coef = 0.0016620361911332262
Expand Down
Loading
Loading