Empowering the Unseen: Mapping Physical Reality into Searchable Memory.
In a world defined by three-dimensional space, navigating and tracking objects can be a significant challenge, especially for individuals with visual impairments. SpatialVCS is designed to rewrite this narrative by creating a bridge between physical reality and digital intelligence.
By leveraging advanced spatial mapping and AI-powered computer vision (YOLO and Gemini), we transform your mobile phone into a "smart probe." Our mission is to empower users to seamlessly track, search, and understand their surroundings, ensuring that no object is ever "lost" and every space is intuitively accessible.
This project is a submission for the CTRL+HACK+DEL 2.0 competition.
- Spatial (Space): Transforming raw video data into precise XYZ coordinates and semantic understandings of the physical world.
- Version (History): Tracking changes over time, allowing users to query not just where an object is, but where it was.
- Control (System): Providing a robust, searchable management layer for physical objects and spatial data.
- Real-time Object Tracking: Seamlessly detect and map objects in 3D space using mobile sensors.
- Natural Language Querying: Ask "Where are my keys?" and get precise, AI-assisted locations.
- Semantic Memory: Powered by Gemini for deep understanding of object context and identity.
- Access Control & Diffs: Manage who sees what and see what has moved between scans.
We have prepared .command scripts to automate everything.
- Double-click
setup_env.commandto install Python/Node dependencies and generate SSL certs. - Double-click
start_app.commandto launch both backend and frontend.
If the scripts don't work, follow these steps:
- Node.js (v18+)
- Python (3.10+)
- mkcert (For SSL certificates - Required for mobile camera access)
- macOS:
brew install mkcert nss - Windows:
choco install mkcert
- macOS:
# Clone repository
git clone <repo-url>
cd gemini_toolkit
# Create Python Virtual Environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install Dependencies
pip install -r requirements.txtcd frontend
npm installSince we use the camera on mobile and connect via local network IP, we need HTTPS. Run this in the root directory:
# Install CA
mkcert -install
# Generate certs for localhost and your local IP (e.g., 100.x.y.z or 192.168.x.y)
# IMPORTANT: Replace 100.104.16.42 with YOUR actual LAN IP!
mkcert -key-file key.pem -cert-file cert.pem localhost 127.0.0.1 ::1 100.104.16.42Note: The frontend must act as HTTPS server for the mobile camera to work. If you skip this, mobile access will fail.
You need two terminal windows.
Terminal 1: Backend
source .venv/bin/activate
uvicorn main:app --reload --host 0.0.0.0 --port 8000Terminal 2: Frontend
cd frontend
npm run dev -- --host- Mobile (Probe): Open
https://<YOUR-IP>:5173/probe(Accept SSL warning).- Click "CFG" -> Enter Gemini API Key.
- Click "START SCAN".
- Desktop (Dashboard): Open
https://localhost:5173/dashboard.- Enter Gemini API Key.
- See live detections and XYZ coordinates.
- Search: "Where is the [object]?"
main.py: FastAPI backend entry point.services/: Logic for YOLO (video_processor.py) and Vector DB (spatial_memory.py).frontend/src/components/: React views (ProbeView.jsx,DashboardView.jsx).
- White Screen on Mobile?
- Check the console (remote debug).
- Ensure SSL certs are valid and generated for your specific LAN IP.
- Verify your phone is on the same Wi-Fi network as your computer.
- Try accessing
https://<YOUR-IP>:5173/probedirectly.
- Search fails? Ensure you entered the API Key in Settings ("CFG" button).
- Backend 422 Error? Fixed in latest version (scan_id is optional).
(Add screenshots here)
Distributed under the MIT License. See LICENSE for more information.


