A clean, modular, and production-ready ETL pipeline that extracts data from SQL Server, transforms it using Python, and loads it into PostgreSQL.
- Project Overview
- Architecture
- Technologies Used
- Features
- Project Structure
- Setup Instructions
- Running the Pipeline
- Environment Variables
- Future Enhancements
- Contributing
- License
This project demonstrates a complete ETL (Extract, Transform, Load) workflow using Python.
- 📤 Extracts data from SQL Server
- 🔄 Transforms data using Pandas
- 📥 Loads cleaned data into PostgreSQL
▶️ Runs end-to-end from a single entry point (main.py)
This pipeline is designed to showcase:
- Clean ETL design principles
- Modular Python code
- Secure credential handling
- A real-world data engineering workflow
Perfect for learning, portfolio projects, or extending into production systems.
┌────────────┐
│ SQL Server │
└─────┬──────┘
│ Extract
▼
┌────────────┐
│ Python │ (Pandas Transformations)
└─────┬──────┘
│ Load
▼
┌────────────┐
│ PostgreSQL │
└────────────┘
Fully automated
Runs with a single command
Easily extendable to more data sources
⚙️ Technologies Used
Category Tools
Language Python 3
Data Processing Pandas
SQL Server Connector pyodbc
PostgreSQL Connector psycopg2
Databases SQL Server Express, PostgreSQL
Dev Tools VS Code, Git
Version Control GitHub
🚀 Features
✔ Modular ETL design (Extract / Transform / Load)
✔ Multi-table migration support
✔ Secure .env configuration
✔ Clean, readable folder structure
✔ CSV data staging for validation
✔ Easily extendable for new datasets
✔ GitHub-ready documentation
📂 Project Structure (Folder)
SQL-ServertoPostgresSQL/
│
├── data/
│ ├── customers_raw.csv
│ ├── customers_clean.csv
│ ├── orders_raw.csv
│ └── orders_clean.csv
│
├── .env
├── .gitignore
├── main.py
└── README.md
Folder Breakdown
data/ → Raw and transformed CSV files
main.py → Orchestrates the full ETL pipeline
.env → Stores secure credentials
README.md → Project documentation
🔧 Setup Instructions
1️⃣ Clone the Repository
git clone https://github.com/BenuelOmanga/SQL-Server_to_PostgresSQL.git
cd SQL-Server_to_PostgresSQL
2️⃣ Create a Virtual Environment
python -m venv .venv
3️⃣ Activate the Environment
Windows
.venv\Scripts\activate
Mac / Linux
source .venv/bin/activate
4️⃣ Install Dependencies
pip install -r requirements.txt
▶️ Running the Pipeline
Run the full ETL process with:
python main.py
Sample Output
Starting ETL Pipeline...
Extracted Customers
Extracted Orders
Transformed Customers
Transformed Orders
Loaded Customers
Loaded Orders
ETL Pipeline Completed Successfully.
🔐 Environment Variables
Create a .env file in the project root:
POSTGRES_HOST=localhost
POSTGRESDB=sampleetl
POSTGRES_USER=postgres
POSTGRESPASSWORD=yourpassword
🔒 Important:
The .env file is excluded via .gitignore to keep credentials secure.
📈 Future Enhancements
🚧 Planned improvements:
Add structured logging
Add robust error handling
Introduce SQLAlchemy engine
Add Airflow / Prefect scheduling
Create BI dashboards (Power BI / Tableau)
Add unit & integration tests
Dockerize the pipeline
🤝 Contributing
Contributions are welcome!
Fork the repository
Create a feature branch
Submit a pull request
For major changes, please open an issue first to discuss your idea.
📜 License
This project is licensed under the MIT License.
Feel free to use, modify, and distribute it.