Skip to content

BenuelOmanga/SQL-Server_to_PostgresSQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

SQL Server → PostgreSQL ETL Pipeline

A clean, modular, and production-ready ETL pipeline that extracts data from SQL Server, transforms it using Python, and loads it into PostgreSQL.


📑 Table of Contents

  1. Project Overview
  2. Architecture
  3. Technologies Used
  4. Features
  5. Project Structure
  6. Setup Instructions
  7. Running the Pipeline
  8. Environment Variables
  9. Future Enhancements
  10. Contributing
  11. License

📌 Project Overview

This project demonstrates a complete ETL (Extract, Transform, Load) workflow using Python.

What it does:

  • 📤 Extracts data from SQL Server
  • 🔄 Transforms data using Pandas
  • 📥 Loads cleaned data into PostgreSQL
  • ▶️ Runs end-to-end from a single entry point (main.py)

Why this project?

This pipeline is designed to showcase:

  • Clean ETL design principles
  • Modular Python code
  • Secure credential handling
  • A real-world data engineering workflow

Perfect for learning, portfolio projects, or extending into production systems.


🧱 Architecture

┌────────────┐
│ SQL Server │
└─────┬──────┘
      │  Extract
      ▼
┌────────────┐
│   Python   │  (Pandas Transformations)
└─────┬──────┘
      │  Load
      ▼
┌────────────┐
│ PostgreSQL │
└────────────┘
Fully automated

Runs with a single command

Easily extendable to more data sources

⚙️ Technologies Used
Category	Tools
Language	Python 3
Data Processing	Pandas
SQL Server Connector	pyodbc
PostgreSQL Connector	psycopg2
Databases	SQL Server Express, PostgreSQL
Dev Tools	VS Code, Git
Version Control	GitHub
🚀 Features
✔ Modular ETL design (Extract / Transform / Load)
✔ Multi-table migration support
✔ Secure .env configuration
✔ Clean, readable folder structure
✔ CSV data staging for validation
✔ Easily extendable for new datasets
✔ GitHub-ready documentation

📂 Project Structure (Folder)
SQL-ServertoPostgresSQL/
│
├── data/
│   ├── customers_raw.csv
│   ├── customers_clean.csv
│   ├── orders_raw.csv
│   └── orders_clean.csv
│
├── .env
├── .gitignore
├── main.py
└── README.md
Folder Breakdown
data/ → Raw and transformed CSV files

main.py → Orchestrates the full ETL pipeline

.env → Stores secure credentials

README.md → Project documentation

🔧 Setup Instructions
1️⃣ Clone the Repository
git clone https://github.com/BenuelOmanga/SQL-Server_to_PostgresSQL.git
cd SQL-Server_to_PostgresSQL
2️⃣ Create a Virtual Environment
python -m venv .venv
3️⃣ Activate the Environment
Windows

.venv\Scripts\activate
Mac / Linux

source .venv/bin/activate
4️⃣ Install Dependencies
pip install -r requirements.txt
▶️ Running the Pipeline
Run the full ETL process with:

python main.py
Sample Output
Starting ETL Pipeline...
Extracted Customers
Extracted Orders
Transformed Customers
Transformed Orders
Loaded Customers
Loaded Orders
ETL Pipeline Completed Successfully.
🔐 Environment Variables
Create a .env file in the project root:

POSTGRES_HOST=localhost
POSTGRESDB=sampleetl
POSTGRES_USER=postgres
POSTGRESPASSWORD=yourpassword
🔒 Important:
The .env file is excluded via .gitignore to keep credentials secure.

📈 Future Enhancements
🚧 Planned improvements:

Add structured logging

Add robust error handling

Introduce SQLAlchemy engine

Add Airflow / Prefect scheduling

Create BI dashboards (Power BI / Tableau)

Add unit & integration tests

Dockerize the pipeline

🤝 Contributing
Contributions are welcome!

Fork the repository

Create a feature branch

Submit a pull request

For major changes, please open an issue first to discuss your idea.

📜 License
This project is licensed under the MIT License.
Feel free to use, modify, and distribute it.

About

This project migrates relational data from SQL Server to PostgreSQL using Python. It extracts tables, transforms schemas and data types, and loads them into PostgreSQL. An automated pipeline syncs new and updated records continuously, ensuring consistent, analytics-ready data in a scalable open-source environment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages