Skip to content

Latest commit

 

History

History
506 lines (395 loc) · 14 KB

File metadata and controls

506 lines (395 loc) · 14 KB

SQL Server to PostgreSQL Migration Guide

This guide explains how to use the improved migration process that properly organizes database objects like SQL Server Management Studio (SSMS).

Overview

The migration process consists of three main steps:

  1. Extract DACPAC - Extracts and organizes all SQL Server objects into the Input folder
  2. Migrate Schema - Convert tables, constraints, indexes using GitHub Copilot (interactive)
  3. Migrate Code - Convert procedures, functions, triggers, views using AI pipeline (automated)

Prerequisites

  • PowerShell 5.1+ or PowerShell Core 7+
  • Python 3.8+
  • requests Python library: pip install requests
  • Azure OpenAI endpoint (or OpenAI-compatible API)
  • SQL Server DACPAC or BACPAC file

Step 1: Extract DACPAC

The extraction script organizes all database objects into separate folders, just like SSMS:

.\scripts\extract_dacpac_objects.ps1 `
    -Package "path\to\YourDatabase.dacpac" `
    -InputRoot "examples\YourProject\Input"

What Gets Extracted

The script creates this folder structure in Input/YourDatabase/:

Input/
└── YourDatabase/
    ├── Tables/
    │   ├── dbo/
    │   │   ├── Customers.sql
    │   │   └── Orders.sql
    │   └── HumanResources/
    │       └── Employee.sql
    ├── Views/
    │   └── dbo/
    │       └── vCustomerOrders.sql
    ├── StoredProcedures/
    │   └── dbo/
    │       └── uspGetCustomer.sql
    ├── Functions/
    │   └── dbo/
    │       └── fnGetTotal.sql
    ├── Triggers/
    │   └── dbo/
    │       └── trgAudit.sql
    ├── Types/
    │   └── dbo/
    │       └── PhoneNumber.sql
    ├── Sequences/
    │   └── dbo/
    │       └── OrderNumber.sql
    ├── Constraints/
    │   ├── PrimaryKey/
    │   │   └── dbo/
    │   │       └── Customers_PK_CustomerID.sql
    │   ├── ForeignKey/
    │   │   └── dbo/
    │   │       └── Orders_FK_Orders_Customers.sql
    │   ├── Check/
    │   ├── Unique/
    │   └── Default/
    └── Indexes/
        └── dbo/
            └── Customers_IX_CustomerName.sql

Example Output

========================================
Extraction Complete!
Output directory: Input/AdventureWorks2016

Object Summary:
  Tables:              75
  Views:               20
  Stored Procedures:   12
  Functions:           21
  Triggers:            10
  Types:               6

Constraints:
  Primary Keys:        72
  Foreign Keys:        90
  Check Constraints:   89
  Default Constraints: 152

Indexes:             96
========================================

Step 2: Configure Azure OpenAI

Create a .env file in the project root with your Azure OpenAI settings:

# Azure OpenAI Configuration
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_API_VERSION=2024-02-15-preview
AZURE_OPENAI_KEY=your-api-key-here

# Deployment names for each stage
DRAFT_AZURE_OPENAI_DEPLOYMENT=gpt-4o
REFINE_AZURE_OPENAI_DEPLOYMENT=gpt-4o
VERIFY_AZURE_OPENAI_DEPLOYMENT=gpt-4o

Alternative: Use OpenAI or Claude

For OpenAI:

DRAFT_PROVIDER=openai
DRAFT_BASE_URL=https://api.openai.com/v1
DRAFT_MODEL=gpt-4
DRAFT_API_KEY=your-openai-key

For Claude (via OpenAI-compatible proxy):

DRAFT_PROVIDER=openai-compatible
DRAFT_BASE_URL=https://your-claude-proxy.com/v1
DRAFT_MODEL=claude-sonnet-4
DRAFT_API_KEY=your-claude-key

Step 3: Migrate Schema (GitHub Copilot)

Schema objects (Tables, Constraints, Indexes, Types, Sequences) are migrated using GitHub Copilot Chat for better control and review:

Open GitHub Copilot Chat

In VS Code:

  1. Open Copilot Chat (Ctrl+Shift+I or Cmd+Shift+I)
  2. Run the migration command:
/SQLtoPostgres-Migrate-Schema

What Gets Migrated

GitHub Copilot will interactively convert:

  • Tables - Structure and column definitions
  • Types - Custom user-defined types
  • Sequences - Sequence objects
  • Constraints - Primary Keys, Foreign Keys, Check, Unique, Default
  • Indexes - All index types

Benefits of Using Copilot for Schema

  • Interactive review - You can review and approve each change
  • Context-aware - Understands your specific database patterns
  • Best practices - Applies PostgreSQL best practices
  • No API costs - Uses GitHub Copilot (not Azure OpenAI)
  • Better control - You decide what gets changed

The Copilot prompt will:

  1. Read all extracted schema objects from Input/YourDatabase/
  2. Convert them to PostgreSQL DDL
  3. Write output to Output/YourDatabase/Schema/
  4. Provide execution order guidance
  5. Flag any manual steps needed

Step 4: Migrate Code (AI Pipeline)

Programmability objects (Views, Functions, Procedures, Triggers) are migrated using the automated AI pipeline:

.\scripts\migrate_code_objects.ps1 `
    -InputDir "examples\YourDatabase\Input\YourDatabase" `
    -OutputDir "examples\YourDatabase\Output\YourDatabase" `
    -EnvFile ".env" `
    -MaxParallel 4

Copilot tip: Run this in a normal terminal and avoid Copilot “progress checks” while it runs; those probes can interrupt the long API calls. Hard rule: Once migration starts, do not run any monitoring/polling commands (Copilot “check progress”, Get-Process, tailing logs, etc.). Leave the terminal alone until it finishes to avoid interrupting the pipeline.

Validate outputs (optional but recommended)

After code migration, check for markdown/prose leftovers or name mismatches:

.\scripts\validate_migrated_objects.ps1 -OutputDir "examples\YourDatabase\Output\YourDatabase"

See Output/<Database>/code_validation.json (or your custom -ReportPath) for flagged items.

Migration Process

The script processes code objects in dependency order:

  1. Views - After tables exist
  2. Functions - After tables and views
  3. Stored Procedures - After functions
  4. Triggers - After tables

Example Output

========================================
SQL Server to PostgreSQL Code Migration
========================================
Input:  Input/AdventureWorks2016
Output: Output/AdventureWorks2016

NOTE: Schema objects (Tables, Constraints, Indexes) should be
      migrated using GitHub Copilot's /SQLtoPostgres-Migrate-Schema prompt.

[1/4] Migrating Views...
  Processing: vEmployee.sql
    ✓ Completed
  Views migrated: 20

[2/4] Migrating Functions...
  Functions migrated: 21

[3/4] Migrating Stored Procedures...
  Stored Procedures migrated: 12

[4/4] Migrating Triggers...
  Triggers migrated: 10

========================================
Code Migration Summary
========================================

Programmability Objects Migrated:
  Views:               20
  Functions:           21
  Stored Procedures:   12
  Triggers:            10

All code objects migrated successfully!

========================================
Next Steps
========================================
Schema objects (Tables, Constraints, Indexes) should be migrated
using GitHub Copilot Chat with the following command:

  /SQLtoPostgres-Migrate-Schema

This will handle conversion of:
  - Tables and column definitions
  - Primary and Foreign Keys
  - Check and Default Constraints
  - Indexes
  - Custom Types and Sequences
========================================

Migration Features

Automatic Conversions

The migration pipeline automatically handles:

Data Types

  • NVARCHARVARCHAR
  • DATETIME / DATETIME2TIMESTAMP
  • UNIQUEIDENTIFIERUUID
  • BITBOOLEAN
  • TINYINTSMALLINT
  • MONEYNUMERIC(19,4)

Identity Columns

-- SQL Server
[EmployeeID] INT IDENTITY(1,1) NOT NULL

-- PostgreSQL
employee_id INTEGER GENERATED ALWAYS AS IDENTITY NOT NULL

Error Handling

-- SQL Server
BEGIN TRY
    -- code
END TRY
BEGIN CATCH
    THROW;
END CATCH

-- PostgreSQL
BEGIN
    -- code
EXCEPTION
    WHEN OTHERS THEN
        RAISE;
END;

Constraints

-- SQL Server
ALTER TABLE [dbo].[Employee]
    ADD CONSTRAINT [PK_Employee] PRIMARY KEY CLUSTERED ([EmployeeID] ASC);

-- PostgreSQL
ALTER TABLE employee
    ADD CONSTRAINT pk_employee PRIMARY KEY (employee_id);

Indexes

-- SQL Server
CREATE NONCLUSTERED INDEX [IX_Employee_Name]
    ON [dbo].[Employee]([LastName] ASC, [FirstName] ASC);

-- PostgreSQL
CREATE INDEX ix_employee_name
    ON employee(last_name, first_name);

No Double Quotes on Identifiers

The migration automatically:

  • Removes SQL Server brackets [name]
  • Converts to lowercase unquoted identifiers
  • Only uses double quotes when absolutely necessary (case-sensitive names)
-- SQL Server
CREATE TABLE [HumanResources].[Employee] (
    [EmployeeID] INT NOT NULL,
    [FirstName] NVARCHAR(50) NOT NULL
);

-- PostgreSQL (no double quotes!)
CREATE TABLE humanresources.employee (
    employee_id INTEGER NOT NULL,
    first_name VARCHAR(50) NOT NULL
);

Advanced Options

Migrate Specific Object Types

To migrate only specific programmability objects:

# Only migrate views and functions
.\scripts\migrate_code_objects.ps1 `
    -InputDir "Input\YourDatabase" `
    -OutputDir "Output\YourDatabase" `
    -ObjectTypes @("Views", "Functions")

# Only migrate stored procedures
.\scripts\migrate_code_objects.ps1 `
    -InputDir "Input\YourDatabase" `
    -OutputDir "Output\YourDatabase" `
    -ObjectTypes @("StoredProcedures")

Valid object types: Views, Functions, StoredProcedures, Triggers, DatabaseTriggers

Disable Quote Removal

If you want to keep double quotes (not recommended):

python scripts/run_migration_pipeline.py \
    --tsql Input/Tables/dbo/Employee.sql \
    --outdir Output \
    --no-quote-removal

Deployment to PostgreSQL

After migration, deploy objects to PostgreSQL in this order:

1. Create Schemas

psql -d your_database -c "CREATE SCHEMA IF NOT EXISTS humanresources;"
psql -d your_database -c "CREATE SCHEMA IF NOT EXISTS sales;"

2. Deploy in Order

# Types and sequences
psql -d your_database -f Output/Types/dbo/*.sql
psql -d your_database -f Output/Sequences/dbo/*.sql

# Tables (structure only)
psql -d your_database -f Output/Tables/dbo/*.sql

# Primary Keys
psql -d your_database -f Output/Constraints/PrimaryKey/dbo/*.sql

# Views and Functions
psql -d your_database -f Output/Views/dbo/*.sql
psql -d your_database -f Output/Functions/dbo/*.sql

# Procedures and Triggers
psql -d your_database -f Output/StoredProcedures/dbo/*.sql
psql -d your_database -f Output/Triggers/dbo/*.sql

# Foreign Keys (after all tables)
psql -d your_database -f Output/Constraints/ForeignKey/dbo/*.sql

# Indexes (last)
psql -d your_database -f Output/Indexes/dbo/*.sql

Troubleshooting

Error: "SqlPackage not found"

The script will auto-download SqlPackage. If this fails:

Error: "requests module not found"

pip install requests

Error: "Missing required environment variables"

Check your .env file has all required settings:

  • AZURE_OPENAI_ENDPOINT
  • AZURE_OPENAI_API_VERSION
  • AZURE_OPENAI_KEY
  • Deployment names for DRAFT, REFINE, VERIFY

Migration Errors

Check the error log at the end of migration output. Common issues:

  • Complex T-SQL features not supported in PostgreSQL
  • Circular dependencies between objects
  • Azure OpenAI rate limits or quota issues

Best Practices

  1. Extract first - Always start with DACPAC extraction to organize objects
  2. Schema before code - Migrate schema using GitHub Copilot before running code migration
  3. Review constraints - Check constraint logic is preserved correctly in Copilot output
  4. Test triggers - PostgreSQL trigger syntax differs significantly - review carefully
  5. Validate functions - Table-valued functions need special attention
  6. Check indexes - Clustered indexes don't exist in PostgreSQL (use primary keys)
  7. Manual review - Always review and test migrated code before production deployment
  8. Use Copilot for schema - Interactive review is better for structural changes
  9. Use AI pipeline for code - Automated translation works well for procedural code

Migration Workflow Summary

1. Extract DACPAC
   ↓
   .\scripts\extract_dacpac_objects.ps1 -Package YourDB.dacpac
   ↓
   Creates organized Input/YourDB/ structure

2. Migrate Schema (Interactive with GitHub Copilot)
   ↓
   Open Copilot Chat → /SQLtoPostgres-Migrate-Schema
   ↓
   Review and approve table/constraint/index conversions
   ↓
   Output → Output/YourDB/Schema/

3. Migrate Code (Automated with AI)
   ↓
   .\scripts\migrate_code_objects.ps1 -InputDir Input\YourDB -OutputDir Output\YourDB
   ↓
   Converts views, functions, procedures, triggers
   ↓
   Output → Output/YourDB/Views|Functions|StoredProcedures|Triggers/

4. Deploy to PostgreSQL
   ↓
   Run DDL scripts in proper order

Why Two Different Approaches?

GitHub Copilot for Schema (Tables, Constraints, Indexes)

  • Interactive review - You see and approve each change
  • Structural changes - Table schemas need careful review
  • Data type decisions - You can make informed choices about type mappings
  • No API costs - Uses your existing GitHub Copilot subscription
  • Context-aware - Understands your entire project structure

AI Pipeline for Code (Procedures, Functions, Triggers, Views)

  • Automated at scale - Handles hundreds of objects automatically
  • Consistent translations - Uses standardized conversion patterns
  • Multi-stage refinement - Draft → Refine → Verify for quality
  • Complex logic - Better at handling complex procedural code
  • Batch processing - Converts all objects in one run

Need Help?

  • Check the migrations in migrations/AdventureWorks/
  • Review the GitHub Copilot prompts in .github/prompts/
  • See README.md for additional information