Azure-Samples · pamelafox · Apr 6, 2026 · Apr 5, 2026 · Apr 6, 2026 · Apr 6, 2026
diff --git a/.devcontainer/github/devcontainer.json b/.devcontainer/github/devcontainer.json
diff --git a/.env.sample b/.env.sample
@@ -1,13 +1,11 @@
-# API_HOST can be either azure, ollama, openai, or github:
+# API_HOST can be either azure, ollama, or openai:
 API_HOST=azure
 # Needed for Azure:
-AZURE_OPENAI_ENDPOINT=https://YOUR-AZURE-OPENAI-SERVICE-NAME.openai.azure.com/openai/v1
+AZURE_OPENAI_ENDPOINT=https://YOUR-AZURE-OPENAI-SERVICE-NAME.openai.azure.com
 AZURE_OPENAI_CHAT_DEPLOYMENT=YOUR-AZURE-DEPLOYMENT-NAME
 # Needed for Ollama:
 OLLAMA_ENDPOINT=http://localhost:11434/v1
 OLLAMA_MODEL=llama3.1
 # Needed for OpenAI.com:
 OPENAI_KEY=YOUR-OPENAI-KEY
 OPENAI_MODEL=gpt-3.5-turbo
-# Needed for GitHub models:
-GITHUB_MODEL=gpt-4o
diff --git a/.env.sample.github b/.env.sample.github
diff --git a/AGENTS.md b/AGENTS.md
@@ -4,15 +4,15 @@ This document provides comprehensive instructions for coding agents working on t
 
 ## Overview
 
-This repository contains a collection of Python scripts that demonstrate how to use the OpenAI API (and compatible APIs like Azure OpenAI, GitHub Models, and Ollama) to generate chat completions. The repository includes examples of:
+This repository contains a collection of Python scripts that demonstrate how to use the OpenAI API (and compatible APIs like Azure OpenAI and Ollama) to generate chat completions. The repository includes examples of:
 
 - Basic chat completions (streaming, async, history)
 - Function calling (basic to advanced multi-function scenarios)
 - Structured outputs using Pydantic models
 - Retrieval-Augmented Generation (RAG) with various complexity levels
 - Prompt engineering and safety features
 
-The scripts are designed to be educational and can run with multiple LLM providers: **GitHub Models (preferred for agents)**, Azure OpenAI, OpenAI.com, or local Ollama models.
+The scripts are designed to be educational and can run with multiple LLM providers: **Azure OpenAI (preferred)**, OpenAI.com, or local Ollama models.
 
 ## Code Layout
 
@@ -98,7 +98,6 @@ These scripts are automatically run by `azd provision` via the `azure.yaml` post
 **Environment Variables:**
 - `.env.sample` - Example .env file showing all possible configurations
 - `.env.sample.azure` - Azure-specific example
-- `.env.sample.github` - GitHub Models example
 - `.env.sample.ollama` - Ollama example
 - `.env.sample.openai` - OpenAI.com example
 
@@ -110,17 +109,10 @@ These scripts are automatically run by `azd provision` via the `azure.yaml` post
 - Runs: `uv run ruff check .` and `uv run black . --check --verbose`
 - **Important:** The CI uses `uv` but local development typically uses standard `pip`
 
-**`test-github-models.yaml` - Integration Test:**
-- Runs on: push to main, pull requests to main (limited paths)
-- Tests: `chat.py` and `spanish/chat.py` with GitHub Models
-- Uses: uv for setup, requires models: read permission
-- Sets: `API_HOST=github`, `GITHUB_TOKEN=${{ secrets.GITHUB_TOKEN }}`, `GITHUB_MODEL=openai/gpt-4o-mini`
-
 ### Dev Container Files (.devcontainer/)
 
 - `.devcontainer/devcontainer.json` - Default dev container (Azure OpenAI setup with azd)
 - `.devcontainer/Dockerfile` - Base Python 3.12 image, installs all requirements-dev.txt
-- `.devcontainer/github/` - GitHub Models variant
 - `.devcontainer/ollama/` - Ollama variant
 - `.devcontainer/openai/` - OpenAI.com variant
 
@@ -156,31 +148,9 @@ All dev containers install all dependencies from `requirements-dev.txt` which in
 
 ### Configuring LLM Provider
 
-**For agents, ALWAYS prefer GitHub Models** since the agent often won't have access to Azure OpenAI or paid API keys.
-
 The scripts read environment variables from a `.env` file. Create one based on your provider:
 
-#### Option 1: GitHub Models (RECOMMENDED for agents)
-
-**For agents:** Check if `GITHUB_TOKEN` environment variable is available:
-```bash
-if [ -n "$GITHUB_TOKEN" ]; then
-    echo "GitHub Models available - GITHUB_TOKEN is set"
-else
-    echo "GitHub Models not available - GITHUB_TOKEN not found"
-fi
-```
-
-In GitHub Codespaces, `GITHUB_TOKEN` is already set, so **no .env file is needed** - scripts will work immediately.
-
-If `GITHUB_TOKEN` is available, you can optionally set a different model (default is `gpt-4o`):
-```bash
-export GITHUB_MODEL=openai/gpt-4o-mini
-```
-
-**Models that support function calling:** `gpt-4o`, `gpt-4o-mini`, `o3-mini`, `AI21-Jamba-1.5-Large`, `AI21-Jamba-1.5-Mini`, `Codestral-2501`, `Cohere-command-r`, `Ministral-3B`, `Mistral-Large-2411`, `Mistral-Nemo`, `Mistral-small`
-
-#### Option 2: Azure OpenAI (requires Azure resources and costs)
+#### Option 1: Azure OpenAI (recommended)
 
 **For agents:** Check if Azure OpenAI environment variables are already configured:
 ```bash
@@ -199,7 +169,7 @@ azd provision
 
 This creates real Azure resources that incur costs. The `.env` file would be created automatically with all needed variables after provisioning.
 
-#### Option 3: OpenAI.com (requires API key and costs)
+#### Option 2: OpenAI.com (requires API key and costs)
 
 **For agents:** Check if OpenAI.com API key is available:
 ```bash
@@ -212,7 +182,7 @@ fi
 
 If `OPENAI_API_KEY` is available, ensure `API_HOST=openai` and `OPENAI_MODEL` are also set (e.g., `gpt-4o-mini`).
 
-#### Option 4: Ollama (requires local Ollama installation)
+#### Option 3: Ollama (requires local Ollama installation)
 
 **For agents:** Check if Ollama is installed and running:
 ```bash
@@ -292,13 +262,9 @@ pre-commit run --all-files
 
 ### Integration Tests
 
-The repository has limited automated testing via GitHub Actions. The primary test runs basic scripts with GitHub Models:
+The repository has limited automated testing via GitHub Actions. Changes to scripts should be manually verified by running them:
 
 ```bash
-# This is what the CI does (requires GitHub token):
-export API_HOST=github
-export GITHUB_TOKEN=$YOUR_TOKEN
-export GITHUB_MODEL=openai/gpt-4o-mini
 python chat.py
 python spanish/chat.py
 ```
@@ -309,13 +275,13 @@ python spanish/chat.py
 
 ### Environment Variables
 
-- **All scripts default to `API_HOST=github`** if no .env file is present and no environment variable is set.
+- **All scripts default to `API_HOST=azure`** if no .env file is present and no environment variable is set.
 - Scripts use `load_dotenv(override=True)` which means .env values override environment variables.
 
 ### Model Compatibility
 
 - **Function calling scripts require models that support tools**. Not all models support this:
-  - ✅ Supported: `gpt-4o`, `gpt-4o-mini`, and many others (see GitHub Models list above)
+  - ✅ Supported: `gpt-4o`, `gpt-4o-mini`, `gpt-5.4`, and many others
   - ❌ Not supported: Older models, some local Ollama models
 - If a script fails with a function calling error, check if your model supports the `tools` parameter.
 
@@ -365,14 +331,14 @@ python spanish/chat.py
 - Solution: Install RAG dependencies: `python -m pip install -r requirements-rag.txt`
 
 **Error: `KeyError: 'AZURE_OPENAI_ENDPOINT'`**
-- Solution: Your `.env` file is missing required Azure variables, or `API_HOST` is set to `azure` but you haven't configured Azure. Switch to GitHub Models or configure Azure properly.
+- Solution: Your `.env` file is missing required Azure variables, or `API_HOST` is set to `azure` but you haven't configured Azure. Run `azd provision` or configure Azure properly.
 
 **Error: `openai.APIError: content_filter`**
 - This is expected behavior for `chat_safety.py` - it's demonstrating content filtering.
 - The script catches this error and prints a message.
 
 **Error: Function calling not supported**
-- Solution: Use a model that supports tools. For GitHub Models, use `gpt-4o`, `gpt-4o-mini`, or another compatible model from the list above.
+- Solution: Use a model that supports tools, such as `gpt-4o`, `gpt-4o-mini`, or `gpt-5.4`.
 
 **Error: `azd` command not found**
 - Solution: Install Azure Developer CLI: https://aka.ms/install-azd
@@ -385,9 +351,8 @@ When making code changes:
 2. **Run linters before making changes** to understand baseline: `ruff check .` and `black . --check`
 3. **Make minimal, surgical changes** to the relevant scripts.
 4. **Run linters again after changes**: `ruff check .` and `black .` (auto-fix)
-5. **Manually test the changed script** with GitHub Models:
+5. **Manually test the changed script**:
    ```bash
-   export GITHUB_TOKEN=$YOUR_TOKEN
    python your_modified_script.py
    ```
 6. **Check that Spanish translations are updated** if applicable.

diff --git a/README.md b/README.md
@@ -10,7 +10,6 @@ This repository contains a collection of Python scripts that demonstrate how to
   * [Retrieval-Augmented Generation (RAG)](#retrieval-augmented-generation-rag)
 * [Setting up the Python environment](#setting-up-the-python-environment)
 * [Configuring the OpenAI environment variables](#configuring-the-openai-environment-variables)
-  * [Using GitHub Models](#using-github-models)
   * [Using Azure OpenAI models](#using-azure-openai-models)
   * [Using OpenAI.com models](#using-openaicom-models)
   * [Using Ollama models](#using-ollama-models)
@@ -94,35 +93,11 @@ python -m pip install -r requirements.txt
 
 ## Configuring the OpenAI environment variables
 
-These scripts can be run with Azure OpenAI account, OpenAI.com, local Ollama server, or GitHub models,
+These scripts can be run with Azure OpenAI account, OpenAI.com, or local Ollama server,
 depending on the environment variables you set. All the scripts reference the environment variables from a `.env` file, and an example `.env.sample` file is provided. Host-specific instructions are below.
 
-## Using GitHub Models
-
-If you open this repository in GitHub Codespaces, you can run the scripts for free using GitHub Models without any additional steps, as your `GITHUB_TOKEN` is already configured in the Codespaces environment.
-
-If you want to run the scripts locally, you need to set up the `GITHUB_TOKEN` environment variable with a GitHub [personal access token (PAT)](https://github.com/settings/tokens). You can create a PAT by following these steps:
-
-1. Go to your GitHub account settings.
-2. Click on "Developer settings" in the left sidebar.
-3. Click on "Personal access tokens" in the left sidebar.
-4. Click on "Tokens (classic)" or "Fine-grained tokens" depending on your preference.
-5. Click on "Generate new token".
-6. Give your token a name and select the scopes you want to grant. For this project, you don't need any specific scopes.
-7. Click on "Generate token".
-8. Copy the generated token.
-9. Set the `GITHUB_TOKEN` environment variable in your terminal or IDE:
-
-    ```shell
-    export GITHUB_TOKEN=your_personal_access_token
-    ```
-
-10. Optionally, you can use a model other than "gpt-4o" by setting the `GITHUB_MODEL` environment variable. Use a model that supports function calling, such as: `gpt-4o`, `gpt-4o-mini`, `o3-mini`, `AI21-Jamba-1.5-Large`, `AI21-Jamba-1.5-Mini`, `Codestral-2501`, `Cohere-command-r`, `Ministral-3B`, `Mistral-Large-2411`, `Mistral-Nemo`, `Mistral-small`
-
 ## Using Azure OpenAI models
 
-You can run all examples in this repository using GitHub Models. If you want to run the examples using models from Azure OpenAI instead, you need to provision the Azure AI resources, which will incur costs.
-
 This project includes infrastructure as code (IaC) to provision Azure OpenAI deployments of "gpt-4o" and "text-embedding-3-large". The IaC is defined in the `infra` directory and uses the Azure Developer CLI to provision the resources.
 
 1. Make sure the [Azure Developer CLI (azd)](https://aka.ms/install-azd) is installed.
@@ -133,12 +108,6 @@ This project includes infrastructure as code (IaC) to provision Azure OpenAI dep
     azd auth login
     ```
 
-    For GitHub Codespaces users, if the previous command fails, try:
-
-   ```shell
-    azd auth login --use-device-code
-    ```
-
 3. Provision the OpenAI account:
 
     ```shell

diff --git a/chained_calls.py b/chained_calls.py
@@ -6,14 +6,14 @@
 
 # Setup the OpenAI client to use either Azure, OpenAI.com, or Ollama API
 load_dotenv(override=True)
-API_HOST = os.getenv("API_HOST", "github")
+API_HOST = os.getenv("API_HOST", "azure")
 
 if API_HOST == "azure":
     token_provider = azure.identity.get_bearer_token_provider(
         azure.identity.DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
     )
     client = openai.OpenAI(
-        base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
+        base_url=f"{os.environ['AZURE_OPENAI_ENDPOINT'].rstrip('/')}/openai/v1/",
         api_key=token_provider,
     )
     MODEL_NAME = os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"]
@@ -22,42 +22,40 @@
     client = openai.OpenAI(base_url=os.environ["OLLAMA_ENDPOINT"], api_key="nokeyneeded")
     MODEL_NAME = os.environ["OLLAMA_MODEL"]
 
-elif API_HOST == "github":
-    client = openai.OpenAI(base_url="https://models.github.ai/inference", api_key=os.environ["GITHUB_TOKEN"])
-    MODEL_NAME = os.getenv("GITHUB_MODEL", "openai/gpt-4o")
-
 else:
     client = openai.OpenAI(api_key=os.environ["OPENAI_KEY"])
     MODEL_NAME = os.environ["OPENAI_MODEL"]
 
 
-response = client.chat.completions.create(
+response = client.responses.create(
     model=MODEL_NAME,
     temperature=0.7,
-    messages=[{"role": "user", "content": "Explain how LLMs work in a single paragraph."}],
+    input=[{"role": "user", "content": "Explain how LLMs work in a single paragraph."}],
+    store=False,
 )
 
-explanation = response.choices[0].message.content
+explanation = response.output_text
 print("Explanation: ", explanation)
-response = client.chat.completions.create(
+response = client.responses.create(
     model=MODEL_NAME,
     temperature=0.7,
-    messages=[
+    input=[
         {
             "role": "user",
             "content": "You're an editor. Review the explanation and provide feedback (but don't edit yourself):\n\n"
             + explanation,
         }
     ],
+    store=False,
 )
 
-feedback = response.choices[0].message.content
+feedback = response.output_text
 print("\n\nFeedback: ", feedback)
 
-response = client.chat.completions.create(
+response = client.responses.create(
     model=MODEL_NAME,
     temperature=0.7,
-    messages=[
+    input=[
         {
             "role": "user",
             "content": (
@@ -66,7 +64,8 @@
             ),
         }
     ],
+    store=False,
 )
 
-final_article = response.choices[0].message.content
+final_article = response.output_text
 print("\n\nFinal Article: ", final_article)
diff --git a/chat.py b/chat.py
@@ -6,14 +6,14 @@
 
 # Setup the OpenAI client to use either Azure, OpenAI.com, or Ollama API
 load_dotenv(override=True)
-API_HOST = os.getenv("API_HOST", "github")
+API_HOST = os.getenv("API_HOST", "azure")
 
 if API_HOST == "azure":
     token_provider = azure.identity.get_bearer_token_provider(
         azure.identity.DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
     )
     client = openai.OpenAI(
-        base_url=os.environ["AZURE_OPENAI_ENDPOINT"],
+        base_url=f"{os.environ['AZURE_OPENAI_ENDPOINT'].rstrip('/')}/openai/v1/",
         api_key=token_provider,
     )
     MODEL_NAME = os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"]
@@ -22,23 +22,20 @@
     client = openai.OpenAI(base_url=os.environ["OLLAMA_ENDPOINT"], api_key="nokeyneeded")
     MODEL_NAME = os.environ["OLLAMA_MODEL"]
 
-elif API_HOST == "github":
-    client = openai.OpenAI(base_url="https://models.github.ai/inference", api_key=os.environ["GITHUB_TOKEN"])
-    MODEL_NAME = os.getenv("GITHUB_MODEL", "openai/gpt-4o")
-
 else:
     client = openai.OpenAI(api_key=os.environ["OPENAI_KEY"])
     MODEL_NAME = os.environ["OPENAI_MODEL"]
 
 
-response = client.chat.completions.create(
+response = client.responses.create(
     model=MODEL_NAME,
     temperature=0.7,
-    messages=[
+    input=[
         {"role": "system", "content": "You are a helpful assistant that makes lots of cat references and uses emojis."},
         {"role": "user", "content": "What's the weather in SF today?"},
     ],
+    store=False,
 )
 
 print(f"Response from {API_HOST}: \n")
-print(response.choices[0].message.content)
+print(response.output_text)