lambda-feedback
diff --git a/‎.DS_Store‎
6 KB b/‎.DS_Store‎
6 KB
diff --git a/‎README.md‎
Lines changed: 4 additions & 45 deletions b/‎README.md‎
Lines changed: 4 additions & 45 deletions
diff --git a/‎config.json‎
Lines changed: 1 addition & 1 deletion b/‎config.json‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/user.md‎
Lines changed: 4 additions & 2 deletions b/‎docs/user.md‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎evaluation_function/.DS_Store‎
8 KB b/‎evaluation_function/.DS_Store‎
8 KB
diff --git a/‎evaluation_function/dev.py‎
Lines changed: 4 additions & 1 deletion b/‎evaluation_function/dev.py‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎evaluation_function/evaluation.py‎
Lines changed: 13 additions & 4 deletions b/‎evaluation_function/evaluation.py‎
Lines changed: 13 additions & 4 deletions
diff --git a/‎evaluation_function/models/.DS_Store‎
6 KB b/‎evaluation_function/models/.DS_Store‎
6 KB
diff --git a/‎evaluation_function/models/__init__.py‎
Lines changed: 3 additions & 0 deletions b/‎evaluation_function/models/__init__.py‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎evaluation_function/models/basic_nn.py‎
Lines changed: 115 additions & 0 deletions b/‎evaluation_function/models/basic_nn.py‎
Lines changed: 115 additions & 0 deletions
@@ -1,51 +1,10 @@
-# Python Evaluation Function
+# landModel 'Evaluation Function'
 
-This repository contains the boilerplate code needed to create a containerized evaluation function written in Python.
+A collection of small Language Models, leading up to LLM-like behaviour and then calling external LLMs. The purpose of the function is to provide interactive learning materials about language models (a.k.a. 'AI'). Primarily designed to interact with [Lambda Feedback](https://www.lambdafeedback.com), the code is public so can be used for other purposes.
 
-## Quickstart
+Code is free to use or adapt, but all liability is with the user and I would appreciate being given credit (Peter B. Johnson, Imperial College London).
 
-This chapter helps you to quickly set up a new Python evaluation function using this template repository.
-
-> [!NOTE]
-> After setting up the evaluation function, delete this chapter from the `README.md` file, and add your own documentation.
-
-#### 1. Create a new repository
-
-- In GitHub, choose `Use this template` > `Create a new repository` in the repository toolbar.
-
-- Choose the owner, and pick a name for the new repository.
-
-  > [!IMPORTANT]
-  > If you want to deploy the evaluation function to Lambda Feedback, make sure to choose the Lambda Feedback organization as the owner.
-
-- Set the visibility to `Public` or `Private`.
-
-  > [!IMPORTANT]
-  > If you want to use GitHub [deployment protection rules](https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment#deployment-protection-rules), make sure to set the visibility to `Public`.
-
-- Click on `Create repository`.
-
-#### 2. Clone the new repository
-
-Clone the new repository to your local machine using the following command:
-
-```bash
-git clone <repository-url>
-```
-
-#### 3. Configure the evaluation function
-
-When deploying to Lambda Feedback, set the evaluation function name in the `config.json` file. Read the [Deploy to Lambda Feedback](#deploy-to-lambda-feedback) section for more information.
-
-#### 4. Develop the evaluation function
-
-You're ready to start developing your evaluation function. Head over to the [Development](#development) section to learn more.
-
-#### 5. Update the README
-
-In the `README.md` file, change the title and description so it fits the purpose of your evaluation function.
-
-Also, don't forget to delete the Quickstart chapter from the `README.md` file after you've completed these steps.
+For more information on the function, see the `/docs` folder. The remainder of this README is generic guidance, from the boilderplate, about running the function locally when developing.
 
 ## Usage
 
 
@@ -1,3 +1,3 @@
 {
-  "EvaluationFunctionName": ""
+  "EvaluationFunctionName": "langModels"
 }
@@ -1,3 +1,5 @@
-# YourFunctionName
+# langModels
 
-Teacher-facing documentation for this function.
+A series of language models that students can run for inference to explore their performance.
+
+Created from scratch with just a neural network for modeling data from a sine wave. This is to ensure the software all works in integration tests. Following that, more models will be added (Shannon ngrams (letters); Shannon ngrams (words); Bengio neural network model of language with a small context window; basic transformer models; larger transformer models; external LLMs).
@@ -15,8 +15,11 @@ def dev():
 
     answer = sys.argv[1]
     response = sys.argv[2]
+    model = sys.argv[3] if len(sys.argv) > 3 else "basic_nn"
+    refresh = sys.argv[4].lower() == "true" if len(sys.argv) >= 4 else False
+    params = Params(model=model, refresh=refresh)
 
-    result = evaluation_function(answer, response, Params())
+    result = evaluation_function(answer, response, params)
 
     print(result.to_dict())
 
 
@@ -1,13 +1,15 @@
 from typing import Any
 from lf_toolkit.evaluation import Result, Params
 
+from . import models
+
 def evaluation_function(
     response: Any,
     answer: Any,
     params: Params,
 ) -> Result:
     """
-    Function used to evaluate a student response.
+    Evaluation Function.
     ---
     The handler function passes three arguments to evaluation_function():
 
@@ -29,6 +31,13 @@ def evaluation_function(
     to output the evaluation response.
     """
 
-    return Result(
-        is_correct=response == answer
-    )
+    model_name = getattr(params, "model", "basic_nn")  # default
+    try:
+        model = getattr(models, model_name)   # e.g. models.basic_nn
+    except AttributeError:
+        raise ValueError(f"Unknown model: {model_name}")
+
+    if not hasattr(model, "run"):
+        raise ValueError(f"Model {model_name} has no run()")
+
+    return model.run(response, answer, params)
@@ -0,0 +1,3 @@
+from . import basic_nn
+
+__all__ = ["basic_nn"]
@@ -0,0 +1,115 @@
+"""
+A simple feedforward neural network in PyTorch to illustrate 
+the basic features of a neural network.
+
+Dev only:
+- Data: add random noise to a time series
+- Model: a tiny neural network with one hidden layer, using PyTorch nn.Module
+- Training setup: mean squared error loss and Adam optimizer
+- Training loop: runs for a fixed number of epochs, printing loss occasionally
+- Save the model to disk after training
+
+Production:
+
+- Load the trained model
+- Test the model for the argument given by the student (infer value and compare to underlying 'True' function)
+
+"""
+
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import matplotlib.pyplot as plt
+
+from lf_toolkit.evaluation import Result, Params
+
+from pathlib import Path
+import os
+
+# Setup paths for saving/loading model and data
+BASE_DIR = Path(__file__).resolve().parent
+MODEL_DIR = Path(os.environ.get("MODEL_DIR", BASE_DIR / "storage"))
+MODEL_DIR.mkdir(parents=True, exist_ok=True)
+MODEL_PATH = MODEL_DIR / "basic_nn.pt"
+
+def f(x):
+    """Target function with noise (sine wave)."""
+    return torch.sin(x)
+
+def x_on_model(v, dev):
+    """ Helper: put scalar value on same device as model. """
+    return torch.tensor([[v]], device=dev, dtype=torch.float32)
+
+class TinyNet(nn.Module):
+    """A tiny feedforward neural network."""
+    def __init__(self):
+        super().__init__()
+        self.hidden = nn.Linear(1, 16)
+        self.act = nn.Tanh()
+        self.out = nn.Linear(16, 1)
+
+    def forward(self, x):
+        return self.out(self.act(self.hidden(x)))
+
+def train_model(device):
+    torch.manual_seed(0)
+    x = torch.linspace(-2*torch.pi, 2*torch.pi, 200).unsqueeze(1).to(device)
+    y = (f(x) + 0.1*torch.randn_like(x)).to(device)
+
+    model = TinyNet().to(device)
+    loss_fn = nn.MSELoss()
+    opt = optim.Adam(model.parameters(), lr=0.01)
+
+    for epoch in range(2000):
+        y_pred = model(x)
+        loss = loss_fn(y_pred, y)
+        opt.zero_grad()
+        loss.backward()
+        opt.step()
+        if epoch % 400 == 0:
+            print(f"Epoch {epoch}: loss={loss.item():.4f}")
+
+    return model
+
+def run(response, answer, params: Params) -> Result:
+    print("GPU") if torch.backends.mps.is_available() else print("CPU")
+    device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
+    refresh = params.get("refresh", False)
+    if refresh:
+        model = train_model(device)
+        MODEL_DIR.mkdir(parents=True, exist_ok=True)
+        torch.save(model.state_dict(), MODEL_PATH)
+
+    else:
+        model = TinyNet().to(device)
+        model.load_state_dict(torch.load(MODEL_PATH, map_location=device))
+        model.eval()
+
+    with torch.no_grad():
+        # For now just test one point
+        x_val = x_on_model(float(response), device)
+        y_pred = model(x_val).cpu().item()
+
+    absolute_tolerance = params.get("absolute_tolerance", 0.1)
+    y_true = f(torch.tensor([[float(response)]])).item()
+    diff = abs(y_pred - y_true)
+    is_correct=diff < absolute_tolerance
+    return Result(is_correct=is_correct,feedback_items=[("general",f"Model({response}) = {y_pred:.4f}, f({response}) = {y_true:.4f} (this is the 'true' value), Diff = {diff:.4f} (tolerance {absolute_tolerance}). Valid model: {is_correct}")])
+
+# --- runnable code only executes if script is run directly ---
+
+if __name__ == "__main__":
+
+    result = run("some_response", "some_answer", Params())
+    print(result)
+
+"""     # 5. Plot results (eval mode, extended domain)
+    with torch.no_grad():
+        # Make domain twice as wide as training range
+        x_plot = torch.linspace(2*x.min().item(), 2*x.max().item(), 800, device=x.device).unsqueeze(1)
+        y_plot = model(x_plot)
+
+        plt.scatter(x.cpu(), y.cpu(), s=10, label="Data")
+        plt.plot(x_plot.cpu(), y_plot.cpu(), color="red", label="Model")
+        plt.legend()
+        plt.show() """
Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,3 @@`
`1`	`1`	`{`
`2`		`- "EvaluationFunctionName": ""`
	`2`	`+ "EvaluationFunctionName": "langModels"`
`3`	`3`	`}`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+from . import basic_nn`
	`2`	`+`
	`3`	`+__all__ = ["basic_nn"]`