Skip to content

Commit 98d7d36

Browse files
Peter JohnsonPeter Johnson
authored andcommitted
initialise function with a basic model (nn)
1 parent fa2dc06 commit 98d7d36

13 files changed

Lines changed: 1541 additions & 261 deletions

File tree

.DS_Store

6 KB
Binary file not shown.

README.md

Lines changed: 4 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,10 @@
1-
# Python Evaluation Function
1+
# landModel 'Evaluation Function'
22

3-
This repository contains the boilerplate code needed to create a containerized evaluation function written in Python.
3+
A collection of small Language Models, leading up to LLM-like behaviour and then calling external LLMs. The purpose of the function is to provide interactive learning materials about language models (a.k.a. 'AI'). Primarily designed to interact with [Lambda Feedback](https://www.lambdafeedback.com), the code is public so can be used for other purposes.
44

5-
## Quickstart
5+
Code is free to use or adapt, but all liability is with the user and I would appreciate being given credit (Peter B. Johnson, Imperial College London).
66

7-
This chapter helps you to quickly set up a new Python evaluation function using this template repository.
8-
9-
> [!NOTE]
10-
> After setting up the evaluation function, delete this chapter from the `README.md` file, and add your own documentation.
11-
12-
#### 1. Create a new repository
13-
14-
- In GitHub, choose `Use this template` > `Create a new repository` in the repository toolbar.
15-
16-
- Choose the owner, and pick a name for the new repository.
17-
18-
> [!IMPORTANT]
19-
> If you want to deploy the evaluation function to Lambda Feedback, make sure to choose the Lambda Feedback organization as the owner.
20-
21-
- Set the visibility to `Public` or `Private`.
22-
23-
> [!IMPORTANT]
24-
> If you want to use GitHub [deployment protection rules](https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment#deployment-protection-rules), make sure to set the visibility to `Public`.
25-
26-
- Click on `Create repository`.
27-
28-
#### 2. Clone the new repository
29-
30-
Clone the new repository to your local machine using the following command:
31-
32-
```bash
33-
git clone <repository-url>
34-
```
35-
36-
#### 3. Configure the evaluation function
37-
38-
When deploying to Lambda Feedback, set the evaluation function name in the `config.json` file. Read the [Deploy to Lambda Feedback](#deploy-to-lambda-feedback) section for more information.
39-
40-
#### 4. Develop the evaluation function
41-
42-
You're ready to start developing your evaluation function. Head over to the [Development](#development) section to learn more.
43-
44-
#### 5. Update the README
45-
46-
In the `README.md` file, change the title and description so it fits the purpose of your evaluation function.
47-
48-
Also, don't forget to delete the Quickstart chapter from the `README.md` file after you've completed these steps.
7+
For more information on the function, see the `/docs` folder. The remainder of this README is generic guidance, from the boilderplate, about running the function locally when developing.
498

509
## Usage
5110

config.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
{
2-
"EvaluationFunctionName": ""
2+
"EvaluationFunctionName": "langModels"
33
}

docs/user.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1-
# YourFunctionName
1+
# langModels
22

3-
Teacher-facing documentation for this function.
3+
A series of language models that students can run for inference to explore their performance.
4+
5+
Created from scratch with just a neural network for modeling data from a sine wave. This is to ensure the software all works in integration tests. Following that, more models will be added (Shannon ngrams (letters); Shannon ngrams (words); Bengio neural network model of language with a small context window; basic transformer models; larger transformer models; external LLMs).

evaluation_function/.DS_Store

8 KB
Binary file not shown.

evaluation_function/dev.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,11 @@ def dev():
1515

1616
answer = sys.argv[1]
1717
response = sys.argv[2]
18+
model = sys.argv[3] if len(sys.argv) > 3 else "basic_nn"
19+
refresh = sys.argv[4].lower() == "true" if len(sys.argv) >= 4 else False
20+
params = Params(model=model, refresh=refresh)
1821

19-
result = evaluation_function(answer, response, Params())
22+
result = evaluation_function(answer, response, params)
2023

2124
print(result.to_dict())
2225

evaluation_function/evaluation.py

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
11
from typing import Any
22
from lf_toolkit.evaluation import Result, Params
33

4+
from . import models
5+
46
def evaluation_function(
57
response: Any,
68
answer: Any,
79
params: Params,
810
) -> Result:
911
"""
10-
Function used to evaluate a student response.
12+
Evaluation Function.
1113
---
1214
The handler function passes three arguments to evaluation_function():
1315
@@ -29,6 +31,13 @@ def evaluation_function(
2931
to output the evaluation response.
3032
"""
3133

32-
return Result(
33-
is_correct=response == answer
34-
)
34+
model_name = getattr(params, "model", "basic_nn") # default
35+
try:
36+
model = getattr(models, model_name) # e.g. models.basic_nn
37+
except AttributeError:
38+
raise ValueError(f"Unknown model: {model_name}")
39+
40+
if not hasattr(model, "run"):
41+
raise ValueError(f"Model {model_name} has no run()")
42+
43+
return model.run(response, answer, params)
6 KB
Binary file not shown.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
from . import basic_nn
2+
3+
__all__ = ["basic_nn"]
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
"""
2+
A simple feedforward neural network in PyTorch to illustrate
3+
the basic features of a neural network.
4+
5+
Dev only:
6+
- Data: add random noise to a time series
7+
- Model: a tiny neural network with one hidden layer, using PyTorch nn.Module
8+
- Training setup: mean squared error loss and Adam optimizer
9+
- Training loop: runs for a fixed number of epochs, printing loss occasionally
10+
- Save the model to disk after training
11+
12+
Production:
13+
14+
- Load the trained model
15+
- Test the model for the argument given by the student (infer value and compare to underlying 'True' function)
16+
17+
"""
18+
19+
import torch
20+
import torch.nn as nn
21+
import torch.optim as optim
22+
import matplotlib.pyplot as plt
23+
24+
from lf_toolkit.evaluation import Result, Params
25+
26+
from pathlib import Path
27+
import os
28+
29+
# Setup paths for saving/loading model and data
30+
BASE_DIR = Path(__file__).resolve().parent
31+
MODEL_DIR = Path(os.environ.get("MODEL_DIR", BASE_DIR / "storage"))
32+
MODEL_DIR.mkdir(parents=True, exist_ok=True)
33+
MODEL_PATH = MODEL_DIR / "basic_nn.pt"
34+
35+
def f(x):
36+
"""Target function with noise (sine wave)."""
37+
return torch.sin(x)
38+
39+
def x_on_model(v, dev):
40+
""" Helper: put scalar value on same device as model. """
41+
return torch.tensor([[v]], device=dev, dtype=torch.float32)
42+
43+
class TinyNet(nn.Module):
44+
"""A tiny feedforward neural network."""
45+
def __init__(self):
46+
super().__init__()
47+
self.hidden = nn.Linear(1, 16)
48+
self.act = nn.Tanh()
49+
self.out = nn.Linear(16, 1)
50+
51+
def forward(self, x):
52+
return self.out(self.act(self.hidden(x)))
53+
54+
def train_model(device):
55+
torch.manual_seed(0)
56+
x = torch.linspace(-2*torch.pi, 2*torch.pi, 200).unsqueeze(1).to(device)
57+
y = (f(x) + 0.1*torch.randn_like(x)).to(device)
58+
59+
model = TinyNet().to(device)
60+
loss_fn = nn.MSELoss()
61+
opt = optim.Adam(model.parameters(), lr=0.01)
62+
63+
for epoch in range(2000):
64+
y_pred = model(x)
65+
loss = loss_fn(y_pred, y)
66+
opt.zero_grad()
67+
loss.backward()
68+
opt.step()
69+
if epoch % 400 == 0:
70+
print(f"Epoch {epoch}: loss={loss.item():.4f}")
71+
72+
return model
73+
74+
def run(response, answer, params: Params) -> Result:
75+
print("GPU") if torch.backends.mps.is_available() else print("CPU")
76+
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
77+
refresh = params.get("refresh", False)
78+
if refresh:
79+
model = train_model(device)
80+
MODEL_DIR.mkdir(parents=True, exist_ok=True)
81+
torch.save(model.state_dict(), MODEL_PATH)
82+
83+
else:
84+
model = TinyNet().to(device)
85+
model.load_state_dict(torch.load(MODEL_PATH, map_location=device))
86+
model.eval()
87+
88+
with torch.no_grad():
89+
# For now just test one point
90+
x_val = x_on_model(float(response), device)
91+
y_pred = model(x_val).cpu().item()
92+
93+
absolute_tolerance = params.get("absolute_tolerance", 0.1)
94+
y_true = f(torch.tensor([[float(response)]])).item()
95+
diff = abs(y_pred - y_true)
96+
is_correct=diff < absolute_tolerance
97+
return Result(is_correct=is_correct,feedback_items=[("general",f"Model({response}) = {y_pred:.4f}, f({response}) = {y_true:.4f} (this is the 'true' value), Diff = {diff:.4f} (tolerance {absolute_tolerance}). Valid model: {is_correct}")])
98+
99+
# --- runnable code only executes if script is run directly ---
100+
101+
if __name__ == "__main__":
102+
103+
result = run("some_response", "some_answer", Params())
104+
print(result)
105+
106+
""" # 5. Plot results (eval mode, extended domain)
107+
with torch.no_grad():
108+
# Make domain twice as wide as training range
109+
x_plot = torch.linspace(2*x.min().item(), 2*x.max().item(), 800, device=x.device).unsqueeze(1)
110+
y_plot = model(x_plot)
111+
112+
plt.scatter(x.cpu(), y.cpu(), s=10, label="Data")
113+
plt.plot(x_plot.cpu(), y_plot.cpu(), color="red", label="Model")
114+
plt.legend()
115+
plt.show() """

0 commit comments

Comments
 (0)