Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 33 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,45 +137,47 @@ const result = await multiUserSession.prompt([

Because of their special behavior of being preserved on context window overflow, system prompts cannot be provided this way.

### Emulating tool use or function-calling via assistant-role prompts
### Tool use

A special case of the above is using the assistant role to emulate tool use or function-calling, by marking a response as coming from the assistant side of the conversation:
The Prompt API supports **tool use** via the `tools` option, allowing you to define external capabilities that a language model can invoke in a model-agnostic way. Each tool is represented by an object that includes an `execute` member that specifies the JavaScript function to be called. When the language model initiates a tool use request, the user agent calls the corresponding `execute` function and sends the result back to the model.

Here’s an example of how to use the `tools` option:

```js
const session = await LanguageModel.create({
initialPrompts: [{
role: "system",
content: `
You are a helpful assistant. You have access to the following tools:
- calculator: A calculator. To use it, write "CALCULATOR: <expression>" where <expression> is a valid mathematical expression.
`
}]
initialPrompts: [
{
role: "system",
content: `You are a helpful assistant. You can use tools to help the user.`
}
],
tools: [
{
name: "getWeather",
description: "Get the weather in a location.",
inputSchema: {
type: "object",
properties: {
location: {
type: "string",
description: "The city to check for the weather condition.",
},
},
required: ["location"],
},
async execute({ location }) {
const res = await fetch("https://weatherapi.example/?location=" + location);
// Returns the result as a JSON string.
return JSON.stringify(await res.json());
},
}
]
});

async function promptWithCalculator(prompt) {
const result = await session.prompt(prompt);

// Check if the assistant wants to use the calculator tool.
const match = /^CALCULATOR: (.*)$/.exec(result);
if (match) {
const expression = match[1];
const mathResult = evaluateMathExpression(expression);

// Add the result to the session so it's in context going forward.
await session.prompt([{ role: "assistant", content: mathResult }]);

// Return it as if that's what the assistant said to the user.
return mathResult;
}

// The assistant didn't want to use the calculator. Just return its response.
return result;
}

console.log(await promptWithCalculator("What is 2 + 2?"));
const result = await session.prompt("What is the weather in Seattle?");
```

We'll likely explore more specific APIs for tool- and function-calling in the future; follow along in [issue #7](https://github.com/webmachinelearning/prompt-api/issues/7).
In this example, the `tools` array defines a `getWeather` tool, specifying its name, description, input schema, and `execute` implementation. When the language model determines that a tool call is needed, the user agent invokes the `getWeather` tool's `execute()` function with the provided arguments and returns the result to the model, which can then incorporate it into its response.

### Multimodal inputs

Expand Down
14 changes: 14 additions & 0 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,19 @@ interface LanguageModelParams {
readonly attribute float maxTemperature;
};


callback LanguageModelToolFunction = Promise<DOMString> (any... arguments);

// A description of a tool call that a language model can invoke.
dictionary LanguageModelTool {
required DOMString name;
required DOMString description;
// JSON schema for the input parameters.
required object inputSchema;
// The function to be invoked by user agent on behalf of language model.
required LanguageModelToolFunction execute;
};

dictionary LanguageModelCreateCoreOptions {
// Note: these two have custom out-of-range handling behavior, not in the IDL layer.
// They are unrestricted double so as to allow +Infinity without failing.
Expand All @@ -83,6 +96,7 @@ dictionary LanguageModelCreateCoreOptions {

sequence<LanguageModelExpected> expectedInputs;
sequence<LanguageModelExpected> expectedOutputs;
sequence<LanguageModelTool> tools;
};

dictionary LanguageModelCreateOptions : LanguageModelCreateCoreOptions {
Expand Down