Demo

Watch a Demo

My overall observation

Based on an initial vibe check with popular "non-reasoning" models, The quality improvement observed with MCTS wrapped invocations ranges from minimal to moderately better. The extent of improvement varies considerably with the specific task and the inherent capabilities of each model. You could also use this project with reasoning models. There is definitely room for improvement, specifically around tool use and function calling. Try claude-3.7-sonnet with this project.

Using MCTS with `gpt-4o-search-preview`

GPT-4o Search Preview is a specialized model trained to understand and execute web search queries with the Chat Completions API. I thought it'd be fun to share the experiment.

Expand to view vanilla gpt-4o-search-preview

mcts_gpt-4o_search_preview.mp4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demo

Watch a Demo

My overall observation

Using MCTS with `gpt-4o-search-preview`

Uh oh!

Clone this wiki locally

Demo

Watch a Demo

My overall observation

Using MCTS with gpt-4o-search-preview

Uh oh!

Clone this wiki locally

Using MCTS with `gpt-4o-search-preview`