Tool & Function Calling
Tool & Function Calling
Tool & Function Calling
Tool calls (also known as function calls) give an LLM access to external tools. The LLM does not call the tools directly. Instead, it suggests the tool to call. The user then calls the tool separately and provides the results back to the LLM. Finally, the LLM formats the response into an answer to the user’s original question.
OpenRouter standardizes the tool calling interface across models and providers, making it easy to integrate external tools with any supported model.
Supported Models: You can find models that support tool calling by filtering on openrouter.ai/models?supported_parameters=tools.
If you prefer to learn from a full end-to-end example, keep reading.
Tool calling with OpenRouter involves three key steps. Here are the essential request body formats for each step:
After receiving the model’s response with tool_calls, execute the requested tool locally and prepare the result:
Note: The tools parameter must be included in every request (Steps 1 and 3) so the router can validate the tool schema on each call.
Here is Python code that gives LLMs the ability to call an external API — in this case Project Gutenberg, to search for books.
First, let’s do some basic setup:
Next, we define the tool that we want to call. Remember, the tool is going to get requested by the LLM, but the code we are writing here is ultimately responsible for executing the call and returning the results to the LLM.
Note that the “tool” is just a normal function. We then write a JSON “spec” compatible with the OpenAI function calling parameter. We’ll pass that spec to the LLM so that it knows this tool is available and how to use it. It will request the tool when needed, along with any arguments. We’ll then marshal the tool call locally, make the function call, and return the results to the LLM.
Let’s make the first OpenRouter API call to the model:
The LLM responds with a finish reason of tool_calls, and a tool_calls array. In a generic LLM response-handler, you would want to check the finish_reason before processing tool calls, but here we will assume it’s the case. Let’s keep going, by processing the tool call:
The messages array now has:
Now, we can make a second OpenRouter API call, and hopefully get our result!
The output will be something like:
We did it! We’ve successfully used a tool in a prompt.
Interleaved thinking allows models to reason between tool calls, enabling more sophisticated decision-making after receiving tool results. This feature helps models chain multiple tool calls with reasoning steps in between and make nuanced decisions based on intermediate results.
Important: Interleaved thinking increases token usage and response latency. Consider your budget and performance requirements when enabling this feature.
With interleaved thinking, the model can:
Here’s an example showing how a model might use interleaved thinking to research a topic across multiple sources:
Initial Request:
Model’s Reasoning and Tool Calls:
Initial Thinking: “I need to research electric vehicle environmental impact. Let me start with academic papers to get peer-reviewed research.”
First Tool Call: search_academic_papers({"query": "electric vehicle lifecycle environmental impact", "field": "environmental science"})
After First Tool Result: “The papers show mixed results on manufacturing impact. I need current statistics to complement this academic research.”
Second Tool Call: get_latest_statistics({"topic": "electric vehicle carbon footprint", "year": 2024})
After Second Tool Result: “Now I have both academic research and current data. Let me search for manufacturing-specific studies to address the gaps I found.”
Third Tool Call: search_academic_papers({"query": "electric vehicle battery manufacturing environmental cost", "field": "materials science"})
Final Analysis: Synthesizes all gathered information into a comprehensive response.
When implementing interleaved thinking:
In the example above, the calls are made explicitly and sequentially. To handle a wide variety of user inputs and tool calls, you can use an agentic loop.
Here’s an example of a simple agentic loop (using the same tools and initial messages as above):
When defining tools for LLMs, follow these best practices:
Clear and Descriptive Names: Use descriptive function names that clearly indicate the tool’s purpose.
Comprehensive Descriptions: Provide detailed descriptions that help the model understand when and how to use the tool.
When using streaming responses with tool calls, handle the different content types appropriately:
Control tool usage with the tool_choice parameter:
Control whether multiple tools can be called simultaneously with the parallel_tool_calls parameter (default is true for most models):
When parallel_tool_calls is false, the model will only request one tool call at a time instead of potentially multiple calls in parallel.
Design tools that work well together:
This allows the model to naturally chain operations: search → get details → check inventory.
OpenRouter tracks how reliably each provider completes tool calls and surfaces this as the Tool Call Error Rate on the Performance tab of every model page. The same signal drives Auto Exacto provider ordering on tool-calling requests. For the exact validator, JSON Schema draft, regex semantics, and per-tool-call classification, see How Tool-Calling Success Rate Is Measured.
For more details on OpenRouter’s message format and tool parameters, see the API Reference.