Asynchronous Invoke¶
Prerequisites¶
Initialize LLM and Agent¶
To use a list of default tools inside vinagent.tools you should set environment varibles inside .env
including TOGETHER_API_KEY
to use llm models at togetherai site and TAVILY_API_KEY
to use tavily websearch tool at tavily site:
from vinagent.agent.agent import Agent
from langchain_together import ChatTogether
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv('.env'))
# Step 1: Initialize LLM
llm = ChatTogether(
model="meta-llama/Llama-3.3-70B-Instruct-Turbo-Free"
)
# Step 2: Initialize Agent
agent = Agent(
description="You are a Weather Analyst",
llm = llm,
skills = [
"Update weather at anywhere",
"Forecast weather in the futher",
"Recommend picnic based on weather"
],
tools=['vinagent.tools.websearch_tools'],
tools_path = 'templates/tools.json', # Place to save tools. Default is 'templates/tools.json'
is_reset_tools = True # If True, it will reset tools every time reinitializing an agent. Default is False
)
INFO:httpx:HTTP Request: POST https://api.together.xyz/v1/chat/completions "HTTP/1.1 200 OK"
INFO:vinagent.register.tool:Registered search_api:
{'tool_name': 'search_api', 'arguments': {'query': {'type': 'Union[str, dict[str, str]]', 'value': '{}'}}, 'return': 'Any', 'docstring': 'Search for an answer from a query string\n Args:\n query (dict[str, str]): The input query to search\n Returns:\n The answer from search query', 'dependencies': ['os', 'dotenv', 'tavily', 'dataclasses', 'typing'], 'module_path': 'vinagent.tools.websearch_tools', 'tool_type': 'module', 'tool_call_id': 'tool_d697f931-5c00-44cf-b2f1-f70f91cc2973'}
INFO:vinagent.register.tool:Completed registration for module vinagent.tools.websearch_tools
Syntax for Async Invoke¶
Vinagent supports both synchronous (agent.invoke
) and asynchronous (agent.ainvoke
) execution methods. Synchronous calls block the main thread until a response is received, whereas asynchronous calls allow the program to continue running while waiting for a response. This makes asynchronous execution especially effective for I/O-bound tasks, such as when interacting with external services like search engine, database connection, weather API, .... In real-world usage, asynchronous calls can perform up to twice as fast as their synchronous counterparts.
Latency Benchmarking¶
This is a performance benchmarking table based on 100 requests to meta-llama/Llama-3.3-70B-Instruct-Turbo-Free on TogetherAI. It demonstrates that the latency of ainvoke
is nearly twice as fast as invoke
. You may get different results due to the randomness of the requests and state of LLM-provider server.
Number of requests | ainvoke (sec/req) |
invoke (sec/req) |
---|---|---|
100 | 8.05-11.72 | 15.03-18.47 |
This is code for benchmarking between two inference methods. To save cost, we only run 5 times.
import timeit
import asyncio
async def benchmark_ainvoke():
message = await agent.ainvoke("What is the weather in New York today?")
print(message.content)
return message
def sync_wrapper():
asyncio.run(benchmark_ainvoke())
execution_time = timeit.timeit(sync_wrapper, number=5)
print(f"Average execution of asynchronous time over 5 runs: {execution_time / 5:.2f} seconds")