Skip to main content
Agents (browser agents) are the fundamental building block for your Simplex workflows to interact with the web. Simplex agents are designed to be reusable across workflows.

How browser agents work

At their core, browser agents are LLMs in a loop with tools that allow them to interact with the page. You give them a user prompt and they work to accomplish the task.

A typical browser agent flow

At every step, browser agents are overloaded with new context: web page content.

Browser agent context at a given step

When the content gets to be too large, agents get steered off course. They get stuck in loops completing the same actions, skip steps, or explore by taking random actions on the page. Managing agent context is key to building reliable automations. With Simplex agents, you can restrict the tool calls an agent has access to, restrict the number of steps it can take before failing, and we include dense template prompts with prompting best practices.

Agent types

Agents can be either Single Agent or Multi-Agent. Multi-Agents are useful in long-horizon, complex tasks. Message us on Slack to help you set up your first Multi-Agent system. Read our blog on multi-agent systems if you’d like to learn more about our approach. Most automations only require sequences of Single Agents.

Available tools

In the Simplex agent editor, you need to explicitly set the tools an agent can access. This constrains agents so they’re less likely to make incorrect tool calls. For example, if you’re attempting to fill out a form using a form filling agent you created, and you know your agent is already on the form page when it starts in the workflow, your agent is far more likely to simply fill out the form. Custom tools are available for Simplex Growth customers — contact us on Slack to get started.

Recommendations

Many legacy portals take a long time to load. In this case, include wait_for_seconds, so the agent can wait for the page to fully load before having to act on the page. Use type_text for filling out input fields. Use send_keys for special characters inputs like Ctrl+C or Alt+A.
Tool CallDescription
click_elementClicks on a specified element on the page
type_textTypes text into an input field or text area
wait_for_secondsPauses execution for a specified number of seconds
go_backNavigates to the previous page in browser history
reload_pageRefreshes the current page
press_enterSimulates pressing the Enter key
switch_tabSwitches between browser tabs
scroll_downScrolls down on the current page
scroll_upScrolls up on the current page
send_keysUsed for special characters
scroll_to_textScrolls the page until specified text is visible
get_current_timeReturns the current system time
get_downloaded_filesLists files downloaded during the session
store_dataSaves data to browser storage for later retrieval

Using session store as an agent tool

The session store is a powerful tool to prompt your agent to ‘remember’ or store important information. It acts as a persistent key-value storage system that maintains state between different sessions and workflow executions. Here is a snippet of an agent prompt to persist extracted information about a customer on a web portal using the session store, which can then be retrieved with the Get Session Store endpoint.
Store the extracted customer data using store_data tool with JSON structure containing:
  -"customer_name": "{customer_name}",
  -"birth_date": "<extracted birth date>"
  -"file": "<s3 URL from the most recent get_downloaded_files response>"
You can read more on the session store here. Message us on Slack if you’d like to use the session store for your agent and we’ll activate it for you!

Max Steps

Steps define the number of sub actions an agent needs to take to complete the overall task. For example, filling out a form with 3 fields may take 5 steps. If your agent needs to run for a long time (i.e. fill out a large form), increase the number of max steps. We recommend allowing a larger number of steps (default value is 15) in case pages take time to load, there are dropdowns, etc..