Procedures

Automation procedures in the Browser Use book.

The following documentation is for Browser Use v1.9.9 (BDK).

to perform a task on a browser and get the visual log, the detailed plan, the result and the browser run id

Execute a browser-based web automation task.

This procedure sends your task to a remote browser agent. The agent uses browser automation to complete the task; it can navigate websites, click buttons, fill forms, extract data, and perform other web interactions. When finished, it returns a detailed plan of the steps taken, a visual log, and the results. Notes: 1. Before using this procedure, you must first provision a browser instance using "provision a browser". 2. Use the Browser Automation widget to create a task. Access it by typing / in the editor, then select "Browser Automation".

Input Concepts

Concept
Description
Type
Required
Default Value

The reference to the provisioned browser instance.

browser instance

Yes

(no default)

An object containing task configuration with the following attributes: - instructions (str): A natural language description of the task for the browser agent to perform (e.g., "Log into my account on example.com and check for new messages"). - output (str, optional): The desired format for the result output. If provided, the agent will structure the result accordingly. - task_context (str, optional): Additional context or data that may be helpful for completing the task. - browser_task_id (str): Identifier for the browser task. - worker_id (str): Identifier for the worker executing the task. - line_id (str): Identifier for the specific line/step. - browser_task_version (str): Version of the browser task.

browser task

Yes

(no default)

Output Concepts

Concept
Description
Type

visual log

Currently returns None. Future versions may include video recording of the browser interaction.

file

detailed plan

A natural language summary (string) generated by the agent, describing the key steps taken to accomplish the task. Returns None if summary generation fails.

text

result

The data extracted or produced by the agent upon completing the task. The type is Any. If the result format was provided, this result will be structured accordingly. Returns None if the agent fails to produce a result.

any

browser run id

The ID of the browser run.

text

Examples

Get the Weather in Philadelphia Create the following task using the Browser Automation widget: "Search for the 'current weather in Philadelphia' and extract the temperature"

provision a browser
perform <the task> on the browser and get the visual log, the detailed plan, the result and the browser run id

Get Laptop Prices from Amazon Create the following task using the Browser Automation widget: "Go to www.Amazon.com and search for 'laptop.' Look at the first 5 product listings on the results page. For each product, copy the product name and its price. Show me the list of product names with their prices."

provision a browser
perform <the task> on the browser and get the visual log, the detailed plan, the result and the browser run id

to provision a browser

Provision a new browser instance for your web automation task. The

browser includes a live, interactive view that allows you to see each step of the process. Note: The browser will be ready to use after a few seconds.

Output Concepts

Concept
Description
Type

None

browser instance

Examples

Provision a browser and get the reference to the browser instance:

provision a browser

Last updated

Was this helpful?