Browser Use

Overview of the Browser Use book.

The following documentation is for Browser Use v1.9.10 (BDK).

Overview

The Browser Use book enables you to perform web automation using natural language. Use plain English to describe a task, and a remote browser agent will execute it.

The browser agent can perform a range of web automation tasks, including navigating pages, filling out forms, and extracting structured data, all with full traceability. Every action is logged with detailed step plans and unique session IDs. Take a look at this quick example:

Prerequisites

1. Required Books

The following Book(s) need to be added to your agent so it can learn and understand the automation procedures defined within them:

  • Browser Use

How to Add the Book(s)

  1. Go to BooksAll Books.

  2. Search for the name of the book and click on it.

  3. Click on Install or Add Connection to add the book to your agent.

  4. If adding a connection, you'll be prompted for connectivity details.

Connectivity

This section outlines the available methods for connecting to the Book, along with the required configuration details for each.

The Browser Use book can be used with or without credentials.

Getting Started

Follow these steps to write an automation that performs a web-based task:

1

Provision a Browser

The first step is to provision a browser, which sets up a browser instance for automation to control. This is required for any browser automation task. In the editor, write:

provision a browser

You can also name the browser instance using "as". This allows you to run multiple browser sessions or refer to a specific one. In this example, the browser instance is named x:

provision a browser as x
2

Prepare to Perform a Task

On the following line, write this procedure to send the task to a remote browser agent:

perform <the task> on the browser and get the visual log, the detailed plan, the result and the browser run id

If you assigned a name to the browser instance, use it in place of the browser:

perform <the task> on x and get the visual log, the detailed plan, the result and the browser run id
3

Create a Web Automation Task

Replace <the task> in the previous line with a Browser Automation widget:

  1. In the editor, type / to open the widget menu and select Browser Automation.

  1. Within the widget, select Choose an Existing Task or configure a new task by providing the following details. Then, click Save to finalize it.

Task Name

A short, descriptive name for the task (e.g., Check Website for Updates)

Task Instructions

Describe what you want the browser to do in plain English (e.g., Go to example.com, log in, and check for new messages).

Output Format (Optional)

Define how you want the results formatted. It is recommended to use key-value pairs for clarity (e.g., {"key": "value"}). If specified, the agent will format the result accordingly.

Credentials (Optional)

If your task requires access to a secure website or account, provide the necessary credentials (e.g., username and password). You can select from existing connections in your agent or add a new connection.

Example: Get the Weather in a City
4

Run Your Automation

After provisioning the browser and defining your task, you’re ready to run the automation. The browser agent will carry out the steps and return the following:

Visual Log

A screen-by-screen view of what happened in the browser.

Detailed Plan

The step-by-step breakdown of actions taken.

Result

The output of the task (e.g., extracted data).

Browser Run ID

A unique identifier for the automation run.

Passing Data as Input

You can pass data to a task by defining data elements ("facts") in your automation and referencing them within the instructions. The agent automatically detects these facts when they are enclosed in curly brackets {} within the Task Instructions of the Browser Automation widget.

Example

In this example, we define the the url and the email as data elements and then use them to instruct the browser agent:

1

Define Data

the url is "https://reports.example.com"
the email is "[email protected]"
2

Provision Browser

provision a browser
3

Perform Task and Reference Data

perform <the task> on the browser and get the visual log, the detailed plan, the result and the browser run id

Reference the facts in the Task Instructions of the Browser Automation widget:

Task Name

Download and Email Report

Task Instructions

Go to {the url}, download the latest monthly report, and email the report to {the email}.

Information References Detected: the url, the email

When the agent runs, it uses https://reports.example.com as the value for the url and [email protected] for the email.

The Result

Once the browser task completes, you can reference the extracted data output as the result and access specific fields within that result. For example, say the task was configured to return the following JSON:

{
  "Price": "$499.99",
  "Availability": "In Stock"
}

You can retrieve the complete data or specific values in this way:

# Returns the entire JSON object: {"Price": "$499.99", "Availability": "In Stock"}
get the result

# Returns the value of the 'Price' field: $499.99
get the result's Price

# Returns the value of the 'Availability' field: In Stock
get the result's Availability

Best Practices

Follow these guidelines when configuring the Browser Automation widget to write clear and accurate task instructions for the browser agent.

1. Use Sequential Task Instructions

Write task instructions as a step-by-step script, where each line represents a distinct action in the workflow. For example:

1. Go to https://salesportal.com
2. Log in using stored credentials
3. Click the "Reports" tab
4. Select "Quarterly Sales Summary"
5. Click the "Download Report" button
6. Save the downloaded file to the workspace

2. Define the Output Format

Specify the Output Format to ensure the data is returned in an expected format. For example:

JSON in the format {"city": "temperature"}

3. Be Precise

Refer to webpage elements using their exact visible text or label to prevent ambiguity.

Example

  • Click the 'Download Report' button.

  • Type the data into the field labeled 'Customer ID'.

Instead of

  • Click the button.

  • Find the table.

4. Use Dynamic Input

Use facts to dynamically pass data into your task, rather than hardcoding values.

Example

Search for the client profile that has the name {the client name}.

Instead of

Search for "John Doe".

Last updated

Was this helpful?