Browser Use

Procedures and concepts for the Browser Use integration.

The following documentation is for Browser Use v1.14.1 (BDK).

Overview

The Browser Use book enables you to perform web automation using natural language. Use plain English to describe a task, and a remote browser agent will execute it.

The browser agent can perform a range of web automation tasks, including navigating pages, filling out forms, and extracting structured data, all with full traceability. Every action is logged with detailed step plans and unique session IDs. Take a look at this quick example:

Prerequisites

1. Required Books

The following Book(s) need to be added to your agent so it can learn and understand the automation procedures defined within them:

Browser Use

How to Add the Book(s)

Go to Books → All Books.
Search for the name of the book and click on it.
Click on Install or Add Connection to add the book to your agent.
If adding a connection, you'll be prompted for connectivity details.

Procedures

to close a browser

Close a browser instance.

Input Concepts

Concept

Description

Type

Required

Default Value

browser

A reference to the browser instance to delete.

browser instance

Yes

(no default)

Output Concepts

Concept

Description

Type

answer

None

boolean

Examples

Close a browser instance:

close the browser

to perform a task on a browser and get the visual log, the detailed plan, the result, the browser run id and the browser files

Execute a browser-based web automation task.

Note: Before using this procedure, you must first "provision a browser" for your automation task.

Input Concepts

Concept

Description

Type

Required

Default Value

browser

The reference to the provisioned browser instance.

browser instance

Yes

(no default)

task

An object containing task configuration (created with the Browser Automation widget).

browser task

Yes

(no default)

Output Concepts

Concept

Description

Type

visual log

A series of screenshots showing each step the browser took.

file

detailed plan

A text summary of the steps taken to automate the task.

text

result

The data extracted or produced by the agent upon completing the task.

any

browser run id

The identifier of the browser run.

text

browser files

A list of files generated by the agent upon completing the task.

file

Examples

Get the Weather in Philadelphia. Task instructions: "Search for the 'current weather in Philadelphia' and extract the temperature"

provision a browser
perform <the task> on the browser and get the visual log, the detailed plan, the result, the browser run id and the browser files

Get Laptop Prices from Amazon. Task instructions: "Go to www.Amazon.com and search for 'laptop.' Look at the first 5 product listings on the results page. For each product, copy the product name and its price. Show me the list of product names with their prices."

provision a browser
perform <the task> on the browser and get the visual log, the detailed plan, the result, the browser run id and the browser files

to provision a browser

Provision a new K8s browser instance for a web automation task.

Output Concepts

Concept

Description

Type

answer

None

browser instance

Examples

Provision a browser and get the reference to the browser instance:

provision a browser

to provision an aws browser

Provision a new AWS AgentCore browser instance for a web automation task.

Output Concepts

Concept

Description

Type

answer

None

browser instance

Examples

Provision an AWS browser and get the reference to the browser instance:

provision an aws browser

to run a playwright script on a browser and get the result, the browser run id and the browser files

Execute a custom Playwright script on the provisioned browser.

Note: Before using this procedure, you must first "provision a browser". The script must: 1. Be stored in S3 2. Contain an async function: run(page, context, browser, params, llm=None) 3. Be accessible from the browser farm's AWS credentials

Input Concepts

Concept

Description

Type

Required

Default Value

browser

Reference to the provisioned browser instance.

browser instance

Yes

(no default)

playwright script

S3 URI of the Playwright script (e.g., "s3://bucket/scripts/my_script.py").

text

Yes

(no default)

parameter json

Optional dictionary of parameters to pass to the script.

json

(no default)

Output Concepts

Concept

Description

Type

result

Data returned by the script execution.

any

browser run id

Identifier of the browser run.

text

browser files

Files generated during script execution.

file

Examples

Run a custom data extraction script with parameters:

provision a browser
the script s3 uri is "s3://my-bucket/scripts/123456789/weather_search.py"
run the script s3 uri on the browser with parameters {"location": "San Francisco, CA"} and get the result, the browser run id and the browser files

Concepts

Browser instance

Represents attributes of an instance with browser support.

Field Name

Description

Type

browser_name

The name of the browser instance (pod name or AWS session ID).

text

vnc

The Virtual Network Computing (VNC) URL to access the browser.

text

browser_type

The type of browser backend ("k8s" or "aws").

optional[text]

Browser task

Represents a browser automation task.

Field Name

Description

Type

instructions

A natural language description of the task for the browser agent to perform (e.g., "Log into my account on example.com and check for new messages").

text

browser_task_id

Identifier for the browser task.

text

output

The desired structure or format of the results. It is recommended to use key-value pairs for clarity (e.g., {"key": "value"}). If specified, the agent will format the result accordingly.

optional[text]

worker_id

Identifier for the Kognitos worker executing the task.

text

line_id

Identifier for the specific line/step in the Kognitos automation.

text

browser_task_version

Version of the browser task.

text

task_context

Optional additional context or data to assist in completing the task. Contains the resolved reference facts for the task.

optional[json]

credential_names

The names of the authentication credentials needed for the task.

optional[list of text]

files

The files to be uploaded to the browser.

optional[list of text]

PreviousBox NextClickUp

Last updated 1 month ago

Was this helpful?

hashtagOverview

hashtagPrerequisites

hashtag1. Required Books

hashtagHow to Add the Book(s)

hashtagProcedures

hashtagto close a browser

hashtagto perform a task on a browser and get the visual log, the detailed plan, the result, the browser run id and the browser files

hashtagto provision a browser

hashtagto provision an aws browser

hashtagto run a playwright script on a browser and get the result, the browser run id and the browser files

hashtagConcepts

hashtagBrowser instance

hashtagBrowser task

Overview

Prerequisites

1. Required Books

How to Add the Book(s)

Procedures

to close a browser

to perform a task on a browser and get the visual log, the detailed plan, the result, the browser run id and the browser files

to provision a browser

to provision an aws browser

to run a playwright script on a browser and get the result, the browser run id and the browser files

Concepts

Browser instance

Browser task