browserBrowser Use

Procedures and concepts for the Browser Use integration.

circle-info

The following documentation is for Browser Use v1.14.0 (BDK).

Overview

The Browser Use book enables you to perform web automation using natural language. Use plain English to describe a task, and a remote browser agent will execute it.

The browser agent can perform a range of web automation tasks, including navigating pages, filling out forms, and extracting structured data, all with full traceability. Every action is logged with detailed step plans and unique session IDs. Take a look at this quick example:

spinner

Prerequisites

1. Required Books

The following Book(s) need to be added to your agent so it can learn and understand the automation procedures defined within them:

  • Browser Use

How to Add the Book(s)

  1. Go to BooksAll Books.

  2. Search for the name of the book and click on it.

  3. Click on Install or Add Connection to add the book to your agent.

  4. If adding a connection, you'll be prompted for connectivity details.

Procedures

to close a browser

Close a browser instance.

Input Concepts

Concept
Description
Type
Required
Default Value

A reference to the browser instance to delete.

browser instance

Yes

(no default)

Output Concepts

Concept
Description
Type

answer

None

boolean

Examples

Close a browser instance:

to perform a task on a browser and get the visual log, the detailed plan, the result, the browser run id and the browser files

Execute a browser-based web automation task.

Note: Before using this procedure, you must first "provision a browser" for your automation task.

Input Concepts

Concept
Description
Type
Required
Default Value

The reference to the provisioned browser instance.

browser instance

Yes

(no default)

An object containing task configuration (created with the Browser Automation widget).

browser task

Yes

(no default)

Output Concepts

Concept
Description
Type

visual log

A series of screenshots showing each step the browser took.

file

detailed plan

A text summary of the steps taken to automate the task.

text

result

The data extracted or produced by the agent upon completing the task.

any

browser run id

The identifier of the browser run.

text

browser files

A list of files generated by the agent upon completing the task.

file

Examples

Get the Weather in Philadelphia. Task instructions: "Search for the 'current weather in Philadelphia' and extract the temperature"

Get Laptop Prices from Amazon. Task instructions: "Go to www.Amazon.com and search for 'laptop.' Look at the first 5 product listings on the results page. For each product, copy the product name and its price. Show me the list of product names with their prices."

to provision a browser

Provision a new K8s browser instance for a web automation task.

Output Concepts

Concept
Description
Type

None

browser instance

Examples

Provision a browser and get the reference to the browser instance:

to provision an aws browser

Provision a new AWS AgentCore browser instance for a web automation task.

Output Concepts

Concept
Description
Type

None

browser instance

Examples

Provision an AWS browser and get the reference to the browser instance:

to run a playwright script on a browser and get the result, the browser run id and the browser files

Execute a custom Playwright script on the provisioned browser.

Note: Before using this procedure, you must first "provision a browser". The script must: 1. Be stored in S3 2. Contain an async function: run(page, context, browser, params, llm=None) 3. Be accessible from the browser farm's AWS credentials

Input Concepts

Concept
Description
Type
Required
Default Value

Reference to the provisioned browser instance.

browser instance

Yes

(no default)

playwright script

S3 URI of the Playwright script (e.g., "s3://bucket/scripts/my_script.py").

text

Yes

(no default)

parameter json

Optional dictionary of parameters to pass to the script.

json

No

(no default)

Output Concepts

Concept
Description
Type

result

Data returned by the script execution.

any

browser run id

Identifier of the browser run.

text

browser files

Files generated during script execution.

file

Examples

Run a custom data extraction script with parameters:

Concepts

Browser instance

Represents attributes of an instance with browser support.

Field Name
Description
Type

browser_name

The name of the browser instance (pod name or AWS session ID).

text

vnc

The Virtual Network Computing (VNC) URL to access the browser.

text

browser_type

The type of browser backend ("k8s" or "aws").

optional[text]

Browser task

Represents a browser automation task.

Field Name
Description
Type

instructions

A natural language description of the task for the browser agent to perform (e.g., "Log into my account on example.com and check for new messages").

text

browser_task_id

Identifier for the browser task.

text

output

The desired structure or format of the results. It is recommended to use key-value pairs for clarity (e.g., {"key": "value"}). If specified, the agent will format the result accordingly.

optional[text]

worker_id

Identifier for the Kognitos worker executing the task.

text

line_id

Identifier for the specific line/step in the Kognitos automation.

text

browser_task_version

Version of the browser task.

text

task_context

Optional additional context or data to assist in completing the task. Contains the resolved reference facts for the task.

optional[json]

credential_names

The names of the authentication credentials needed for the task.

optional[list of text]

files

The files to be uploaded to the browser.

optional[list of text]

Last updated

Was this helpful?