Skip to content

Commit 8b43bde

Browse files
authored
Simplify prompt for pagination (#18)
* Simplified prompt for robust pagination * Required output schema when specified in output (no longer `.nullable()`) * Provided an example of ai pagination
1 parent ca021cb commit 8b43bde

File tree

6 files changed

+106
-66
lines changed

6 files changed

+106
-66
lines changed

scripts/test-page-ai-local.ts

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
import { BrowserAgent } from "../src/agent";
2+
import dotenv from "dotenv";
3+
4+
dotenv.config();
5+
6+
const agent = new BrowserAgent({
7+
browserProvider: "Local",
8+
debug: true,
9+
});
10+
11+
(async () => {
12+
const page = await agent.newPage();
13+
await page.goto("https://www.loblaws.ca/en/food/bakery/bread/c/28251");
14+
try {
15+
await page.waitForLoadState("networkidle", { timeout: 10000 });
16+
} catch {
17+
console.log("Network idle timeout, continuing...");
18+
}
19+
page.ai("Find pagination links and go to the next page", { maxSteps: 2 });
20+
})();

scripts/test-pagination-local.ts

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
import { BrowserAgent } from "../src/agent";
2+
import { ChatOpenAI } from "@langchain/openai";
3+
import dotenv from "dotenv";
4+
import { z } from "zod";
5+
import { TaskStatus } from "../dist/types";
6+
7+
dotenv.config();
8+
9+
const agent = new BrowserAgent({
10+
browserProvider: "Local",
11+
debug: true,
12+
llm: new ChatOpenAI({
13+
apiKey: process.env.OPENAI_API_KEY,
14+
model: "gpt-4.1-mini",
15+
}),
16+
});
17+
18+
(async () => {
19+
const page = await agent.newPage();
20+
await page.goto("https://www.loblaws.ca/en/food/bakery/bread/c/28251");
21+
try {
22+
await page.waitForLoadState("networkidle", { timeout: 10000 });
23+
} catch {
24+
console.log("Network idle timeout, continuing...");
25+
}
26+
27+
// Run first extraction
28+
// --- some extraction code ---
29+
console.log("Run extraction for page 1");
30+
let nextPageNumber = 2;
31+
let hasNextPage = true;
32+
33+
while (hasNextPage) {
34+
const result = await page.ai(
35+
`Go to page ${nextPageNumber} of the results. If page ${nextPageNumber} does not exist, return early and complete the task.`,
36+
{
37+
maxSteps: 3,
38+
outputSchema: z.object({
39+
success: z.boolean(),
40+
currentPageNumber: z.number(),
41+
hasNextPage: z.boolean(),
42+
}),
43+
}
44+
);
45+
if (result.status === TaskStatus.COMPLETED) {
46+
const structuredOutput = JSON.parse(result.output) as {
47+
success: boolean;
48+
currentPageNumber: number;
49+
hasNextPage: boolean;
50+
};
51+
52+
if (structuredOutput.currentPageNumber === nextPageNumber) {
53+
// Run extraction
54+
// --- some extraction code ---
55+
console.log(
56+
"Run extraction for page",
57+
structuredOutput.currentPageNumber
58+
);
59+
} else {
60+
console.error(
61+
`Expected page ${nextPageNumber}, but got page ${structuredOutput.currentPageNumber}`
62+
);
63+
break;
64+
}
65+
66+
if (structuredOutput.hasNextPage) {
67+
nextPageNumber += 1;
68+
hasNextPage = true;
69+
} else {
70+
console.log(
71+
"No more pages available at page",
72+
structuredOutput.currentPageNumber
73+
);
74+
break;
75+
}
76+
} else {
77+
console.error("Task failed", JSON.stringify(result, null, 2));
78+
break;
79+
}
80+
}
81+
})();

src/agent/actions/complete-with-output-schema.ts

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,9 @@ export const generateCompleteActionWithOutputDefinition = (
99
success: z
1010
.boolean()
1111
.describe("Whether the task was completed successfully."),
12-
outputSchema: outputSchema
13-
.nullable()
14-
.describe(
15-
"The output model to return the response in. Given the previous data, try your best to fit the final response into the given schema."
16-
),
12+
outputSchema: outputSchema.describe(
13+
"The output model to return the response in. Given the previous data, try your best to fit the final response into the given schema."
14+
),
1715
})
1816
.describe(
1917
"Complete the task. An output schema has been provided to you. Try your best to provide your response so that it fits the output schema provided."

src/agent/messages/examples-actions.ts

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,6 @@ export const EXAMPLE_ACTIONS = `- Search: [
55
- Clicking on an element: [
66
{"type": "clickElement", "params": {"index": 1}}
77
]
8-
- Extracting content (if your goal is to find any information on a page): [
9-
{"type": "extractContent", "params": {"goal": "what specifically you need to extract"}}
10-
]
118
- Forms: [
129
{"type": "inputText", "params": {"index": 1, "text": "first name"}},
1310
{"type": "inputText", "params": {"index": 2, "text": "last name"}},

src/agent/messages/system-prompt.ts

Lines changed: 2 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -52,69 +52,13 @@ ${EXAMPLE_ACTIONS}
5252
- Use scroll to find elements you are looking for
5353
- If you want to research something, open a new tab instead of using the current tab
5454
55-
2. GETTING UNSTUCK
56-
- Avoid getting stuck in loops.
57-
* You know your previous actions, and you know your current state. Do not keep repeating yourself expecting something to change.
58-
- If stuck, try:
59-
* Going back to a previous page
60-
* Starting a new search
61-
* Opening a new tab
62-
* Using alternative navigation paths
63-
* Trying a different website or source
64-
* Use the thinking action to think about the task and how to accomplish it
65-
66-
3. SPECIAL CASES
55+
2. SPECIAL CASES
6756
- Cookies: Either try accepting the banner or closing it
6857
- Captcha: First try to solve it, otherwise try to refresh the website, if that doesn't work, try a different method to accomplish the task
6958
70-
4. Form filling:
59+
3. Form filling:
7160
- If your action sequence is interrupted after filling an input field, it likely means the page changed (e.g., autocomplete suggestions appeared).
7261
- When suggestions appear, select an appropriate one before continuing. Important thing to note with this, you should prioritize selecting the most specific/detailed option when hierarchical or nested options are available.
7362
- For date selection, use the calendar/date picker controls (usually arrows to navigate through the months and years) or type the date directly into the input field rather than scrolling. Ensure the dates selected are the correct ones.
7463
- After completing all form fields, remember to click the submit/search button to process the form.
75-
76-
5. For Date Pickers with Calendars:
77-
- First try to type the date directly into the input field and send the enter key press action
78-
* Be sure to send the enter key press action after typing the date, if you don't do that, the date will not be selected
79-
- If that doesn't work, use the right arrow key to navigate through months and years until finding the correct date
80-
* Be patient and persistent with calendar navigation - it may take multiple attempts to reach the target month/year
81-
* Verify the correct date is selected before proceeding
82-
83-
5. For Flight Search:
84-
- If you are typing in the where from, ALWAYS send an enter key press action after typing the value
85-
- If you are typing in the where to, ALWAYS send an enter key press action after typing the value
86-
87-
5. For flight sources and destinations:
88-
- Send enter key press action after typing the source or destination
89-
90-
# Search Strategy
91-
When searching, follow these best practices:
92-
93-
1. Primary Search Method:
94-
- Use textInput action followed by keyPress action with 'Enter'
95-
- If unsuccessful, look for clickable 'Search' text or magnifying glass icon
96-
- Only click search elements that are marked as interactive
97-
98-
2. Query Construction:
99-
- Search Engines (Google, Bing):
100-
* Can handle complex, natural language queries
101-
* Example: "trending python repositories" or "wizards latest game score"
102-
103-
- Specific Websites:
104-
* Use simpler, more targeted queries
105-
* Follow up with filters and sorting
106-
* Example on GitHub: Search "language:python", then sort by trending/stars
107-
* Example on ESPN: Search "wizards", navigate to team page, find latest score
108-
109-
3. Important Considerations:
110-
- For date-based queries, use current date: ${DATE_STRING}
111-
- Use relative dates only when explicitly requested
112-
- With autocomplete:
113-
* You can ignore suggestions and enter custom input
114-
* Verify suggested options match requirements before selecting
115-
116-
4. Search Refinement:
117-
- Use available filters and sort options
118-
- Consider in-memory filtering when site options are limited
119-
- Break down complex searches into smaller, manageable steps
12064
`;

0 commit comments

Comments
 (0)