In this video, we'll build an agent that mimics a prompt engineer: it will iterate on a prompt, run evaluations against the prompt, and reflect and iterate on those eval findings as if it were a real Prompt Engineer.
I'll also cover how to more generally set up, run, and score prompt evaluations, which is essential for this agent to work.
You might want to watch the previous video on building an agent with long-term memory, which you can find here: [ Ссылка ]
Interested in talking about a project? Reach out!
Email: christian@botany-ai.com
LinkedIn: linkedin.com/in/christianerice
Timestamps:
0:00 - Intro
0:55 - Project Overview
1:48 - Evaluation Approach
7:54 - System Overview
9:24 - Theory Behind My Approach
12:09 - System Discussion, Continued
17:41 - Step Through the Agent Logs
21:56 - Code Walkthrough
32:26 - Future Optimizations
Follow along with the code on GitHub:
[ Ссылка ]
Ещё видео!