All Posts

  • Published on
    Learn how to build the core evaluation pipeline for prompt testing. This guide covers the three essential functions—run_prompt, run_test_case, and run_eval—that process test cases through Claude and collect structured results for analysis.
  • Published on
    Learn how to build a custom prompt evaluation workflow for AWS-specific code generation. This guide covers creating baseline prompts, automatically generating test datasets using Claude, and setting up a systematic evaluation framework for Python, JSON, and regex outputs.
  • Published on
    Learn how to build hooks in Claude Code to intercept and control tool calls. Master PreToolUse and PostToolUse hooks, understand tool call data structures, and implement access control for sensitive files.