Build an automated large language model evaluation pipeline on AWS

Large Language Models (LLMs) have gained significant attention as the key tools for understanding, generating and manipulating text with unprecedented proficiency. Their potential applications span from conversational agents to content generation and information retrieval. However, maximizing LLM capabilities, while ensuring responsible and effective use of these models hinges on the critical process of LLM evaluation. Join us as we dive into the solution framework and demonstrate how you can efficiently evaluate different LLMs and prompt templates by temporarily launching endpoints and running test sets. We show how the evaluation process is automated by converting LLM evaluation into a classification problem, where a test LLM assesses the output of the first LLM, similar to human evaluators, thus saving significant costs and resources during the evaluation stage. Download slide »

Speakers:
Melanie Li, PhD, Senior AI/ML Specialist Technical Account Manager, AWS
Sam Edwards, Cloud Support Engineer, AWS
Rafa Xu, Senior Cloud Architect, AWS Professional Services

Build an automated large language model evaluation pipeline on AWS

Innovate with data and machine learning

Accelerate rapid innovation with data and AIML

Generative AI on AWS

Generative AI platform on AWS

Smart traffic management

Troubleshooting with augmented observability and generative AI

Brick maestro with AI/ML and HPC on AWS

Generative AI-powered conversational intelligence - audio, chats supporting diverse languages

Transform digital experiences with generative AI Intelligent videoaudio Q&A

Build generative AI applications with no code/low code solutions on AWS

Build a personalized registration application using generative AI and AWS serverless

Codenator Enhancing user productivity through AI-powered code generation and secure execution

Choosing the right AIML and generative AI tools for your use case

Architecture patterns for building generative AI applications

Cost-optimizing AIML workloads on AWS

Select the right large language model for your generative AI use case

LLMOps Lifecycle of a LLM

Using generative AI responsibly and securely on AWS

Transform your organization with intelligent document processing (IDP) on AWS

Build a generative AI-powered content moderation solution on AWS