Choosing the right compute for ML training and inference
Organizations across various industries are increasingly adopting machine learning for a wide range of use cases, including natural language processing (NLP), computer vision, voice assistants, fraud detection, and recommendation engines. Large language models (LLMs) that have hundreds of billions of parameters are unlocking new generative AI use cases, for example, image and text generation. But the growth of ML applications has resulted in higher usage, management, cost of compute, storage, and networking resources. This session explains why identifying and choosing the right compute infrastructure is important to reduce your power consumption, costs, as well as managing complexities from training and deployment of ML models to production. We explain how AWS offers the ideal combination of high performance, cost-effective, and energy-efficient purpose-built ML tools and accelerators, optimized for ML applications. Learn how to choose the right infrastructure for your AI/ML workload requirements. The session also explores the highly performant, scalable, and cost-effective ML infrastructure from AWS, ranging from the latest GPUs to purpose-built accelerators including AWS Trainium, AWS Inferentia and Amazon EC2 P5 which are designed for training and running models.
Speaker: Smiti Guru, Senior Solutions Architect, AWS India