Scalable data preparation & ML using Apache Spark on AWS (Level 200)

Analyzing, transforming and preparing large amounts of data is a foundational step of any data science and ML workflow. This session shows how to build end-to-end data preparation and machine learning (ML) workflows. We explain how to connect Apache Spark, for fast data preparation in your data processing environments on Amazon EMR and AWS Glue interactive sessions from Amazon SageMaker Studio. Uncover how to access data governed by AWS Lake Formation to interactively query, explore, visualize data, run and debug Spark jobs as you prepare large-scale data for use in ML. Download slides »
Speaker: Suman Debnath, Principal Developer Advocate, Data Engineering, AWS
Duration: 30mins

Previous Video
Implement unified text and image search application with analytics and ML (Level 200)
Implement unified text and image search application with analytics and ML (Level 200)

While text and semantic search engines has enabled many organizations to search for information quickly, or...

Next Video
Build an intelligent document processing solution (Level 200)
Build an intelligent document processing solution (Level 200)

Organizations have millions of physical documents and forms that hold critical business data. These documen...