15 Free YouTube subscribers for your channel
Get Free YouTube Subscribers, Views and Likes

Scalable advanced ML systems with Ray Google Kubernetes Engine and ML accelerators

Follow
Google Cloud Tech

As machine learning (ML) systems continue to evolve, the ability to scale complex ML workloads becomes crucial. Scalability can be considered along two dimensions: expansive training of large language models (LLMs) and intricate distribution of reinforcement learning (RL) systems. Each has its own set of challenges, from computational demands of LLMs to complex synchronization in distributed RL.

This session explores the integration of Ray, Google Kubernetes Engine (GKE) and ML accelerators like tensor processing units (TPUs) as a powerful combination to develop advanced ML systems at scale. We discuss Ray and its scalable APIs, its mature integration with GKE and ML accelerators, and demonstrate how it has been used for LLMs and reimplementing the powerful RL algorithm, Muzero.

Speakers:

Watch more:
All sessions from Google Cloud Next → https://goo.gle/next24

#GoogleCloudNext

OPS213

posted by 4KLPx1