| Page 658 | Kisaco Research

Large Language Models (LLMs) have revolutionized natural language processing but have posed significant challenges in training and inference due to their enormous memory requirements. In this talk, we delve into techniques and optimizations to mitigate memory constraints across the entire lifecycle of LLMs.

The first segment explores Memory Optimized LLM Training. We discuss Training challenges and cover different techniques under Parameter Efficient Fine Tuning (PEFT). like prompt tuning with LoRA, and adapters.

LLMs inference is more memory bound rather than compute bound, In this section we will explore inference optimizations mostly for transformer architectures like Paged Key-Value (KV) Cache, Speculative Decoding, Quantization, Inflight Batching strategies, Flash Attention, each contributing to enhanced inference speed and efficiency.

Finally, we explore the concept of Coherent Memory, and how it helps with Inference optimizations by KV Cache offloading and LoRA weight re-computation.

By illuminating these advancements, this talk aims to provide a comprehensive understanding of state-of-the-art memory optimization techniques for LLMs, empowering practitioners to push the boundaries of natural language processing further.

Systems Infrastructure/Architecture
AI/ML Compute

Author:

Arun Raman

Deep Learning Solutions Architect
NVIDIA

Arun Raman is an AI solution architect at NVIDIA, adept at navigating the intricate challenges of deploying AI applications across edge, cloud, and on-premises environments within the consumer Internet industry. In his current role, he works on the design of end-to-end accelerated AI pipelines, for consumer internet customers meticulously addressing preprocessing, training, and inference optimizations.  His experience extends beyond AI, having worked with distributed systems and multi-cloud infrastructure. He shares practical strategies and real-world experiences, empowering organizations to leverage AI effectively.

Arun Raman

Deep Learning Solutions Architect
NVIDIA

Arun Raman is an AI solution architect at NVIDIA, adept at navigating the intricate challenges of deploying AI applications across edge, cloud, and on-premises environments within the consumer Internet industry. In his current role, he works on the design of end-to-end accelerated AI pipelines, for consumer internet customers meticulously addressing preprocessing, training, and inference optimizations.  His experience extends beyond AI, having worked with distributed systems and multi-cloud infrastructure. He shares practical strategies and real-world experiences, empowering organizations to leverage AI effectively.

HBM
CXL
Interconnects
AI/ML Compute

Author:

Jin-Hyeok Choi

EVP, Memory Solution & Product Development
Samsung Electronics

Jin-Hyeok Choi leads Device Solution’s R&D – Memory division, which develops new memory technologies and enables memory products.

Jin-Hyeok joined Samsung Electronics in 2003 as a SoC design engineer, working on the development of mobile storage. From 2012 to 2019, he was in charge of the development team for controllers, a core component of SoCs based on NAND Flash. He developed and commercialized the world's first eMMC and UFS products, as well as various controllers for SATA/SAS/NVMe SSDs. He also developed the first-ever enterprise premium SSD with high endurance VNAND and has contributed significantly to the expansion of the storage market.

Jin-Hyeok received his B.S., M.S., and Ph. D. degrees in Electronics Engineering from Seoul National University in 1989, 1991, and 1996, respectively. He also studied low-power circuits at the University of Tokyo's Institute of Industrial Science.

Jin-Hyeok Choi

EVP, Memory Solution & Product Development
Samsung Electronics

Jin-Hyeok Choi leads Device Solution’s R&D – Memory division, which develops new memory technologies and enables memory products.

Jin-Hyeok joined Samsung Electronics in 2003 as a SoC design engineer, working on the development of mobile storage. From 2012 to 2019, he was in charge of the development team for controllers, a core component of SoCs based on NAND Flash. He developed and commercialized the world's first eMMC and UFS products, as well as various controllers for SATA/SAS/NVMe SSDs. He also developed the first-ever enterprise premium SSD with high endurance VNAND and has contributed significantly to the expansion of the storage market.

Jin-Hyeok received his B.S., M.S., and Ph. D. degrees in Electronics Engineering from Seoul National University in 1989, 1991, and 1996, respectively. He also studied low-power circuits at the University of Tokyo's Institute of Industrial Science.

Author:

SangJoon Hwang

Corporate EVP, Head of DRAM Product & Technology
Samsung Electronics

SangJoon Hwang received B.S, M.S., and Ph.D. degrees in electric engineering from the Korea University in 1994, 1996, and 2008, respectively.

He joined the Samsung Electronics, Hwaseong, South Korea in 1996, where he had successfully led a DRAM design group in 2014 and the Flash design team in 2017 as a Vice President and the Memory Product Planning team in 2019 as as a Senior Vice President. Through leading various backgounds from product planning to design, his experience enhances the overall quality of Samsung DRAM products.

 

Since 2023, he has been leading the DRAM Product & Technology of the Samsung memory division. His current research interests include architecture for next-generation DRAM and product development utilizing new process technology for new product line-up.

SangJoon Hwang

Corporate EVP, Head of DRAM Product & Technology
Samsung Electronics

SangJoon Hwang received B.S, M.S., and Ph.D. degrees in electric engineering from the Korea University in 1994, 1996, and 2008, respectively.

He joined the Samsung Electronics, Hwaseong, South Korea in 1996, where he had successfully led a DRAM design group in 2014 and the Flash design team in 2017 as a Vice President and the Memory Product Planning team in 2019 as as a Senior Vice President. Through leading various backgounds from product planning to design, his experience enhances the overall quality of Samsung DRAM products.

 

Since 2023, he has been leading the DRAM Product & Technology of the Samsung memory division. His current research interests include architecture for next-generation DRAM and product development utilizing new process technology for new product line-up.

Author:

Paul Turner

VP
Broadcom/VMWare

 

Paul Turner, is the Vice President of the vSphere Product Management team, covering vCenter, ESXi, vMotion and Project Pacific. He is leading our next generation of vSphere and moving the platform to become the leading infrastructure platform for all apps – VMs, Containers and Machine Learning applications. Paul brings more than 20 years expertise in enterprise software product management and marketing - having held leadership roles at VMware, NetApp, Oracle, Cloudian and Scality. Under his leadership, Scality was recognized as a leader in Gartner's Magic Quadrant and also by IDC in their Marketscape report for Object Storage. Prior to this at NetApp, he led the product management and technical marketing for their management software and also ran the Product Strategy Office, where he guided their investments into all-flash, Iongrid, CacheIQ, Onaro and Akorri.   

 

Paul holds an computer science degree from Trinity College in Ireland. He lives in Los Altos, Silicon Valley with his wife Kristy and their children Conor and Aoife.. 

 

 

 

Paul Turner

VP
Broadcom/VMWare

 

Paul Turner, is the Vice President of the vSphere Product Management team, covering vCenter, ESXi, vMotion and Project Pacific. He is leading our next generation of vSphere and moving the platform to become the leading infrastructure platform for all apps – VMs, Containers and Machine Learning applications. Paul brings more than 20 years expertise in enterprise software product management and marketing - having held leadership roles at VMware, NetApp, Oracle, Cloudian and Scality. Under his leadership, Scality was recognized as a leader in Gartner's Magic Quadrant and also by IDC in their Marketscape report for Object Storage. Prior to this at NetApp, he led the product management and technical marketing for their management software and also ran the Product Strategy Office, where he guided their investments into all-flash, Iongrid, CacheIQ, Onaro and Akorri.   

 

Paul holds an computer science degree from Trinity College in Ireland. He lives in Los Altos, Silicon Valley with his wife Kristy and their children Conor and Aoife.. 

 

 

 

Author:

Gunnar Hellekson

VP & GM
RedHat

Gunnar Hellekson is Vice President and General Manager for the Red Hat Enterprise Linux business. Before that, he was Chief Strategist for Red Hat’s US Public Sector group.  He is a founder of Open Source for America, one of Federal Computer Week’s Fed 100 for 2010, and was voted one of the FedScoop 50 for industry leadership. He was a founder of the Military Open Source working group, a member of the SIIA Software Division Board, the Board of Directors for the Public Sector Innovation Group, the Open Technology Fund Advisory Council, New America’s California Civic Innovation Project Advisory Council, and the CivicCommons Board of Advisors. He perks up when people talk about commoditization and the industrial mobilization of World War II. He is also co-host of the Dave and Gunnar Show.

Prior to joining Red Hat, he worked as a developer, systems administrator, and IT director for a number of Internet businesses. He has also been a business and IT consultant to not-for-profit organizations in New York City. During that time, he spearheaded the reform of safety regulations for New York State’s electrical utilities through the Jodie Lane Project.

Gunnar’s CV is available in HTMLPDF, and on GitHub.

 

Gunnar Hellekson

VP & GM
RedHat

Gunnar Hellekson is Vice President and General Manager for the Red Hat Enterprise Linux business. Before that, he was Chief Strategist for Red Hat’s US Public Sector group.  He is a founder of Open Source for America, one of Federal Computer Week’s Fed 100 for 2010, and was voted one of the FedScoop 50 for industry leadership. He was a founder of the Military Open Source working group, a member of the SIIA Software Division Board, the Board of Directors for the Public Sector Innovation Group, the Open Technology Fund Advisory Council, New America’s California Civic Innovation Project Advisory Council, and the CivicCommons Board of Advisors. He perks up when people talk about commoditization and the industrial mobilization of World War II. He is also co-host of the Dave and Gunnar Show.

Prior to joining Red Hat, he worked as a developer, systems administrator, and IT director for a number of Internet businesses. He has also been a business and IT consultant to not-for-profit organizations in New York City. During that time, he spearheaded the reform of safety regulations for New York State’s electrical utilities through the Jodie Lane Project.

Gunnar’s CV is available in HTMLPDF, and on GitHub.

 

 

Richelle Marting

Director, Managed Care Contracting
North Kansas City Hospital, Meritas Health Corporation

Richelle Marting

Director, Managed Care Contracting
North Kansas City Hospital, Meritas Health Corporation

Richelle Marting

Director, Managed Care Contracting
North Kansas City Hospital, Meritas Health Corporation
 

Gregory Bryant

Chief Information Officer
Gov Juan F. Luis Hospital & Medical Center

Gregory Bryant

Chief Information Officer
Gov Juan F. Luis Hospital & Medical Center

Gregory Bryant

Chief Information Officer
Gov Juan F. Luis Hospital & Medical Center
 

Novelette Wallace

AVP, Payment Integrity and FWA
Johns Hopkins HealthCare

Novelette Wallace

AVP, Payment Integrity and FWA
Johns Hopkins HealthCare

Novelette Wallace

AVP, Payment Integrity and FWA
Johns Hopkins HealthCare
 

Anthony Baize

Inspector General
Wisconsin Department of Health Services

Anthony J. Baize is the Inspector General for the Wisconsin Department of Health Services.  Baize took the position in early 2016 after eight years with Kentucky state government in the Kentucky Cabinet for Health and Family Services, serving as the Deputy Director of Audits and Investigations for the Office of Inspector General and the Director of Business Informatics with the Department of Behavioral Health, Developmental and Intellectual Disabilities.

Anthony Baize

Inspector General
Wisconsin Department of Health Services

Anthony Baize

Inspector General
Wisconsin Department of Health Services

Anthony J. Baize is the Inspector General for the Wisconsin Department of Health Services.  Baize took the position in early 2016 after eight years with Kentucky state government in the Kentucky Cabinet for Health and Family Services, serving as the Deputy Director of Audits and Investigations for the Office of Inspector General and the Director of Business Informatics with the Department of Behavioral Health, Developmental and Intellectual Disabilities.

Baize has served as the Region V representative for the National Association of Medicaid Program Integrity Directors and on the Advisory Board for the Centers for Medicare and Medicaid Services’ Medicaid Integrity Institute.  He regularly speaks at national conferences on topics related to Medicaid Program Integrity. 

Baize became a certified inspector general in 2022 after completing the Association of Inspectors General Institute.  He is also a member of the Internation Association of Financial Crimes Investigators.      

Baize was a civil rights consultant for nearly 20 years, serving on the Board of Directors for the National Fair Housing Alliance and the Lexington (KY) Fair Housing Council. Baize has given presentations on fair housing requirements across the United States, but especially in Kentucky, Indiana, Ohio and Tennessee.  He has a master’s degree in public administration from Indiana State University, has been married for 29 years and has two daughters. 

 

Dr. Ahmad Kilani MD, MBA, MLS, MSIT, CHCQM-PHYADV, FACP, FACHE

Medical Director
Cleveland Clinic

Dr. Ahmad Kilani MD, MBA, MLS, MSIT, CHCQM-PHYADV, FACP, FACHE

Medical Director
Cleveland Clinic

Dr. Ahmad Kilani MD, MBA, MLS, MSIT, CHCQM-PHYADV, FACP, FACHE

Medical Director
Cleveland Clinic
 

Lennart Stradler

General Manager & Senior Analyst, Wound
SMARTTRAK

Lennart Stradler

General Manager & Senior Analyst, Wound
SMARTTRAK

Lennart Stradler

General Manager & Senior Analyst, Wound
SMARTTRAK
 

Thariea Whisker

Director of Minuteful for Wound Services UK
Healthy.io

Thariea Whisker

Director of Minuteful for Wound Services UK
Healthy.io

Thariea Whisker

Director of Minuteful for Wound Services UK
Healthy.io