Reference Architecture: Enterprise AI/ML Workloads on Lenovo ThinkSystem and ThinkAgile Platforms

Top

Author

Chandrakandh Mouleeswaran

Published

26 Dec 2023

Form Number

LP1880

PDF size

69 pages, 1.6 MB

Rate & Provide Feedback

Download PDF

Abstract

This document describes the reference architecture for deploying Enterprise AI workloads on Lenovo ThinkSystem, ThinkAgile and ThinkEdge platforms. The reference architecture provides system architecture for on-premises AI infrastructure and software stack for the end-to-end AI/ML life cycle. The solution objective is to provide design and guidance to consolidate enterprise wide AI/ML workloads and the relevant data science applications on a Lenovo server platform with respective private cloud management software and Kubernetes platforms.

The RA provides system design guidelines to select appropriate Lenovo servers, AI technology and frameworks, model selection and evaluation. The document provides system design and recommendation for AI/ML use cases like time series forecasting, conversational AI, Generative AI and different cognitive services from Lenovo AI Innovation partners. Lenovo systems are engineered, tailored, and validated to enable running a wide range of data science and AI solutions at any scale, anywhere.

The Lenovo Enterprise AI solution caters to the following software and frameworks from leading partners. The choices provide flexibility and enable customers to choose the best software for different AI use cases. The solution uses NVIDIA AI Enterprise as a common framework to provide GPU accelerated analytics, optimization libraries and pretrained models for computer vision, video analytic and private large language models. The following platforms are interoperable with NVIDIA Enterprise AI:

Lenovo Intelligent Computing Orchestration (LiCO) solution for HPC/AI workload deployment and orchestration
VMware Private AI Foundation solution with VMware vSAN, VMware Cloud Foundation and Tanzu
Nutanix Enterprise AI solution with Nutanix AOS, Nutanix Cloud Platform, and Nutanix Kubernetes Engine

The document also covers components and frameworks for machine learning orchestration, MLOps and distributed training and deployment to provide complete solution for building and deploying scalable AI/ML applications.

1. Introduction
2. Business problem and business value
3. Requirements
4. Architectural Overview
5. Component Model
6. AI/ML System Design
7. Sample AI/ML Applications Solution Design
8. Operational model
9. Conclusion
10. Appendix: Bill of materials
Resources

Related product families

Product families related to this document are the following:

Lenovo Press