FriendliAI Makes LLMs Easy: Managed Service Streamlines Generative AI Deployment


Free eNews Subscription

FriendliAI Makes LLMs Easy: Managed Service Streamlines Generative AI Deployment

By Greg Tavarez

Generative AI has changed how industries operate, and it has been great to see it do this from the start of its surge. We won't risk soundling like a broken record, though; we're sure you're already aware of GenAI's benefits and the opportunities therein.

However, many organizations are still facing a widespread adoption hurdle: Namely, complex infrastructure requirements. Deploying and serving LLMs necessitates expertise in containerization and managing high-performance computing resources like GPUs. This technical barrier often limits the technology to well-resourced organizations with dedicated AI teams.

FriendliAI, a frontrunner in inference serving for generative AI, aims to bridge this gap with Friendli Dedicated Endpoints, a managed service offering built upon the foundation of their Friendli Container technology. This new addition to the Friendli Suite streamlines the deployment of LLMs by automating complex processes and delivering cost-effective, high-performance custom model serving.

Friendli Dedicated Endpoints functions as the managed cloud alternative to Friendli Container. Friendli Container, already adopted by startups and large enterprises alike, allows for the deployment of LLMs at scale within private environments. It achieves reductions in GPU costs through the power of the Friendli Engine, a highly GPU-optimized engine that also serves as the core of Friendli Dedicated Endpoints.

Friendli Dedicated Endpoints also simplifies the entire LLM development and serving process through automation. This automation encompasses everything from model fine-tuning and cloud resource procurement to deployment monitoring. What this means is that users can now fine-tune and deploy cutting-edge, quantized models like Llama 2 or Mixtral with just a few clicks, thanks to the Friendli Engine's power. This allows users of all technical backgrounds to leverage Friendli's GPU-optimized serving capabilities.

In the announcement, Byung-Gon Chun, CEO of FriendliAI, mentioned the importance of making generative AI accessible to a wider audience and highlights its potential to drive innovation and boost organizational productivity.

“Friendli Dedicated Endpoints eliminates the burden of infrastructure management,” Chun said. “This allows our customers to unlock the full potential of generative AI with the Friendli Engine. Whether it's text generation, image creation, or anything else, our service opens doors to endless possibilities for users regardless of their technical expertise.”

To be more specific, here are key features offered by Friendli Dedicated Endpoints.

Dedicated GPU instances allow users to reserve entire GPUs for their custom generative AI models to guarantee consistent and dependable access to high-performance computing resources.

Also, a single GPU powered by the optimized Friendli Engine delivers performance equivalent to up to seven GPUs running a vanilla LLM. This translates to cost savings of 50% to 90% on GPUs and up to 10 times faster response times for queries.

Furthermore, Friendli Dedicated Endpoints automatically adapts to fluctuating workloads and handles failures seamlessly. This includes features like automated failure management and auto-scaling, which adjusts resource allocation based on real-time traffic patterns. In other words, say hello to uninterrupted operations and optimal resource utilization during peak demand periods.

By eliminating technical hurdles and optimizing GPU usage, FriendliAI aims to remove infrastructure constraints as a barrier to innovation in generative AI.

“We're excited to welcome new users on our mission to make generative AI models fast and affordable,” Chun said.

By offering a user-friendly managed service with exceptional performance and efficiency, Friendli Dedicated Endpoints has the potential to better equip a wider range of users to leverage the power of LLMs and unlock new possibilities in various fields.

Edited by Alex Passett
Get stories like this delivered straight to your inbox. [Free eNews Subscription]

GenAIToday Editor

Related Articles

Upland Qvidian AI Assist Improves Response and Proposal Process with Generative AI

By: Greg Tavarez    5/24/2024

Qvidian AI Assist easily integrates with Qvidian's existing functionalities and offers several key features.

Read More

New Solution by NetApp and Lenovo Makes Generative AI Accessible to Businesses

By: Greg Tavarez    5/24/2024

NetApp AIPod with Lenovo ThinkSystem servers for NVIDIA OVX is a converged infrastructure optimized for the generative AI era.

Read More

Palo Alto Networks and Accenture Collaborate for Secure AI Development

By: Greg Tavarez    5/22/2024

Palo Alto Networks and Accenture are working together to help joint clients work toward a more secure AI future through intentional design, deployment…

Read More

Next-Gen Enterprise, Now: IBM and SAP Partner on Generative AI Solutions

By: Greg Tavarez    5/21/2024

IBM and SAP SE announced their vision for the next era of their collaboration, which includes new generative AI capabilities.

Read More

Genesys Announces AI-Driven Enhancements to Genesys Cloud

By: Tracey E. Schelmetic    5/21/2024

Customer experience solutions provider Genesys announced several new AI capabilities designed to improve customer and employee experiences alike.

Read More