Anyscale Launches Cost-Effective Endpoints for Embedding Open-Source LLMs into Apps

Anyscale Launches Cost-Effective Endpoints for Embedding Open-Source LLMs into Apps
Image Credit: Maginative

San Francisco-based Anyscale unveiled its new Anyscale Endpoints service at the recent Ray Summit 2023 conference. Anyscale Endpoints provides developers an easy and affordable way to integrate large language models into their applications. The service promises speed, scalability, and a substantial reduction in cost, offering a compelling alternative to existing solutions.

Traditionally, embedding these models involved building machine learning pipelines, training custom models, and then handling deployment and scalability—processes fraught with high costs and significant delays in time-to-market. With Endpoints, Anyscale gives developers simple API access to powerful GPUs and the latest open-source models at a fraction of the cost of proprietary solutions.

The service distinguishes itself by being significantly less expensive than other market offerings—up to 10 times cheaper for certain tasks. The service is priced at $1 per million tokens for models like Llama-2 70B, providing an accessible entry point for developers keen to leverage advanced natural language capabilities.

Anyscale also touts the service's rapid adaptability. The company can integrate new models in a matter of hours, not weeks. This speed enables developers to stay updated with the latest advancements in the open-source LLM community, a crucial factor for maintaining competitive applications.

Moreover, Endpoints can operate within a customer's existing cloud accounts on AWS or GCP, providing an added layer of security. This feature allows for the reuse of existing security protocols and makes it easier to process proprietary data within a familiar environment.

Endpoints can seamlessly integrate with popular Python libraries and machine learning platforms like Hugging Face, Weight & Biases, and Arize to facilitate diverse use cases and applications.

The new service aligns well with the company's focus on accelerating AI application development through its Ray framework. With endpoints' cost and speed advantages, the company hopes to expand access and drive user growth.

This flexibility has already proven valuable to early adopters. Shaun Wei, CEO and Cofounder at, highlighted the speed and cost-effectiveness of Endpoints, which enabled them to launch new services in hours instead of weeks. Siddartha Saxena, Co-Founder and CTO at Merlin, cited a 5x-8x cost advantage over other solutions, enhancing the affordability of their consumer-facing services.

For developers and businesses eyeing more cost-effective and efficient ways to leverage large language models, Anyscale Endpoints could serve as a critical resource in their AI toolbox. The service is available starting today.

Let’s stay in touch. Get the latest AI news from Maginative in your inbox.