Empowering Generative AI with Alibaba Cloud PAI’s Advanced LLM and LangChain Features

8 min readNov 23, 2023

Are you ready to dive into the world of Generative AI (GenAI) and witness the magic of Large Language Models (LLM)? Meet Alibaba Cloud Platform for AI (PAI). This cutting-edge platform combines the advanced capabilities of AI (Artificial Intelligence) with the groundbreaking features of Langchain and LLM to revolutionize the field of GenAI.

PAI offers plenty of exciting features that will leave you awe-inspired. Let’s explore some of its key highlights:

Seamless Langchain Integration: PAI seamlessly integrates with Langchain, an open platform. With Langchain’s powerful tools and resources, PAI employs the full potential of language models, enabling you to create dynamic and contextually-aware AI-generated content.
Advanced LLM Capabilities: PAI leverages state-of-the-art LLM technology to provide unparalleled text generation performance. Whether you’re looking to generate natural language responses or creative stories or engage in multi-turn conversations, PAI can handle it with unprecedented accuracy and fluency.
Real-time Streaming Output: Say goodbye to waiting for AI-generated responses! PAI offers real-time streaming output, ensuring instant and dynamic feedback. Experience the thrill of watching the AI-generated content unfold before your eyes, making your interactions with the model more engaging and interactive.
Template-based Access: PAI introduces an innovative template-based access feature. With the help of DSW Gallery, you can easily create your own GenAI or AI model based on templates for your business tasks. This empowers you to tailor the AI model for specific areas or create an Industry-Specific LLM, providing an industry and contextually relevant experience.
Knowledge Base System (KBS) or Retrieval Augmented Generation (RAG): Unlock the power of PAI’s KBS feature within LLM. Seamlessly retrieve relevant information from vast knowledge bases to enhance the quality and accuracy of AI-generated responses. Leverage the extensive knowledge stored in PAI to make LLM an intelligent and reliable companion.
Interactive Agent Capabilities: Transform your AI-driven conversations with PAI into an interactive and immersive experience. With PAI as the platform, you can establish your multi-agent system to engage in lifelike and multi-turn dialogues. Witness an AI companion that understands your queries, responds intelligently, and adapts to your conversation flow.

Covering all features in one publication will be challenging. Hence, we will publish articles related to LLM on PAI. Let’s start with integration with Langchain.

Unleashing the Power of Chat Models with AliCloud PAI EAS

Introduction

PAI-EAS

PAI-EAS (Platform for AI — Elastic Algorithm Service) is for online inference of the model. It offers automatic scaling, blue-green deployment, and resource group management. PAI-EAS supports real-time and near-real-time AI inference scenarios and provides a flexible infrastructure, efficient container scheduling, and simplified model deployment. It also enables real-time and near real-time synchronous inference, ensuring high throughput and low latency for various AI applications.

LangChain

LangChain is a robust framework for developing applications powered by language models. It enables context-aware applications that can connect a language model to various context sources, such as prompt instructions, few-shot examples, or content to ground its response in. LangChain allows applications to reason and make decisions based on the provided context, utilizing the capabilities of the language model.

The LangChain framework consists of several components. The LangChain Libraries provide Python and JavaScript libraries with interfaces and integrations for working with language models. These libraries include a runtime for combining features into chains and agents and pre-built implementations of chains and agents for different tasks. The LangChain Templates offer easily deployable reference architectures for various applications.

By using LangChain, developers can simplify the entire application lifecycle. They can develop applications using LangChain libraries and templates, produce them by inspecting and monitoring with LangSmith, and deploy the chains as APIs using LangServe.

LangChain offers standard and extendable interfaces and integrations for modules such as Model I/O, Retrieval, and Agents. It provides many resources, including use cases, walkthroughs, best practices, API references, and a developer’s guide. The LangChain community is active and supportive, offering places to ask questions, share feedback, and collaborate on the future of language model-powered applications.

In this tutorial, we will walk through the steps to set up PAI EAS, deploy a chat model, and run it using different configurations. Here is the documentation link.

Steps to Build LangChain and LLM with PAI-EAS

Step 1: Setting Up LLM on PAI-EAS

We begin from the lunching LLM on PAI-EAS. Here, you can find a tutorial or documentation on how to set up EAS and obtain the PAI-EAS service URL and token. These credentials will be essential for connecting to the EAS service.

Suppose you would like to try the Qwen model. Feel free to modify the “Command to Run” in the number 4 field. Qwen is open-source and has a series of models, including Qwen, the base language models, namely Qwen-7B and Qwen-14B, and Qwen-Chat, the chat models, namely Qwen-7B-Chat and Qwen-14B-Chat.

python api/api_server.py --port=8000 --model-path=Qwen/Qwen-7B-Chat

Step 2: Environment Variables Next

It requires setting the environment variables URL and token from the PAI-EAS. You can export these variables in your terminal or set them programmatically in your code. This will allow your application to access the EAS service securely.

Export variables in the terminal:

export EAS_SERVICE_URL=XXX
export EAS_SERVICE_TOKEN=XXX

To get these variables from the PAI-EAS, you can follow the below steps to find the host URL and token required for the code:

Go to your deployed model on PAI-EAS, as shown above.
Click Service Details to access the detailed information on the deployed model.
Look for the View Endpoint Information section and click it.

In the Public Endpoint field, you will find the host URL. Copy this URL.
Next, locate the Authorization field, which contains the token required for authentication. Copy this token.

Step 3: Import Dependencies Now

Import the necessary dependencies in your code. This includes the PaiEasChatEndpoint class from the langchain.chat_models module and the HumanMessage class from the langchain.chat_models.base module. These modules provide the foundation for running chat models with PAI-EAS.

import os

from langchain.chat_models import PaiEasChatEndpoint
from langchain.chat_models.base import HumanMessage

os.environ["EAS_SERVICE_URL"] = "Your_EAS_Service_URL"
os.environ["EAS_SERVICE_TOKEN"] = "Your_EAS_Service_Token"
chat = PaiEasChatEndpoint(
    eas_service_url=os.environ["EAS_SERVICE_URL"],
    eas_service_token=os.environ["EAS_SERVICE_TOKEN"],
)

Step 4a: Initializing the Chat Model

Create an instance of the PaiEasChatEndpoint class by passing the PAI-EAS service URL and token as parameters. This initializes the connection to the EAS service and prepares it for chat model interactions.

output = chat([HumanMessage(content="write a funny joke")])
print("output:", output)

Step 4b: Running the Chat Model

Utilize the chat model by calling the chat method and passing a HumanMessage object with the desired input content. By default, the chat model uses the default settings for inference. However, you can customize the inference parameters by including additional keyword arguments such as temperature, top_p, and top_k. This allows you to optimize the model’s behavior as per your requirements.

kwargs = {"temperature": 0.8, "top_p": 0.8, "top_k": 5}
output = chat([HumanMessage(content="write a funny joke")], **kwargs)
print("output:", output)

Here is the list of other parameters for PaiEasEndpoint:

Step 4c: Stream Response (Optional)

You can run a stream call to obtain a stream response for more dynamic interactions. Set the streaming parameter to True when calling the stream method. Iterate over the stream outputs to process each response in real-time.

outputs = chat.stream([HumanMessage(content="hi")], streaming=True)
for output in outputs:
    print("stream output:", output)

Bonus: a Ready Code to Run

import os
from langchain.llms.pai_eas_endpoint import PaiEasEndpoint
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# Initialize the EAS service endpoint
os.environ["EAS_SERVICE_URL"] = "EAS_SERVICE_URL",
os.environ["EAS_SERVICE_TOKEN"] = "EAS_SERVICE_TOKEN"
llm = PaiEasEndpoint(eas_service_url=os.environ["EAS_SERVICE_URL"], eas_service_token=os.environ["EAS_SERVICE_TOKEN"])
# Accessing EAS LLM services:
# Access method 1: Direct access
kwargs = {"temperature": 0.8, "top_p": 0.8, "top_k": 5}
output = llm("Say foo:", **kwargs)
# Access method 2: Template-based access
# 2.1 Prepare the question template
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
# 2.2 Wrap with LLMChain
llm_chain = LLMChain(prompt=prompt, llm=llm)
# 2.3 Access the service
question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"
llm_chain.run(question)

Conclusion

Alibaba Cloud PAI is the perfect blend of cutting-edge platforms, which provides technology and captivating features, creating a transformative experience in the realm of GenAI. Brace yourself for a journey where language becomes a canvas for AI-powered creativity and innovation. Get ready to unlock the true potential of LLM with PAI — your gateway to the future of GenAI!

With Alibaba Cloud PAI-EAS, deploying and running chat models becomes seamless and efficient. Using the above steps and information, you can unlock the full potential of chat models and create intelligent conversational experiences. PAI-EAS empowers developers to harness the capabilities of machine learning with ease while providing high throughput, low latency, and comprehensive operations and maintenance features. Start leveraging chat models within your applications today and embark on a journey of exciting AI-driven conversations.

Experience the power of LLM and LangChain on Alibaba Cloud’s PAI-EAS and embark on a journey of accelerated performance and cost-savings. Contact Alibaba Cloud to explore the world of generative AI and discover how it can transform your applications and business.