Structured Response from LLMs
Large Language models (LLMs) have revolutionized the way we interact with unstructured text data. They can search for specific information, summarize key points, and even answer straightforward yes-or-no questions with corresponding explanations.
However, for we developers, the outputs generated by LLMs can be cumbersome to handle. These models can certainly generate a paragraph based on our requirements, but the data is unstructured, posing challenges for us who prefer structured data. Instead of presenting users with the raw output from the LLM, we desire the flexibility of structured data.
Making LLMs Generate Structured Data
Function calling is a novel feature offered on gpt-3.5-turbo-0613
and gpt-4-0613
by OpenAI. It enables the LLM to execute a predefined function allowing the LLM request API calls and consume the returned response. For instance, the following function can be provided to the Chat Completions API:
get_current_weather(location: string,unit:'celsius'|'fahrenheit')
Upon prompting the LLM with a query like "What is the weather like in London," GPT responds by calling the get_current_weather()
function with the location set to "London". The output can then be processed to generate a response such as "It's 30 degrees Celsius in London." Impressive, isn't it?
Let's delve even deeper!
What if, instead of retrieving data for the LLM through function calls, we equip it with a function to log our desired actions? Let's assume we specify some parameters for the log entry output and instruct the LLM to log this information. You might be surprised to find that GPT will happily comply!
The Code in Action
Let's walk through a sentiment analysis example. Our aim is for GPT to identify the sentiment in an article, assign a sentiment score, and provide instances that reinforce the identified sentiment.
We can shape our structured response using the options available in the function's parameters. To log the sentiment analysis, we can detail the parameters as a function, as outlined below:
structuredResponseFn = {
"name": "logger",
"description": "The logger function logs takes a given text and provides the sentiment, a sentiment score, and provides supporting evidence from the text.",
"parameters": {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"description": "The overarching sentiment of the article.",
},
"sentimentScore": {
"type": "number",
"description": "A number between 0-100 describing the strength of the sentiment.",
},
"supportingEvidence": {
"type": "array",
"items": {
"type": "object",
"properties": {
"example": {
"type": "string",
"description": "An example of the sentiment in the text.",
},
"score": {
"type": "number",
"description": "A number between 0-100 describing the strength of the sentiment example.",
},
},
"required": ["example", "score"],
},
"description": "A sorted list by score of supporting evidence for the sentiment.",
},
},
"required": [
"sentiment",
"sentimentScore",
"supportingEvidence",
],
},
}
Next, we'll use the following prompt for GPT, allowing it to utilize the function and generate a response based on sentiment analysis. Remember, GPT might not fill in all the details, so ensure to prompt it for all the return values you'd like it to respond with.
structuredResponseContent = f"""{article}
Log the sentiment of the article and provide the top 3 supporting evidences.
"""
Here's the Python snippet that brings the two code pieces together:
def run_structured_response(structuredResponseContent, structuredResponseFn):
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo-0613",
messages=[{"role": "user", "content": content}],
functions=[structuredResponseFn],
temperature=0.1,
function_call="auto",
)
response_message = response["choices"][0]["message"]
if response_message.get("function_call"):
function_args = json.loads(response_message["function_call"]["arguments"])
print(function_args) # print out the structured response
Experimenting
Let's put this function to the test using a sample customer complaint letter. As the letter is a complaint, we can anticipate a negative sentiment. Let's explore how well GPT performs.
Here is the output after invoking `run_structured_response`. We received a JSON object that reflects the sentiment type, a sentiment score, and the top 3 instances from the text that back up the sentiment.
{
"sentiment":"negative",
"sentimentScore":80,
"supportingEvidence":[
{
"example":"The sofa is defective.",
"score":90
},
{
"example":"One of the legs broke off on March 31, 2021.",
"score":85
},
{
"example":"The store manager, Aaron, would not speak to me.",
"score":75
}
]
}
— Revised with Henshu.ai. See the original and revised version here