How to use @postman to test LLMs with audio data (Transcribe and Understand)

AssemblyAIAssemblyAI
Science & Technology2 min read21 min video
May 13, 2024|3,200 views|66|3
Save to Pod

Key Moments

TL;DR

Use Postman to test AssemblyAI API for audio transcription and LLM insights.

Key Insights

1

Postman is a valuable tool for initial API testing and understanding responses before coding.

2

AssemblyAI API can transcribe audio/video files via URL or direct upload.

3

Speaker diarization can identify different speakers within an audio file.

4

AssemblyAI's LeMUR framework enables LLM-driven analysis of audio transcripts.

5

LeMUR can extract action items, answer questions, and summarize audio content.

6

Postman's browser version may time out on long requests; a desktop agent can help.

INTRODUCTION TO POSTMAN AND ASSEMBLYAI

This tutorial demonstrates using Postman to interact with AssemblyAI's API for audio processing and Large Language Model (LLM) applications. Postman is highlighted as an excellent tool for beginners to understand API requests and responses without immediate coding. It allows for direct testing of endpoints, parameters, and the structure of returned data, making the initial learning curve for new APIs more manageable and encouraging its use beyond this specific tutorial for general API exploration.

AUDIO TRANSCRIPTION VIA POSTMAN

The process begins with setting up a POST request in Postman to AssemblyAI's transcript endpoint. Users need to include their AssemblyAI API key in the authorization headers and set the content type to 'application/json'. Audio or video files can be processed by providing a publicly accessible URL. Alternatively, files can be uploaded directly to AssemblyAI, requiring a change in the content type to 'application/octet-stream' and using the upload endpoint.

RETRIEVING TRANSCRIPTION RESULTS

After initiating a transcription job, Postman is used to poll for the results. A GET request is made to the same transcript endpoint, appending the unique transcript ID obtained from the initial POST request. This allows users to check the status of the transcription job. Once completed, the response will include the full text transcript along with other metadata, such as the audio URL and the model used.

LEVERAGING CONVERSATIONAL INTELLIGENCE FEATURES

AssemblyAI offers various 'conversational intelligence' models beyond basic transcription. These can be enabled by setting specific parameters in the initial transcription request. For instance, setting 'speaker_labels' to true activates speaker diarization, which identifies and labels different speakers throughout the audio. The results, found in the 'utterance' section of the response, attribute text segments to specific speakers and provide word-level timestamps.

USING LEMUR FOR LLM-POWERED ANALYSIS

AssemblyAI's LeMUR framework allows direct LLM interaction through the API for advanced analysis. Two key use cases demonstrated are extracting action items and answering specific questions from transcripts. The 'extract action items' endpoint can process one or more transcript IDs to generate a list of actionable tasks discussed in meetings. Users can specify output formats and provide context to guide LeMUR's analysis.

ADVANCED QUERYING AND RESPONSE FORMATTING

The question-answering capabilities of LeMUR are shown by sending POST requests to a dedicated endpoint. Users can input their own questions, specify desired answer formats, and choose the LLM model. Parameters like 'max_output_size' and 'temperature' control the length and creativity of the AI's response. The tutorial also notes that browser-based Postman requests exceeding 30 seconds may require a desktop agent to prevent timeouts.

Common Questions

Postman is a great tool for testing APIs before integrating them into a codebase. It allows you to send HTTP requests, inspect responses, and experiment with different parameters without writing any code.

Topics

Mentioned in this video

More from AssemblyAI

View all 48 summaries

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Try Summify free