How to use @postman to test LLMs with audio data (Transcribe and Understand)
Key Moments
Use Postman to test AssemblyAI API for audio transcription and LLM insights.
Key Insights
Postman is a valuable tool for initial API testing and understanding responses before coding.
AssemblyAI API can transcribe audio/video files via URL or direct upload.
Speaker diarization can identify different speakers within an audio file.
AssemblyAI's LeMUR framework enables LLM-driven analysis of audio transcripts.
LeMUR can extract action items, answer questions, and summarize audio content.
Postman's browser version may time out on long requests; a desktop agent can help.
INTRODUCTION TO POSTMAN AND ASSEMBLYAI
This tutorial demonstrates using Postman to interact with AssemblyAI's API for audio processing and Large Language Model (LLM) applications. Postman is highlighted as an excellent tool for beginners to understand API requests and responses without immediate coding. It allows for direct testing of endpoints, parameters, and the structure of returned data, making the initial learning curve for new APIs more manageable and encouraging its use beyond this specific tutorial for general API exploration.
AUDIO TRANSCRIPTION VIA POSTMAN
The process begins with setting up a POST request in Postman to AssemblyAI's transcript endpoint. Users need to include their AssemblyAI API key in the authorization headers and set the content type to 'application/json'. Audio or video files can be processed by providing a publicly accessible URL. Alternatively, files can be uploaded directly to AssemblyAI, requiring a change in the content type to 'application/octet-stream' and using the upload endpoint.
RETRIEVING TRANSCRIPTION RESULTS
After initiating a transcription job, Postman is used to poll for the results. A GET request is made to the same transcript endpoint, appending the unique transcript ID obtained from the initial POST request. This allows users to check the status of the transcription job. Once completed, the response will include the full text transcript along with other metadata, such as the audio URL and the model used.
LEVERAGING CONVERSATIONAL INTELLIGENCE FEATURES
AssemblyAI offers various 'conversational intelligence' models beyond basic transcription. These can be enabled by setting specific parameters in the initial transcription request. For instance, setting 'speaker_labels' to true activates speaker diarization, which identifies and labels different speakers throughout the audio. The results, found in the 'utterance' section of the response, attribute text segments to specific speakers and provide word-level timestamps.
USING LEMUR FOR LLM-POWERED ANALYSIS
AssemblyAI's LeMUR framework allows direct LLM interaction through the API for advanced analysis. Two key use cases demonstrated are extracting action items and answering specific questions from transcripts. The 'extract action items' endpoint can process one or more transcript IDs to generate a list of actionable tasks discussed in meetings. Users can specify output formats and provide context to guide LeMUR's analysis.
ADVANCED QUERYING AND RESPONSE FORMATTING
The question-answering capabilities of LeMUR are shown by sending POST requests to a dedicated endpoint. Users can input their own questions, specify desired answer formats, and choose the LLM model. Parameters like 'max_output_size' and 'temperature' control the length and creativity of the AI's response. The tutorial also notes that browser-based Postman requests exceeding 30 seconds may require a desktop agent to prevent timeouts.
Mentioned in This Episode
●Software & Apps
●Tools
●Organizations
●Concepts
Common Questions
Postman is a great tool for testing APIs before integrating them into a codebase. It allows you to send HTTP requests, inspect responses, and experiment with different parameters without writing any code.
Topics
Mentioned in this video
A unique key required to authenticate requests to the Assembly AI API, obtained from the Assembly AI dashboard.
The content type required for uploading audio files directly to the Assembly AI API.
A platform for API development and testing, used in the video to interact with Assembly AI's API.
Tasks or follow-up items generated from meeting discussions, extractable using Assembly AI's LLM features.
Assembly AI's framework for large language models, used for tasks like generating summaries, action items, and answering questions from audio data.
A company whose meeting recordings were used as an example for transcription and analysis.
A feature of Assembly AI that identifies and separates different speakers within an audio file.
More from AssemblyAI
View all 48 summaries
1 minUniversal-3 Pro Streaming: Subway test
2 minUniversal-3 Pro: Office Icebreakers
20 minBuilding Quso.ai: Autonomous social media, the death of traditional SaaS, and founder lessons
61 minPrompt Engineering Workshop: Universal-3 Pro
Found this useful? Build your knowledge library
Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.
Try Summify free