Guide for creating and using multimodal AI agents in Praison Labs for processing images, videos, and other media types
Install Package
First, install the Praison Labs Agents package:
Set API Key
Set your OpenAI API key as an environment variable in your terminal:
Create a file
Create a new file app.py
with the
basic setup:
Start Agents
Type this in your terminal to run your agents:
Install Package
First, install the Praison Labs Agents package:
Set API Key
Set your OpenAI API key as an environment variable in your terminal:
Create a file
Create a new file app.py
with the
basic setup:
Start Agents
Type this in your terminal to run your agents:
Install Package
Install the Praison Labs package:
Set API Key
Set your OpenAI API key as an environment variable in your terminal:
Create a file
Create a new file agents.yaml
with
the basic setup:
Start Agents
Type this in your terminal to run your agents:
Requirements
Multimodal agents are designed to:
Analyze images, detect objects, and understand visual content.
Process video content for events and actions.
Extract and analyze text from images and documents.
Integrate insights across different media types.
Extract and analyze text from document images.
Monitor security feeds for suspicious activity.
Analyze medical scans for abnormalities.
Study architectural features and designs.
Learn about automatically created and managed AI agents
Explore lightweight, focused AI agents
For optimal results, ensure your media files are in supported formats and sizes for processing.