
UPCOMING
Vision Possible: Agent Protocol
Your mission, should you choose to accept it: Build multi-modal AI agents that watch, listen, and understand video in real-time.
23 Feb - 1 Mar
$6,000+
+ exclusive agent swag
+ interview at WeMakeDevs
Mission Briefing
Your mission, should you choose to accept it: Build multi-modal AI agents that watch, listen, and understand video in real-time. Vision Agents gives you the building blocks to create intelligent, low-latency video experiences powered by your models, your infrastructure, and your use cases. Whether you're building security systems, sports coaching AI, interactive gaming, or something we haven't imagined yet - this hackathon is your proving ground. This message will self-destruct... after you build something amazing.
Your Mission Objectives
Build multi-modal AI agents that watch, listen, and understand video in real-time
In the world of AI, video remains the final frontier. Static image analysis is yesterday's mission. Real-time video understanding is the protocol for the future.
Vision Agents gives you the building blocks to create intelligent, low-latency video experiences powered by your models, your infrastructure, and your use cases.
Whether you're building security systems, sports coaching AI, drone detection, or interactive gaming experiences - this hackathon is your chance to push the boundaries of what's possible with real-time Vision AI.
Video AI
Real-time video intelligence
Combine YOLO, Roboflow, Moondream, and other vision models with Gemini/OpenAI in real-time. Build agents that truly see and understand.
Ultra-Low Latency
Stream's edge network
Join quickly (500ms) and maintain audio/video latency under 30ms. Your agents respond in real-time, not real-later.
Native LLM APIs
Direct access to the latest models
Native SDK methods from OpenAI, Gemini, and Claude. Always access the latest LLM capabilities without waiting for wrapper updates.
Cross-Platform SDKs
Build anywhere
SDKs for React, Android, iOS, Flutter, React Native, and Unity. Your vision agents can run on any platform.
Mission Rewards
Complete your mission and earn elite status with substantial rewards
Top 3 Elite Agents
Intel Network Rewards
Join the network! Star the Vision Agents repository on GitHub and share your mission progress on social media ( tag @VisionAgents). Top 10 intel reports win swag bundles.
Career Opportunities
Outstanding agents may be recruited for positions at Stream. Showcase your vision AI skills and join the team building the future of real-time video!
Mission Sponsor

Vision Agents is an open-source SDK by Stream for building real-time Vision AI agents. It provides the building blocks to create intelligent, low-latency video experiences - combining vision models like YOLO, Roboflow, and Moondream with LLMs like Gemini and OpenAI. With native SDK methods, ultra-low latency via Stream's edge network, and support for React, Android, iOS, Flutter, React Native, and Unity, you have everything you need to build the next generation of video AI applications. Your mission: push the boundaries of what's possible with real-time video intelligence.
Mission Evaluation Criteria
Potential Impact
How effectively does the project address a meaningful problem or unlock a valuable use case in the Vision AI space?
Creativity & Innovation
How unique is the idea? Does it push the boundaries of what's possible with real-time video AI?
Technical Excellence
How well is the project implemented? Does it demonstrate mastery of Vision Agents SDK and related technologies?
Real-Time Performance
Does the agent truly operate in real-time? Is it responsive and low-latency as Vision Agents enables?
User Experience
Is the agent intuitive to interact with? Does it provide a seamless, polished experience?
Best Use of Vision Agents
How effectively does the project leverage Vision Agents' capabilities - video AI, low latency, native APIs, and multi-platform support?