01001010
10110101
01101001
11010010
AGENT://
PROTOCOL
ACTIVE
Vision Possible: Agent Protocol

UPCOMING

Vision Possible: Agent Protocol

Your mission, should you choose to accept it: Build multi-modal AI agents that watch, listen, and understand video in real-time.

Submit your projects

23 Feb - 1 Mar

Starts in 0 minutes

Mission Rewards

$4,000+

+ exclusive agent swag

+ interview at WeMakeDevs

Mission Briefing

Your mission, should you choose to accept it: Build multi-modal AI agents that watch, listen, and understand video in real-time. Vision Agents gives you the building blocks to create intelligent, low-latency video experiences powered by your models, your infrastructure, and your use cases. Whether you're building security systems, sports coaching AI, interactive gaming, or something we haven't imagined yet - this hackathon is your proving ground. This message will self-destruct... after you build something amazing.

Your Mission Objectives

Build multi-modal AI agents that watch, listen, and understand video in real-time

In the world of AI, video remains the final frontier. Static image analysis is yesterday's mission. Real-time video understanding is the protocol for the future.

Vision Agents gives you the building blocks to create intelligent, low-latency video experiences powered by your models, your infrastructure, and your use cases.

Whether you're building security systems, sports coaching AI, drone detection, or interactive gaming experiences - this hackathon is your chance to push the boundaries of what's possible with real-time Vision AI.

500ms
Join latency
<30ms
Audio/Video latency

Video AI

Real-time video intelligence

Combine YOLO, Roboflow, Moondream, and other vision models with Gemini/OpenAI in real-time. Build agents that truly see and understand.

Ultra-Low Latency

Stream's edge network

Join quickly (500ms) and maintain audio/video latency under 30ms. Your agents respond in real-time, not real-later.

Native LLM APIs

Direct access to the latest models

Native SDK methods from OpenAI, Gemini, and Claude. Always access the latest LLM capabilities without waiting for wrapper updates.

Cross-Platform SDKs

Build anywhere

SDKs for React, Android, iOS, Flutter, React Native, and Unity. Your vision agents can run on any platform.

Mission Rewards

Complete your mission and earn elite status with substantial rewards

Elite Agent Rewards

Alpha Protocol
$2,000

1st Place

Bravo Protocol🎯
$1,500

2nd Place

Best Blog Submission📝
$500

Share your experience using Vision Agents SDK in a blog

+ exclusive agent swag+ interview at WeMakeDevs

Intel Network Rewards

Join the network! Star the Vision Agents repository on GitHub and share your mission progress on social media ( tag @VisionAgents). Top 10 intel reports win swag bundles.

Top 10 Posts Win Swag

Career Opportunities

Outstanding agents may be recruited for positions at WeMakeDevs. Showcase your vision AI skills and join the team building the future of real-time video!

Join the WeMakeDevs Team

Mission Evaluation Criteria

Potential Impact

How effectively does the project address a meaningful problem or unlock a valuable use case in the Vision AI space?

Creativity & Innovation

How unique is the idea? Does it push the boundaries of what's possible with real-time video AI?

Technical Excellence

How well is the project implemented? Does it demonstrate mastery of Vision Agents SDK and related technologies?

Real-Time Performance

Does the agent truly operate in real-time? Is it responsive and low-latency as Vision Agents enables?

User Experience

Is the agent intuitive to interact with? Does it provide a seamless, polished experience?

Best Use of Vision Agents

How effectively does the project leverage Vision Agents' capabilities - video AI, low latency, native APIs, and multi-platform support?

Mission Stats

Participants

Prizes worth

$

Project submissions

Countries

Agent Testimonials

Quotation symbol

Just finished building FitAgent for the #VisionPossible hackathon! It's a real-time AI personal trainer that lives right in your browser. No apps, no sensors—just your webcam, YOLO11, and Gemini Live automatically detecting your exercises and giving you voice coaching! Huge thanks to @WeMakeDevs and the amazing SDK from @visionagents_ai that made building this under 30ms latency possible!

TosifTosif
X
Quotation symbol

A huge thanks to @WeMakeDevs, @visionagents_ai and @kunalstwt for conducting this amazing hackathon. Got to learn a lot about building vision agents!

SwayamSwayam
X
Quotation symbol

Built a real-time cognitive AI tutor. Every 15s it analyzes gaze, posture & cognitive load → adapts live via Gemini + WebRTC. Not passive video. Adaptive learning in real time.

VickyVicky
X
Quotation symbol

Wrote the full build diary for "RoundZero" — Every bug. Every 2AM debugging session. Every honest take on what @VisionAgents actually feels like to build with. I want to thank @kunalstwt and whole @WeMakeDevs for this.

RahulRahul
X
Quotation symbol

Working on a hackathon by @WeMakeDevs, where I need to orchestrate an audio-visual agent using visionagents.ai and it truly has been more fun than expected, from ideating around what to build and exploring the docs and building.

MrityunjayMrityunjay
X
Quotation symbol

Built IntentLens for the Vision Possible Hackathon — a real-time Vision AI agent that tracks people, detects gestures, analyzes movement, and answers questions about what it sees. Learned a lot building this one. Thanks @kunalstwt, @WeMakeDevs & @visionagents_ai!

RishiRishi
X
Quotation symbol

My CCTV fire alarm project is coming along nicely for #visionpossible hackathon.

GauravGaurav
X
Quotation symbol

Built an AI Sports Broadcaster PlayCast: point phone at action → get live ESPN-style commentary. No app install, it runs in browser.

SrinivasSrinivas
X
Quotation symbol

I tried to do something on this weekend. For Vision-Agents Hackathon. @WeMakeDevs Created Semsorter, a multimodal robotic arm system that could help in hazardous item sorting.

KaustubhKaustubh
X
Quotation symbol

Beyond the Webcam: I built a Real-Time AI Medical Wellness Assistant. What if your webcam could detect posture issues, estimate breathing rate, and flag fatigue — in under 10 seconds?

DhruvaDhruva
X

Frequently Asked Questions