Odyseus - Spatial VLM : Projecting 2D reasoning into 3D outputs (open source repo)

v.redd.it

Odyseus - Spatial VLM : Projecting 2D reasoning into 3D outputs (open source repo)

v.redd.it

eifachposteMB to AI (Reddit RSS)English · 4 hours ago

Original Reddit post

So I’ve always argued that Physical AI for robotics need actionable outputs like 3D coordinates, not bullet points or nice paragraphs. So decided to experiment by combining a VLM with Monocular Depth Estimation, essentially projecting 2D reasoning into 3D, I called it Odyseus - Spatial VLM Tech Stack:

VLM: Qwen 3.6
Depth Estimation: Depth Anything 3 - Metric Large Worked pretty well, figured to share, check repo: https://github.com/MercuriusTech/Odyseus-Spatial-VLM submitted by /u/L42ARO

Originally posted by u/L42ARO on r/ArtificialInteligence

You must log in or # to comment.

Chat