Trending topics
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
Ant Group just open sourced LingBot-Depth.
It solves the hardest depth perception challenge in robotics: handling transparent and reflective objects.
Robots have "eyes" (sensors), but they are usually blind to things like glass cups or shiny metal bowls. They literally look through them or get blinded by reflections.
LingBot-Depth fixes this blindness, allowing robots to "see" and interact with the invisible.
TLDR:
- 10M training samples (~3.1M curated + 7M public)
- SOTA on depth completion benchmarks
- Works for monocular depth, stereo, video depth, and 3D tracking
- Successfully grasps transparent/reflective objects in real robot tests
More details below 👇 1/6
2/6
The biggest problem currently is that standard robot cameras (RGB-D) work by projecting out light to measure distance.
But when that light hits a glass window or a mirror, it doesn't bounce back correctly, it goes through or scatters. The robot just sees a "black hole" or noise. It thinks nothing is there, so it tries to walk through the glass door or crush the cup.
Solution: LingBot-Depth flips this. Instead of filtering out those "black holes," it uses them as a learning signal. It teaches the AI to use the surrounding context (the table, the shadow) to "fill in the blanks" and reconstruct the invisible object.

3/6
They took a vision model (ViT encoder) and trained it to play a "fill-in-the-blanks" game with broken depth maps.
The model learns to look at:
- What the RGB camera sees (colors, edges, shadows)
- The partial depth data that IS working
- The patterns of what's missing
Then it reconstructs the full scene, including the invisible parts.
The clever bit: they didn't create fake masks. They just used the sensor's natural failures as the training data. Every time the camera failed to see glass or metal, that became a lesson.

4/6
LingBot-Depth beats existing methods on standard depth benchmarks (iBims, NYUv2) and works across multiple tasks without retraining:
- Video depth: Keeps depth consistent across frames, even for moving transparent objects
- Stereo matching: Improves accuracy when combined with stereo camera systems
- 3D Tracking: Helps track objects through space more smoothly
It generalizes because it learned to handle "missing information" as a core skill, not as an edge case.

5/6
Real Robot Test
They mounted the system on a robot arm (Rokae XMate SR5) and gave it two impossible tasks:
Transparent storage box
- Standard depth sensor: complete failure (0 percent success, could not even detect it)
- LingBot Depth: 50 percent success rate (saw the box, planned grasp correctly)
Reflective steel cup
- Standard sensor: confused by reflections
- LingBot Depth: consistent success (reconstructed plausible geometry)
This is not just better numbers on a benchmark.
It is a robot that can actually grab your water glass without knocking it over.

652
Top
Ranking
Favorites
