[EN] CostNav: A Navigation Benchmark for Cost-Aware Evaluation of Embodied Agents

Haebin Seong

Nov 12, 2025 · 5 min read

Hi, I’m Haebin Seong, Senior Researcher on the WoRV (World Model for Robotics and Vehicle Control) team at Maum.AI. I serve as the Project Lead for CostNav, one of the company’s most strategically important initiatives, which engages all members of the WoRV team. CostNav is a cornerstone program that directly informs Maum.AI’s long-term business strategy by answering a pivotal question for the company:

How can our embodied-AI technologies be translated into commercially scalable, high-value products?

CostNav: It's Time to Talk About Money

Let's say you've built an amazing autonomous delivery robot. It navigates sidewalks flawlessly, avoids pedestrians with grace, and completes deliveries with impressive success rates. Your research paper is accepted at a top conference. Congratulations!

Now comes the real question: Will it make money?

The Elephant in the Research Lab

Here's an uncomfortable truth about autonomous navigation research: we've been optimizing for the wrong things.

We obsess over success rates, path efficiency, and collision counts. We celebrate when our robot achieves 95% task completion or shaves seconds off navigation time. These metrics are important—but they're not what keeps startup founders awake at night.

What keeps them awake is this: "Should I spend $8,000 on LiDAR sensors with classical planning, or $400 on RGB-D cameras with a learning-based approach that costs $2,400 to train? How many deliveries until I break even? What's my actual cost per delivery when I factor in energy, crashes, and sensor degradation?"

Academic benchmarks don't answer these questions. CostNav does.

A Different Kind of Benchmark

CostNav is the first navigation benchmark that evaluates robots the way businesses actually evaluate them: by profit per run.

Instead of just measuring whether your robot successfully navigated from A to B, CostNav asks: How much did that navigation cost? How much revenue did it generate? When will this system become profitable?

This isn't just about adding a cost column to existing metrics. It's a fundamental shift in how we think about navigation performance. Because here's the thing: a robot with 80% success rate using cheap sensors might be more profitable than a robot with 95% success rate using expensive LiDAR. Traditional benchmarks would crown the 95% robot as superior. CostNav reveals the economic reality.

The Complete Economic Picture

CostNav models the entire economic lifecycle of a delivery robot:

Before the robot even starts: Hardware costs, sensor investments, training expenses, data collection—all the upfront investments that need to be recovered.

During every delivery: Energy consumption from motors and sensors, battery degradation from charge cycles, maintenance costs from wear and tear, crash damage from collisions.

Revenue that actually matters: Not just "did it deliver?" but "did it deliver within the service-level agreement (SLA)?" Because in the real world, a delivery that takes 35 minutes when you promised 30 gets refunded. A delivery that arrives on time but spoils the food because of aggressive driving is refunded. In practice, both timing and quality constraints define whether a delivery has truly created "economic value".

All of this is grounded in real-world data: actual delivery service pricing, industry energy rates, hardware costs from commercial robots. This isn't theoretical—it's what real companies face every day.

What We're Building

Our initial release establishes a learning-based navigation baseline in realistic urban environments. But this is just the beginning.

We're going to compare fundamentally different approaches:

- Classical rule-based planning with expensive sensors vs. learning-based methods with cheap cameras

- On-device inference vs. cloud-based computation

- Traditional training vs. cost-aware reinforcement learning that directly optimizes for profit

We're going to test in challenging scenarios:

- Dense crowds where collision avoidance becomes critical

- Nighttime conditions where sensor choices matter

- Adverse weather that tests robustness

- Outdated maps that reflect real-world deployment

We're going to answer questions that matter:

- Which navigation approach maximizes profit, not just performance?

- How do hardware choices affect break-even time?

- What's the true cost of collisions beyond just counting them?

- When does investing in better sensors pay for itself?

Why This Matters

If you're a researcher, CostNav lets you optimize for what actually matters in deployment. You can explore cost-aware reward functions, evaluate trade-offs between sensor cost and performance, and publish work that directly translates to commercial value.

If you're a startup founder or engineer, CostNav gives you data-driven answers to deployment decisions. No more guessing whether expensive sensors are worth it—you'll see the break-even analysis. No more wondering if cloud inference pays for itself—you'll see the profit margins.

If you're building the future of autonomous systems, CostNav bridges the gap between impressive demos and sustainable businesses.

The Vision

Imagine a world where navigation research papers include a "profitability" section alongside accuracy metrics. Where we optimize for dollars per delivery, not just success rates. Where choosing between navigation approaches is guided by break-even analysis, not just technical performance.

That's the world CostNav is building.

We're not saying traditional metrics don't matter—they absolutely do. But they're incomplete. A robot that's technically impressive but economically unviable won't change the world. A robot that's profitable at scale will.

What's Next

We're releasing everything: the benchmark framework, cost models validated against industry data, simulation environment, evaluation code, and our baseline results. We want the community to build on this.

Coming soon:

- Comprehensive comparison of rule-based vs. learning-based navigation economics

- Cloud vs. edge inference trade-off analysis

- Imitation learning that requires human annotation wage cost

- Cost-aware RL training that directly optimizes profit

- Diverse maps, robots reflecting infinite choices in the real world

- Expanded scenarios testing robustness under challenging conditions

- Open challenges for the community to beat our baselines

The autonomous navigation field has made incredible technical progress. Now it's time to make it economically viable.

It's time to talk about money. It's time for CostNav.

CostNav will be presented at CES 2026. Technical report, benchmark, code, and models will soon be released with following continual updates. Stay tuned for updates, and get ready to rethink how you evaluate navigation systems.

The current pre-release version github repo includes our initial implementations for simulation, task design, training, evaluation, and—most importantly—metrics. We’ll be rolling out continual improvements, so keep an eye on upcoming updates!

Github Repo : https://github.com/worv-ai/CostNav

Our technical report is available at the following arXiv link.

CostNav: It's Time to Talk About Money

The Elephant in the Research Lab

A Different Kind of Benchmark

The Complete Economic Picture

What We're Building

Why This Matters

The Vision

What's Next

Written by Haebin Seong

Keep reading

[EN] “The End of VGGT? Why MapAnything Wins in the Real World.”

[EN] Building the CORE High-Performance GPU Cluster