Nori Bot: A Sub-$1,000 Floor-to-Counter Mobile Manipulator

Source: arXiv:2605.16537 · Published 2026-05-15 · By Antonio Li, Sungjoon Park, Wen Ni Chew

TL;DR

This paper presents Nori Bot, a dual-arm mobile manipulator costing $947 in parts—around 3% the price of comparable commercial platforms—and addressing three major limitations common to sub-$1,000 open-source manipulators. First, it introduces a 600mm linear Z-axis lift driven by the same Feetech servo bus as the arms, enabling a floor-to-counter reach previously unavailable without adding extra controllers. Second, it implements a thin-client compute architecture using a Raspberry Pi 4 paired with the OpenClaw proactive agent runtime, allowing scheduled and autonomous task execution via cron jobs, hooks, and heartbeats rather than only reactive human prompts. Third, it includes a software safety stack that prevents the common stall-induced burn-out of cheap Feetech servos through calibration clamping, stall detection, and firmware EEPROM backstops, while also recovering continuous grip-force feedback by mapping motor current on a soft TPU finger.

The authors demonstrate end-to-end task executions spanning the full Z-axis envelope, including book re-shelving, trash pickup from the floor, laundry sorting, cloth folding on a table, and an autonomous make_coffee task triggered by a cron job. They also train an ACT imitation learning policy from 30 teleoperation demonstrations that reliably grasps a small DC motor from a desk. Zero servo burnouts were observed over four weeks of routine operation after deploying the protection stack, compared to two burnouts before. The paper delivers a practical low-cost hardware plus software foundation for capable household mobile manipulation with proactive behavior and servo safety, aiming to enable future large-scale deployment and multi-robot data pooling.

Key findings

Nori Bot costs $947 in parts, approximately 3% the cost of comparable commercial dual-arm mobile manipulators (e.g., Hello Robot Stretch at $25,000).
The 600 mm Z-axis lift uses a single Feetech STS3215 servo on the existing right-arm servo bus, avoiding a second controller board or motor protocol.
The thin-client Raspberry Pi 4 (1 GB RAM) runs motor I/O and a WebSocket bridge, with heavy compute delegated offboard via the OpenClaw agent runtime.
OpenClaw proactive triggers (cron, hooks, heartbeats) enable autonomous scheduled and reactive-but-unprompted tasks, e.g., autonomous make_coffee at 8am.
The onboard software safety stack including calibration clamping, stall detector, and EEPROM backstops prevented any servo burnouts over four weeks, while two burnouts occurred prior.
Grip force is estimated sensorlessly by mapping motor current on a soft TPU gripper finger to a normalized force signal [0,1], usable as an input observation for imitation learning.
An ACT-imitated policy trained on 30 demonstrations reliably grasps and lifts a small DC motor held on a desk from neutral starting pose.
End-to-end task latency from OpenClaw cron trigger to first arm motion averages 2.4s, suitable for scheduled household tasks but too slow for closed-loop reactive control.

Threat model

The adversary is not explicitly modeled as malicious but assumed to be accidental or environmental factors that can cause servo stall and mechanical limit violations leading to hardware damage. The protections defend against unintentional stall-induced gear stripping. The model excludes adversaries capable of subverting software or injecting malicious commands but protects against physical faults and control errors within normal operation.

Methodology — deep read

Threat Model and Assumptions: The adversary model is implicit: the robot operates in unstructured household environments with unknown obstacles and contacts. There is an assumption that servo stall can occur unintentionally due to commanded positions exceeding joint limits or physical obstruction. The system is not designed against malicious, adversarial manipulation but instead focuses on robustness to ordinary mechanical and control failures.
Data: Demonstrations for imitation learning were collected via leader-follower teleoperation at 50 Hz, recording arm and gripper joint states and the newly derived grip-force channels from motor current. The main trained skill is 'pick_motor' a grasp skill of a small DC motor using 30 recorded episodes. No multi-task training was performed to avoid overfitting at small scale.
Architecture / Algorithm: The hardware consists of a base XLeRobot platform with 2 SO-101 6-DoF arms powered by Feetech STS3215 servos, augmented by a 1-DoF linear Z-axis lift using a CNC-style 600 mm ball-screw rail driven by a Feetech servo on the same half-duplex bus as the right arm. Robot computation runs on a Raspberry Pi 4 (1 GB RAM) acting as a thin client for motor I/O and WebSocket bridging. Heavy compute and ACT imitation learning policies run offboard and are served via LeRobot's gRPC PolicyServer.

The proactive agent runtime is OpenClaw, an open-source framework that schedules tasks through cron jobs, event hooks, and periodic heartbeats by exposing the robot as an OpenClaw skill manifest. Skills translate into JSON commands sent to the robot client, which dispatches them as ACT policy checkpoints or scripted motions.

A software safety layer prevents Feetech servo burnouts via multi-layer calibration clamping (bounding goal positions within calibrated joint limits), stall detection (detecting sustained zero-position-change with high motor current over a 15-loop window), and persistent EEPROM backstops to limit torque and current.

Grip force estimation piggybacks on the stall detector's existing current register reading from the servo. A soft TPU finger converts the motor current step into a continuous force ramp, normalized per gripper and used as an observation channel in the ACT imitation learning policy.

Training Regime: ACT policies were trained separately per skill (no multi-task co-training) with chunk size 100 (2 s at 50 Hz) and 40 action steps (~0.8 s re-query). Training ran for 40,000 gradient steps on a consumer laptop GPU with around 80 million parameters.
Evaluation Protocol: Five end-to-end household tasks spanning the vertical workspace were demonstrated qualitatively: floor-level trash pickup, mid-level book shelving, table-top folding, laundry sorting, and autonomous scheduled coffee making. Quantitative benchmarking is pending for full publication. One ACT skill was quantitatively evaluated on picking a DC motor with success judged by successful grasps under neutral conditions; no failure modes or statistical n-trials recorded yet. Protection events were logged continuously over four weeks to quantify servo burnouts and stall events.
Reproducibility: Code, CAD models, and skill manifests are planned for open-source release. The platform uses off-the-shelf hardware parts and open-source compute/runtime stacks (LeRobot, OpenClaw). The data set for imitation learning is not fully described for public release yet.

Example walk-through: In the pick_motor task, the robot lift is scripted to a fixed height; then the ACT policy observes RGB cameras, right arm joint states, and the gripper force signal; outputs 6DoF joint target positions at 10 Hz; and successfully grasps and lifts a small DC motor from a desk under neutral lighting and position. Force signals derived from motor current showed a smooth ramp during successful grasps, enabling the policy to implicitly react to grip force.

The safety stack prevented any servo burnouts during four weeks of routine operation by clamping out-of-range commands and dropping torque on stalls. Protection events captured 14 clamp events and 29 stall detections, none of which led to servo failures, representing a key engineering advance for cheap Feetech servos.

Technical innovations

Integrating a 600 mm ball-screw linear Z-axis lift driven by a single Feetech servo on the arm's existing half-duplex bus to enable floor-to-counter vertical reach without extra controllers.
Employing a thin-client compute architecture with a Raspberry Pi 4 for low-level motor I/O and delegating heavy computation offboard via OpenClaw, supporting proactive task scheduling (cron, hooks, heartbeats).
Developing a multi-layer onboard software safety stack that prevents Feetech servo burnout through calibration position clamping, a real-time stall detector, and firmware EEPROM backstops.
Recovering continuous grip-force feedback on sub-$1,000 hardware by mapping servo motor current signals into a normalized force estimate on a soft TPU gripper finger, usable as an observation for imitation learning.
First demonstration of plugging a low-cost physical dual-arm mobile manipulator into the OpenClaw proactive agent runtime as a skill manifest with autonomous scheduled task triggering.

Datasets

pick_motor demonstrations — 30 episodes — teleoperated leader-follower data recorded at 50 Hz with RGB cameras, joint states, force signals

Baselines vs proposed

Servo protection effectiveness: Zero servo burnouts logged over four weeks after safety stack deployment vs two burnouts prior in equal previous operation time.
End-to-end task latency: 2.4 s from OpenClaw cron trigger to arm motion start (proactive execution) vs <0.5 s typical in reactive systems.
Cost comparison: Nori Bot $947 in parts vs commercial platforms e.g. Hello Robot Stretch at $25,000, Mobile ALOHA at $32,000.

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2605.16537.

Fig 1

Fig 1: Nori Bot at the two ends of its 600 mm Z-axis travel. Left: carriage at the bottom, arms at cart-shelf height for floor-level interaction (rubber duck

Fig 2

Fig 2: Tasks across the Z-axis envelope. Top-left: floor-level reach, picking

Fig 3

Fig 3: Modeled force signal during a successful grasp on a rigid object

Fig 4

Fig 4 (page 4).

Fig 5

Fig 5 (page 4).

Fig 6

Fig 6 (page 4).

Limitations

Quantitative task success metrics beyond single pick_motor skill are not yet measured or reported; benchmarking is ongoing.
No multi-task joint training of skills reported; per-skill training risks limited generalization and efficiency at small dataset size.
Force feedback resolution under-resolves for very compliant objects like sponges or fabric; only a scalar force estimate is recovered, not contact distribution.
Z-axis lift is deliberately slow due to servo choice; fluidity and speed improvements planned.
Reactive closed-loop control is not supported given system latency; proactive agent scheduler suitable only for periodic/scheduled tasks.
Hardware platform relies on proprietary Feetech servos which have inherent fragility without software protections.

Open questions / follow-ons

How does the ACT policy performance and grasp success scale with larger training datasets (≥50 episodes) and with/without grip-force observation input?
How robust are skills trained on fixed Z-axis heights to deployment at different lift positions, i.e., generalization across vertical workspace?
Can latency in proactive execution be reduced to support more fluid or reactive robot behaviors beyond scheduled tasks?
Does multi-task co-training of multiple skills improve policy generalization and efficiency compared to per-skill training at scale on Nori Bot?

Why it matters for bot defense

For bot-defense and CAPTCHA practitioners, Nori Bot offers an instructive case study in deploying a low-cost mobile manipulator platform with embedded proactive autonomy and built-in hardware protections critical for long-term reliability. Its approach to embedding continuous grip-force feedback by sensorlessly exploiting motor current on compliant fingers enables richer state observations essential for robust autonomous manipulation policies, illustrating practical engineering trade-offs in resource-constrained robotics. The proactive agent runtime integration demonstrates how scheduling physical tasks autonomously through triggers can overcome the reactive-only control limitation of many low-cost platforms.

While Nori Bot is primarily a robotics platform paper rather than a direct bot-defense or CAPTCHA solution, its layered software safety, sensorless force sensing, and scheduled autonomous task execution provide design patterns that bots deployed in human environments might emulate or require to robustly handle physical interaction states, enabling safer and more reliable HRI. Bot-defense systems that rely on behavioral or interaction signals could draw parallels in sensor design and fault prevention strategies. CAPTCHA developers might study how such low-cost physical proxies extend the attack surface and how continuous feedback channels from inexpensive hardware can augment detection and robustness.

Cite

bibtex

@article{arxiv2605_16537,
  title={ Nori Bot: A Sub-$1,000 Floor-to-Counter Mobile Manipulator },
  author={ Antonio Li and Sungjoon Park and Wen Ni Chew },
  journal={arXiv preprint arXiv:2605.16537},
  year={ 2026 },
  url={https://arxiv.org/abs/2605.16537}
}

Nori Bot: A Sub-$1,000 Floor-to-Counter Mobile Manipulator ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​