Driving a Real Robot from a Terminal —
Safely, on Free and Open Tooling

May 29, 2026 · Craig Merry

The question wasn't "can an AI move a robot arm?" Of course it can. The question was: can it do that safely — where a tampered or unknown command is refused, not executed — using only free, open tooling a hobbyist already has?

The robot is a hobby-grade SO-ARM101: six Feetech STS3215 servos, an OAK-D-Lite depth camera overhead, the whole thing run by a Raspberry Pi. The operator is a plain Claude Code terminal — no bespoke app, no SDK glue. The goal: have it see a red LEGO brick and drop it in a bowl, and climb a ladder of trust while doing it, one rung at a time.

the escalation ladder
R0  raw          servos + camera, no protocol        — a plain Python loop
R1  perception   robot-md vision_find over depthai          — "what do you see?"
R2  authorized   signed RCAN INVOKE  →  robot-md-gateway    — "you may actuate"
R3  identity     signature verified LIVE against RRF         — "and we know who you are"
R4  cloud        OpenCastor fleet bridge                     — "from anywhere"

R0 — Raw control, no protocol

The rawest possible path: torque off the servos, hand-pose the arm through a pick-and-place by feel, record the encoder angles at each waypoint, then replay them. No inverse kinematics, no camera, no cloud — about a hundred lines of Python talking straight to the serial bus. It worked: the arm ran the full taught trajectory, approached the brick, closed, lifted, and swung to the bowl. Honest result: it almost made the pick — the jaws didn't quite secure the small brick and gravity sag crept into the carry, so the LEGO ended up beside the bowl rather than in it. Teach-and-replay proves the arm faithfully executes a trajectory; nailing the grasp is a tuning problem.

R1 — Perception, and a fix worth upstreaming

The next rung asks the robot what it sees. robot-md exposes a vision_find tool backed by its feetech_depthai driver — and it returned no_backend on every call. The root cause turned out to be general: the perception module was written against the depthai v3 API, but this Pi runs depthai 2.x, so camera startup raised and the backend fell over. Porting Perception.open() back to the v2 pipeline (a ColorCamera, two MonoCameras, and an aligned StereoDepth node) brought it to life on this rig: clean aligned depth and a solid lock on the red brick. That's a one-file fix that should unblock vision on other depthai-2.x rigs, and it's headed upstream.

The honest limit. Closing the loop to a fully autonomous "see brick → solve IK → pick" did not work on this rig — and not because of a software bug. The arm's calibrated wrist range simply can't reach the gripper orientation the IK solver demands for most tabletop targets. That's mechanical geometry, not a calibration miss. On this body, teach-and-replay is the pragmatic path; chasing autonomy further would mean a different arm.

R2 + R3 — A command the registry has to vouch for

This is the rung that matters. Raw control is easy; safe control is the hard part, and it's where the RCAN protocol earns its keep. Instead of the terminal poking the serial bus directly, every actuation request becomes a signed RCAN INVOKE envelope sent to the robot-md-gateway — the one process that owns the hardware. The gateway runs with REQUIRE_ENVELOPE_SIGNATURE=1, and before it will do anything it checks two signatures, resolving each signer's public key live from the Robot Registry Foundation (robotregistryfoundation.org):

  • Manifest provenance — the robot's ROBOT.md carries a signature footer, proving its declared capabilities and safety gates haven't been altered.
  • Envelope signature — the INVOKE itself is signed by a registered operator identity (here, bob-operator-2026), whose public key RRF publishes.

With a correctly signed envelope, a no-motion read_state call sails through — and the gateway hands back the arm's live joint angles and motor temperatures. The identity was verified against the public registry over the network, in real time:

authorized → executed
$ # operator-signed RCAN INVOKE  →  robot-md-gateway  →  read_state (no motion)
HTTP 200
{
  "ok": true,
  "manifest_kid": "bob-operator-2026",
  "scope": "READ",
  "tool_name": "read_state",
  "actuator_name": "so-arm101",
  "outcome_kind": "executed",
  "telemetry": {
    "positions": { "shoulder_pan": 0.589, "shoulder_lift": -0.568,
                   "elbow_flex": 1.157, "wrist_flex": 0.095,
                   "wrist_roll": 0.213, "gripper": 0.446 },
    "motor_temps_c": { "shoulder_pan": 34.0, "shoulder_lift": 40.0,
                       "elbow_flex": 33.0, "wrist_flex": 32.0,
                       "wrist_roll": 31.0, "gripper": 41.0 }
  }
}

The lock is only real if it stays shut for everyone else. So we tried to force it. Change a single field of the signed envelope in transit, sign with a key the registry has never seen, or strip the manifest's provenance footer — each one is refused, fail-closed, with a specific reason:

unverifiable → refused
# Same command, tampered in transit (one field changed after signing):
HTTP 403   { "deny": "envelope_signature", "reason": "signature did not verify" }

# Signed by a key the registry has never seen:
HTTP 403   { "deny": "envelope_signature", "reason": "kid rogue-operator-9999 not registered" }

# Manifest with no provenance signature footer:
HTTP 403   { "deny": "manifest_provenance", "reason": "no ROBOT-MD-SIG footer (signature absent)" }

That's the whole thesis in one screen: authorized → actuates; tampered or unknown → rejected. We proved it with read_state deliberately — a telemetry read moves nothing, so the trust machinery is exercised without spinning a motor. A motion command goes through the same identity check — both signatures, resolved live against RRF — but the authorization is deliberately stricter: actuation carries an ACTUATE scope, and the gateway flatly refuses an actuation scope from a read-tier caller (a read-only key can look, but it cannot move the arm). Same proof of who; a tighter gate on what.

Every layer of this is free

Nothing in that ladder is behind a paywall. robot-md is pip install, Apache-2.0. RCAN is an open protocol — an independent standard for robot identity and authorization, the way DNS is for names. An RRF registration (your robot's permanent RRN) is free to mint. The OpenCastor runtime is Apache-2.0. The operator is Claude Code. A hobbyist with a Raspberry Pi and a low-cost arm can stand up this exact stack — signed, registry-anchored, fail-closed control — at zero software cost. That's the point of shipping opencastor.com free: the safe path shouldn't be the expensive one.

An honest scorecard

RungStatus
R0 raw teach-and-replay✓ ran end-to-end; grasp undershoots (tuning)
R1 perception (vision_find)✓ fixed (depthai v2 port); brick detected
Full-auto IK pick✗ blocked by the arm's mechanical envelope
R2 signed INVOKE → gateway✓ executed (HTTP 200, telemetry)
R3 RRF identity + fail-closed✓ verified live; tamper/rogue refused
R4 OpenCastor cloud bridge— not exercised in this session

Two more honest notes: the signed actuation path was proven with a read, not a physical move (by choice — the arm's grasp still needs tuning, and the gating is identical either way); and the depthai-v2 perception fix is committed locally, with an upstream pull request still to come.

Reproduce it on your robot

The full path, from bare Pi to a registry that vouches for your commands:

  1. pip install robot-md, then robot-md init <name> --preset so-arm101 --register to mint a free RRN.
  2. robot-md install-gateway — the one process that owns the hardware.
  3. Register an operator identity with RRF and hold its signing key; that key is what authorizes your INVOKEs.
  4. Sign + send: a signed RCAN INVOKE to the gateway. Verified against the registry, it actuates; tampered, it doesn't.
  5. Point a Claude Code terminal at it. Ask "what do you see?" and "pick up the brick" — and watch the gateway insist on a signature first.

A simple terminal can drive a real robot. The interesting part is that a free, open registry can stand between the terminal and the motors and refuse anything it can't verify. That's the difference between a demo and a system you'd trust near your hands.