Claude code hacked my (rotary) phone

A rather technical report from claude itself on how it reverse engineered a hardware protocol of a viking voip phone

Viking K-1900D-IP phone at the ElevenLabs booth

As part of my work on putting a voice agent in a British red telephone booth, I needed to reprogram a Viking K-1900D-IP VoIP phone to call the agent.

The manufacturer only provided Windows software, and I only had a Mac. So I decided to work around any middleware software, programming hardware directly.

Interestingly, our team already tried to crack this phone in 2025. Back then they used a simple chatGPT and failed.

This year I closed the loop by connecting claude code directly to the phone and letting it autonomously iterate; and succeeded. The whole process took me about a day. This illustrates the leap in AI capabilities that happened over last year.

The interaction itself was also quite interesting. My role in the project was more of an assistant to the agent. For example, on one instance, it asked me to pick up the phone and count how many beeps I could hear.

The article below is the report of steps that Claude took (co-create with Claude!)

Setup

Phone internals with controller, router, and PoE injector

I connected the phone to the mini router via PoE injector. I connected my laptop to the same router via WiFi.

I asked Claude to program the phone with the SIP trunk I needed.

At first, it spent quite some time trying to find documentation online.

Then it gave up and suggested to try to connect directly and figure out the protocol.

Finding the phone

Claude found the phone on the local network and scanned for open ports:

$ nmap -Pn -p- 192.168.8.235

PORT      STATE SERVICE
10001/tcp open  scp-config
107/tcp   open  rtelnet

Port 10001 turned out to be a Lantronix XPort serial-to-TCP tunnel — a red herring. The actual protocol lives on port 107.

The text protocol

Claude then sent a test sequence VIKING and received ER[VIKING].

This hinted at a XX[value] response format, but everything else was unknown. What followed was dozens of probes, most failing:

Tried random two-letter codes. Some worked:

That confirmed the read syntax: just send two letters.

Tried write formats:

Gradually the protocol revealed itself:

OperationSyntaxResponse
ReadXX\r\nXX[value]
WriteXX=value\r\nXX[value]
Erroranything invalidER[input]
QuitQ\r\nGOODBYE

It came up with the hypotheses that all commands are two-letter sequences.

Then it suggested to brute-force all two-letter sequences and see which of them work.

It wrote a script to try all 676 combinations:

for first in {A..Z}; do
  for second in {A..Z}; do
    cmd="${first}${second}"
    result=$(printf "${cmd}\r\n" | nc -w 1 192.168.8.235 107)
    [[ ! "$result" =~ "ER[" ]] && echo "$cmd -> $result"
  done
done

80+ valid registers came back:

SIP settings (all empty — unconfigured): UC (username), UD (domain), UP (password), UR (registrar), UU (auth ID), UX (outbound proxy)

Network: WA = 192.168.8.235 (IP), WM = 18e80f513f66 (MAC)

Device: MN = VIKING_MK64_Vik02 (device name), MB = R8.44.2236 (firmware build), MC = SGTL5000 0xA011 (audio codec)

Plus some others: audio volume, speed dial slots, NTP server, baud rate, and various control commands.

Now, the entire protocol was mapped. This seemed like a victory, but its never that easy :)

The wall

We set all the SIP credentials, rebooted, and:

UC  -->  UC[]
UD  -->  UD[]
UP  -->  UP[]
SR  -->  SR[NOT_REGISTERED]

Everything gone!

The text protocol writes to RAM, but SIP stack reads from flash. They’re separate stores.

What followed was a long, frustrated search for a save command.

WR=1, WF=1, CE, CU — some return OK, but none make SIP settings survive a reboot.

Claude tried everything, but couldn’t find a way to persist flash.

It realised that Windows software of the phone is likely to use some other mechanism to save changes.

It then suggested to run it in VM, and intercept the traffic to learn how it works.

The MITM

A Windows VM in UTM couldn’t reach the phone directly over WiFi, due to driver incompatibility.

So Claude set up a man-in-the-middle chain:

Windows VM
Viking Software
:10107
Mac
TCP Proxy
:107
Viking Phone
port 107

UTM’s bridged networking didn’t work over WiFi. We ended up with NAT networking + a port forward in PowerShell + a Python TCP proxy on the Mac.

Once connected, the Viking software read the phone’s state using the text commands we already knew. Then I entered SIP credentials and clicked Apply.

The proxy captured something new — TS A binary commands, one command per byte, writing directly to flash memory:

TS A  a7 6d 65 68 9a 6c 6a  3c  73  26  20
TS A  a7 6d 65 68 9a 6c 6a  3d  69  2f  20
TS A  a7 6d 65 68 9a 6c 6a  3e  70  27  20

Format: TS A <7-byte prefix> <addr> <data> <checksum> 0x20

The 7-byte prefix is a device-specific identifier, constant per phone. First byte encodes the operation: 0xa7 for write, 0xa6 for read, 0xa5 for erase.

The data bytes are plaintext ASCII — no encryption, just a trivial checksum for integrity.

After all writes, the save sequence: CE -> GB=40 -> ME=1 -> MR=1

The save commands were the same ones we’d found during the brute force and dismissed as useless. They don’t work with text protocol writes alone — but after binary flash writes, this exact sequence commits everything to persistent storage and reboots.

Cracking the checksum

Claude analysed the captured writes for the string "sip":

addr: 0x3c  data: 0x73  checksum: 0x26   # 's' = 0x73
addr: 0x3d  data: 0x69  checksum: 0x2f   # 'i' = 0x69
addr: 0x3e  data: 0x70  checksum: 0x27   # 'p' = 0x70

And cracked the checksum: (0x195 - address - data_byte) & 0xFF

Verify: (0x195 - 0x3c - 0x73) & 0xFF = 0x26.

It matches! The checksum is a single subtraction.

We then dumped all 256 bytes of the primary flash region and mapped it:

AddressContents
0x0eSecurity code (default 845464 — spells “VIKING” on a phone keypad)
0x3cSIP server / registrar
0x7cAuto-dial phone number 1
0x90-0xcbPhone numbers 2-4
0xcfFirmware ID (read-only)

The two layers

The full picture of what we’d been fighting:

Layer 1: Text protocol (RAM, volatile). Two-letter commands, read/write to RAM. The SIP stack doesn’t read from here.

Layer 2: Binary flash protocol (persistent). The TS command reads and writes directly to flash memory, one byte at a time, with a device-specific auth prefix and a trivial checksum.

For a factory-fresh phone, you must use the binary protocol to write the SIP server to flash. This activates the SIP stack. After that, text protocol writes with the save sequence are sufficient for subsequent changes.

Victory

# Write SIP server to flash (binary protocol)
server = "vikingphone.sip.us1.twilio.com"
for i, byte in enumerate(server.encode()):
    addr = 0x3c + i
    checksum = (0x195 - addr - byte) & 0xFF
    send(f"TS A {prefix} {addr:02x} {byte:02x} {checksum:02x} 20")

# Set SIP credentials (text protocol)
send("UC=vikingphone")
send("UP=s3cret")
send("UU=vikingphone")
send("UR=vikingphone.sip.us1.twilio.com")

# Commit and reboot
send("CE"); send("GB=40"); send("ME=1"); send("MR=1")

After reboot:

SR  -->  SR[REGISTERED]

The phone registers with Twilio. A TwiML bin routes outbound calls to an ElevenAgent. Pick up the handset and an AI answers.

The phone in action at the ElevenLabs booth
swyx using the phone to call ElevenAgent

Parting Thoughts

As software feels “solved” by AI today, it’s interesting to see how far it could diffuse into physical world. This project made me feel like the answer is: surprisingly far.

I also built it in the middle of Mythos cyber hell. Even though this was a baby version of “hacking”, it was still amusing (and slightly alarming) how much I could achieve without any sort of expertise in low level / networks / cyber.

Finally, I open sourced a skill file describing the protocol in detail, so that the next person doesn’t need to burn through the tokens to reverse engineer it again. A skill is a product.

← Back to home