A first-pass on-call operator for Claude

The pager that stays quiet
unless it's a real fire.

Nightwatch reads an incoming alert and makes the call your on-call engineer would make in the first 90 seconds — then drafts the message. Built for the team without an AIOps budget, where the scarcest resource is uninterrupted sleep.

Get it free on GitHub → Watch it make three calls

It decides — it never kicks the question back. PAGE · TICKET · SUPPRESS · FLAG, every time.

Watch it decide

Watch it make the call, in real time.

No narration, no edits. Three alerts in, three different decisions out — the way it would stand your watch tonight.

Three alerts · three calls · 9:16

The problem

Alerts all night. Most of them are nothing.

One to three people share the pager. There's no NOC, no SRE on staff, no AIOps platform — just a phone that buzzes at 3am for a disk-space warning that always clears itself.

So you do the math every on-call does: read the alert, guess the blast radius, decide in 90 seconds whether it's worth waking up for. The decision isn't hard. The volume is what burns you out — and the night it actually matters is the night you've learned to swipe the buzz away.

90 sec

the triage call a human makes per alert

of those calls should require waking that human

What it does

Every signal leaves with one decision.

No "here's what I found, what do you want to do?" An operator decides and routes the work. You come back to a correct page — or a quiet night.

● PAGE

A real fire. Wake a human now, with the blast radius and a first check already drafted.

● TICKET

Real, but it can wait for daylight. Drafted with severity, impact, and a suggested owner.

● SUPPRESS

Known noise or self-healing. Logged with the exact tripwire that would change its mind.

● FLAG

Genuinely your call — rare. It holds a safe default and asks one specific question.

See it decide

Three real alerts. Three different calls.

Severity is computed from business impact, not the alert's own label — so a global 0.2% blip can outrank a "CRITICAL," and a "CRITICAL" can be safely muted.

Real outputs — produced by the folder running in Claude, not mockups.

Prometheus · 03:10 UTC

Disk usage on prod-db-1 at 88%. Severity: warning.

↓

⚪ SUPPRESS

Known nightly log-rotation spike (normal ≤92%, clears by 03:30). Escalate-if: >92%, or not below 80% by 03:45, or free space <5%.

Known self-healing pattern within its window — suppress the human cost, keep the safety net.

Sentry · 14:30 local

500s on /settings/profile, ~3% of users, stable 20 min, workaround: refresh.

↓

🟡 TICKET · SEV3

Non-core page, degraded not down, workaround confirmed, stable. Queue this week. (Check: any deploy in the last 30 min?)

No reason to pull someone off focus — a fix-in-daylight bug, and daylight is now.

Datadog · 02:40 UTC

503s on the API write path, 1.5% globally, stable — all from org_2290.

↓

🟠 PAGE · SEV2 · SLA

1.5% globally = 100% of Acme Co.'s write path (Priority SLA, 4-hr clock running). Paging on-call + account owner.

Raw volume is small; business impact isn't. Severity is about who's down, not how many.

How it works

A minute to set up. Then it stands the watch.

Drop in the folder

Put the Nightwatch folder into a Claude Project. Claude becomes the operator — identity, rules, examples, and reference tables all loaded.

Teach it your world, once

Fill the reference/ templates with your services, severity thresholds, SLA customers, and recurring false alarms. ~30 minutes.

Paste any alert

It returns one decision, the routed action or drafted message, and a one-line "why" you can audit or override in five seconds.

The methodology

Folders as architecture. Each file does one job.

nightwatch/
├── identity.md      # who it is, what it refuses
├── rules.md         # the decision logic (the heart)
├── examples.md      # worked calls + edge cases
├── README.md        # how to use it
└── reference/
    ├── severity-matrix.md   # blast × impact × trend
    ├── routing-table.md     # sev → who, when
    ├── known-noise.md       # your false alarms
    ├── sla-customers.md     # who overrides the metrics
    └── response-templates.md# the drafted artifacts

The decision logic is a short-circuiting flow: security and data-loss gates fire first, then noise, then dedup, then severity from the matrix, then the SLA override. The first step that produces an outcome wins.

Nothing is a black box. The rules are plain English in a file you can read, edit, and trust — which is the whole point.

For builders

Want to build your own operator?

Nightwatch is one worked example. The Operator's Handbook is the method behind it — how to turn Claude into something that decides instead of chats.

✓

The 5-file structure, and how to write decision logic with real thresholds — not "use good judgment"

✓

How to handle the edge cases that break most tools, and keep the human gate where it matters

✓

New operators as they ship from The Quiet Ai — delivered with the handbook

No catch

It's a folder, not a SaaS.

Nightwatch is free. There's no account, no agent to install, no dashboard, no per-seat pricing. It's a folder of plain Markdown you drop into Claude — and it's running in a minute. Take it, fork it, make it yours.

Get Nightwatch on GitHub →