Compare commits
1 Commits
unit10-arc
...
unit1-read
| Author | SHA1 | Date | |
|---|---|---|---|
| 96b5a0fa1b |
@@ -206,7 +206,7 @@ The management API (REST + gRPC) uses MCIAS bearer tokens:
|
||||
| GET | `/v1/zones/{zone}` | Bearer | Get zone details |
|
||||
| PUT | `/v1/zones/{zone}` | Admin | Update zone SOA parameters |
|
||||
| DELETE | `/v1/zones/{zone}` | Admin | Delete zone and all its records |
|
||||
| GET | `/v1/zones/{zone}/records` | Bearer | List records in a zone (optional `?name=`/`?type=` filters) |
|
||||
| GET | `/v1/zones/{zone}/records` | Bearer | List records in a zone |
|
||||
| POST | `/v1/zones/{zone}/records` | Admin | Create a record |
|
||||
| GET | `/v1/zones/{zone}/records/{id}` | Bearer | Get a record |
|
||||
| PUT | `/v1/zones/{zone}/records/{id}` | Admin | Update a record |
|
||||
@@ -229,8 +229,6 @@ service ZoneService {
|
||||
}
|
||||
|
||||
service RecordService {
|
||||
// ListRecords returns records in a zone. The name and type fields in
|
||||
// ListRecordsRequest are optional filters; omit them to return all records.
|
||||
rpc ListRecords(ListRecordsRequest) returns (ListRecordsResponse);
|
||||
rpc CreateRecord(CreateRecordRequest) returns (Record);
|
||||
rpc GetRecord(GetRecordRequest) returns (Record);
|
||||
@@ -300,31 +298,6 @@ Response 201:
|
||||
}
|
||||
```
|
||||
|
||||
### gRPC Usage Examples
|
||||
|
||||
**List zones with grpcurl:**
|
||||
```bash
|
||||
grpcurl -cacert ca.pem \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
mcns.svc.mcp.metacircular.net:9443 mcns.v1.ZoneService/ListZones
|
||||
```
|
||||
|
||||
**Create a record with grpcurl:**
|
||||
```bash
|
||||
grpcurl -cacert ca.pem \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-d '{"zone":"svc.mcp.metacircular.net","name":"metacrypt","type":"A","value":"192.168.88.181","ttl":300}' \
|
||||
mcns.svc.mcp.metacircular.net:9443 mcns.v1.RecordService/CreateRecord
|
||||
```
|
||||
|
||||
**List records with name filter:**
|
||||
```bash
|
||||
grpcurl -cacert ca.pem \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-d '{"zone":"svc.mcp.metacircular.net","name":"metacrypt"}' \
|
||||
mcns.svc.mcp.metacircular.net:9443 mcns.v1.RecordService/ListRecords
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
```toml
|
||||
|
||||
21
CLAUDE.md
21
CLAUDE.md
@@ -56,27 +56,6 @@ deploy/ Docker, systemd, install scripts, examples
|
||||
- `srv/` — local dev runtime data
|
||||
- `gen/` — regenerated from proto, not hand-edited
|
||||
|
||||
## Shared Library
|
||||
|
||||
MCNS uses `mcdsl` (git.wntrmute.dev/kyle/mcdsl) for shared platform packages:
|
||||
auth, db, config, httpserver, grpcserver. These provide MCIAS authentication,
|
||||
SQLite database helpers, TOML config loading, and TLS-configured HTTP/gRPC
|
||||
server scaffolding.
|
||||
|
||||
## Testing Patterns
|
||||
|
||||
- Use stdlib `testing` only. No third-party test frameworks.
|
||||
- Tests use real SQLite databases in `t.TempDir()`. No mocks for databases.
|
||||
|
||||
## Key Invariants
|
||||
|
||||
- **SOA serial format**: YYYYMMDDNN, auto-incremented on every record mutation.
|
||||
If the date prefix matches today, NN is incremented. Otherwise the serial
|
||||
resets to today with NN=01.
|
||||
- **CNAME exclusivity**: Enforced at the DB layer within transactions. A name
|
||||
cannot have both CNAME and A/AAAA records. Attempts to violate this return
|
||||
`ErrConflict`.
|
||||
|
||||
## Critical Rules
|
||||
|
||||
1. **REST/gRPC sync**: Every REST endpoint must have a corresponding gRPC
|
||||
|
||||
42
README.md
Normal file
42
README.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# MCNS
|
||||
|
||||
Metacircular Networking Service -- an authoritative DNS server for the
|
||||
Metacircular platform. MCNS serves DNS zones backed by SQLite, forwards
|
||||
non-authoritative queries to upstream resolvers, and exposes a gRPC and
|
||||
REST management API authenticated through MCIAS. Records are updated
|
||||
dynamically via the API and visible to DNS immediately on commit.
|
||||
|
||||
## Quick Start
|
||||
|
||||
Build the binary:
|
||||
|
||||
```bash
|
||||
make mcns
|
||||
```
|
||||
|
||||
Copy and edit the example configuration:
|
||||
|
||||
```bash
|
||||
cp deploy/examples/mcns.toml /srv/mcns/mcns.toml
|
||||
# Edit TLS paths, database path, MCIAS URL, upstream resolvers
|
||||
```
|
||||
|
||||
Run the server:
|
||||
|
||||
```bash
|
||||
./mcns server --config /srv/mcns/mcns.toml
|
||||
```
|
||||
|
||||
The server starts three listeners:
|
||||
|
||||
| Port | Protocol | Purpose |
|
||||
|------|----------|---------|
|
||||
| 53 | UDP + TCP | DNS (no auth) |
|
||||
| 8443 | TCP | REST management API (TLS, MCIAS auth) |
|
||||
| 9443 | TCP | gRPC management API (TLS, MCIAS auth) |
|
||||
|
||||
## Documentation
|
||||
|
||||
- [ARCHITECTURE.md](ARCHITECTURE.md) -- full technical specification, database schema, API surface, and security model.
|
||||
- [RUNBOOK.md](RUNBOOK.md) -- operational procedures and incident response for operators.
|
||||
- [CLAUDE.md](CLAUDE.md) -- context for AI-assisted development.
|
||||
242
RUNBOOK.md
Normal file
242
RUNBOOK.md
Normal file
@@ -0,0 +1,242 @@
|
||||
# MCNS Runbook
|
||||
|
||||
## Service Overview
|
||||
|
||||
MCNS is an authoritative DNS server for the Metacircular platform. It
|
||||
listens on port 53 (UDP+TCP) for DNS queries, port 8443 for the REST
|
||||
management API, and port 9443 for the gRPC management API. Zone and
|
||||
record data is stored in SQLite. All management operations require MCIAS
|
||||
authentication; DNS queries are unauthenticated.
|
||||
|
||||
## Health Checks
|
||||
|
||||
### CLI
|
||||
|
||||
```bash
|
||||
mcns status --addr https://localhost:8443
|
||||
```
|
||||
|
||||
With a custom CA certificate:
|
||||
|
||||
```bash
|
||||
mcns status --addr https://localhost:8443 --ca-cert /srv/mcns/certs/ca.pem
|
||||
```
|
||||
|
||||
Expected output: `ok`
|
||||
|
||||
### REST
|
||||
|
||||
```bash
|
||||
curl -k https://localhost:8443/v1/health
|
||||
```
|
||||
|
||||
Expected: HTTP 200.
|
||||
|
||||
### gRPC
|
||||
|
||||
Use the `AdminService.Health` RPC on port 9443. This method is public
|
||||
(no auth required).
|
||||
|
||||
### DNS
|
||||
|
||||
```bash
|
||||
dig @localhost svc.mcp.metacircular.net SOA +short
|
||||
```
|
||||
|
||||
A valid SOA response confirms the DNS listener and database are working.
|
||||
|
||||
## Common Operations
|
||||
|
||||
### Start the Service
|
||||
|
||||
1. Verify config exists: `ls /srv/mcns/mcns.toml`
|
||||
2. Start the container:
|
||||
```bash
|
||||
docker compose -f deploy/docker/docker-compose-rift.yml up -d
|
||||
```
|
||||
3. Verify health:
|
||||
```bash
|
||||
mcns status --addr https://localhost:8443
|
||||
```
|
||||
|
||||
### Stop the Service
|
||||
|
||||
1. Stop the container:
|
||||
```bash
|
||||
docker compose -f deploy/docker/docker-compose-rift.yml stop mcns
|
||||
```
|
||||
2. MCNS handles SIGTERM gracefully and drains in-flight requests (30s timeout).
|
||||
|
||||
### Restart the Service
|
||||
|
||||
1. Restart the container:
|
||||
```bash
|
||||
docker compose -f deploy/docker/docker-compose-rift.yml restart mcns
|
||||
```
|
||||
2. Verify health:
|
||||
```bash
|
||||
mcns status --addr https://localhost:8443
|
||||
```
|
||||
|
||||
### Backup (Snapshot)
|
||||
|
||||
1. Run the snapshot command:
|
||||
```bash
|
||||
mcns snapshot --config /srv/mcns/mcns.toml
|
||||
```
|
||||
2. The snapshot is saved to `/srv/mcns/backups/mcns-YYYYMMDD-HHMMSS.db`.
|
||||
3. Verify the snapshot file exists and has a reasonable size:
|
||||
```bash
|
||||
ls -lh /srv/mcns/backups/
|
||||
```
|
||||
|
||||
### Restore from Snapshot
|
||||
|
||||
1. Stop the service (see above).
|
||||
2. Back up the current database:
|
||||
```bash
|
||||
cp /srv/mcns/mcns.db /srv/mcns/mcns.db.pre-restore
|
||||
```
|
||||
3. Copy the snapshot into place:
|
||||
```bash
|
||||
cp /srv/mcns/backups/mcns-YYYYMMDD-HHMMSS.db /srv/mcns/mcns.db
|
||||
```
|
||||
4. Start the service (see above).
|
||||
5. Verify the service is healthy:
|
||||
```bash
|
||||
mcns status --addr https://localhost:8443
|
||||
```
|
||||
6. Verify zones are accessible by querying DNS:
|
||||
```bash
|
||||
dig @localhost svc.mcp.metacircular.net SOA +short
|
||||
```
|
||||
|
||||
### Log Inspection
|
||||
|
||||
Container logs:
|
||||
|
||||
```bash
|
||||
docker compose -f deploy/docker/docker-compose-rift.yml logs --tail 100 mcns
|
||||
```
|
||||
|
||||
Follow logs in real time:
|
||||
|
||||
```bash
|
||||
docker compose -f deploy/docker/docker-compose-rift.yml logs -f mcns
|
||||
```
|
||||
|
||||
MCNS logs to stderr as structured text (slog). Log level is configured
|
||||
via `[log] level` in `mcns.toml` (debug, info, warn, error).
|
||||
|
||||
## Incident Procedures
|
||||
|
||||
### Database Corruption
|
||||
|
||||
Symptoms: server fails to start with SQLite errors, or queries return
|
||||
unexpected errors.
|
||||
|
||||
1. Stop the service.
|
||||
2. Check for WAL/SHM files alongside the database:
|
||||
```bash
|
||||
ls -la /srv/mcns/mcns.db*
|
||||
```
|
||||
3. Attempt an integrity check:
|
||||
```bash
|
||||
sqlite3 /srv/mcns/mcns.db "PRAGMA integrity_check;"
|
||||
```
|
||||
4. If integrity check fails, restore from the most recent snapshot:
|
||||
```bash
|
||||
cp /srv/mcns/mcns.db /srv/mcns/mcns.db.corrupt
|
||||
cp /srv/mcns/backups/mcns-YYYYMMDD-HHMMSS.db /srv/mcns/mcns.db
|
||||
```
|
||||
5. Start the service and verify health.
|
||||
6. Re-create any records added after the snapshot was taken.
|
||||
|
||||
### Certificate Expiry
|
||||
|
||||
Symptoms: health check fails with TLS errors, API clients get
|
||||
certificate errors.
|
||||
|
||||
1. Check certificate expiry:
|
||||
```bash
|
||||
openssl x509 -in /srv/mcns/certs/cert.pem -noout -enddate
|
||||
```
|
||||
2. Replace the certificate and key files at the paths in `mcns.toml`.
|
||||
3. Restart the service to load the new certificate.
|
||||
4. Verify health:
|
||||
```bash
|
||||
mcns status --addr https://localhost:8443
|
||||
```
|
||||
|
||||
### MCIAS Outage
|
||||
|
||||
Symptoms: management API returns 502 or authentication errors. DNS
|
||||
continues to work normally (DNS has no auth dependency).
|
||||
|
||||
1. Confirm MCIAS is unreachable:
|
||||
```bash
|
||||
curl -k https://svc.metacircular.net:8443/v1/health
|
||||
```
|
||||
2. DNS resolution is unaffected -- no immediate action needed for DNS.
|
||||
3. Management operations (zone/record create/update/delete) will fail
|
||||
until MCIAS recovers.
|
||||
4. Escalate to MCIAS (see Escalation below).
|
||||
|
||||
### DNS Not Resolving
|
||||
|
||||
Symptoms: `dig @<server> <name>` returns SERVFAIL or times out.
|
||||
|
||||
1. Verify the service is running:
|
||||
```bash
|
||||
docker compose -f deploy/docker/docker-compose-rift.yml ps mcns
|
||||
```
|
||||
2. Check that port 53 is listening:
|
||||
```bash
|
||||
ss -ulnp | grep ':53'
|
||||
ss -tlnp | grep ':53'
|
||||
```
|
||||
3. Test an authoritative query:
|
||||
```bash
|
||||
dig @localhost svc.mcp.metacircular.net SOA
|
||||
```
|
||||
4. Test a forwarded query:
|
||||
```bash
|
||||
dig @localhost example.com A
|
||||
```
|
||||
5. If authoritative queries fail but forwarding works, the database may
|
||||
be corrupt (see Database Corruption above).
|
||||
6. If forwarding fails, check upstream connectivity:
|
||||
```bash
|
||||
dig @1.1.1.1 example.com A
|
||||
```
|
||||
7. Check logs for errors:
|
||||
```bash
|
||||
docker compose -f deploy/docker/docker-compose-rift.yml logs --tail 50 mcns
|
||||
```
|
||||
|
||||
### Port 53 Already in Use
|
||||
|
||||
Symptoms: MCNS fails to start with "address already in use" on port 53.
|
||||
|
||||
1. Identify what is using the port:
|
||||
```bash
|
||||
ss -ulnp | grep ':53'
|
||||
ss -tlnp | grep ':53'
|
||||
```
|
||||
2. Common culprit: `systemd-resolved` listening on `127.0.0.53:53`.
|
||||
- If on a system with systemd-resolved, either disable it or bind
|
||||
MCNS to a specific IP instead of `0.0.0.0:53`.
|
||||
3. If another DNS server is running, stop it or change the MCNS
|
||||
`[dns] listen_addr` in `mcns.toml` to a different address.
|
||||
4. Restart MCNS and verify DNS is responding.
|
||||
|
||||
## Escalation
|
||||
|
||||
Escalate when:
|
||||
|
||||
- Database corruption cannot be resolved by restoring a snapshot.
|
||||
- MCIAS is down and management operations are urgently needed.
|
||||
- DNS resolution failures persist after following the procedures above.
|
||||
- Any issue not covered by this runbook.
|
||||
|
||||
Escalation path: Kyle (platform owner).
|
||||
Reference in New Issue
Block a user