# CaddyHole A Caddy module that blocks clients based on country and sends crawlers/bots a random amount of data from `/dev/random`. ## Features - **Country-based blocking**: Block requests from specific countries using GeoIP2 database - **Bot/Crawler detection**: Automatically detect bots and crawlers based on User-Agent - **Bot tarpit**: Send bots/crawlers random data from `/dev/random` to waste their resources - **Configurable**: Customize blocked countries and amount of data sent to bots ## Installation To use this module, you need to build Caddy with this plugin included. You can use [xcaddy](https://github.com/caddyserver/xcaddy): ```bash xcaddy build --with git.wntrmute.dev/kyle/caddyhole ``` ## Configuration ### Caddyfile ```caddyfile example.com { caddyhole { # Path to GeoIP2 database (optional, required for country blocking) database /path/to/GeoLite2-Country.mmdb # List of country ISO codes to block (optional) block_countries CN RU KP IN IL # Minimum bytes to send to bots (optional, default: 1MB) min_bot_bytes 1048576 # Maximum bytes to send to bots (optional, default: 100MB) max_bot_bytes 104857600 } # Your other handlers here respond "Hello, World!" } ``` ### JSON Config ```json { "apps": { "http": { "servers": { "srv0": { "listen": [":443"], "routes": [ { "handle": [ { "handler": "caddyhole", "database_path": "/path/to/GeoLite2-Country.mmdb", "blocked_countries": ["CN", "RU", "KP"], "min_bot_bytes": 1048576, "max_bot_bytes": 104857600 }, { "handler": "static_response", "body": "Hello, World!" } ] } ] } } } } } ``` ## How It Works ### Bot Detection The module detects bots/crawlers by examining the `User-Agent` header. It looks for common bot signatures including: - bot, crawler, spider, scraper - curl, wget - python-requests, python-urllib - go-http-client - java, perl, ruby, php - http_request When a bot is detected, the module: 1. Generates a random amount of data between `min_bot_bytes` and `max_bot_bytes` 2. Streams that amount of random data from `/dev/random` (or `/dev/urandom` as fallback) 3. Returns HTTP 200 OK to make the bot think it succeeded 4. Never passes the request to downstream handlers ### Country Blocking The module uses MaxMind's GeoIP2 database to determine the country of the client based on their IP address. If the country code matches any in the `blocked_countries` list: 1. Returns HTTP 403 Forbidden 2. Sends "Access denied" message 3. Never passes the request to downstream handlers The module checks the following headers for the client IP (in order): 1. `X-Forwarded-For` (first IP) 2. `X-Real-IP` 3. `RemoteAddr` ### Execution Order The module processes requests in this order: 1. Check if request is from a bot → If yes, send random data 2. Check if request is from a blocked country → If yes, return 403 3. Otherwise, pass to the next handler ## GeoIP2 Database To use country blocking, you need a GeoIP2 database. You can download the free GeoLite2 Country database from MaxMind: 1. Sign up for a free account at [MaxMind](https://www.maxmind.com/en/geolite2/signup) 2. Download the GeoLite2 Country database in MMDB format 3. Extract the `.mmdb` file 4. Configure the `database` path in your Caddyfile ## Examples ### Block only specific countries (no bot handling) ```caddyfile example.com { caddyhole { database /path/to/GeoLite2-Country.mmdb block_countries CN RU } respond "Hello, World!" } ``` ### Bot tarpit only (no country blocking) ```caddyfile example.com { caddyhole { min_bot_bytes 10485760 # 10MB max_bot_bytes 1073741824 # 1GB } respond "Hello, World!" } ``` ### Full protection ```caddyfile example.com { caddyhole { database /path/to/GeoLite2-Country.mmdb block_countries CN RU KP IR min_bot_bytes 52428800 # 50MB max_bot_bytes 524288000 # 500MB } respond "Hello, World!" } ``` ## Notes - Bot detection happens before country blocking, so bots will get random data regardless of their country - The random data is streamed directly from `/dev/random` (or `/dev/urandom`), which may impact system entropy on some systems - The `Content-Type` is set to `application/octet-stream` and `Content-Length` is set to make the response appear legitimate - Country blocking requires a GeoIP2 database; without it, no country blocking occurs - All configuration parameters are optional, but you need at least `database` and `block_countries` for country blocking to work ## License This module is provided as-is for use with Caddy.