Skip to content

Discord Bot Operations & Scaling

Use this playbook to keep production bots healthy after the initial deploy. It covers monitoring, scheduling, cost controls, sharding, data management, compliance, and incident response.

  • Panel Metrics: Every container exposes real-time CPU, RAM, and network graphs. Watch for RAM > 75% or CPU spikes near 90%—scale up before watchdog restarts kick in.
  • Log Streaming: Download console logs or stream them live for aggregation into Logtail, Datadog, or Elastic. Include requestId, guildId, or shardId in logs to pinpoint issues quickly.
  • Status Webhooks: Subscribe to https://status.mambahost.com or request a dedicated webhook so incident notifications land in your staff Discord.
  • Heartbeat Commands: Run scheduled ping jobs that report uptime and latency to a monitoring channel.
  • Restarts: Schedule daily or weekly restarts during low-traffic windows to clear memory leaks.
  • Dependency Updates: Add a scheduler task that runs your update script (e.g., npm run update or pip install --upgrade -r requirements.txt) followed by a restart.
  • Cron Jobs: Use the built-in scheduler for tasks such as data exports, leaderboard resets, or timed announcements.
  • Backups: Automate mysqldump or SQLite copies to /backups with retention policies (default 7 days).
  • Move from Starter → Pro → Premium when RAM or CPU utilization consistently exceeds 75%.
  • Premium containers support PM2/multiprocess setups, multiple bots, and heavier AI inference workloads.
  • Divide workloads by shards or features. Example: one bot handles moderation (sharded), another handles music streaming.
  • Use Redis, PostgreSQL, or MySQL to store state shared across shards/containers.
  • Keep slash-command registration scripts aware of shard counts—Discord enforces interaction timeouts per shard.
  • Dev/Staging: Mirror production env vars except for tokens/guild IDs; register slash commands in a staging guild to avoid polluting production.
  • Production: Keep env vars minimal and rotate tokens quarterly.
  • Automate promotions by syncing assets from staging to production once smoke tests pass.
  • MySQL: Ideal for inventories, audit logs, or economy data. Request credentials via support; add them as env vars.
  • SQLite: Suitable for small bots—back up the .db file nightly.
  • Object Storage: Host large assets (images, audio, templates) on S3-compatible storage to keep the container lean.
  • Caching: External Redis/Upstash caches accelerate frequently accessed data and reduce database load.
  • Monitor resource graphs and downgrade tiers if utilization stays below 30% for extended periods.
  • Consolidate lightweight bots into one Premium container using PM2 to reduce per-bot costs.
  • Offload CPU-heavy tasks (e.g., video transcoding) to serverless workers or queued jobs that run only when needed.
  • Rotate tokens when staff changes occur. Remove old tokens immediately in the Discord Dev Portal.
  • Use least-privilege OAuth scopes and enable privileged intents only when your bot actually needs them.
  • Enable MFA on all panel accounts; create sub-users with scoped permissions rather than sharing root credentials.
  • Log sensitive actions (bans, payouts, command escalations) and store them in immutable storage for auditability.
  1. Detect: Watchdog restarts, 5xx errors, or alerting webhooks indicate an incident.
  2. Stabilize: Scale up, disable problematic features via feature flags, or roll back to the previous deployment.
  3. Communicate: Post in your community status channel and update status.mambahost.com if needed.
  4. Postmortem: Capture timeline, root cause, follow-up tasks, and automation improvements.

Integrations with Creator Labs & Game Servers

Section titled “Integrations with Creator Labs & Game Servers”
  • Sync downtime alerts across Creator Labs websites, Discord bots, and managed game servers.
  • Pipe telemetry from game servers into Discord via webhooks hosted on the same infrastructure for consistent latency.
  • Use n8n automations (see docs/N8N_MARKETING_AUTOMATION.md) to route leads or support tickets into bots.
  • Monitor metrics weekly and adjust plan tiers accordingly.
  • Review env vars quarterly; remove unused secrets.
  • Test backups/restore monthly.
  • Keep CI pipelines green before every deploy.
  • Maintain a staging container for smoke tests.
  • Document incident procedures and share with staff.