Troubleshooting Guide¶
This guide helps diagnose and resolve common issues with the RPi Generator Control system.
Table of Contents¶
- Quick Diagnostics
- GenMaster Issues
- GenSlave Issues
- Communication Issues
- Generator Control Issues
- Victron Integration Issues
- Notification Issues
- Database Issues
- Docker Issues
- Log Analysis
Quick Diagnostics¶
System Health Check¶
Run these commands to quickly assess system status:
# GenMaster health
curl -s https://your-genmaster/api/health | jq .
# GenSlave health (via GenMaster)
curl -s https://your-genmaster/api/genslave/health \
-H "Authorization: Bearer YOUR_TOKEN" | jq .
# Container status
docker compose ps
# Recent logs
docker compose logs --tail=50
Expected Healthy Response¶
{
"status": "healthy",
"generator_running": false,
"slave_connected": true,
"slave_armed": true,
"victron_signal": "inactive",
"database": "connected",
"redis": "connected"
}
GenMaster Issues¶
Container Won't Start¶
Symptoms: Container exits immediately or keeps restarting.
Check logs:
Common causes:
-
Database not ready:
Fix: Ensure postgres container is healthy: -
Missing environment variables:
Fix: Check.envfile has all required variables. -
Port already in use:
Fix: Stop conflicting service or change port.
Database Migrations Failed¶
Symptoms: App crashes with database schema errors.
Fix:
# Run migrations manually
docker compose exec genmaster alembic upgrade head
# Check migration status
docker compose exec genmaster alembic current
API Returns 500 Errors¶
Check application logs:
Common causes: - Database connection lost - Redis connection lost - GenSlave unreachable (for proxy endpoints)
High Memory Usage¶
Check container stats:
Fix:
- Restart container: docker compose restart genmaster
- Check for memory leaks in logs
- Increase container memory limit if needed
GenSlave Issues¶
Container Won't Start¶
Check logs:
Common causes:
-
GPIO access denied:
Fix: Ensureprivileged: truein docker-compose.yaml. -
Automation Hat not detected:
Fix: - Check HAT is properly seated
- Enable SPI:
sudo raspi-config→ Interface Options → SPI - Reboot Pi
Mock Mode When HAT is Present¶
Symptoms: Logs show "Mock HAT mode" even with hardware.
Check SPI:
Enable SPI:
Relay Not Clicking¶
-
Check armed status:
Must be"armed": true. -
Check power supply:
- Pi Zero needs stable 5V 2.5A
-
Relay may not click with insufficient power
-
Test relay directly:
Communication Issues¶
GenSlave Not Reachable¶
From GenMaster, test connection:
# Via Tailscale hostname
ping genslave
# Test API
curl http://genslave:8001/api/health \
-H "X-API-Key: YOUR_SECRET"
Common causes:
-
Tailscale not connected:
Both devices should show as connected. -
Wrong IP/hostname in config: Check
GENSLAVE_HOSTin GenMaster's.env. -
Firewall blocking:
-
GenSlave container not running:
Heartbeat Failures¶
Symptoms: GenSlave shows "Failsafe triggered" or frequent disconnections.
Check heartbeat status:
Common causes:
- Network latency:
- Heartbeat timeout too short
-
Increase
FAILSAFE_TIMEOUT_SECONDS -
GenMaster overloaded:
- Check GenMaster CPU/memory
-
Check database performance
-
Intermittent network:
- Check WiFi signal strength
- Consider wired connection
API Authentication Errors¶
Symptoms: 401 Unauthorized responses.
Check:
1. API secret matches:
- GenMaster: GENSLAVE_API_SECRET
- GenSlave: GENSLAVE_API_SECRET
- Must be identical
- Header format:
Generator Control Issues¶
Generator Won't Start¶
Run through this checklist:
-
Is relay armed?
Must betrue. -
Is there an active override?
force_stopblocks automatic starts. -
Is runtime lockout active?
-
Is GenSlave connected? Check
slave_connectedin health endpoint. -
Is GenSlave armed? GenSlave must be armed to execute relay commands.
Generator Won't Stop¶
-
Check if force_run override is active:
-
Try force stop via GenSlave:
State Mismatch Between Master and Slave¶
Symptoms: GenMaster shows running, GenSlave shows stopped (or vice versa).
This should self-heal via heartbeat. If not:
-
Check heartbeat is working:
-
Force reconciliation: Restart GenMaster to trigger startup reconciliation.
-
Manual sync:
Victron Integration Issues¶
Signal Not Detected¶
Check GPIO status:
Common causes:
- GPIO not accessible:
- Pi 5 needs
privileged: trueanduser: root -
Check device mappings for gpiochip
-
Wiring issue:
- Verify connection to GPIO17 and GND
-
Test with multimeter
-
Mock mode enabled:
- Check
MOCK_GPIOenvironment variable
Signal Stuck Active/Inactive¶
- Check Cerbo relay:
- Look for relay LED indicator
-
Listen for relay click
-
Test GPIO manually:
-
Check for shorts:
- Disconnect wire and test again
Notification Issues¶
Notifications Not Sending¶
-
Check channel is enabled:
-
Check event configuration:
-
Test channel:
-
Check logs:
GenSlave Failsafe Not Notifying¶
-
Check Apprise URLs configured:
-
Check notifications enabled: Verify
enabled: truein response. -
Check cooldown:
- Recent notification may have set cooldown
- Clear cooldown to test again
Database Issues¶
PostgreSQL Won't Start¶
Check logs:
Common causes:
-
Disk full:
-
Corrupt data:
- Restore from backup
-
Or delete volume (loses data):
-
Permission issues:
Connection Pool Exhausted¶
Symptoms: "Too many connections" errors.
Fix:
# Restart to reset connections
docker compose restart genmaster
# Long-term: increase pool size in config
Redis Connection Issues¶
Check Redis:
Fix:
Docker Issues¶
Container Keeps Restarting¶
Check exit code:
Common exit codes: - 0: Clean exit - 1: Application error - 137: OOM killed (out of memory) - 139: Segfault
Out of Disk Space¶
# Check disk usage
df -h
# Clean Docker resources
docker system prune -a
# Clean specific volumes
docker volume prune
Permission Denied¶
If you see permission denied while trying to connect to the Docker daemon socket (or Got permission denied while trying to connect to the Docker daemon), choose ONE of the following based on your situation:
For one-off commands (recommended for occasional admin):
For daily use on a trusted workstation (lets you run docker without sudo):
Do NOT chmod 666 /var/run/docker.sock
Making the Docker socket world-writable lets any local user — including any compromised low-privilege process — control Docker, which is effectively root on the host. This is a common piece of bad advice on Stack Overflow; ignore it.
Why is docker group membership 'effectively root'?
Anyone who can talk to the Docker daemon can spin up a privileged container that mounts the host's / and gives them a root shell. Only add yourself (or a service user) to the docker group on machines where you'd already be trusted as root.
Log Analysis¶
Viewing Logs¶
# All containers
docker compose logs
# Specific container
docker compose logs genmaster
# Follow logs
docker compose logs -f genmaster
# Last N lines
docker compose logs --tail=100 genmaster
# With timestamps
docker compose logs -t genmaster
Filtering Logs¶
# Errors only
docker compose logs genmaster 2>&1 | grep -i error
# Specific component
docker compose logs genmaster | grep -i heartbeat
# Time range (requires timestamps)
docker compose logs -t genmaster | grep "2026-05"
Common Log Patterns¶
Healthy patterns:
Warning patterns:
Relay ON requested but relay not armed
Victron signal active but relay not armed
GenSlave connection timeout
Error patterns:
Getting Help¶
If you can't resolve an issue:
-
Collect diagnostics:
-
Check GitHub Issues: github.com/rjsears/pizero_generator_control/issues
-
Open a new issue with:
- Description of problem
- Steps to reproduce
- Relevant log excerpts
- System information (Pi model, OS version)
Recovery Procedures¶
Lost Network After WiFi Change¶
A bad static IP, gateway, or subnet on a saved WiFi profile can leave a device unreachable over the network. Recovery options (local console, nmcli, SD card edit, Ethernet fallback) are documented separately:
→ See Network Recovery.
Full System Reset¶
If all else fails:
# Stop everything
docker compose down
# Remove all data (WARNING: loses all history)
docker compose down -v
# Pull fresh images
docker compose pull
# Start fresh
docker compose up -d
# Run migrations
docker compose exec genmaster alembic upgrade head
Restore from Backup¶
# Stop services
docker compose stop genmaster
# Restore database
docker compose exec -T postgres pg_restore \
-U postgres -d genmaster < backup.dump
# Start services
docker compose start genmaster
Emergency Generator Stop¶
If automation isn't working, stop generator manually:
# Direct to GenSlave (bypasses GenMaster)
curl -X POST http://genslave:8001/api/relay/off \
-H "X-API-Key: YOUR_SECRET" \
-d '{"force": true}'
Or physically disconnect power to the Automation Hat relay.