The AI-First Architecture
1. Structured Documentation for AI Context
The first major task is letting Claude create a CLAUDE.md
- a dedicated context file specifically for AI assistants. This isn’t just documentation; it’s a structured knowledge transfer that gives AI the complete picture:
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Development Commands
### Setup and Dependencies
make setup # Set up virtual environment and install all dependencies
make install-dev # Install development dependencies only
make install-hiking-weather # Install hiking weather app dependencies
make install-scripts # Install script dependencies
make install-ci # Install CI dependencies
### Ansible Deployment
make homelab # Run the main Ansible playbook (all layers)
make infrastructure # Run infrastructure layer only
make network # Run network layer only
make security # Run security layer only
make applications # Run applications layer only
make monitoring # Run monitoring layer only
make secret # Edit encrypted vault file
make syntax-check # Validate inventory and task enumeration
### Code Quality and Testing
make check # Run all linting checks
make lint # Run ansible-lint on all files
make yamllint # Run yamllint on all YAML files
make pylint # Run ruff linting on all Python files
make pyformat # Format Python files with ruff
make pyfix # Auto-fix Python linting issues
make test # Run comprehensive Python test suite
make check-all # Run all linting checks and tests
### Direct Commands
# Virtual environment setup
source .venv/bin/activate
# Ansible Galaxy collections
ansible-galaxy collection install community.general ansible.posix
# Direct Ansible execution
ansible-playbook -i inventory/hosts site.yml --limit homelab-server --vault-password-file scripts/get-vault-pass.sh
## Architecture Overview
This is an Ansible-based homelab infrastructure project that configures a Debian-based Proxmox server with a layered architecture:
### Core Architecture Layers
1. **Infrastructure Layer** (`infrastructure/`): Base system configuration, runtimes, and core services
- Common system configuration (SSH, users, packages)
- NTP time synchronization
- Docker runtime environment
- Go runtime for services
- Rclone for cloud storage backups
2. **Network Layer** (`network/`): Connectivity, routing, and reverse proxy services
- Cloudflare tunnels for secure remote access
- Caddy reverse proxy with automatic TLS
3. **Security Layer** (`security/`): DNS filtering and security-related services
- AdGuard Home for DNS filtering with CloudFlare upstream
4. **Application Layer** (`applications/`): User-facing applications and services
- Mealie recipe management with R2 backups
- Homepage dashboard
- Open WebUI AI chat interface
- Custom hiking weather forecasting Flask app
5. **Monitoring Layer** (`monitoring/`): Observability, metrics, and logging
- Grafana Alloy for metrics and log collection
- Synthetic monitoring with Grafana Cloud private probes
### Key Files and Structure
- `site.yml`: Main Ansible playbook orchestrating all layers
- `inventory/hosts`: Server inventory (homelab-server at 10.0.0.10)
- `group_vars/homelab.yml`: Non-sensitive configuration variables
- `group_vars/homelab_secrets.yml`: Encrypted vault file for secrets
- `playbooks/`: Layer-specific playbooks for targeted deployments
- `roles/`: Ansible roles organized by architectural layer
### Deployment Philosophy
- **Zero public attack surface**: No exposed ports or public DNS entries
- **Secure by default**: All external access via Cloudflare tunnels
- **Layered deployment**: Each layer can be deployed independently
- **Infrastructure as code**: All configuration managed via Ansible
### Python Applications
The project includes custom Python applications deployed via Ansible:
- **Hiking Weather App**: Flask-based weekend weather forecasting (`roles/applications/hiking_weather/files/hikestatus_nws.py`)
- **Grafana Logs Query Agent**: Utility script for log querying (`scripts/grafana-logs-query-agent.py`)
### Development Workflow
1. All changes should be tested with `make check-all` before deployment
2. Use layer-specific deployment commands for targeted updates
3. Secrets are managed via Ansible Vault (`make secret` to edit)
4. Python code follows Ruff linting standards (configured in `pyproject.toml`)
5. YAML files follow yamllint standards (configured in `config/yamllint.yml`)
### Testing
- Python tests are located in `roles/applications/hiking_weather/files/tests/`
- Run tests with `make test` (requires hiking-weather dependencies)
- CI dependencies can be installed with `make install-ci`
### Vault Management
- Vault password is managed via `scripts/get-vault-pass.sh`
- Encrypted secrets are stored in `group_vars/homelab_secrets.yml`
- Use `make secret` to edit vault contents safely
By front-loading this context, AI assistants can immediately understand the project’s structure and make informed decisions without repeatedly asking for clarification.
2. Cursor Rules: Context-Aware AI Assistance
I leveraged Cursor’s powerful rules system to create context-aware guidance that automatically applies based on file types and locations:
Always-Applied Rules (.cursor/rules/homelab-workflows.mdc
)
---
description: Always-present guidance for building, testing and troubleshooting this Proxmox/Ansible repo.
alwaysApply: true
---
- **Testing and Provisioning**
- Run `make help` first and invoke the appropriate target (e.g. `make setup`, `make homelab`, `make check-all`).
- Never bypass the Makefile targets when testing or provisioning changes (e.g. do not involke `yamllint` or `ansible-lint` directly)
- **Troubleshooting on the node**
- SSH into `homelab-user@10.0.0.10` (LAN-only); escalate with `sudo` when needed.
- Prefer `journalctl` and service logs for diagnostics before changing configs.
@Makefile # Attaches the Makefile so the AI can see the targets
File-Specific Rules
- Ansible/YAML Rules: Enforces 2-space indentation, proper quoting, and Ansible best practices
- Markdown Rules: Ensures proper formatting with blank lines around code blocks
- Secrets Structure: Detailed documentation of the vault variable hierarchy
The @Makefile
reference in the rules ensures the AI always has access to available commands, making it incredibly efficient at suggesting the right automation targets.
3. ChatGPT Custom Project Configuration
For ChatGPT, I created a detailed project configuration that establishes:
Role & Scope
You are my automation partner for a single-node Proxmox 8 homelab (Debian 12) running on a Ubiquiti Unifi home network.
Primary goals: reliability, security, and zero exposed TCP/UDP ports.
Manage all services through Ansible playbooks structured exactly like this repo (layered roles, group_vars, inventory).
Formatting Rules
Playbooks / tasks / vars → fenced yaml blocks beginning with ---.
Shell / CLI commands → fenced bash blocks; include inline comments.
Python snippets (for helper scripts) → fenced python blocks.
Use bulleted lists for options or trade-offs; avoid tables unless they add real value.
Safety & Operational Guardrails
Never output real secrets. Use <VAULT_VAR> placeholders.
Flag any action that could cause downtime; suggest taking a Proxmox snapshot first.
Maintain zero public attack surface—assume all ingress is via Cloudflare Tunnels + Caddy with automatic TLS.
Confirm intent before destructive tasks (e.g., wiping ZFS snapshots, upgrading Proxmox kernel).
Baseline Assumptions
Core services already present
Cloudflare Tunnel (cloudflared), Caddy reverse-proxy
AdGuard Home with Cloudflare Secure Web Gateway upstreams
Mealie, Open WebUI, Homepage dashboard, Flask “hiking weather” app
Grafana Alloy → Grafana Cloud (free tier)
Home Assistant OS runs in a VM.
Networking: single /24 LAN, AdGuard-filtered DNS handed out by router.
Project commands: make setup, make help, and layered make <layer> targets exist.
Secrets live in group_vars/homelab_secrets.yml (Ansible Vault).
Troubleshooting Style
Diagnose with systemctl, journalctl, or service-specific logs first.
Provide one-liner commands to reproduce or test an issue.
Map errors back to the layer (infrastructure → network → security → application → monitoring).
Resource Hints
Prefer upstream docs: Proxmox, Ansible, Caddy, Cloudflare Tunnels.
For metrics examples, cite Grafana Alloy or Grafana Cloud docs.
When linking, include bare URLs (no markdown titles) so they’re easy to copy.
This configuration ensures ChatGPT understands the operational context and constraints, preventing suggestions that could compromise security or stability.
4. Layered Architecture: AI-Friendly by Design
The entire project follows a layered architecture that mirrors how humans (and AI) think about infrastructure:
roles/
├── infrastructure/ # Base system: Docker, NTP, common configs
├── network/ # Cloudflare tunnels, Caddy reverse proxy
├── security/ # AdGuard Home DNS filtering
├── applications/ # User-facing apps: Mealie, Homepage, Open WebUI
└── monitoring/ # Grafana Alloy, synthetic monitoring
Each layer can be deployed independently with dedicated Make targets:
make infrastructure # Deploy base system only
make network # Update network configuration
make applications # Deploy user applications
This separation of concerns makes it trivial for AI to understand dependencies and suggest targeted fixes without affecting unrelated systems.
5. Self-Documenting Ansible Roles
Every role includes comprehensive README files that explain:
- Prerequisites and dependencies
- Configuration variables (with examples)
- What the role does (step by step)
- Security considerations
- Troubleshooting guides
- File structures created
For example, the Open WebUI role documentation clearly states:
## What This Role Does
1. **Directory Setup**: Creates the required directory structure under `/opt/docker/openwebui/`
2. **Docker Compose**: Deploys the docker-compose.yml template with your configuration
3. **Systemd Service**: Creates and enables a systemd service for lifecycle management
4. **Security**: Container runs with proper isolation and restart policies
5. **API Configuration**: Disables Ollama API integration for 3rd party provider use
## Security Considerations
- **Container Isolation**: The container runs as root within its own isolated namespace
- **Network Isolation**: Uses a dedicated Docker network (`openwebui_network`)
- **Data Persistence**: Application data is stored in `/opt/docker/openwebui/data/`
- **Authentication**: Enable `openwebui_auth_enabled: true` for user authentication
- **Secret Management**: Store the `openwebui_secret_key` in encrypted Ansible Vault
- **API Security**: Ollama API is disabled; only 3rd party APIs are enabled
This level of detail means AI can understand not just what the code does, but why decisions were made.
6. AI-Powered Code Review and Automation
I’ve integrated Claude directly into the GitHub workflow for automated code reviews:
- name: Run Claude Code Review
uses: anthropics/claude-code-action@beta
with:
direct_prompt: |
Please review this pull request and provide feedback on:
- Code quality and best practices
- Potential bugs or issues
- Security concerns
- Test coverage
Additionally, Claude can be summoned via @claude
mentions in issues and PR comments, with access to specific tools:
allowed_tools: "Bash(make check),Bash(make test),Bash(make lint)"
This creates a collaborative environment where AI actively participates in the development process.
Real-World Benefits
1. Rapid Development Cycles
With AI understanding the full context, I can describe high-level goals and get working implementations immediately:
Me: “Add a new monitoring endpoint for checking if all Docker containers are healthy”
AI: Understands the monitoring layer structure, creates the appropriate Ansible task, adds it to the Grafana Alloy configuration, and suggests the correct deployment command
2. Intelligent Troubleshooting
When issues arise, AI can leverage the structured logs and clear service boundaries:
Me: “The Homepage dashboard isn’t loading”
AI: “Let me help troubleshoot. Based on your architecture:
- First, let’s check the Homepage service status:
systemctl status homepage
- If that’s running, check Caddy reverse proxy:
journalctl -u caddy -f
- Finally, verify the Cloudflare tunnel:
systemctl status cloudflared
”
3. Security-First Suggestions
The AI’s understanding of the zero-trust architecture means it never suggests solutions that would expose services:
Me: “How can I access Mealie from outside my network?”
AI: “Since you maintain zero public attack surface, we’ll use your existing Cloudflare tunnel. Let me add the route to your Caddy configuration…”
4. Consistent Code Quality
With linting rules clearly defined and AI understanding the standards, every suggested change follows best practices:
- YAML files maintain proper indentation
- Ansible variables use correct naming conventions
- Python code follows Ruff standards
- Markdown documentation stays properly formatted
Key Takeaways
Context is King: Investing time in comprehensive documentation specifically for AI pays massive dividends. My readme files are now the most important piece of documentation.
Structure Enables Intelligence: A well-organized codebase with clear separation of concerns allows AI to make surgical changes without breaking unrelated systems.
Automate the Automation: Using Make targets as the primary interface gives AI a consistent way to interact with the infrastructure while maintaining safety rails.
Layer Your Rules: Cursor’s file-specific rules combined with always-applied guidance creates a context-aware development environment that prevents common mistakes.
Integrate Deeply: Having AI in the CI/CD pipeline through GitHub Actions creates a true collaborative development experience.
Looking Forward
This AI-first approach has fundamentally changed how I think about infrastructure as code. Instead of viewing AI as a tool for generating snippets, I now see it as a full development partner that understands my infrastructure as deeply as I do.
The time invested in making the codebase AI-friendly has paid for itself many times over through:
- Faster feature implementation
- More reliable deployments
- Better documentation (AI helps maintain it!)
- Reduced cognitive load
As AI capabilities continue to evolve, I’m excited to explore even deeper integrations, perhaps including:
- AI-driven anomaly detection in logs
- Automated performance optimization suggestions
- Predictive maintenance based on metrics trends
Get Started with Your Own AI-Powered Homelab
If you’re interested in building your own AI-optimized homelab, here are my recommendations:
- Start with Structure: Design your architecture with clear boundaries and separation of concerns
- Document for AI: Create comprehensive context files that explain not just what, but why
- Use Declarative Tools: Ansible, Terraform, and similar tools work beautifully with AI
- Establish Safety Rails: Define clear operational boundaries in your AI configurations
- Iterate and Refine: Your AI context will evolve as your infrastructure grows
And also, definitely check out my code examples here!