Add infrastructure architecture documentation#2883
Conversation
Reverse-engineered architecture doc covering the BOPS Rails app and bops-terraform infrastructure-as-code. Includes ASCII diagrams for system context, infrastructure overview, request/job/upload flows, CI/CD pipeline, and monitoring. Also covers component inventory, data stores, queues, scheduled jobs, and key architectural findings.
a615afe to
09e1051
Compare
| ### B1.1) Rails Engine Inventory | ||
|
|
||
| | Engine | Mount Path | Purpose | Defined In | | ||
| |---|---|---|---| | ||
| | **bops_core** | (shared) | Routing helpers, middleware, base controllers | `engines/bops_core/` | | ||
| | **bops_admin** | `/admin` | LA admin: users, app types, consultees, policy | `engines/bops_admin/` | | ||
| | **bops_api** | `/api` | Public/authenticated REST API, Swagger docs | `engines/bops_api/` | | ||
| | **bops_applicants** | `/` (applicants subdomain) | Applicant responses, neighbour comments | `engines/bops_applicants/` | | ||
| | **bops_config** | `/` (config subdomain) | Global config, Sidekiq UI | `engines/bops_config/` | | ||
| | **bops_consultees** | `/consultees` | External consultee view/comment | `engines/bops_consultees/` | | ||
| | **bops_enforcements** | `/` | Enforcement case management | `engines/bops_enforcements/` | | ||
| | **bops_preapps** | `/preapps` | Pre-application advice workflow | `engines/bops_preapps/` | | ||
| | **bops_reports** | `/reports` | Report generation | `engines/bops_reports/` | | ||
| | **bops_submissions** | `/api` (submissions) | Incoming application submission handling | `engines/bops_submissions/` | | ||
| | **bops_uploads** | (uploads subdomain) | File uploads via Active Storage | `engines/bops_uploads/` | |
There was a problem hiding this comment.
This section seems to duplicate docs/architecture.md.
| ## C) Key Findings | ||
|
|
||
| ### C1) Architectural Observations | ||
|
|
||
| | # | Observation | Severity | Evidence | | ||
| |---|---|---|---| | ||
| | 1 | **Single-AZ RDS by default** | MEDIUM | `modules/database/main.tf` — `multi_az = false` default. Verify if production overrides this. | | ||
| | 2 | **No S3 lifecycle policies on uploads** | LOW | `modules/ecs_bops/s3.tf` — no lifecycle rules on uploads bucket. Unbounded storage growth. | | ||
| | 3 | **No application-level caching** | LOW | `config/environments/production.rb` — cache_store commented out, despite having ElastiCache available. | | ||
| | 4 | **No auto-scaling policies** | LOW | No ECS auto-scaling found in Terraform. Fixed task counts (2 web + 2 worker + 1 console). Verify if this is intentional. | | ||
|
|
||
| ### C2) Unknowns to Confirm | ||
|
|
||
| | # | Unknown | Impact | Where to Verify | | ||
| |---|---|---|---| | ||
| | 1 | **Is RDS multi-AZ enabled in production?** | HIGH — data availability | Check AWS console or Terraform state | | ||
| | 2 | **What RDS instance class runs in production?** | MEDIUM — performance | Default is `db.t3.small` in module; check if production overrides it | | ||
| | 3 | **Are there any Lambda@Edge functions on CloudFront?** | MEDIUM — architecture | Only a CloudFront Function (`remove-blobs-prefix`) found in Terraform | | ||
| | 4 | **Is there a backup/DR strategy beyond 7-day RDS snapshots?** | MEDIUM — resilience | No cross-region replication or separate backup strategy found | | ||
| | 5 | **Are there additional environments beyond staging/production/pentest/sandbox?** | LOW — architecture | Only 4 environments found in Terraform | | ||
|
|
||
| ### C3) Next Steps | ||
|
|
||
| 1. **Verify production overrides**: Check Terraform state to confirm RDS instance class, multi-AZ, and other module defaults | ||
| 2. **Inspect AV scanning setup**: `monitoring/bucket_av.tf` exists — verify it covers the uploads bucket | ||
| 3. **Review additional CloudFront distributions**: May exist for the applicants subdomain | ||
| 4. **Check for auto-scaling policies**: Verify if fixed task counts are intentional or if scaling is needed | ||
| 5. **Confirm tenant list**: Verify the list of production local authorities matches Terraform configuration |
There was a problem hiding this comment.
Who is this section C for? It looks like it shouldn't remain in the final version of this document, is that correct?
|
|
||
| ### B2) Data Stores | ||
|
|
||
| **Note:** Versions below are from Terraform source files. Actual deployed versions may differ — RDS and ElastiCache can be upgraded independently of Terraform. Verify against live AWS resources. |
There was a problem hiding this comment.
This line makes reference to 'versions', but the table doesn't actually specify any, so we can remove this caveat.
|
|
||
| ## Important Caveats | ||
|
|
||
| - **Versions cited are from repository source files as of February 2026**, not verified against live infrastructure. Actual deployed versions may differ. Always verify against live AWS resources and `Gemfile.lock`. |
There was a problem hiding this comment.
Reference to versions here, but I can't see anywhere where we actually specify a version in this document.
It would be better to ensure we don't specify a version, then the documentation doesn't need to be updated so frequently and we don't need to have a warning about versions being out of date.
|
|
||
| - **Versions cited are from repository source files as of February 2026**, not verified against live infrastructure. Actual deployed versions may differ. Always verify against live AWS resources and `Gemfile.lock`. | ||
| - **Terraform defaults vs overrides**: Many module variables have defaults. Production may override these via `terraform.tfvars`, CLI variables, or post-apply manual changes not reflected in source. | ||
| - **This analysis is static code analysis only**. Infrastructure may have been modified outside of Terraform. |
There was a problem hiding this comment.
This line is irrelevant.
| - **Terraform defaults vs overrides**: Many module variables have defaults. Production may override these via `terraform.tfvars`, CLI variables, or post-apply manual changes not reflected in source. | ||
| - **This analysis is static code analysis only**. Infrastructure may have been modified outside of Terraform. | ||
| - **Resource counts** (e.g., "2 tasks", "2 cache clusters") reflect Terraform source defaults. Auto-scaling or manual scaling may result in different actual counts. | ||
| - Items marked **[ASSUMPTION]** should be confirmed against live infrastructure. |
There was a problem hiding this comment.
There are no lines marked 'assumption', so we don't need this clarification.
|
|
||
| | User | Subdomain | Engine | | ||
| |------------------|--------------------------------------|-----------------| | ||
| | Planning Officer | `{council}.bops.services` | bops_admin | |
There was a problem hiding this comment.
A planning officer is not limited to one engine:
| | Planning Officer | `{council}.bops.services` | bops_admin | | |
| | Planning Officer | `{council}.bops.services` | various | |
| ┌─────────────────────────────┐ | ||
| │ Route53 (DNS) │ | ||
| │ *.bops.services │ | ||
| │ *.applicants.bops.services │ | ||
| └─────────────┬───────────────┘ | ||
| │ |
There was a problem hiding this comment.
As discussed, we can remove this section.
| ┌─────────────────────────────┐ | |
| │ Route53 (DNS) │ | |
| │ *.bops.services │ | |
| │ *.applicants.bops.services │ | |
| └─────────────┬───────────────┘ | |
| │ |
| | **low_priority** (Sidekiq) | Redis queue | Web app (default queue) | Worker (Sidekiq) | `config/sidekiq.yml` | | ||
| | **high_priority** (Sidekiq) | Redis queue | Web app (urgent jobs) | Worker (Sidekiq) | `config/sidekiq.yml` | |
There was a problem hiding this comment.
It might be useful to document which jobs are considered urgent and which are default.
Reverse-engineered architecture doc covering the BOPS Rails app and bops-terraform infrastructure-as-code. Includes ASCII diagrams for system context, infrastructure overview, request/job/upload flows, CI/CD pipeline, and monitoring. Also covers component inventory, data stores, queues, scheduled jobs, and key architectural findings.
Description of change
A brief description of the change, with enough context for the reviewer to be able to understand why we're making this change.
Story Link
https://link-to-issue
Screenshots
If you have made changes to the frontend please add screenshots
Decisions [OPTIONAL]
If you had to make any decisions between different ways of doing things, what where they and why?
Known issues [OPTIONAL]
Things you know need further follow on work but aren't in scope of this issue
Further testing or sign off required [OPTIONAL]
e.g.