Skip to content

Add infrastructure architecture documentation#2883

Draft
benbaumann95 wants to merge 1 commit intomainfrom
add-infrastructure-architecture-docs
Draft

Add infrastructure architecture documentation#2883
benbaumann95 wants to merge 1 commit intomainfrom
add-infrastructure-architecture-docs

Conversation

@benbaumann95
Copy link
Member

Reverse-engineered architecture doc covering the BOPS Rails app and bops-terraform infrastructure-as-code. Includes ASCII diagrams for system context, infrastructure overview, request/job/upload flows, CI/CD pipeline, and monitoring. Also covers component inventory, data stores, queues, scheduled jobs, and key architectural findings.

Description of change

A brief description of the change, with enough context for the reviewer to be able to understand why we're making this change.

Story Link

https://link-to-issue

Screenshots

If you have made changes to the frontend please add screenshots

Decisions [OPTIONAL]

If you had to make any decisions between different ways of doing things, what where they and why?

Known issues [OPTIONAL]

Things you know need further follow on work but aren't in scope of this issue

  1. issue1
  2. issue2

Further testing or sign off required [OPTIONAL]

e.g.

  1. product manager to sign off view templates

Reverse-engineered architecture doc covering the BOPS Rails app
and bops-terraform infrastructure-as-code. Includes ASCII diagrams
for system context, infrastructure overview, request/job/upload flows,
CI/CD pipeline, and monitoring. Also covers component inventory,
data stores, queues, scheduled jobs, and key architectural findings.
@benbaumann95 benbaumann95 force-pushed the add-infrastructure-architecture-docs branch from a615afe to 09e1051 Compare February 12, 2026 10:31
Comment on lines +276 to +290
### B1.1) Rails Engine Inventory

| Engine | Mount Path | Purpose | Defined In |
|---|---|---|---|
| **bops_core** | (shared) | Routing helpers, middleware, base controllers | `engines/bops_core/` |
| **bops_admin** | `/admin` | LA admin: users, app types, consultees, policy | `engines/bops_admin/` |
| **bops_api** | `/api` | Public/authenticated REST API, Swagger docs | `engines/bops_api/` |
| **bops_applicants** | `/` (applicants subdomain) | Applicant responses, neighbour comments | `engines/bops_applicants/` |
| **bops_config** | `/` (config subdomain) | Global config, Sidekiq UI | `engines/bops_config/` |
| **bops_consultees** | `/consultees` | External consultee view/comment | `engines/bops_consultees/` |
| **bops_enforcements** | `/` | Enforcement case management | `engines/bops_enforcements/` |
| **bops_preapps** | `/preapps` | Pre-application advice workflow | `engines/bops_preapps/` |
| **bops_reports** | `/reports` | Report generation | `engines/bops_reports/` |
| **bops_submissions** | `/api` (submissions) | Incoming application submission handling | `engines/bops_submissions/` |
| **bops_uploads** | (uploads subdomain) | File uploads via Active Storage | `engines/bops_uploads/` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section seems to duplicate docs/architecture.md.

Comment on lines +327 to +354
## C) Key Findings

### C1) Architectural Observations

| # | Observation | Severity | Evidence |
|---|---|---|---|
| 1 | **Single-AZ RDS by default** | MEDIUM | `modules/database/main.tf` — `multi_az = false` default. Verify if production overrides this. |
| 2 | **No S3 lifecycle policies on uploads** | LOW | `modules/ecs_bops/s3.tf` — no lifecycle rules on uploads bucket. Unbounded storage growth. |
| 3 | **No application-level caching** | LOW | `config/environments/production.rb` — cache_store commented out, despite having ElastiCache available. |
| 4 | **No auto-scaling policies** | LOW | No ECS auto-scaling found in Terraform. Fixed task counts (2 web + 2 worker + 1 console). Verify if this is intentional. |

### C2) Unknowns to Confirm

| # | Unknown | Impact | Where to Verify |
|---|---|---|---|
| 1 | **Is RDS multi-AZ enabled in production?** | HIGH — data availability | Check AWS console or Terraform state |
| 2 | **What RDS instance class runs in production?** | MEDIUM — performance | Default is `db.t3.small` in module; check if production overrides it |
| 3 | **Are there any Lambda@Edge functions on CloudFront?** | MEDIUM — architecture | Only a CloudFront Function (`remove-blobs-prefix`) found in Terraform |
| 4 | **Is there a backup/DR strategy beyond 7-day RDS snapshots?** | MEDIUM — resilience | No cross-region replication or separate backup strategy found |
| 5 | **Are there additional environments beyond staging/production/pentest/sandbox?** | LOW — architecture | Only 4 environments found in Terraform |

### C3) Next Steps

1. **Verify production overrides**: Check Terraform state to confirm RDS instance class, multi-AZ, and other module defaults
2. **Inspect AV scanning setup**: `monitoring/bucket_av.tf` exists — verify it covers the uploads bucket
3. **Review additional CloudFront distributions**: May exist for the applicants subdomain
4. **Check for auto-scaling policies**: Verify if fixed task counts are intentional or if scaling is needed
5. **Confirm tenant list**: Verify the list of production local authorities matches Terraform configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is this section C for? It looks like it shouldn't remain in the final version of this document, is that correct?


### B2) Data Stores

**Note:** Versions below are from Terraform source files. Actual deployed versions may differ — RDS and ElastiCache can be upgraded independently of Terraform. Verify against live AWS resources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line makes reference to 'versions', but the table doesn't actually specify any, so we can remove this caveat.


## Important Caveats

- **Versions cited are from repository source files as of February 2026**, not verified against live infrastructure. Actual deployed versions may differ. Always verify against live AWS resources and `Gemfile.lock`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference to versions here, but I can't see anywhere where we actually specify a version in this document.

It would be better to ensure we don't specify a version, then the documentation doesn't need to be updated so frequently and we don't need to have a warning about versions being out of date.


- **Versions cited are from repository source files as of February 2026**, not verified against live infrastructure. Actual deployed versions may differ. Always verify against live AWS resources and `Gemfile.lock`.
- **Terraform defaults vs overrides**: Many module variables have defaults. Production may override these via `terraform.tfvars`, CLI variables, or post-apply manual changes not reflected in source.
- **This analysis is static code analysis only**. Infrastructure may have been modified outside of Terraform.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is irrelevant.

- **Terraform defaults vs overrides**: Many module variables have defaults. Production may override these via `terraform.tfvars`, CLI variables, or post-apply manual changes not reflected in source.
- **This analysis is static code analysis only**. Infrastructure may have been modified outside of Terraform.
- **Resource counts** (e.g., "2 tasks", "2 cache clusters") reflect Terraform source defaults. Auto-scaling or manual scaling may result in different actual counts.
- Items marked **[ASSUMPTION]** should be confirmed against live infrastructure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no lines marked 'assumption', so we don't need this clarification.


| User | Subdomain | Engine |
|------------------|--------------------------------------|-----------------|
| Planning Officer | `{council}.bops.services` | bops_admin |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A planning officer is not limited to one engine:

Suggested change
| Planning Officer | `{council}.bops.services` | bops_admin |
| Planning Officer | `{council}.bops.services` | various |

Comment on lines +66 to +71
┌─────────────────────────────┐
│ Route53 (DNS) │
│ *.bops.services │
│ *.applicants.bops.services │
└─────────────┬───────────────┘
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, we can remove this section.

Suggested change
┌─────────────────────────────┐
│ Route53 (DNS) │
│ *.bops.services │
│ *.applicants.bops.services │
└─────────────┬───────────────┘

Comment on lines +309 to +310
| **low_priority** (Sidekiq) | Redis queue | Web app (default queue) | Worker (Sidekiq) | `config/sidekiq.yml` |
| **high_priority** (Sidekiq) | Redis queue | Web app (urgent jobs) | Worker (Sidekiq) | `config/sidekiq.yml` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to document which jobs are considered urgent and which are default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants