Skip to content

Conversation

@Jongmassey
Copy link
Contributor

@Jongmassey Jongmassey commented Jan 20, 2026

For a request from @LFISHER7 routed to @opensafely-core/team-rsi

This request required some new features:

  • inclusion of additional fields in bulk uploaded codelists
  • setting of codelist methodology in bulk uploader (also required API change)
  • support for glob paths in bulk uploader script

Includes a couple of drive-by bug fixes encountered in development of supporting features for this request.

@Jongmassey Jongmassey force-pushed the Jongmassey/ssi-codelists-bulk-import branch 2 times, most recently from 2c95ab1 to 493c39f Compare January 21, 2026 15:33
@Jongmassey Jongmassey force-pushed the Jongmassey/ssi-codelists-bulk-import branch from 493c39f to dbf2804 Compare January 28, 2026 10:07
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we will load/update in the future? If so we should probably add something to one of the docs explaining where the data came from and the exact command that we executed - especially as it's looking for multiple csvs instead of one.

Copy link
Contributor

@rw251 rw251 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Minor comment on docs.

@Jongmassey Jongmassey enabled auto-merge January 28, 2026 15:21
Coding systems that extend the DummyCodingSystem class (OPCS4 & Readv2)
don't have databases.

The bulk upload script erroneously checked for presence of a database
file for the requested coding system in all cases.

This commit omits the check for coding systems without databases.
As part of the checking for existence of coding system database files,
the bulk upload script made false assumptions regarding their path.

There exists an env var - DATABASE_DIR - which controls the database
file path which should be checked before assuming the default location.

Ideally the coding system database paths would be pulled directly out
of django.conf.settings,
but this would require the full Django machinery to be instantiated
for this script to run.
This could be achieved by changing this ad-hoc script into a django
script to be run via `runscript` but would require significant and
unergonomic changes to how command-line arguments are handled.
CodelistVersions created from csv can contain additional columns beyond
the default `code` and `description`.

This commit supports loading of such additional columns via the bulk
upload script.
Previous bulk-uploaded codelist files contained many codelists
concatenated into a single file.

This change allows for iteration over a set of files indicated by a
glob pattern.
Fixes bug of assumed presence of optional "tag" config value.
replicate update behaviour of codelist description for methodology

add tests for both
Configure column alias to allow UKHSA surgical site infection codelists
to have codelist methodology set.
@Jongmassey Jongmassey force-pushed the Jongmassey/ssi-codelists-bulk-import branch from bf72db7 to 851ed31 Compare January 28, 2026 15:27
@Jongmassey Jongmassey merged commit 4228cef into main Jan 28, 2026
6 checks passed
@Jongmassey Jongmassey deleted the Jongmassey/ssi-codelists-bulk-import branch January 28, 2026 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants