Skip to content

Conversation

@echicoine-icf
Copy link
Contributor

This change helps to establish dependency ordering for persisting resources to a FHIR server for a more successful ingestion of data. The change also distinguishes between a POST and PUT call based on transaction Bundle vs Resource.

  • Github Issue:
  • I've read the contribution guidelines
  • Code compiles without errors
  • Tests are created / updated
  • Documentation is created / updated

By creating this PR you acknowledge that your contribution will be licensed under Apache 2.0

@echicoine-icf echicoine-icf changed the title HttpClientUtils: Post and put handling HttpClientUtils: POST and PUT handling / Resource ingestion priority Mar 10, 2025
@raleigh-g-thompson raleigh-g-thompson self-requested a review April 17, 2025 14:50
@raleigh-g-thompson
Copy link
Contributor

@echicoine-icf Will need to remove ThreadUtils changes since those are now covered by PR #555. Might need to rebase, or merge master to address the tests issues that github is reporting.

@echicoine-icf
Copy link
Contributor Author

@echicoine-icf Will need to remove ThreadUtils changes since those are now covered by PR #555. Might need to rebase, or merge master to address the tests issues that github is reporting.

sorry about that, forgot I had it in this PR. It's been removed here now, and the latest master branch pulled in

@c-schuler c-schuler self-assigned this Apr 21, 2025
@c-schuler c-schuler self-requested a review April 23, 2025 18:09
@c-schuler
Copy link
Contributor

@echicoine-icf Looks like we are reinventing the wheel a bit here. I understand the need, but am suspicious of the proposed solution. The priority ranking seems unnecessary. Why not just merge all the resources into a single transaction bundle? That removes the need for priority and always results in a POST request. Additionally, the bundle can be deduped to avoid redundant requests.

Perhaps I am misunderstanding something?

@echicoine-icf
Copy link
Contributor Author

@echicoine-icf Looks like we are reinventing the wheel a bit here. I understand the need, but am suspicious of the proposed solution. The priority ranking seems unnecessary. Why not just merge all the resources into a single transaction bundle? That removes the need for priority and always results in a POST request. Additionally, the bundle can be deduped to avoid redundant requests.

Perhaps I am misunderstanding something?

This comes from testing the RefreshIG operation against our hapi-fhir endpoint for weeks on end. I can't explain why, but this approach had the highest "success rate" with regard to the number of resources accepted by hapi-fhir. When I only focused on uploading transaction bundles, there were failures due to references to missing resources. Then when I started sending individual resources before those transaction bundles the bundles were accepted. I also am inclined to think madie's own sort of data integrity enforcement being looser than hapi-fhir's could be a source of frustration here. I had to sanitize their data in many different ways just to get it accepted by the endpoint (no numerical-only ids, no references to missing resources to name the two biggest ones.)

@c-schuler
Copy link
Contributor

@echicoine-icf So it is the case then that the bundling/packaging process includes resources with references to resources not included in the bundle?

@echicoine-icf
Copy link
Contributor Author

@echicoine-icf So it is the case then that the bundling/packaging process includes resources with references to resources not included in the bundle?

When I get some time I'll retest for clarity, but the files being persisted should be the same as what's in the transaction bundle. It's been a few months since I last worked on this so I'll give updates when I have further information.

@codecov
Copy link

codecov bot commented May 5, 2025

Codecov Report

Attention: Patch coverage is 44.00000% with 168 lines in your changes missing coverage. Please review.

Project coverage is 21.39%. Comparing base (0bead88) to head (8ea95cc).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
...opencds/cqf/tooling/utilities/HttpClientUtils.java 50.20% 104 Missing and 16 partials ⚠️
...org/opencds/cqf/tooling/utilities/BundleUtils.java 0.00% 42 Missing ⚠️
...f/tooling/processor/PostBundlesInDirProcessor.java 0.00% 4 Missing ⚠️
...opencds/cqf/tooling/processor/AbstractBundler.java 80.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #548      +/-   ##
============================================
- Coverage     21.42%   21.39%   -0.04%     
+ Complexity     1703     1701       -2     
============================================
  Files           303      303              
  Lines         26379    26484     +105     
  Branches       4204     4246      +42     
============================================
+ Hits           5653     5667      +14     
- Misses        19787    19871      +84     
- Partials        939      946       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@echicoine-icf
Copy link
Contributor Author

echicoine-icf commented May 5, 2025

@echicoine-icf So it is the case then that the bundling/packaging process includes resources with references to resources not included in the bundle?

First off thank you for your input on this and guidance. I spent time today stepping through it and found a major oversight. In AbstractBundler the persistBundle call was coming after calls to add various resources to the bundle first. This would account for the missing resources.

I've adjusted the abstract class to no longer require persistFilesFolder, and removed this function from MeasureBundler. The process will focus on ensuring the bundle has all the resources required.

As to the changes to HttpClientUtils - bundles still require a different http method than individual resources, and in my tests I did manage to find an optimal order in which various individual resources should be persisted. For example, Encounter references Patient, or an Observation might reference an Encounter, etc. I think having HttpClientUtils handle sorting of resources and handling the different approach to sending them to a fhir server is a useful addition to the class to help keep it's capability robust.

@echicoine-icf
Copy link
Contributor Author

Closing this pull request as the defined goals are now out of scope from the initial request. Will open a new PR to address bug fixes with regard to persisting transaction bundles during Refresh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants