[RORDEV-1940] Fix ES 8.0.x/8.1.x container startup crash caused by JDK 17.0.2 cgroup v2 bug#73
[RORDEV-1940] Fix ES 8.0.x/8.1.x container startup crash caused by JDK 17.0.2 cgroup v2 bug#73
Conversation
This comment was marked as spam.
This comment was marked as spam.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@environments/eck-ror/start.sh`:
- Line 158: The curl command that downloads Corretto uses the insecure -k flag;
remove the -k option from the curl invocation so TLS certificate verification
remains enabled (the line containing curl -fsSLk
"https://corretto.aws/downloads/resources/${CORRETTO_VERSION}/amazon-corretto-${CORRETTO_VERSION}-linux-${ARCH}.tar.gz"
-o /tmp/jdk.tar.gz should be changed to drop -k), keeping the rest of the flags
(-fsSL) and output path (/tmp/jdk.tar.gz) unchanged.
In `@environments/elk-ror/images/es/Dockerfile`:
- Line 31: The curl invocation that downloads Corretto uses the insecure -k
flag; remove -k from the curl command in the Dockerfile line containing "curl
-fsSLk
\"https://corretto.aws/downloads/resources/${CORRETTO_VERSION}/amazon-corretto-${CORRETTO_VERSION}-linux-${ARCH}.tar.gz\""
and also remove -k from the corresponding curl in start.sh (around the download
at ~line 158) so TLS certificate verification is enforced; keep the other flags
(-fsSL) unchanged and verify the commands still work against the corretto.aws
endpoint.
🧹 Nitpick comments (1)
environments/eck-ror/start.sh (1)
130-168: Duplicated version-mapping logic between this script and the Dockerfile.The ES version → Corretto version mapping is duplicated in
environments/elk-ror/images/es/Dockerfile(lines 17-27). If the affected version ranges change, both places must be updated in lockstep. Consider extracting the mapping to a shared file (e.g., a sourced shell snippet or a simple lookup table) to keep it DRY.
| if [ -n "$CORRETTO_VERSION" ]; then \ | ||
| ARCH=$(uname -m | sed 's/x86_64/x64/' | sed 's/arm64/aarch64/') && \ | ||
| echo "Replacing buggy bundled JDK with Corretto $CORRETTO_VERSION for ES $ES_VERSION (arch: $ARCH)" && \ | ||
| curl -fsSLk "https://corretto.aws/downloads/resources/${CORRETTO_VERSION}/amazon-corretto-${CORRETTO_VERSION}-linux-${ARCH}.tar.gz" -o /tmp/jdk.tar.gz && \ |
There was a problem hiding this comment.
Remove -k (insecure) flag from curl.
The -k flag disables TLS certificate verification, allowing MITM attacks during the JDK download. The corretto.aws endpoint has valid certificates, so this flag is unnecessary. The same issue exists in start.sh line 158.
Proposed fix
- curl -fsSLk "https://corretto.aws/downloads/resources/${CORRETTO_VERSION}/amazon-corretto-${CORRETTO_VERSION}-linux-${ARCH}.tar.gz" -o /tmp/jdk.tar.gz && \
+ curl -fsSL "https://corretto.aws/downloads/resources/${CORRETTO_VERSION}/amazon-corretto-${CORRETTO_VERSION}-linux-${ARCH}.tar.gz" -o /tmp/jdk.tar.gz && \📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| curl -fsSLk "https://corretto.aws/downloads/resources/${CORRETTO_VERSION}/amazon-corretto-${CORRETTO_VERSION}-linux-${ARCH}.tar.gz" -o /tmp/jdk.tar.gz && \ | |
| curl -fsSL "https://corretto.aws/downloads/resources/${CORRETTO_VERSION}/amazon-corretto-${CORRETTO_VERSION}-linux-${ARCH}.tar.gz" -o /tmp/jdk.tar.gz && \ |
🤖 Prompt for AI Agents
In `@environments/elk-ror/images/es/Dockerfile` at line 31, The curl invocation
that downloads Corretto uses the insecure -k flag; remove -k from the curl
command in the Dockerfile line containing "curl -fsSLk
\"https://corretto.aws/downloads/resources/${CORRETTO_VERSION}/amazon-corretto-${CORRETTO_VERSION}-linux-${ARCH}.tar.gz\""
and also remove -k from the corresponding curl in start.sh (around the download
at ~line 158) so TLS certificate verification is enforced; keep the other flags
(-fsSL) unchanged and verify the commands still work against the corretto.aws
endpoint.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@environments/common/images/es-jdk-patch/patch-es-jdk.sh`:
- Around line 12-14: Validate ES_VERSION before using cut: check that ES_VERSION
is set and matches a semantic version pattern (e.g., digits.digits.digits) and
exit with a clear error if not; only after that parse MAJOR, MINOR, PATCH from
ES_VERSION and proceed with the existing numeric comparisons in the script
(referencing ES_VERSION, MAJOR, MINOR, PATCH). Ensure the guard triggers under
set -e (use an explicit if [[ -z "$ES_VERSION" || ! "$ES_VERSION" =~
^[0-9]+(\.[0-9]+){1,2}$ ]]; then ... fi) so downstream arithmetic won't run on
empty values.
In `@environments/elk-ror/images/es/Dockerfile`:
- Line 19: The Dockerfile ends with "USER root" which leaves the container
running as root; either switch back to the non-root Elasticsearch user after
privileged setup or document why root is required. Modify the Dockerfile so
privileged operations (package install, permission/chown for ReadonlyREST
patching) run as root, then add a final "USER elasticsearch" (or the appropriate
non-root username used in this image) to drop privileges, or if reverting to
non-root is not possible for E2E needs, add a clear comment next to the "USER
root" instruction explaining the ReadonlyREST patching workflow and why root is
required for this test image.
🧹 Nitpick comments (3)
environments/elk-ror/start.sh (1)
119-125: Consider skipping the patch build for unaffected ES versions.The patch script supports a
--checkmode to test whether patching is needed. Currently, this builds a new Docker image for every ES version, even when no JDK replacement occurs. You could gate this step to avoid unnecessary image builds:💡 Suggested optimization
-echo "Building JDK-patched ES base image ..." -export ES_PATCHED_IMAGE="es-ror-patched:${ES_VERSION}" -docker build \ - --build-arg BASE_IMAGE="${ROR_ES_REPO}:${ES_VERSION}-ror-${ROR_ES_VERSION}" \ - --build-arg ES_VERSION="$ES_VERSION" \ - -t "$ES_PATCHED_IMAGE" \ - ../common/images/es-jdk-patch/ +BASE_ES_IMAGE="${ROR_ES_REPO}:${ES_VERSION}-ror-${ROR_ES_VERSION}" +if ES_VERSION="$ES_VERSION" ../common/images/es-jdk-patch/patch-es-jdk.sh --check 2>/dev/null; then + echo "Building JDK-patched ES base image ..." + export ES_PATCHED_IMAGE="es-ror-patched:${ES_VERSION}" + docker build \ + --build-arg BASE_IMAGE="$BASE_ES_IMAGE" \ + --build-arg ES_VERSION="$ES_VERSION" \ + -t "$ES_PATCHED_IMAGE" \ + ../common/images/es-jdk-patch/ +else + echo "No JDK patch needed for ES $ES_VERSION, using base image directly." + export ES_PATCHED_IMAGE="$BASE_ES_IMAGE" +fienvironments/common/images/es-jdk-patch/Dockerfile (1)
1-7: Consider restoring the original user after patching.Trivy flags
USER rootas the last instruction (DS-0002). While this is an intermediate image and the downstreames/Dockerfilesets its ownUSER, restoringelasticsearchhere would be defensive and make this image safe to use directly as well.Suggested improvement
ARG ES_VERSION USER root COPY patch-es-jdk.sh /tmp/patch-es-jdk.sh RUN chmod +x /tmp/patch-es-jdk.sh && ES_VERSION=${ES_VERSION} /tmp/patch-es-jdk.sh && rm -f /tmp/patch-es-jdk.sh +USER elasticsearchenvironments/common/images/es-jdk-patch/patch-es-jdk.sh (1)
29-32:--checkmode: consider placing it before the version parsing to clarify intent, or document that it requiresES_VERSION.Minor: the
--checkargument is parsed after version logic, which is fine functionally, but could be confusing if someone passes--checkwithoutES_VERSION(it would error out on the arithmetic comparisons before reaching line 29).
| MAJOR=$(echo "$ES_VERSION" | cut -d. -f1) | ||
| MINOR=$(echo "$ES_VERSION" | cut -d. -f2) | ||
| PATCH=$(echo "$ES_VERSION" | cut -d. -f3) |
There was a problem hiding this comment.
Missing validation for ES_VERSION.
If ES_VERSION is unset or empty, cut will produce empty strings and the arithmetic comparisons on lines 17–26 will fail with errors under set -e. Add a guard:
Suggested fix
+if [ -z "$ES_VERSION" ]; then
+ echo "ES_VERSION is not set" >&2
+ exit 1
+fi
+
MAJOR=$(echo "$ES_VERSION" | cut -d. -f1)
MINOR=$(echo "$ES_VERSION" | cut -d. -f2)
PATCH=$(echo "$ES_VERSION" | cut -d. -f3)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| MAJOR=$(echo "$ES_VERSION" | cut -d. -f1) | |
| MINOR=$(echo "$ES_VERSION" | cut -d. -f2) | |
| PATCH=$(echo "$ES_VERSION" | cut -d. -f3) | |
| if [ -z "$ES_VERSION" ]; then | |
| echo "ES_VERSION is not set" >&2 | |
| exit 1 | |
| fi | |
| MAJOR=$(echo "$ES_VERSION" | cut -d. -f1) | |
| MINOR=$(echo "$ES_VERSION" | cut -d. -f2) | |
| PATCH=$(echo "$ES_VERSION" | cut -d. -f3) |
🤖 Prompt for AI Agents
In `@environments/common/images/es-jdk-patch/patch-es-jdk.sh` around lines 12 -
14, Validate ES_VERSION before using cut: check that ES_VERSION is set and
matches a semantic version pattern (e.g., digits.digits.digits) and exit with a
clear error if not; only after that parse MAJOR, MINOR, PATCH from ES_VERSION
and proceed with the existing numeric comparisons in the script (referencing
ES_VERSION, MAJOR, MINOR, PATCH). Ensure the guard triggers under set -e (use an
explicit if [[ -z "$ES_VERSION" || ! "$ES_VERSION" =~ ^[0-9]+(\.[0-9]+){1,2}$
]]; then ... fi) so downstream arithmetic won't run on empty values.
| # For ROR_ES_VERSION < 1.64.0 | ||
| ENV I_UNDERSTAND_IMPLICATION_OF_ES_PATCHING yes | ||
| USER root No newline at end of file | ||
| USER root |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Read the full Dockerfile to see context
cat -n environments/elk-ror/images/es/DockerfileRepository: beshu-tech/readonlyrest-e2e-tests
Length of output: 976
🏁 Script executed:
#!/bin/bash
# Check for entrypoint scripts in the same directory
ls -la environments/elk-ror/images/es/Repository: beshu-tech/readonlyrest-e2e-tests
Length of output: 244
🏁 Script executed:
#!/bin/bash
# Search for any entrypoint or startup scripts that might handle privilege dropping
find environments/elk-ror/images/es -type f \( -name "entrypoint*" -o -name "start*" -o -name "docker-entrypoint*" \) 2>/dev/null | head -20Repository: beshu-tech/readonlyrest-e2e-tests
Length of output: 59
🏁 Script executed:
#!/bin/bash
# Check other Dockerfiles in the repo to see common USER patterns
find . -name "Dockerfile*" -type f | xargs grep -l "USER" | head -10Repository: beshu-tech/readonlyrest-e2e-tests
Length of output: 305
🏁 Script executed:
#!/bin/bash
# Check USER patterns in other Dockerfiles
for file in ./docker-based-ror-dev-env/src/Dockerfile ./environments/common/images/es-jdk-patch/Dockerfile ./environments/elk-ror/images/apm/Dockerfile ./environments/elk-ror/images/kbn/Dockerfile; do
echo "=== $file ==="
grep -n "USER" "$file" | tail -5
doneRepository: beshu-tech/readonlyrest-e2e-tests
Length of output: 364
🏁 Script executed:
#!/bin/bash
# Check if there's any entrypoint or startup logic in the ES_PATCHED_IMAGE base image reference or documentation
grep -r "ES_PATCHED_IMAGE\|entrypoint" environments/elk-ror/images/es/ 2>/dev/null || echo "No additional files found"Repository: beshu-tech/readonlyrest-e2e-tests
Length of output: 206
🏁 Script executed:
#!/bin/bash
# Search for any build scripts or compose files that might show how the image is built/used
find . -name "docker-compose*" -o -name "*build*" | grep -E "(elk-ror|es)" | head -5Repository: beshu-tech/readonlyrest-e2e-tests
Length of output: 141
🏁 Script executed:
#!/bin/bash
# Read the kbn Dockerfile to see the pattern
cat -n environments/elk-ror/images/kbn/DockerfileRepository: beshu-tech/readonlyrest-e2e-tests
Length of output: 1263
🏁 Script executed:
#!/bin/bash
# Check the buildRorDevEnvImageIfNecessary.sh to understand context
head -30 ./docker-based-ror-dev-env/src/buildRorDevEnvImageIfNecessary.shRepository: beshu-tech/readonlyrest-e2e-tests
Length of output: 444
USER root as the final Dockerfile instruction — container defaults to running as root.
Trivy correctly flags this (DS-0002). However, examining the related Kibana Dockerfile in the same environment shows that root privileges are intentionally needed for ReadonlyREST patching operations (e.g., package installation and permission/ownership modifications). This pattern is consistent across the E2E test environment images.
For a production image, consider either:
- Dropping back to the
elasticsearchuser after any privileged initialization - Adding a comment explaining why root is required for the ROR patching workflow
For an E2E test environment, this may be acceptable as-is, but documenting the rationale would improve clarity.
🧰 Tools
🪛 Trivy (0.69.1)
[error] 19-19: Image user should not be 'root'
Last USER command in Dockerfile should not be 'root'
Rule: DS-0002
(IaC/Dockerfile)
🤖 Prompt for AI Agents
In `@environments/elk-ror/images/es/Dockerfile` at line 19, The Dockerfile ends
with "USER root" which leaves the container running as root; either switch back
to the non-root Elasticsearch user after privileged setup or document why root
is required. Modify the Dockerfile so privileged operations (package install,
permission/chown for ReadonlyREST patching) run as root, then add a final "USER
elasticsearch" (or the appropriate non-root username used in this image) to drop
privileges, or if reverting to non-root is not possible for E2E needs, add a
clear comment next to the "USER root" instruction explaining the ReadonlyREST
patching workflow and why root is required for this test image.
Summary by CodeRabbit
Bug Fixes
New Features