Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 173 additions & 2 deletions .github/workflows/diff-skills.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,176 @@ permissions:

jobs:
diff-skills:
uses: cardstack/gh-actions/.github/workflows/diff-skills.yml@skills-matrix-cs-8727
secrets: inherit
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Get changed JSON files
id: changed
run: |
FILES=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} -- 'Skill/*.json')
if [ -z "$FILES" ]; then
echo "has_changes=false" >> "$GITHUB_OUTPUT"
else
echo "has_changes=true" >> "$GITHUB_OUTPUT"
echo "$FILES" > "$RUNNER_TEMP/changed_files.txt"
fi

- name: Generate diff
id: diff
if: steps.changed.outputs.has_changes == 'true'
run: |
diff_file="$RUNNER_TEMP/skills-diff.md"
echo "## Skills Diff" > "$diff_file"
echo "" >> "$diff_file"

# Write the Python diff script
cat > "$RUNNER_TEMP/smart_diff.py" << 'PYEOF'
import json, sys, difflib

def expand(text):
try:
obj = json.loads(text)
s = json.dumps(obj, indent=2, sort_keys=True, ensure_ascii=False)
return s.replace('\\n', '\n')
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replace('\\n', '\n') operation on line 45 appears to be attempting to handle escaped newlines in JSON strings, but this is unnecessary and potentially problematic. json.dumps() already properly handles newlines, and this replace will convert literal backslash-n sequences (which might be intentional in the JSON) into actual newlines. Consider removing this line unless there's a specific reason for it.

Suggested change
return s.replace('\\n', '\n')
return s

Copilot uses AI. Check for mistakes.
except:
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bare exception clause except: on line 46 catches all exceptions including system exits and keyboard interrupts, which can mask critical errors. Replace with except (ValueError, json.JSONDecodeError): to catch only JSON parsing errors specifically.

Suggested change
except:
except (ValueError, json.JSONDecodeError):

Copilot uses AI. Check for mistakes.
return text

def sentence_start(text, pos):
"""Find the start of the sentence containing pos."""
i = pos - 1
while i > 0:
if text[i] in '.!?' and i + 1 < len(text) and text[i + 1] == ' ':
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent sentence boundary detection logic. Line 53 uses and while line 62 correctly uses or for the same pattern. Line 53 should be changed to: if text[i] in '.!?' and (i + 1 >= len(text) or text[i + 1] == ' '): to properly handle sentence-ending punctuation at the end of a string or not followed by a space.

Suggested change
if text[i] in '.!?' and i + 1 < len(text) and text[i + 1] == ' ':
if text[i] in '.!?' and (i + 1 >= len(text) or text[i + 1] == ' '):

Copilot uses AI. Check for mistakes.
return i + 2
i -= 1
return 0

def sentence_end(text, pos):
"""Find the end of the sentence containing pos."""
i = pos
while i < len(text):
if text[i] in '.!?' and (i + 1 >= len(text) or text[i + 1] == ' '):
return i + 1
i += 1
return len(text)

def trim_pair(old_line, new_line):
"""Extract the sentence containing the change."""
prefix_len = 0
for i in range(min(len(old_line), len(new_line))):
if old_line[i] == new_line[i]:
prefix_len = i + 1
else:
break
start = sentence_start(old_line, prefix_len)
end_old = sentence_end(old_line, prefix_len)
end_new = sentence_end(new_line, prefix_len)
return (
old_line[start:end_old],
new_line[start:end_new],
)
Comment on lines +67 to +81
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trim_pair function has a potential issue when all characters match between old_line and new_line. In this case, prefix_len will equal the minimum length of both strings, but if the strings differ only in length (one is a prefix of the other), calling sentence_start and sentence_end with this position may not capture the intended change. Consider handling the case where one string is a complete prefix of the other.

Copilot uses AI. Check for mistakes.

def trim_long(text, max_len=120):
if len(text) > max_len:
h = max_len // 2 - 2
return f"{text[:h]}...{text[-h:]}"
return text

old_text = expand(open(sys.argv[1]).read())
new_text = expand(open(sys.argv[2]).read())
Comment on lines +89 to +90
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing error handling for file operations. The open() calls on lines 89-90 should include error handling (try/except) or use context managers to ensure files are properly closed even if errors occur. Additionally, there's no validation that sys.argv has the required 3 arguments before accessing them.

Suggested change
old_text = expand(open(sys.argv[1]).read())
new_text = expand(open(sys.argv[2]).read())
if len(sys.argv) < 4:
sys.stderr.write("Usage: smart_diff.py <old_file> <new_file> <filename>\n")
sys.exit(1)
try:
with open(sys.argv[1]) as f:
old_text = expand(f.read())
with open(sys.argv[2]) as f:
new_text = expand(f.read())
except OSError as e:
sys.stderr.write(f"Error reading input files: {e}\n")
sys.exit(1)

Copilot uses AI. Check for mistakes.
filename = sys.argv[3]

diff = list(difflib.unified_diff(
old_text.splitlines(), new_text.splitlines(),
fromfile=f"a/{filename}", tofile=f"b/{filename}",
lineterm='', n=1,
))
if not diff:
sys.exit(0)

out = []
i = 0
while i < len(diff):
line = diff[i]
if line.startswith('---') or line.startswith('+++') or line.startswith('@@'):
out.append(line)
i += 1
continue
if line.startswith('-') and not line.startswith('---'):
minus, plus = [], []
while i < len(diff) and diff[i].startswith('-') and not diff[i].startswith('---'):
minus.append(diff[i][1:])
i += 1
while i < len(diff) and diff[i].startswith('+') and not diff[i].startswith('+++'):
plus.append(diff[i][1:])
i += 1
if len(minus) == len(plus):
for m, p in zip(minus, plus):
if max(len(m), len(p)) > 120:
tm, tp = trim_pair(m, p)
out.append(f"-{tm}")
out.append(f"+{tp}")
else:
out.append(f"-{m}")
out.append(f"+{p}")
else:
for m in minus:
out.append(f"-{trim_long(m)}")
for p in plus:
out.append(f"+{trim_long(p)}")
continue
if line.startswith(' '):
out.append(f" {trim_long(line[1:])}")
else:
out.append(line)
i += 1
Comment on lines +109 to +136
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff processing logic only handles lines starting with '-' on line 109, but doesn't handle standalone additions (lines starting with '+' that aren't preceded by '-' lines). These will fall through to the else clause on line 134-135 and be output as-is without the trim_long optimization. Consider adding a separate condition to handle '+' lines: elif line.startswith('+') and not line.startswith('+++'): before the space-handling condition.

Copilot uses AI. Check for mistakes.

print('\n'.join(out))
PYEOF

while IFS= read -r file; do
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For new JSON files that don't exist in the base branch, the fallback || echo "{}" > "$RUNNER_TEMP/old.json" on line 142 creates an empty JSON object, which is appropriate. However, for deleted files, line 143 will also create an empty JSON object for new.json, which may not clearly indicate file deletion in the diff. Consider adding a comment to clarify this behavior or handle deletion explicitly.

Suggested change
while IFS= read -r file; do
while IFS= read -r file; do
# If the file does not exist in either the base or head commit (for example, new or deleted files),
# fall back to an empty JSON object so the diff tool can still run. For deleted files, this means
# new.json will be "{}", so the diff will show all keys removed rather than using a special deletion marker.

Copilot uses AI. Check for mistakes.
git show "${{ github.event.pull_request.base.sha }}:$file" > "$RUNNER_TEMP/old.json" 2>/dev/null || echo "{}" > "$RUNNER_TEMP/old.json"
git show "${{ github.event.pull_request.head.sha }}:$file" > "$RUNNER_TEMP/new.json" 2>/dev/null || echo "{}" > "$RUNNER_TEMP/new.json"

FILE_DIFF=$(python3 "$RUNNER_TEMP/smart_diff.py" "$RUNNER_TEMP/old.json" "$RUNNER_TEMP/new.json" "$file")
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shell script doesn't handle the case where the Python script fails. On line 145, if smart_diff.py exits with an error, the script continues executing and may produce incorrect or incomplete output. Consider adding error checking: FILE_DIFF=$(python3 "$RUNNER_TEMP/smart_diff.py" "$RUNNER_TEMP/old.json" "$RUNNER_TEMP/new.json" "$file") || { echo "Error processing $file" >&2; continue; }

Suggested change
FILE_DIFF=$(python3 "$RUNNER_TEMP/smart_diff.py" "$RUNNER_TEMP/old.json" "$RUNNER_TEMP/new.json" "$file")
FILE_DIFF=$(python3 "$RUNNER_TEMP/smart_diff.py" "$RUNNER_TEMP/old.json" "$RUNNER_TEMP/new.json" "$file") || { echo "Error processing $file" >&2; continue; }

Copilot uses AI. Check for mistakes.

if [ -n "$FILE_DIFF" ]; then
{
echo "### $file"
echo ""
echo '````diff'
echo "$FILE_DIFF"
echo '````'
Comment on lines +151 to +153
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown code fence uses 4 backticks on line 151 and 153, but this is unusual. Standard markdown uses 3 backticks. While 4 backticks can work (to allow 3 backticks inside), if the diff content contains 4 backticks, it could break the fence. Consider using standard 3 backticks or ensuring the fence uses more backticks than any content it contains.

Suggested change
echo '````diff'
echo "$FILE_DIFF"
echo '````'
echo '```diff'
echo "$FILE_DIFF"
echo '```'

Copilot uses AI. Check for mistakes.
echo ""
} >> "$diff_file"
fi
done < "$RUNNER_TEMP/changed_files.txt"

# Check if any diffs were actually found
if [ "$(wc -l < "$diff_file")" -le 2 ]; then
echo "has_changes=false" >> "$GITHUB_OUTPUT"
exit 0
fi

# Truncate if too large for a PR comment
if [ $(wc -c < "$diff_file") -gt 60000 ]; then
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command substitution on line 166 is missing quotes around the variable expansion. Change if [ $(wc -c < "$diff_file") -gt 60000 ]; then to if [ "$(wc -c < "$diff_file")" -gt 60000 ]; then to prevent word splitting and ensure robust behavior.

Suggested change
if [ $(wc -c < "$diff_file") -gt 60000 ]; then
if [ "$(wc -c < "$diff_file")" -gt 60000 ]; then

Copilot uses AI. Check for mistakes.
head -c 60000 "$diff_file" > "$diff_file.tmp"
{
cat "$diff_file.tmp"
echo ""
echo ""
echo "> :warning: Diff truncated. See the full diff in the Files tab."
} > "$diff_file"
Comment on lines +166 to +173
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truncation logic may create invalid markdown. When truncating at exactly 60000 bytes on line 167, you might cut in the middle of a multi-byte UTF-8 character or in the middle of markdown syntax (like code fences). Consider truncating at line boundaries or using a safer method that respects UTF-8 character boundaries.

Copilot uses AI. Check for mistakes.
fi

echo "has_changes=true" >> "$GITHUB_OUTPUT"
echo "diff_file=$diff_file" >> "$GITHUB_OUTPUT"

- name: Comment on PR
if: steps.diff.outputs.has_changes == 'true'
uses: marocchino/sticky-pull-request-comment@v2
with:
header: skills-diff
path: ${{ steps.diff.outputs.diff_file }}