-
Notifications
You must be signed in to change notification settings - Fork 0
Add skills diff checker for PRs #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -9,5 +9,176 @@ permissions: | |||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| jobs: | ||||||||||||||||||||||||||||||
| diff-skills: | ||||||||||||||||||||||||||||||
| uses: cardstack/gh-actions/.github/workflows/diff-skills.yml@skills-matrix-cs-8727 | ||||||||||||||||||||||||||||||
| secrets: inherit | ||||||||||||||||||||||||||||||
| runs-on: ubuntu-latest | ||||||||||||||||||||||||||||||
| steps: | ||||||||||||||||||||||||||||||
| - uses: actions/checkout@v4 | ||||||||||||||||||||||||||||||
| with: | ||||||||||||||||||||||||||||||
| fetch-depth: 0 | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| - name: Get changed JSON files | ||||||||||||||||||||||||||||||
| id: changed | ||||||||||||||||||||||||||||||
| run: | | ||||||||||||||||||||||||||||||
| FILES=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} -- 'Skill/*.json') | ||||||||||||||||||||||||||||||
| if [ -z "$FILES" ]; then | ||||||||||||||||||||||||||||||
| echo "has_changes=false" >> "$GITHUB_OUTPUT" | ||||||||||||||||||||||||||||||
| else | ||||||||||||||||||||||||||||||
| echo "has_changes=true" >> "$GITHUB_OUTPUT" | ||||||||||||||||||||||||||||||
| echo "$FILES" > "$RUNNER_TEMP/changed_files.txt" | ||||||||||||||||||||||||||||||
| fi | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| - name: Generate diff | ||||||||||||||||||||||||||||||
| id: diff | ||||||||||||||||||||||||||||||
| if: steps.changed.outputs.has_changes == 'true' | ||||||||||||||||||||||||||||||
| run: | | ||||||||||||||||||||||||||||||
| diff_file="$RUNNER_TEMP/skills-diff.md" | ||||||||||||||||||||||||||||||
| echo "## Skills Diff" > "$diff_file" | ||||||||||||||||||||||||||||||
| echo "" >> "$diff_file" | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| # Write the Python diff script | ||||||||||||||||||||||||||||||
| cat > "$RUNNER_TEMP/smart_diff.py" << 'PYEOF' | ||||||||||||||||||||||||||||||
| import json, sys, difflib | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| def expand(text): | ||||||||||||||||||||||||||||||
| try: | ||||||||||||||||||||||||||||||
| obj = json.loads(text) | ||||||||||||||||||||||||||||||
| s = json.dumps(obj, indent=2, sort_keys=True, ensure_ascii=False) | ||||||||||||||||||||||||||||||
| return s.replace('\\n', '\n') | ||||||||||||||||||||||||||||||
| except: | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| except: | |
| except (ValueError, json.JSONDecodeError): |
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent sentence boundary detection logic. Line 53 uses and while line 62 correctly uses or for the same pattern. Line 53 should be changed to: if text[i] in '.!?' and (i + 1 >= len(text) or text[i + 1] == ' '): to properly handle sentence-ending punctuation at the end of a string or not followed by a space.
| if text[i] in '.!?' and i + 1 < len(text) and text[i + 1] == ' ': | |
| if text[i] in '.!?' and (i + 1 >= len(text) or text[i + 1] == ' '): |
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trim_pair function has a potential issue when all characters match between old_line and new_line. In this case, prefix_len will equal the minimum length of both strings, but if the strings differ only in length (one is a prefix of the other), calling sentence_start and sentence_end with this position may not capture the intended change. Consider handling the case where one string is a complete prefix of the other.
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error handling for file operations. The open() calls on lines 89-90 should include error handling (try/except) or use context managers to ensure files are properly closed even if errors occur. Additionally, there's no validation that sys.argv has the required 3 arguments before accessing them.
| old_text = expand(open(sys.argv[1]).read()) | |
| new_text = expand(open(sys.argv[2]).read()) | |
| if len(sys.argv) < 4: | |
| sys.stderr.write("Usage: smart_diff.py <old_file> <new_file> <filename>\n") | |
| sys.exit(1) | |
| try: | |
| with open(sys.argv[1]) as f: | |
| old_text = expand(f.read()) | |
| with open(sys.argv[2]) as f: | |
| new_text = expand(f.read()) | |
| except OSError as e: | |
| sys.stderr.write(f"Error reading input files: {e}\n") | |
| sys.exit(1) |
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff processing logic only handles lines starting with '-' on line 109, but doesn't handle standalone additions (lines starting with '+' that aren't preceded by '-' lines). These will fall through to the else clause on line 134-135 and be output as-is without the trim_long optimization. Consider adding a separate condition to handle '+' lines: elif line.startswith('+') and not line.startswith('+++'): before the space-handling condition.
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For new JSON files that don't exist in the base branch, the fallback || echo "{}" > "$RUNNER_TEMP/old.json" on line 142 creates an empty JSON object, which is appropriate. However, for deleted files, line 143 will also create an empty JSON object for new.json, which may not clearly indicate file deletion in the diff. Consider adding a comment to clarify this behavior or handle deletion explicitly.
| while IFS= read -r file; do | |
| while IFS= read -r file; do | |
| # If the file does not exist in either the base or head commit (for example, new or deleted files), | |
| # fall back to an empty JSON object so the diff tool can still run. For deleted files, this means | |
| # new.json will be "{}", so the diff will show all keys removed rather than using a special deletion marker. |
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shell script doesn't handle the case where the Python script fails. On line 145, if smart_diff.py exits with an error, the script continues executing and may produce incorrect or incomplete output. Consider adding error checking: FILE_DIFF=$(python3 "$RUNNER_TEMP/smart_diff.py" "$RUNNER_TEMP/old.json" "$RUNNER_TEMP/new.json" "$file") || { echo "Error processing $file" >&2; continue; }
| FILE_DIFF=$(python3 "$RUNNER_TEMP/smart_diff.py" "$RUNNER_TEMP/old.json" "$RUNNER_TEMP/new.json" "$file") | |
| FILE_DIFF=$(python3 "$RUNNER_TEMP/smart_diff.py" "$RUNNER_TEMP/old.json" "$RUNNER_TEMP/new.json" "$file") || { echo "Error processing $file" >&2; continue; } |
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The markdown code fence uses 4 backticks on line 151 and 153, but this is unusual. Standard markdown uses 3 backticks. While 4 backticks can work (to allow 3 backticks inside), if the diff content contains 4 backticks, it could break the fence. Consider using standard 3 backticks or ensuring the fence uses more backticks than any content it contains.
| echo '````diff' | |
| echo "$FILE_DIFF" | |
| echo '````' | |
| echo '```diff' | |
| echo "$FILE_DIFF" | |
| echo '```' |
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command substitution on line 166 is missing quotes around the variable expansion. Change if [ $(wc -c < "$diff_file") -gt 60000 ]; then to if [ "$(wc -c < "$diff_file")" -gt 60000 ]; then to prevent word splitting and ensure robust behavior.
| if [ $(wc -c < "$diff_file") -gt 60000 ]; then | |
| if [ "$(wc -c < "$diff_file")" -gt 60000 ]; then |
Copilot
AI
Feb 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The truncation logic may create invalid markdown. When truncating at exactly 60000 bytes on line 167, you might cut in the middle of a multi-byte UTF-8 character or in the middle of markdown syntax (like code fences). Consider truncating at line boundaries or using a safer method that respects UTF-8 character boundaries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
replace('\\n', '\n')operation on line 45 appears to be attempting to handle escaped newlines in JSON strings, but this is unnecessary and potentially problematic.json.dumps()already properly handles newlines, and this replace will convert literal backslash-n sequences (which might be intentional in the JSON) into actual newlines. Consider removing this line unless there's a specific reason for it.