-
Notifications
You must be signed in to change notification settings - Fork 0
test #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pull request from main branch is restricted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @staticmethod | ||
| def _decode_body(body: str, headers: list[dict]) -> str: | ||
| return base64.urlsafe_b64decode(body).decode("utf-8") | ||
| return (base64.urlsafe_b64decode(body).decode("utf-8") | ||
| .replace("\r", "").replace("\n", "").replace("\t", "")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve whitespace in decoded email bodies
The new _decode_body now strips every carriage return, newline, and tab by replacing them with the empty string (lines 38‑41). For any multi‑line plain‑text email this merges adjacent tokens—for example "Order number:\n1234" becomes "Order number:1234"—and collapses paragraphs before the text is fed to the classifier. Because _encode_thread_messages forwards this string to both OpenAI/GPTOSS prompts, every classification request on emails containing line breaks now operates on corrupted content, which will noticeably hurt accuracy. The decoded body needs to preserve its whitespace (or at least replace it with spaces) instead of deleting it outright.
Useful? React with 👍 / 👎.
No description provided.