Skip to content

Conversation

@adnanrhussain
Copy link

This PR extends the Sentence Structure Evaluator (notebook) to support additional grades (5 to 12) beyond the existing grades (3 & 4).

Status: This PR is currently under internal review and testing, before it is ready to be released

@adnanrhussain adnanrhussain requested a review from gary-mu January 10, 2026 00:52
Copy link

@gary-mu gary-mu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this integration of grade 5-12.
2 comments:

  1. Can we also add Ariena as reviewer? She's not a member of this repo, so I can't add her
  2. Can we add test result? I think running test passages and paste result can be sufficient.

@adnanrhussain adnanrhussain requested a review from aychi1 January 13, 2026 20:30
@aychi1
Copy link

aychi1 commented Jan 16, 2026

Overall LGTM. Two small callouts:

  • Some duplicative fields between Gr 3-4 and 5-12 (e.g. num_compound vs num_compound_sentences, perc_simple_sentences vs. perc_simple, etc).
  • Prompt definition for simple sentence is not exactly the same between Noah's version and Wayne's version, and this isn't captured in the PR. Noah's version specifically mentions that simple sentences with relative clauses still count as simple. It likely won't make a big difference, but just documenting here that this is a known departure.

Copy link

@aychi1 aychi1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Commented on some non-blocking nits.

" description=\"Max number of clauses (independent + subordinate) found in a single sentence.\"\n",
" )\n",
" # (Grades 5-12)\n",
" num_compound: int = Field(\n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is there a reason why we have both num_compound_sentences (line 173) and num_compound (line 259)? Asking bc I noticed that the definition is the same across both Grades 3-4 vs. 5-12 evals.

On a similar note, the definition for Simple Sentences is slightly different between Noah's prompt vs. Wayne's prompt. It likely does not make a big difference, but just want to call out that we've never tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants