Skip to content

Conversation

@YishaiGlasner
Copy link
Contributor

@YishaiGlasner YishaiGlasner commented Feb 8, 2026

Description

  • Until today Term has unique primary titles, and name equal to english primary title. That made it unflexible - we couldn't have terms with the same one language and different other language, and changing title meant change all over the library. Now the name is used as a slug, so titles can be changed and affect other objects with no need to change them.
  • That makes adding dependinncies for Term easier (though not easy) - we have to rebuild library (caching of terms, categories and indexes), and then, if the Term was used as a sharedTitle to cascade all objects with this ref.

Code Changes

  1. Term (model/shema/py):
  • add titles to pkeys as internal attributes (not saved in db). add _update_tracked_primary_titles function and run it before _set_pkeys and in _normalize.
  • change validation, for not having duplicate name, duplicate primary titles together and not changing name (as it's a slug)
  • generate a name (like a slug) as part of the validate function
  • remove load_by_title for 2 terms can have now same titles. rather add load_by_primary_titles for getting a Term by both primary titles together.
  1. Change uses of load_by_title and Term().loadto load by name (which for all existing terms has the same result as primary English title) orload_by_primary_titles` (views.py, terms_test.py (i think no tests needed for new function), calendars.py).
  2. Change tests to allow duplicate titles, but not of primary titles together.
  3. add dependenciesw for changing primary titles (cascade categories, indexes' collective_title and indexes' nodes shredTitle (dependencies.py, helper/schema.py).
  4. fix bugs in cascade function (add attr_name where needed, change ManuscriptSet to ManuscriptPageSet, change ManuscriptSet to WebpageSet (helper/schema.py).
  5. terms api (wasn't un use AFAIK):
  • remove redundant url (which is also is not caught because a former url) (urls_library.py).
  • rename param term to name (views.py).
  • remove condition if name is None for url doesn't allow this (views.py).
  1. category api - change check of existing term (views.py).
  2. imports clean (text.py)
  3. Remove of unused code - commented ou, conditioned by False or functions with no uses (model/shema/py, category.py)

…itor before being replaced with different logic).
1. add `load_by_primary_title`.
2. remove validations regarding titles. rather add validation for new term's name doesn't exist.
load terms by primary titles, which ensure the category will get the required titles.
remove term's name setting, leaving it to be determined by the term saving.
remove dead code (if False) and commented-out code.
add validation name is not changed
…ught by only `add`), and have the same view as edit/terms.
…titles` since different terms can have one equal primary title.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the Term model to decouple stable identifiers (name as a slug) from user-facing titles, allowing terms to share titles across languages and enabling cascading rebuilds/updates when primary titles change.

Changes:

  • Updates Term uniqueness/validation and replaces load_by_title usage with lookups by name or (en_primary, he_primary) pairs.
  • Adds a dependency hook intended to rebuild library caches and cascade ref updates when term primary titles change.
  • Cleans up related APIs/routes and removes some unused helper code.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
sefaria/utils/calendars.py Updates parasha Term lookup to load by name.
sefaria/urls_library.py Adjusts term editor route parameter name and removes a redundant URL.
sefaria/model/text.py Import cleanups and minor internal refactors.
sefaria/model/tests/terms_test.py Removes an obsolete load_by_title test.
sefaria/model/schema.py Refactors Term keys/validation and adds load_by_primary_titles + name generation.
sefaria/model/dependencies.py Subscribes to term title changes to trigger rebuild/cascade behavior.
sefaria/helper/schema.py Adds cascading/rebuild logic for term primary title changes and fixes some cascade calls.
sefaria/helper/category.py Removes unused check_term helper.
reader/views.py Updates term editor + term/category APIs to use name and new primary-title lookup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if cause_ref_change: # Now new refs are available and can be cascaded
for index in library.all_index_records():
for node in [index.nodes] + index.nodes.all_children():
if getattr(node, "sharedTitle", None) == old_primary_en:
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks for affected schema nodes by comparing node.sharedTitle to old_primary_en, but sharedTitle is intended to reference Term.name (the stable slug), which may differ from the English primary title after this refactor (or for disambiguated names like Foo1). Use term.name to select nodes, while still cascading refs from old_primary_en to new_primary_en.

Suggested change
if getattr(node, "sharedTitle", None) == old_primary_en:
if getattr(node, "sharedTitle", None) == term.name:

Copilot uses AI. Check for mistakes.
Comment on lines 282 to 285
name = self.get_primary_title()
dupes = len(TermSet({'name': {'$regex': fr'^{name}\d*$'}}))
if dupes:
name = f'{name}{dupes}'
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The autogenerated name disambiguation is based on len(TermSet(regex)), which (1) loads all matching records into memory, (2) doesn’t escape regex metacharacters in name, and (3) can produce collisions when suffixes are non-contiguous (e.g., existing Foo and Foo2 => count=2 => generates Foo2). Prefer count_documents()/TermSet(...).count() and pick the next available suffix via a loop (similar to SluggedAbstractMongoRecord.normalize_slug_field), using re.escape(name) when building regex patterns.

Suggested change
name = self.get_primary_title()
dupes = len(TermSet({'name': {'$regex': fr'^{name}\d*$'}}))
if dupes:
name = f'{name}{dupes}'
base_name = self.get_primary_title()
name = base_name
# Ensure the autogenerated name is unique by finding the next available numeric suffix.
if Term().load({'name': base_name}):
suffix = 2
while True:
candidate = f"{base_name}{suffix}"
if not Term().load({'name': candidate}):
name = candidate
break
suffix += 1

Copilot uses AI. Check for mistakes.
reader/views.py Outdated
"content": "Please include the primary Term name in the URL to uses the Terms Editor."
})
existing_term = Term().load({'name': name})
data = existing_term.contents() if existing_term else {}
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the term doesn’t exist, data is set to {}. The term editor UI/JSON flow previously pre-populated name and an empty titles array; returning an empty object can break the editor and/or lead to a POST missing required fields. Consider initializing data to at least {"name": name, "titles": []} for new terms (while still preventing name edits server-side).

Suggested change
data = existing_term.contents() if existing_term else {}
if existing_term:
data = existing_term.contents()
else:
# Initialize minimal structure for new terms to support the editor UI.
data = {"name": name, "titles": []}

Copilot uses AI. Check for mistakes.
if self.is_new() and Term().load({'name': self.name}):
raise DuplicateRecordError(f"A Term with the name {self.name} already exists")
elif not self.is_new() and self.is_key_changed('name'):
raise ValidationError({"name": "This field cannot be changed."})
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising django.core.exceptions.ValidationError here is inconsistent with the rest of the Mongo model layer (which generally raises InputError/DuplicateRecordError). This may not be caught by existing API error handling in non-Django contexts and can surface as a 500. Prefer raising an InputError (or a project-specific validation exception) for immutability violations.

Suggested change
raise ValidationError({"name": "This field cannot be changed."})
raise InputError("Field 'name' cannot be changed.")

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure about that

@mergify
Copy link

mergify bot commented Feb 8, 2026

🧪 CI Insights

Here's what we observed from your CI run for b111d42.

❌ Job Failures

Pipeline Job Health on master Retries 🔍 CI Insights 📄 Logs
Continuous Continuous Testing: PyTest Healthy 0 View View

…st in pkeys.

The titles list was stored by reference in pkeys_orig_values, so in-place mutations bypassed dependency tracking. Replace with `_primary_en`/`_primary_he` string attrs that reliably detect changes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

sefaria/model/schema.py:238

  • The class 'Term' does not override 'eq', but adds the new attribute _primary_en.
    The class 'Term' does not override 'eq', but adds the new attribute _primary_he.
    The class 'Term' does not override 'eq', but adds the new attribute title_group.
    The class 'Term' does not override 'eq', but adds the new attribute name.
    The class 'Term' does not override 'eq', but adds the new attribute titles.
class Term(abst.AbstractMongoRecord, AbstractTitledObject):

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if self.is_new() and Term().load({'name': self.name}):
raise DuplicateRecordError(f"A Term with the name {self.name} already exists")
elif not self.is_new() and self.is_key_changed('name'):
raise ValidationError({"name": "This field cannot be changed."})
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising django.core.exceptions.ValidationError here will bypass the project’s catch_error_as_json handling (which only catches InputError) and can turn a user-caused validation issue into a 500. Prefer raising an InputError/DuplicateRecordError (or another sefaria.system.exceptions type) for consistency with the rest of the Mongo model layer.

Suggested change
raise ValidationError({"name": "This field cannot be changed."})
raise InputError({"name": "This field cannot be changed."})

Copilot uses AI. Check for mistakes.
Comment on lines 263 to 314
def load_by_primary_titles(self, en_title, he_title):
query = {
'titles': {
'$all': [{'$elemMatch': {
'text': t, 'primary': True
}} for t in [en_title, he_title]]
}
}
return self.load(query=query)

def _update_tracked_primary_titles(self):
self._primary_en = self.get_primary_title("en")
self._primary_he = self.get_primary_title("he")

def _set_pkeys(self):
self.set_titles(getattr(self, "titles", None))
self._update_tracked_primary_titles()
super()._set_pkeys()

def _set_derived_attributes(self):
self.set_titles(getattr(self, "titles", None))

def set_titles(self, titles):
self.title_group = TitleGroup(titles)

def _set_name(self):
name = base_name = self.get_primary_title()
terms = TermSet({'name': {'$regex': f'^{re.escape(name)}\d*$'}})
existing_names = {t.name for t in terms}
i = 1
while name in existing_names:
name = base_name + i
i += 1
self.name = name

def _normalize(self):
self.titles = self.title_group.titles
self._update_tracked_primary_titles()
if not hasattr(self, 'name'):
self._set_name()

def _validate(self):
super(Term, self)._validate()
# do not allow duplicates:
for title in self.get_titles():
other_term = Term().load_by_title(title)
if other_term and not self.same_record(other_term):
raise InputError("A Term with the title {} in it already exists".format(title))
# ensue uniqueness of primary titles together
same_titles_term = Term().load_by_primary_titles(self.get_primary_title(), self.get_primary_title('he'))
if same_titles_term and not self.same_record(same_titles_term):
raise DuplicateRecordError(f"A Term with the primary titles {self.get_primary_title()} and {self.get_primary_title('he')} already exists")
# do not allow duplicate names
if self.is_new() and Term().load({'name': self.name}):
raise DuplicateRecordError(f"A Term with the name {self.name} already exists")
elif not self.is_new() and self.is_key_changed('name'):
raise ValidationError({"name": "This field cannot be changed."})
Copy link

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior around primary-title pair uniqueness and automatic name generation (including collision handling) doesn’t appear to have direct test coverage. Adding tests for load_by_primary_titles() (including language-specific matching) and for _set_name() when a collision exists would help ensure future refactors don’t break term creation/validation.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants