Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions docs/decisions/0005-migration-path-for-legacy-mechanisms.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
0005. Migration Path for Legacy User Grouping Mechanisms
########################################################

Status
******

**Draft** - 2025-06-03

Context
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be useful to add the following in this context section as context on how we got here -these are only some ideas that should be refined before including them-:

  • We've proposed a unified model with X, Y, Z structure that differentiates from the legacy in W, ...
  • This model must be able to replace the legacy models... and to do so, then this should happen... (support and behavior)
  • By unifying all these into a single model, we gain... so we propose migrating all legacy mechanisms gradually to this new unified model
  • The main difference in the approaches considered is how they relate to legacy mechanisms, which also affect the deprecation strategy and long-term maintainability...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestions! I updated the section to provide a bit more context. What do you think?

*******

Open edX currently uses several user grouping mechanisms (cohorts, teams, course groups), each with its own data models, business logic, storage, and integration points. This fragmentation results in:

- Maintenance and evolution complications.
- New functionality implementation difficulties.
- Interoperability and extensibility limitations.

These legacy mechanisms were not designed for reuse across contexts such as messaging, analytics, or advanced segmentation, and lack support for dynamic grouping based on user attributes or behavior.

To address these limitations, we proposed a **unified user grouping model**, as described in `ADR 2 <0002-user-groups-model-foundations.rst>`_, with a standardized structure that supports both static and dynamic groups, scoped at the course, organization, or platform level. Unlike legacy mechanisms, this unified system allows flexible group definitions and enables modular extensibility. It decouples user groups from specific platform features and enables reuse across diverse contexts (content gating, discussions, analytics, messaging, etc.).

To migrate from the legacy mechanisms to this new model, two paths were evaluated:

- **Cross-System Synchronization**: Introduces an abstraction layer that continuously translates the new model's state into the legacy mechanisms. This enables the new model to act as a central source while preserving backward compatibility by updating legacy structures in real time.
- **Behavior Replication**: Builds the new unified and independent grouping system that directly replicates the observable behavior of the legacy mechanisms within its own logic. Instead of integrating with or updating legacy mechanisms, it reproduces their functionality internally and gradually replaces them without requiring active synchronization.

The key difference between these two strategies lies in how they relate to the legacy mechanisms, which in turn affects the complexity of the migration process, the technical debt incurred, and the long-term maintainability of the grouping architecture.

Decision
********

We select the behavior replication approach, eliminating direct dependencies on legacy mechanisms. This choice enables a simpler, cleaner architecture with:

- Full independence from legacy mechanisms from day one.
- Elimination of complex synchronization or integration layers.
- Reduced technical debt and maintenance costs during migration.

Existing user-facing functionalities will be replicated in the new model with migration executed in clear, isolated phases to minimize risk. Activation will be controlled via feature flags, configurable per course, organization, or platform.

See `ADR 6 <0006-replication-of-legacy-mechanisms-behavior.rst>`_ for detailed rationale.

Consequences
************

- The new system can evolve independently, allowing greater flexibility.
- The responsibility for replicating legacy behavior lies entirely within the new model, which must be thoroughly validated.
- The transition can be carried out gradually, implementing one functionality at a time, allowing individual behavior validation and more targeted testing.
- Both new and legacy mechanisms can coexist during rollout, avoiding user disruption.
- Legacy mechanisms will be fully deprecated and removed post-transition, improving maintainability and extensibility. Courses that still rely on legacy grouping systems at the time of removal will not be automatically migrated. It will be the responsibility of course authors or site operators to manually transition their configurations to the new system before deprecation occurs. Failure to do so may result in the loss of grouping data or functionality associated with cohorts, teams, or enrollment track groups.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be useful to say that feature flags can allow running courses to finish with the legacy mechanism in the upgrade window, potentially cutting down on the problems of gated content grading and related issues. We may want to have an extended (2 release?) window of co-existence and a command to automatically exempt running courses that are using the legacy features at the feature flag level. What do you think? @BryanttV @mariajgrimaldi

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a mistake to break backwards compatibility with existing content without an automated migration. We have historically taken backwards compatibility of OLX very seriously, even while making major architectural changes like the XModule -> XBlock migration, the switch from the legacy courseware view to the Learning MFE, and the ongoing ModuleStore -> Learning Core migration. In each of those projects, we automatically migrated as much as we could, and then did our best to clearly announce each breaking change that we couldn't handle in the migration. I think the User Groups project should take the same approach.

Copy link
Member

@kdmccormick kdmccormick Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stepping back: the lack of automatic migration jumped out at me in my first pass through these ADRs, but I think I focused too much on that. Let me reframe.

I'm currently trying to understand which legacy grouping mechanisms this will replace, and it which ones it'll coexsit alongside in the long term. These are the grouping mechanisms I'm familiar with:

  • Content Groups a.k.a. UserPartition Groups -- This allows XBlocks to restrict their access to a set of with numerically-identified Groups, where each Group is a division of a numerically-identified UserPartition, where a UserPartition represents the abstract notion of dividing learners by some pluggable criteria (e.g. cohorting, teaming, enrollment track, etc.). In order to remain compatible with re-running and import/export, these Groups are not mapped to users or any other LMS-specific concept; instead, the XBlock-Group associations are published without any user information, and then the courseware runtime maps the Groups to concrete LMS things like Cohorts, Teams, Enrollment Tracks, etc.
  • Cohorts a.k.a. CourseGroups -- These are configured in the LMS. When enabled in a course, a UserParition is created for the cohorting feature, and each Cohort (CourseGroup) is mapped to one or more Content Group (UserPartitionGroup).
  • Teams -- Teamsets are configured in course settings (and persisted in OLX), and each teamset's teams are populated in the LMS by instructors or the users themselves, depending on the type of teamset. There is a beta flag (CONTENT_GROUPS_FOR_TEAMS) which, when enabled, allows each Teamset to function as a UserPartition and each Team to function as a UserPartitionGroup (but this violates the "no-LMS-data-in-UserPartitionGroups" principle I mentioned earlier, and I think we'll need to retroactively add some indirection there before globally enabling that flag).
  • Course Access Roles (instructor, TA, etc.) -- This is more about RBAC than about user grouping, but I bring it up because it sounds like UserGroups could include staff groups.

My main compatibility concern is around UserPartitionGroups. This system is baked into the modulestore records and OLX of every existing course, so I don't see breaking it as an option. That said, I'm eager to improve it and help make sure it works with UserGroups, and based on the POC Ty shared with me, it seems like you all are already aware of this system and have ideas of how to integrate it with UserPartitionGroups.

I'll keep studying these ADRs in the POC. In the meantime, I'll ask: are you thinking the the UserGroups system will be an umbrella which includes UserPartitioning feature, or are you thinking of UserGroups as the LMS system which is a counterpart to UserPartitioning? Or something else entirely?


Rejected Alternatives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another rejected decision was maintaining both the new model and legacy, I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That alternative wasn’t documented. Do you think we should include it as a rejected option as well? I can add a brief explanation of why it was not chosen.

*********************

Cross-System Synchronization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! I included the comparative table.

============================

This approach, like the selected one, builds on top of the new unified grouping system. However, it differs in that it maintains indirect synchronization with the legacy mechanisms through an abstraction layer.

The synchronization strategy involves monitoring changes in either system (new or legacy), interpreting those changes through registered evaluators, and propagating updates to maintain alignment. This ensures both systems reflect a consistent state, at the cost of added runtime logic and maintenance overhead.

This layer would be responsible for:

- **Translating the logic of the new system to legacy mechanisms**: Establishing a bi-directional synchronization layer that ensures both systems remain consistent. This abstraction layer would monitor changes in the unified model, such as group creation, updates to membership, or criteria changes. It would then propagate these changes to the corresponding legacy mechanisms.

Likewise, any modifications in the legacy mechanisms would also need to be captured and reflected back in the new model to maintain alignment. This translation mechanism would allow legacy features (e.g., content gating, discussions, ORA assignments) to continue operating using their existing infrastructure. They would be effectively controlled by the unified model behind the scenes.

- **Ensuring backward compatibility during the entire transition**: The platform must preserve full functional integrity of the legacy grouping mechanisms (cohorts, teams, course groups) while the new model is introduced. The abstraction layer would need to convert the unified model's definitions into updates to legacy models and APIs. This ensures that existing behaviors remain unchanged for instructors, learners, and third-party integrations.

- **Enabling gradual adoption while maintaining functional consistency**: Migrate to the new grouping model incrementally, activating it course-by-course or organization-wide using feature flags. During this phased adoption, the abstraction layer ensures both models can operate in parallel without conflict. This allows selective rollout, targeted validation, and fallback to legacy behavior if needed. All while maintaining consistent user experience and platform behavior.

Reasons for rejection:

- **Significant increase in technical complexity**: Maintaining bi-directional synchronization between two systems introduces risk of errors, logic duplication, and hard-to-debug issues.
- **Higher maintenance cost**: Any change in the platform or legacy models would also require updating the synchronization layer.
- **Interference with the evolution of the new model**: Depending on legacy mechanisms limits the ability of the new system to introduce more flexible criteria or rules.
- **Greater difficulty in isolating and testing the new system**: Requiring the presence of legacy mechanisms makes independent validation of the new model more complex.
- **Legacy cleanup becomes harder**: As long as active synchronization exists, legacy code cannot be removed without breaking dependencies.

Comparison Summary
------------------

The following table summarizes the key differences between the two migration strategies:

+-----------------------------+----------------------------------------------+------------------------------------------------+
| Aspect | Cross-System Synchronization | Behavior Replication |
+=============================+==============================================+================================================+
| Legacy Dependency | Requires maintaining legacy systems | No dependency on legacy systems |
+-----------------------------+----------------------------------------------+------------------------------------------------+
| Synchronization Complexity | High: requires bi-directional sync layer | None: new system operates independently |
+-----------------------------+----------------------------------------------+------------------------------------------------+
| Backward Compatibility | Full, via real-time updates to legacy state | Achieved by replicating observable behaviors |
+-----------------------------+----------------------------------------------+------------------------------------------------+
| Testing & Validation | Difficult: both systems must stay in sync | Easier: new model can be tested in isolation |
+-----------------------------+----------------------------------------------+------------------------------------------------+
| Migration Strategy | Gradual, but tightly coupled with legacy | Gradual, with clean separation |
+-----------------------------+----------------------------------------------+------------------------------------------------+
| Long-Term Maintenance | Higher effort due to dual-system complexity | Lower effort after transition is complete |
+-----------------------------+----------------------------------------------+------------------------------------------------+
| Time to Legacy Removal | Longer: active sync delays removal | Shorter: legacy can be phased out per feature |
+-----------------------------+----------------------------------------------+------------------------------------------------+

References
**********

- `Cross-System Synchronization Proposal <https://openedx.atlassian.net/wiki/x/AoBhJwE>`_
- `Behavior Replication Proposal <https://openedx.atlassian.net/wiki/x/AgDiKgE>`_