Skip to content

Bug: clean installs with legacy course data can lead to incorrect ElasticSearch initialization #198

@lpm0073

Description

@lpm0073

The LMS assumes that the ElasticSearch _mapping for the 'org' field course_info will be
{ "course_info" : { "mappings" : { "properties" : { "org" : { "type" : "keyword" } } } } }

However, depending on arbitrary data characteristics in your legacy course catalogue, this can potentially resolve to
{ "course_info" : { "mappings" : { "properties" : { "org" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } }

in which case, the following exception will be raised when you navigate to the 'Discover New' tab:
2025-04-21 17:31:02,921 ERROR 11 [search.elastic] [user 4] [ip 192.168.4.199] elastic.py:664 - error while searching index - RequestError(400, 'search_phase_execution_exception', {'error': {'root_cause': [{'type': 'illegal_argument_exception', 'reason': 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.'}], 'type': 'search_phase_execution_exception', 'reason': 'all shards failed', 'phase': 'query', 'grouped': True, 'failed_shards': [{'shard': 0, 'index': 'course_info', 'node': '4Y9AzFxmREqlE4tnWBVIPQ', 'reason': {'type': 'illegal_argument_exception', 'reason': 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.'}}], 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.', 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.'}}}, 'status': 400}) Traceback (most recent call last): File "/openedx/venv/lib/python3.11/site-packages/search/elastic.py", line 662, in search es_response = self._es.search(index=self._prefixed_index_name, body=body, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/client/utils.py", line 168, in _wrapped return func(*args, params=params, headers=headers, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/client/__init__.py", line 1670, in search return self.transport.perform_request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/transport.py", line 415, in perform_request raise e File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/transport.py", line 381, in perform_request status, headers_response, data = connection.perform_request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 277, in perform_request self._raise_error(response.status, raw_data) File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/connection/base.py", line 330, in _raise_error raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.') 2025-04-21 17:31:02,922 ERROR 11 [search.views] [user 4] [ip 192.168.4.199] views.py:214 - Search view exception when searching for for user 4: RequestError(400, 'search_phase_execution_exception', {'error': {'root_cause': [{'type': 'illegal_argument_exception', 'reason': 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.'}], 'type': 'search_phase_execution_exception', 'reason': 'all shards failed', 'phase': 'query', 'grouped': True, 'failed_shards': [{'shard': 0, 'index': 'course_info', 'node': '4Y9AzFxmREqlE4tnWBVIPQ', 'reason': {'type': 'illegal_argument_exception', 'reason': 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.'}}], 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.', 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.'}}}, 'status': 400}) Traceback (most recent call last): File "/openedx/venv/lib/python3.11/site-packages/search/views.py", line 184, in course_discovery results = course_discovery_search( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/search/api.py", line 148, in course_discovery_search results = searcher.search( ^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/search/elastic.py", line 662, in search es_response = self._es.search(index=self._prefixed_index_name, body=body, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/client/utils.py", line 168, in _wrapped return func(*args, params=params, headers=headers, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/client/__init__.py", line 1670, in search return self.transport.perform_request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/transport.py", line 415, in perform_request raise e File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/transport.py", line 381, in perform_request status, headers_response, data = connection.perform_request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/connection/http_urllib3.py", line 277, in perform_request self._raise_error(response.status, raw_data) File "/openedx/venv/lib/python3.11/site-packages/elasticsearch/connection/base.py", line 330, in _raise_error raise HTTP_EXCEPTIONS.get(status_code, TransportError)( elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [org] in order to load field data by uninverting the inverted index. Note that this can use significant memory.') 2025-04-21 17:31:02,933 ERROR 11 [django.request] [user None] [ip None] log.py:241 - Internal Server Error: /search/course_discovery/ [pid: 11|app: 0|req: 8/25] 192.168.5.113 () {76 vars in 3094 bytes} [Mon Apr 21 17:31:02 2025] POST /search/course_discovery/ => generated 54 bytes in 135 msecs (HTTP/1.1 500) 9 headers in 460 bytes (1 switches on core 0)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions