Skip to content

Issue with PyMongo and UUIDs #34

@jotelha

Description

@jotelha

In an up-to-date environment

$ pip freeze
alembic==1.8.1
apispec==6.0.2
attrs==22.1.0
boto3==1.26.8
botocore==1.29.8
cffi==1.15.1
click==8.1.3
click-plugins==1.1.1
coverage==6.5.0
cryptography==38.0.3
dnspython==2.2.1
dtool-cli==0.7.1
dtool-ecs==0.5.0
dtool-irods==0.10.1
-e git+ssh://git@github.com/jotelha/dtool-lookup-server.git@c348aefc77d6d872616be2c22492b4190d51fec9#egg=dtool_lookup_server
-e git+ssh://git@github.com/jotelha/dtool-lookup-server-retrieve-plugin-mongo.git@bc3eefee5c9d62546a23929d9e9a0a1f37735161#egg=dtool_lookup_server_retrieve_plugin_mongo
-e git+ssh://git@github.com/jotelha/dtool-lookup-server-search-plugin-mongo.git@4c456fd58556544742a108b656389e2cb51f5076#egg=dtool_lookup_server_search_plugin_mongo
dtool-s3==0.14.1
dtoolcore==3.18.2
exceptiongroup==1.0.1
Flask==2.2.2
Flask-Cors==3.0.10
Flask-JWT-Extended==4.4.4
flask-marshmallow==0.14.0
Flask-Migrate==3.1.0
Flask-PyMongo==2.3.0
flask-smorest==0.40.0
Flask-SQLAlchemy==3.0.2
greenlet==2.0.1
importlib-metadata==5.0.0
importlib-resources==5.10.0
iniconfig==1.1.1
itsdangerous==2.1.2
Jinja2==3.1.2
jmespath==1.0.1
Mako==1.2.3
MarkupSafe==2.1.1
marshmallow==3.19.0
marshmallow-sqlalchemy==0.28.1
packaging==21.3
pipdeptree==2.3.3
pkg_resources==0.0.0
pluggy==1.0.0
pycparser==2.21
PyJWT==2.6.0
pymongo==4.3.2
pyparsing==3.0.9
pytest==7.2.0
pytest-cov==4.0.0
python-dateutil==2.8.2
PyYAML==6.0
s3transfer==0.6.0
six==1.16.0
SQLAlchemy==1.4.44
tomli==2.0.1
urllib3==1.26.12
webargs==8.2.0
Werkzeug==2.2.2
zipp==3.10.0

two tests fail with

$ pytest
...
==================================================================================================== FAILURES =====================================================================================================
___________________________________________________________________________________________ test_dataset_register_route ___________________________________________________________________________________________

tmp_app_with_users = <FlaskClient <Flask 'dtool_lookup_server'>>

    def test_dataset_register_route(tmp_app_with_users):  # NOQA
    
        from dtool_lookup_server.utils import (
            get_admin_metadata_from_uri,
            get_readme_from_uri_by_user,
            lookup_datasets_by_user_and_uuid,
        )
    
        base_uri = "s3://snow-white"
        uuid = "af6727bf-29c7-43dd-b42f-a5d7ede28337"
        uri = "{}/{}".format(base_uri, uuid)
        dataset_info = {
            "base_uri": base_uri,
            "uuid": uuid,
            "uri": uri,
            "name": "my-dataset",
            "type": "dataset",
            "readme": "---\ndescription: test dataset",
            "manifest": {
                "dtoolcore_version": "3.7.0",
                "hash_function": "md5sum_hexdigest",
                "items": {
                    "e4cc3a7dc281c3d89ed4553293c4b4b110dc9bf3": {
                        "hash": "d89117c9da2cc34586e183017cb14851",
                        "relpath": "U00096.3.rev.1.bt2",
                        "size_in_bytes": 5741810,
                        "utc_timestamp": 1536832115.0
                    }
                }
            },
            "creator_username": "olssont",
            "frozen_at": "1536238185.881941",
            "annotations": {"software": "bowtie2"},
            "tags": ["rnaseq"],
            "number_of_items": 1,
            "size_in_bytes": 5741810,
        }
    
        for token in [dopey_token, sleepy_token]:
            headers = dict(Authorization="Bearer " + sleepy_token)
            r = tmp_app_with_users.post(
                "/dataset/register",
                headers=headers,
                data=json.dumps(dataset_info),
                content_type="application/json"
            )
            assert r.status_code == 401
    
        headers = dict(Authorization="Bearer " + grumpy_token)
        r = tmp_app_with_users.post(
            "/dataset/register",
            headers=headers,
            data=json.dumps(dataset_info),
            content_type="application/json"
        )
>       assert r.status_code == 201
E       assert 500 == 201
E        +  where 500 = <WrapperTestResponse streamed [500 INTERNAL SERVER ERROR]>.status_code

tests/test_dataset_routes.py:294: AssertionError
------------------------------------------------------------------------------------------------ Captured log call ------------------------------------------------------------------------------------------------
ERROR    dtool_lookup_server:app.py:1741 Exception on /dataset/register [POST]
Traceback (most recent call last):
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask/app.py", line 2525, in wsgi_app
    response = self.full_dispatch_request()
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask/app.py", line 1822, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_cors/extension.py", line 165, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/webargs/core.py", line 594, in wrapper
    return func(*args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_smorest/arguments.py", line 82, in wrapper
    return func(*f_args, **f_kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_smorest/response.py", line 90, in wrapper
    func(*args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_jwt_extended/view_decorators.py", line 154, in decorator
    return current_app.ensure_sync(fn)(*args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/dtool_lookup_server/dataset_routes.py", line 137, in register
    dataset_uri = register_dataset(dataset)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/dtool_lookup_server/utils.py", line 551, in register_dataset
    search.register_dataset(dataset_info.copy())
  File "/mnt/dat2/git/dtool/dtool-lookup-server-search-plugin-mongo/dtool_lookup_server_search_plugin_mongo/utils_search.py", line 155, in register_dataset
    return _register_dataset_descriptive_metadata(self.collection, dataset_info)
  File "/mnt/dat2/git/dtool/dtool-lookup-server-search-plugin-mongo/dtool_lookup_server_search_plugin_mongo/utils_search.py", line 58, in _register_dataset_descriptive_metadata
    exists = collection.find_one(query)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/collection.py", line 1452, in find_one
    for result in cursor.limit(-1):
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/cursor.py", line 1248, in next
    if len(self.__data) or self._refresh():
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/cursor.py", line 1165, in _refresh
    self.__send_message(q)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/cursor.py", line 1052, in __send_message
    response = client._run_operation(
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/_csot.py", line 105, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1330, in _run_operation
    return self._retryable_read(
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/_csot.py", line 105, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1448, in _retryable_read
    return func(session, server, sock_info, read_pref)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1326, in _cmd
    return server.run_operation(
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/server.py", line 100, in run_operation
    message = operation.get_message(read_preference, sock_info, use_cmd)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/message.py", line 388, in get_message
    request_id, msg, size, _ = _op_msg(
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/message.py", line 692, in _op_msg
    return _op_msg_uncompressed(flags, command, identifier, docs, opts)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/bson/binary.py", line 267, in from_uuid
    raise ValueError(
ValueError: cannot encode native uuid.UUID with UuidRepresentation.UNSPECIFIED. UUIDs can be manually converted to bson.Binary instances using bson.Binary.from_uuid() or a different UuidRepresentation can be configured. See the documentation for UuidRepresentation for more information.
______________________________________________________________________________ test_dataset_register_route_when_created_at_is_string ______________________________________________________________________________

tmp_app_with_users = <FlaskClient <Flask 'dtool_lookup_server'>>

    def test_dataset_register_route_when_created_at_is_string(tmp_app_with_users):  # NOQA
    
        from dtool_lookup_server.utils import (
            get_admin_metadata_from_uri,
            lookup_datasets_by_user_and_uuid,
        )
    
        base_uri = "s3://snow-white"
        uuid = "af6727bf-29c7-43dd-b42f-a5d7ede28337"
        uri = "{}/{}".format(base_uri, uuid)
        dataset_info = {
            "base_uri": base_uri,
            "uuid": uuid,
            "uri": uri,
            "name": "my-dataset",
            "type": "dataset",
            "readme": "---\ndescription: test dataset",
            "manifest": {
                "dtoolcore_version": "3.7.0",
                "hash_function": "md5sum_hexdigest",
                "items": {
                    "e4cc3a7dc281c3d89ed4553293c4b4b110dc9bf3": {
                        "hash": "d89117c9da2cc34586e183017cb14851",
                        "relpath": "U00096.3.rev.1.bt2",
                        "size_in_bytes": 5741810,
                        "utc_timestamp": 1536832115.0
                    }
                }
            },
            "creator_username": "olssont",
            "frozen_at": "1536238185.881941",
            "created_at": "1536238185.881941",
            "number_of_items": 1,
            "size_in_bytes": 5741810,
            "annotations": {"software": "bowtie2"},
            "tags": ["rnaseq"],
        }
    
        headers = dict(Authorization="Bearer " + grumpy_token)
        r = tmp_app_with_users.post(
            "/dataset/register",
            headers=headers,
            data=json.dumps(dataset_info),
            content_type="application/json"
        )
>       assert r.status_code == 201
E       assert 500 == 201
E        +  where 500 = <WrapperTestResponse streamed [500 INTERNAL SERVER ERROR]>.status_code

tests/test_dataset_routes.py:432: AssertionError
------------------------------------------------------------------------------------------------ Captured log call ------------------------------------------------------------------------------------------------
ERROR    dtool_lookup_server:app.py:1741 Exception on /dataset/register [POST]
Traceback (most recent call last):
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask/app.py", line 2525, in wsgi_app
    response = self.full_dispatch_request()
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask/app.py", line 1822, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_cors/extension.py", line 165, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/webargs/core.py", line 594, in wrapper
    return func(*args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_smorest/arguments.py", line 82, in wrapper
    return func(*f_args, **f_kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_smorest/response.py", line 90, in wrapper
    func(*args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_jwt_extended/view_decorators.py", line 154, in decorator
    return current_app.ensure_sync(fn)(*args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/dtool_lookup_server/dataset_routes.py", line 137, in register
    dataset_uri = register_dataset(dataset)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/dtool_lookup_server/utils.py", line 551, in register_dataset
    search.register_dataset(dataset_info.copy())
  File "/mnt/dat2/git/dtool/dtool-lookup-server-search-plugin-mongo/dtool_lookup_server_search_plugin_mongo/utils_search.py", line 155, in register_dataset
    return _register_dataset_descriptive_metadata(self.collection, dataset_info)
  File "/mnt/dat2/git/dtool/dtool-lookup-server-search-plugin-mongo/dtool_lookup_server_search_plugin_mongo/utils_search.py", line 58, in _register_dataset_descriptive_metadata
    exists = collection.find_one(query)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/collection.py", line 1452, in find_one
    for result in cursor.limit(-1):
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/cursor.py", line 1248, in next
    if len(self.__data) or self._refresh():
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/cursor.py", line 1165, in _refresh
    self.__send_message(q)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/cursor.py", line 1052, in __send_message
    response = client._run_operation(
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/_csot.py", line 105, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1330, in _run_operation
    return self._retryable_read(
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/_csot.py", line 105, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1448, in _retryable_read
    return func(session, server, sock_info, read_pref)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1326, in _cmd
    return server.run_operation(
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/server.py", line 100, in run_operation
    message = operation.get_message(read_preference, sock_info, use_cmd)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/message.py", line 388, in get_message
    request_id, msg, size, _ = _op_msg(
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/pymongo/message.py", line 692, in _op_msg
    return _op_msg_uncompressed(flags, command, identifier, docs, opts)
  File "/mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/bson/binary.py", line 267, in from_uuid
    raise ValueError(
ValueError: cannot encode native uuid.UUID with UuidRepresentation.UNSPECIFIED. UUIDs can be manually converted to bson.Binary instances using bson.Binary.from_uuid() or a different UuidRepresentation can be configured. See the documentation for UuidRepresentation for more information.
================================================================================================ warnings summary =================================================================================================
venv/lib/python3.8/site-packages/flask_marshmallow/__init__.py:34
  /mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_marshmallow/__init__.py:34: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
    __version_info__ = tuple(LooseVersion(__version__).version)

tests/test_admin_user_routes.py: 2 warnings
tests/test_base_uri_routes.py: 2 warnings
tests/test_cli.py: 6 warnings
tests/test_config.py: 1 warning
tests/test_config_routes.py: 2 warnings
tests/test_dataset_routes.py: 11 warnings
tests/test_lookup_datasets_by_user_and_uuid.py: 1 warning
tests/test_permission_routes.py: 2 warnings
tests/test_sql_dataset_utils.py: 1 warning
tests/test_sql_list_datasets_by_user.py: 1 warning
tests/test_summary_of_datasets_by_user.py: 1 warning
tests/test_timestamp_consistency.py: 1 warning
tests/test_user_routes.py: 1 warning
tests/test_utils_auth.py: 6 warnings
tests/test_utils_base_uri_management.py: 1 warning
tests/test_utils_get_annotations_from_uri_by_user.py: 1 warning
tests/test_utils_get_manifest_from_uri_by_user.py: 1 warning
tests/test_utils_get_readme_from_uri_by_user.py: 1 warning
tests/test_utils_permission_management.py: 1 warning
tests/test_utils_preprocess_query_base_uris.py: 1 warning
tests/test_utils_register_dataset.py: 4 warnings
tests/test_utils_user_management.py: 1 warning
  /mnt/dat2/git/dtool/dtool-lookup-server/venv/lib/python3.8/site-packages/flask_marshmallow/__init__.py:115: DeprecationWarning: The 'db' attribute is deprecated and will be removed in Flask-SQLAlchemy 3.1. The extension is registered directly as 'app.extensions["sqlalchemy"]'.
    db = app.extensions["sqlalchemy"].db

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================================================================= short test summary info =============================================================================================
FAILED tests/test_dataset_routes.py::test_dataset_register_route - assert 500 == 201
FAILED tests/test_dataset_routes.py::test_dataset_register_route_when_created_at_is_string - assert 500 == 201
==================================================================================== 2 failed, 52 passed, 50 warnings in 4.88s ====================================================================================

This UUID/pymongo-related failure was mentioned in #27 and #29.

31e71b6 illustrates a simple workaround by storing UUIDs as string. This, of course, is no sustainable solution.

Need to look into https://pymongo.readthedocs.io/en/stable/examples/uuid.html#handling-uuid-data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions