Skip to content

Database versions after 2010 are not supported and do not fail cleanly #35

@GPHemsley-RELX

Description

@GPHemsley-RELX

# Versions
VERSION_3 = 0x00
VERSION_4 = 0x01
VERSION_5 = 0x02
VERSION_2010 = 0x03
ALL_VERSIONS = {VERSION_3: 3, VERSION_4: 4, VERSION_5: 5, VERSION_2010: 2010}
NEW_VERSIONS = [VERSION_4, VERSION_5, VERSION_2010]

version = head.jet_version
if version in NEW_VERSIONS:
if version == VERSION_4:
self.version = ALL_VERSIONS[VERSION_4]
elif version == VERSION_5:
self.version = ALL_VERSIONS[VERSION_5]
elif version == VERSION_2010:
self.version = ALL_VERSIONS[VERSION_2010]
self.page_size = PAGE_SIZE_V4
else:
if not version == VERSION_3:
LOGGER.error(f"Unknown database version {version} Trying to parse database as version 3")
self.version = ALL_VERSIONS[VERSION_3]
self.page_size = PAGE_SIZE_V3
LOGGER.info(f"DataBase version {version}")

Versions after "2010" (0x03) are not explicitly supported, and then fall back to being parsed as version 3 (0x00) and fail with an uncaught exception instead of a clean error message.

According to https://github.com/mdbtools/mdbtools/blob/master/HACKING.md#database-definition-page, 0x04 indicates Access 2013, 0x05 indicates Access 2016, and 0x06 indicates Access 2019. Presumably future versions of Access will eventually increase that number further.

Those versions should probably be supported better (perhaps process as the latest version supported instead of the earliest?) or at least fail cleanly.

As it stands now, I get a KeyError when attempting to read a 0x06 file:

  File "/.../access_parser/access_parser.py", line 44, in __init__
    self.catalog = self._parse_catalog()
                   ^^^^^^^^^^^^^^^^^^^^^
  File "/.../access_parser/access_parser.py", line 112, in _parse_catalog
    catalog_page = self._tables_with_data[2 * self.page_size]
                   ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
KeyError: 4096

I also get errors that say "Unknown database version 6 Trying to parse database as version 3" and "Failed to parse data page".

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions