Skip to content

Parse the data provided by the data source simulation into the DB #13

@cchwala

Description

@cchwala

Data upload from a simulated CML data source is now working. Next step is to parse the data to the DB.

Main questions:

  1. We already have a container for parsing but it is not doing something. Should we implement the parser there or is it better to integrate it into one of the other services (the flask webserver)? Probably not because CPU load could spike for the parser. But this needs to be decided first.
  2. Parsing and handling of metadata and raw data has to be done separately. That should be easy to do, but has to be take into account, in particular the aspect that metadata and raw data need to be linked via a CML ID and a sublink_id. This linkage needs to be documented somewhere but it might also be enough to only parse data to the DB that has this clear linkage via CML IDs.
  3. Raw data and metadata files need to be moved to an archive directory once they have been parsed succesfully and if not there needs to be an intermediate (quarantine?) directory to keep them.
  4. There must be a way to test the parsers and the result in the DB which must be separate from a running production system.

Implementation plan:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions