Skip to content

Generic class definitions for handling upload of data from table specs #24

@azimov

Description

@azimov

Currently the uploadResults function is limited in terms of allowing extendability that could allow packages to easily implement customizable upload functions.

The idea here is that these classes could allow users to modify data before upload, validate data or perform other tasks in the upload pipeline in a way that works well in the generic case and allows customisable complexity in a consistent way.

The initial usecase is to support a complicated example from the requirements of PLP here that works but could be implemented in a more consistent way.

Some requirements to gradually implement:

  • Define what generic classes are needed
  • Implement default behaviour that works as now (upload data and validate or overwrite according to specifications)
  • Support complex modifications that allow modifications of uploaded data
  • Improve upon current implementation by supporting improved loading concepts (load tables, table partition for large tables, cross platform indexes)
  • Allow loading from object storage like AWS S3
  • Fast loading with multiprocess operations
  • Support transactional loading of data - on error rollback certain tables or entire result sets if desired

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions