Skip to content
David Ansa edited this page Jan 15, 2024 · 1 revision

Welcome to the dataDisk Wiki!

Overview

dataDisk is a Python package designed to simplify the creation and execution of data processing pipelines. It provides a flexible framework for defining sequential tasks, applying transformations, and validating data. Additionally, it includes a ParallelProcessor for efficient parallel execution.

Table of Contents

  1. Getting Started
  2. Key Components
  3. Installation
  4. Usage
  5. Examples
  6. Contributing
  7. License

Getting Started

Installation

To use dataDisk in your project, you can install it using pip:

pip install dataDisk

Clone this wiki locally