Skip to content

Data imputation with collaborative filtering and latent factor models for wind farms time series data

License

Notifications You must be signed in to change notification settings

giobbu/collaborative-data-imputation

Repository files navigation

DOI Status

Collaborative-Data-Imputation

Image Alt Text

Wind Power Data Reconstruction

In power system operations and electricity markets, missing data is a pervasive challenge in practice. Missing observations can arise from sensor faults, communication failures, or maintenance outages. This issue becomes particularly critical when large-scale, data-driven approaches are applied to point and probabilistic wind power forecasting, where data quality directly affects model performance and therefore decision making.

To address this, data imputation techniques—such as k-nearest neighbors (k-NN) and factor models—are commonly employed to reconstruct incomplete datasets before training forecasting models. Effective imputation ensures data completeness and consistency, which are essential for the reliability and accuracy of modern machine-learning–based forecasting methods.

MLflow Experiments

MLflow is used to systematically compare and evaluate missing-data imputation algorithms, making it easier to identify the best-performing approach for a given dataset.

  1. Install UV (Dependency Manager)

    pip install uv
  2. Install Project Dependencies

    Install all required dependencies, including MLflow::

    uv sync
  3. Start Mlflow server

    Launch the MLflow UI locally:

    uv run mlflow ui
  4. Run the Experiments

    Set paramaters to test in config.py

    nano config.py

    Execute the experiment pipeline:

    uv run main.py
  5. View Experiment Results:

    Open your browser and navigato to http://127.0.0.1:5000.

    From the MLflow UI, you can explore:

    • Experiment runs
    • Model parameters and hyperparameters
    • Evaluation metrics
    • Logged artifacts (e.g., reconstructed datasets and plots)

About

Data imputation with collaborative filtering and latent factor models for wind farms time series data

Topics

Resources

License

Stars

Watchers

Forks

Languages