Skip to content

why my custom inference result is wrong? #28

@swchoi9075

Description

@swchoi9075

I have downloaded 2025.06 one month data. and I used 13day's of them.
but RMS Error is so huge...

[what I've done..]
I created h5
f.create_dataset('fields', shape=(52, 20, 721, 1440), dtype='f')
and use h5 with parallel_copy_small_set.py's result..

The suspected cause
(while inference time, it converted 720... and use stats_v0's 0~19 channels)

  1. the stats_v0 npy channel num is 21
  2. the h5 shape is ( _, 21, 721, 1440) not ( _, 20, 721, 1440)
  3. inference.py 127th line's annotation: # needed to standardize wind data

/opt/conda/envs/fcnet-mpi/lib/python3.10/site-packages/timm/models/layers/init.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
warnings.warn(f"Importing from {name} is deprecated, please import via timm.layers", FutureWarning)
2025-07-24 08:14:07,318 - root - INFO - --------------- Versions ---------------
2025-07-24 08:14:07,322 - root - INFO - git branch: b'* master'
2025-07-24 08:14:07,325 - root - INFO - git hash: b'93360c1720a9f97aabf970689f21c9fad8737788'
2025-07-24 08:14:07,325 - root - INFO - Torch: 2.5.1
2025-07-24 08:14:07,325 - root - INFO - ----------------------------------------
2025-07-24 08:14:07,325 - root - INFO - ------------------ Configuration ------------------
2025-07-24 08:14:07,325 - root - INFO - Configuration file: /workspace/config/AFNO.yaml
2025-07-24 08:14:07,325 - root - INFO - Configuration name: afno_backbone
2025-07-24 08:14:07,326 - root - INFO - log_to_wandb True
2025-07-24 08:14:07,326 - root - INFO - lr 0.0005
2025-07-24 08:14:07,326 - root - INFO - batch_size 64
2025-07-24 08:14:07,326 - root - INFO - max_epochs 150
2025-07-24 08:14:07,326 - root - INFO - scheduler CosineAnnealingLR
2025-07-24 08:14:07,326 - root - INFO - in_channels [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
2025-07-24 08:14:07,326 - root - INFO - out_channels [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
2025-07-24 08:14:07,326 - root - INFO - orography False
2025-07-24 08:14:07,326 - root - INFO - orography_path None
2025-07-24 08:14:07,326 - root - INFO - exp_dir /pscratch/sd/s/shas1693/results/era5_wind
2025-07-24 08:14:07,326 - root - INFO - train_data_path /pscratch/sd/s/shas1693/data/era5/train
2025-07-24 08:14:07,326 - root - INFO - valid_data_path /pscratch/sd/s/shas1693/data/era5/test
2025-07-24 08:14:07,326 - root - INFO - inf_data_path /workspace/h5
2025-07-24 08:14:07,326 - root - INFO - time_means_path /workspace/stats_v0/time_means.npy
2025-07-24 08:14:07,326 - root - INFO - global_means_path /workspace/stats_v0/global_means.npy
2025-07-24 08:14:07,326 - root - INFO - global_stds_path /workspace/stats_v0/global_stds.npy
2025-07-24 08:14:07,326 - root - INFO - loss l2
2025-07-24 08:14:07,326 - root - INFO - num_data_workers 4
2025-07-24 08:14:07,326 - root - INFO - dt 1
2025-07-24 08:14:07,326 - root - INFO - n_history 0
2025-07-24 08:14:07,326 - root - INFO - prediction_type iterative
2025-07-24 08:14:07,326 - root - INFO - prediction_length 41
2025-07-24 08:14:07,326 - root - INFO - n_initial_conditions 5
2025-07-24 08:14:07,326 - root - INFO - ics_type default
2025-07-24 08:14:07,326 - root - INFO - save_raw_forecasts True
2025-07-24 08:14:07,326 - root - INFO - save_channel False
2025-07-24 08:14:07,326 - root - INFO - masked_acc False
2025-07-24 08:14:07,326 - root - INFO - maskpath None
2025-07-24 08:14:07,326 - root - INFO - perturb False
2025-07-24 08:14:07,326 - root - INFO - add_grid False
2025-07-24 08:14:07,326 - root - INFO - N_grid_channels 0
2025-07-24 08:14:07,326 - root - INFO - gridtype sinusoidal
2025-07-24 08:14:07,326 - root - INFO - roll False
2025-07-24 08:14:07,326 - root - INFO - num_blocks 8
2025-07-24 08:14:07,326 - root - INFO - nettype afno
2025-07-24 08:14:07,326 - root - INFO - patch_size 8
2025-07-24 08:14:07,326 - root - INFO - width 56
2025-07-24 08:14:07,326 - root - INFO - modes 32
2025-07-24 08:14:07,326 - root - INFO - target default
2025-07-24 08:14:07,326 - root - INFO - normalization zscore
2025-07-24 08:14:07,326 - root - INFO - log_to_screen True
2025-07-24 08:14:07,326 - root - INFO - save_checkpoint True
2025-07-24 08:14:07,326 - root - INFO - enable_nhwc False
2025-07-24 08:14:07,326 - root - INFO - optimizer_type FusedAdam
2025-07-24 08:14:07,326 - root - INFO - crop_size_x None
2025-07-24 08:14:07,326 - root - INFO - crop_size_y None
2025-07-24 08:14:07,326 - root - INFO - two_step_training False
2025-07-24 08:14:07,326 - root - INFO - plot_animations False
2025-07-24 08:14:07,326 - root - INFO - add_noise False
2025-07-24 08:14:07,326 - root - INFO - noise_std 0
2025-07-24 08:14:07,327 - root - INFO - world_size 1
2025-07-24 08:14:07,327 - root - INFO - interp 0
2025-07-24 08:14:07,327 - root - INFO - use_daily_climatology False
2025-07-24 08:14:07,327 - root - INFO - global_batch_size 64
2025-07-24 08:14:07,327 - root - INFO - experiment_dir /workspace/vis
2025-07-24 08:14:07,327 - root - INFO - best_checkpoint_path ./pretrained/backbone.ckpt
2025-07-24 08:14:07,327 - root - INFO - resuming False
2025-07-24 08:14:07,327 - root - INFO - local_rank 0
2025-07-24 08:14:07,327 - root - INFO - ---------------------------------------------------
2025-07-24 08:14:07,327 - root - INFO - Inference for 1 initial conditions
2025-07-24 08:14:07,328 - root - INFO - Getting file stats from /workspace/h5/june_2025_01_30.h5
2025-07-24 08:14:07,329 - root - INFO - Number of samples per year: 52
2025-07-24 08:14:07,329 - root - INFO - Found data at path /workspace/h5. Number of examples: 52. Image Shape: 720 x 1440 x 20
2025-07-24 08:14:07,329 - root - INFO - Delta t: 6 hours
2025-07-24 08:14:07,329 - root - INFO - Including 0 hours of past history in training at a frequency of 6 hours
2025-07-24 08:14:07,329 - root - INFO - Loading trained model checkpoint from ./pretrained/backbone.ckpt
/workspace/inference/inference.py:90: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
checkpoint = torch.load(checkpoint_fname)
2025-07-24 08:14:08,320 - root - INFO - Loading inference data
2025-07-24 08:14:08,320 - root - INFO - Inference data from /workspace/h5/june_2025_01_30.h5
2025-07-24 08:14:08,320 - root - INFO - Initial condition 1 of 1
2025-07-24 08:14:14,257 - root - INFO - Begin autoregressive inference
2025-07-24 08:14:14,635 - root - INFO - Predicted timestep 0 of 41. z500 RMS Error: 0.0, ACC: 1.0
2025-07-24 08:14:14,732 - root - INFO - Predicted timestep 1 of 41. z500 RMS Error: 23958.98828125, ACC: 0.9773921966552734
2025-07-24 08:14:14,758 - root - INFO - Predicted timestep 2 of 41. z500 RMS Error: 27976.2890625, ACC: 0.9453849792480469
2025-07-24 08:14:14,784 - root - INFO - Predicted timestep 3 of 41. z500 RMS Error: 26665.060546875, ACC: 0.9637321829795837
2025-07-24 08:14:14,811 - root - INFO - Predicted timestep 4 of 41. z500 RMS Error: 26789.9453125, ACC: 0.9708621501922607
2025-07-24 08:14:14,837 - root - INFO - Predicted timestep 5 of 41. z500 RMS Error: 28808.201171875, ACC: 0.9635037183761597
2025-07-24 08:14:14,864 - root - INFO - Predicted timestep 6 of 41. z500 RMS Error: 31491.37890625, ACC: 0.9400643706321716
2025-07-24 08:14:14,890 - root - INFO - Predicted timestep 7 of 41. z500 RMS Error: 34021.41796875, ACC: 0.8926259875297546
2025-07-24 08:14:14,916 - root - INFO - Predicted timestep 8 of 41. z500 RMS Error: 36068.20703125, ACC: 0.8125800490379333
2025-07-24 08:14:14,943 - root - INFO - Predicted timestep 9 of 41. z500 RMS Error: 37589.2890625, ACC: 0.7005799412727356
2025-07-24 08:14:14,969 - root - INFO - Predicted timestep 10 of 41. z500 RMS Error: 38651.22265625, ACC: 0.5751280188560486
2025-07-24 08:14:14,995 - root - INFO - Predicted timestep 11 of 41. z500 RMS Error: 39385.0625, ACC: 0.4583129286766052
2025-07-24 08:14:15,022 - root - INFO - Predicted timestep 12 of 41. z500 RMS Error: 39882.0078125, ACC: 0.36526814103126526
2025-07-24 08:14:15,048 - root - INFO - Predicted timestep 13 of 41. z500 RMS Error: 40219.43359375, ACC: 0.2955637574195862
2025-07-24 08:14:15,074 - root - INFO - Predicted timestep 14 of 41. z500 RMS Error: 40436.6875, ACC: 0.24684610962867737
2025-07-24 08:14:15,101 - root - INFO - Predicted timestep 15 of 41. z500 RMS Error: 40581.66796875, ACC: 0.21359123289585114
2025-07-24 08:14:15,127 - root - INFO - Predicted timestep 16 of 41. z500 RMS Error: 40691.6171875, ACC: 0.18940924108028412
2025-07-24 08:14:15,154 - root - INFO - Predicted timestep 17 of 41. z500 RMS Error: 40749.03125, ACC: 0.17652156949043274
2025-07-24 08:14:15,180 - root - INFO - Predicted timestep 18 of 41. z500 RMS Error: 40788.625, ACC: 0.16609759628772736
2025-07-24 08:14:15,206 - root - INFO - Predicted timestep 19 of 41. z500 RMS Error: 40819.84765625, ACC: 0.15827195346355438
2025-07-24 08:14:15,233 - root - INFO - Predicted timestep 20 of 41. z500 RMS Error: 40846.30078125, ACC: 0.15329009294509888
2025-07-24 08:14:15,259 - root - INFO - Predicted timestep 21 of 41. z500 RMS Error: 40861.32421875, ACC: 0.1500459611415863
2025-07-24 08:14:15,285 - root - INFO - Predicted timestep 22 of 41. z500 RMS Error: 40861.8046875, ACC: 0.14828626811504364
2025-07-24 08:14:15,312 - root - INFO - Predicted timestep 23 of 41. z500 RMS Error: 40863.82421875, ACC: 0.1472945660352707
2025-07-24 08:14:15,338 - root - INFO - Predicted timestep 24 of 41. z500 RMS Error: 40869.00390625, ACC: 0.14692695438861847
2025-07-24 08:14:15,365 - root - INFO - Predicted timestep 25 of 41. z500 RMS Error: 40872.6015625, ACC: 0.14648151397705078
2025-07-24 08:14:15,391 - root - INFO - Predicted timestep 26 of 41. z500 RMS Error: 40869.30078125, ACC: 0.14627744257450104
2025-07-24 08:14:15,417 - root - INFO - Predicted timestep 27 of 41. z500 RMS Error: 40868.77734375, ACC: 0.14612910151481628
2025-07-24 08:14:15,444 - root - INFO - Predicted timestep 28 of 41. z500 RMS Error: 40872.546875, ACC: 0.1461002081632614
2025-07-24 08:14:15,470 - root - INFO - Predicted timestep 29 of 41. z500 RMS Error: 40874.8984375, ACC: 0.14584366977214813
2025-07-24 08:14:15,496 - root - INFO - Predicted timestep 30 of 41. z500 RMS Error: 40871.1796875, ACC: 0.1457022875547409
2025-07-24 08:14:15,523 - root - INFO - Predicted timestep 31 of 41. z500 RMS Error: 40869.74609375, ACC: 0.14553768932819366
2025-07-24 08:14:15,549 - root - INFO - Predicted timestep 32 of 41. z500 RMS Error: 40873.7890625, ACC: 0.14558562636375427
2025-07-24 08:14:15,576 - root - INFO - Predicted timestep 33 of 41. z500 RMS Error: 40875.8515625, ACC: 0.14545489847660065
2025-07-24 08:14:15,602 - root - INFO - Predicted timestep 34 of 41. z500 RMS Error: 40870.30078125, ACC: 0.1454624980688095
2025-07-24 08:14:15,628 - root - INFO - Predicted timestep 35 of 41. z500 RMS Error: 40868.55078125, ACC: 0.14540629088878632
2025-07-24 08:14:15,655 - root - INFO - Predicted timestep 36 of 41. z500 RMS Error: 40873.26171875, ACC: 0.14550819993019104
2025-07-24 08:14:15,681 - root - INFO - Predicted timestep 37 of 41. z500 RMS Error: 40876.38671875, ACC: 0.14536383748054504
2025-07-24 08:14:15,707 - root - INFO - Predicted timestep 38 of 41. z500 RMS Error: 40873.484375, ACC: 0.14535737037658691
2025-07-24 08:14:15,734 - root - INFO - Predicted timestep 39 of 41. z500 RMS Error: 40872.5078125, ACC: 0.14526529610157013
2025-07-24 08:14:15,760 - root - INFO - Predicted timestep 40 of 41. z500 RMS Error: 40874.73828125, ACC: 0.14521168172359467
2025-07-24 08:14:18,579 - root - INFO - Saving files at /workspace/vis/autoregressive_predictions_z500_vis.h5

If anyone knows the cause, please let me know.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions