Skip to content

rex is >7x slower than hsds #169

@ssolson

Description

@ssolson

I wrote the script at the bottom of this issue to spot check the performance of using hsds vs rex when I noticed rex taking significantly longer to run than hsds for the same call.

This issue is really just a question to if you guys have an idea as to why this is or if this is to be expected for some reason?

Comparison of Execution Times (in seconds):

On average, the HSDS method is faster by a factor of 7.62.

HSDS Method:

  • Minimum Time: 0.681
  • Maximum Time: 0.762
  • Average Time: 0.722

Rex Method:

  • Minimum Time: 4.958
  • Maximum Time: 6.380
  • Average Time: 5.498
from rex import WindX
import h5pyd
import pandas as pd
import time

def measure_hsds_execution_time():
    start_time = time.time()

    f = h5pyd.File("/nrel/wtk/conus/wtk_conus_2014.h5", 'r')
    time_index = pd.to_datetime(f['time_index'][...].astype(str))
    print(time_index)

    return time.time() - start_time

def measure_rex_execution_time():
    start_time = time.time()
    wtk_file = '/nrel/wtk/conus/wtk_conus_2014.h5'
    with WindX(wtk_file, hsds=True) as f:
        time_index = f.time_index
        print(time_index)

    return time.time() - start_time

# Function to calculate min, max, and average times
def calculate_stats(times):
    min_time = min(times)
    max_time = max(times)
    avg_time = sum(times) / len(times)
    return min_time, max_time, avg_time

# Pause for 5 seconds between calls
def wait():
    time.sleep(5)

# Running the script 5 times and recording execution times
hsds_execution_times = []
rex_execution_times = []

for _ in range(5):
    hsds_execution_times.append(measure_hsds_execution_time())
    wait()
    rex_execution_times.append(measure_rex_execution_time())
    wait()

# Calculating stats for each method
hsds_min, hsds_max, hsds_avg = calculate_stats(hsds_execution_times)
rex_min, rex_max, rex_avg = calculate_stats(rex_execution_times)

# Printing comparison
print("\nComparison of Execution Times (in seconds):\n")
print(f"HSDS Method:")
print(f"  Minimum Time: {hsds_min:.3f}")
print(f"  Maximum Time: {hsds_max:.3f}")
print(f"  Average Time: {hsds_avg:.3f}")

print(f"\nRex Method:")
print(f"  Minimum Time: {rex_min:.3f}")
print(f"  Maximum Time: {rex_max:.3f}")
print(f"  Average Time: {rex_avg:.3f}")

# Comparing the average times and calculating the speed difference
if hsds_avg < rex_avg:
    speed_difference = rex_avg / hsds_avg
    print(f"\nOn average, the HSDS method is faster by a factor of {speed_difference:.2f}.")
else:
    speed_difference = hsds_avg / rex_avg
    print(f"\nOn average, the Rex method is faster by a factor of {speed_difference:.2f}.")

Metadata

Metadata

Assignees

No one assigned

    Labels

    topic-hsdsIssues/pull requests related to hsds integration

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions