Skip to content

Not enough RAM for (md.Data(name=system.name)[-1]).to_xarray() #90

@DivineMassacre

Description

@DivineMassacre

Hello,

I encountered a problem with RAM overflow when using something like:

system = func(parameters)
drive = md.Data(name=system.name)[-1]
xarray = drive.to_xarray()

where func() calls function with system initialisation and TimeDrive with particular parameters.

For a small system there is no problem to convert drive data to xarray.DataArray for further analysis or export to other formats. But when a system is large RAM overflows and conversion to xarray.DataArray object is impossible even using chunking and Dask.

For example, I have 64GB RAM and my system dimensions are 30x30 um laterally (XY) with 20 nm thickness (Z) and the cell sizes are 25x25x20 nm (XYZ). TimeDrive is 5 ns long with 2 ps time step (2500 files).

The main reason to use xarray for analysis in my case is to implement a spatially weighted mean for magnetisation components in lateral dimensions instead of spatially uniform mean provided by drive.table.data[].values. Particularly, I tried to use xarray chunking with Dask virtual cluster client initialised to obtain a spatially weighted mean with 2D Gaussian function:

CHUNK_SIZES = {
    't': 20,     
    'x': 750,  
    'y': 750,  
    'z': 1,
    'vdims': 3
}

system = func(parameters)
ds = (md.Data(name=system.name)[-1]).to_xarray().chunk(CHUNK_SIZES)

weights_2d = np.exp(
    -0.5 * (
        ((ds.x - x0) / sigmax)**2 + 
        ((ds.y - y0) / sigmay)**2
    )
).chunk({'x': 750, 'y': 750})
result = ds.weighted(weights_2d).mean(dim=['x', 'y'])

I checked that my code works when the cell sizes are increased to 250x250x20 nm. But with 25x25x20 nm it overflows likely at the step ds = (md.Data(name=system.name)[-1]).to_xarray().chunk(CHUNK_SIZES). So, I think chunking is useless to solve this problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions