Skip to content

feat: New loader for XDI files #203

@pbeaucage

Description

@pbeaucage

While working on something else, I whipped up the attached function which loads XDI-format files, here specifically from BMM at NSLS2, but it seems to be a real standard. Would be cool to get this into a new loader, roughly mirroring the ESRF ID2 loader. The gist is that there is a large semantically meaningful header, that can point out at external hdf data sources.


def loadSingleScan(filepath):
    filepath = pathlib.Path(filepath)
    basepath = filepath.parent
    file = filepath.name
    
    metadata = {}
    header = None
    data_lines = []
    
    with open(filepath, 'r') as f:
        for line in f:
            if line.startswith('#'):
                # Extract metadata lines like "# Key: Value"
                if ':' in line:
                    key_value = line[1:].strip().split(':', 1)
                    if len(key_value) == 2:
                        key, value = key_value
                        metadata[key.strip()] = value.strip()
                header = line  # Will keep updating, so the last # line becomes the header
            elif line.strip() and not line.startswith('//') and not line.startswith('--'):
                data_lines.append(line)
    
    # Final header line (the second-to-last overall # line)
    column_names = header[1:].strip().split()
    
    # Read data into a DataFrame
    from io import StringIO
    df = pd.read_csv(StringIO(''.join(data_lines)), sep='\s+', names=column_names)
    
    # Now `df` contains your data and `metadata` has all the XDI metadata
    # Step 1: Promote DataFrame to xarray Dataset
    ds = df.set_index('energy').to_xarray()
    
    # Step 2: Load HDF5 image stack
    with h5py.File(basepath/metadata['Scan.pilatus100k_hdf5_file'], 'r') as f:
        image_data = f['entry/data/data'][()]  # shape: (293, 195, 487)
    # Confirm dimensions match DataFrame
    assert image_data.shape[0] == len(df), "Image stack and dataframe do not align in length!"
    # Step 3: Insert image as new variable
    ds['pilatus100k'] = (('energy', 'pix_y', 'pix_x'), image_data)
    ds.attrs.update(metadata)
    ds.energy.attrs['unit'] = 'eV'
    return ds

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions