Skip to content

Conversation

@kstppd
Copy link
Collaborator

@kstppd kstppd commented Sep 6, 2024

So in this PR there is some pruning in the SplitVector code to enable the use of external allocators such as Umpire. This PR enables Hashinator as well to propagate the allocators to SplitVector. A citation methods has also been added and as well as some changes to the README.

  • The split::split_host_allocator is not needed anymore an replace with std::allocator
  • The metadata size and capacity are now allocated by the template allocator in splitvector. This enables us to use a standardized allocators but comes with the hacky part of some extra type casting that needs to take place. In cases where sizeof(T) is enormous this approach is not efficient, but since Hashinator and SplitVectors are meant to be used with trivial types this is rather unlikely.

@kstppd kstppd marked this pull request as ready for review September 27, 2024 07:47
Copy link

@markusbattarbee markusbattarbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look through tests yet, but some commentary and fix suggestions.

Copy link

@markusbattarbee markusbattarbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

}
return elements.size();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up - what I mean is that with Allocator support, will the old versions (which do not template allocators) ever be called anymore? Or could they be safely deleted?

@markusbattarbee
Copy link

What do you think, how much work would it be to test this also with MallocMC? That would be nice to report. :)

Also, in case you didn't notice, there's some merge conflicts that still need resolving.

Copy link

@markusbattarbee markusbattarbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few major things still, please.

}
return elements.size();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of code duplication :( Couldn't you put the default allocator as the default argument to this templated function? (same thing as in split_tools.h line 104)

HOSTONLY void copyMetadata(SplitInfo* dst, split_gpuStream_t s = 0) {
SPLIT_CHECK_ERR(split_gpuMemcpyAsync(&dst->size, _size, sizeof(size_t), split_gpuMemcpyDeviceToHost, s));
SPLIT_CHECK_ERR(split_gpuMemcpyAsync(&dst->capacity, _capacity, sizeof(size_t), split_gpuMemcpyDeviceToHost, s));
SPLIT_CHECK_ERR(split_gpuMemcpyAsync(&dst->size, _info, sizeof(SplitInfo), split_gpuMemcpyDeviceToHost, s));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still needs changing to point to just dst

assert(0 && "Splitvector has a catastrophic failure trying to resize on device.");
}
if (construct) {
if constexpr (!std::is_trivially_copy_constructible_v<T> ){

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in that case there needs to be some other workaround - by default, a vector of a given size has all elements set to zero. This must still hold true for SplitVector. If necessary, you can add:

if (_construct) {
   for (size_t i = size(); i < newSize; ++i) {
      data[i] = T();
   }
}

or something similar, I guess. Ugly, but potentially needs to be done.
Or can you check somehow if the allocator has a construct method? Make an issue about it on the Umpire repo?


// Allocate with Mempool
const size_t memory_for_pool = 8 * nBlocks * sizeof(uint32_t);
splitStackArena mPool(memory_for_pool, s);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

@kstppd
Copy link
Collaborator Author

kstppd commented Apr 12, 2025

What do you think, how much work would it be to test this also with MallocMC? That would be nice to report. :)
I am now having a look on how that can be interfaces in one of our unit tests


// Allocate with Mempool
const size_t memory_for_pool = 8 * nBlocks * sizeof(uint32_t);
splitStackArena mPool(memory_for_pool, s);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the allocator should be the one handling the memory pools.

}
#if defined(__HIP__) && !defined(AMD_COHERENCE_FINE)
int device;
SPLIT_CHECK_ERR(split_gpuGetDevice(&device));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

@markusbattarbee
Copy link

In general I would like to see less code duplication as that leads to potential issues in future development, having to fix things in several places, but I guess it's out of my hands now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants