-
Notifications
You must be signed in to change notification settings - Fork 222
Description
PR #1683 (I think) breaks example nvexec.maxwell_gpu_m, causing a seg fault at runtime (or a libstdc++ assertion failure if using the hardened standard library in GCC 15).
The buggy code is in classes stream_scheduler and multi_gpu_stream_scheduler and in their common base class stream_scheduler_env. The base class has this code:
struct stream_scheduler_env {
// ...
auto query(get_completion_scheduler_t<set_value_t>) const noexcept -> stream_scheduler;
// ....
};
// ...
inline auto stream_scheduler_env::query(get_completion_scheduler_t<set_value_t>) const noexcept
-> stream_scheduler {
return (const stream_scheduler&) *this;
}The query function assumes without checking that the derived class is a stream_scheduler.
The two derived classes contain this code:
struct stream_scheduler : private stream_scheduler_env {
// ...
using stream_scheduler_env::query;
// ...
// non-static data members:
context_state_t context_state_;
}; struct multi_gpu_stream_scheduler : private stream_scheduler_env {
// ...
using stream_scheduler_env::query;
// ...
// non-static data members:
int num_devices_{};
context_state_t context_state_;
};__read_query_t::operator() has return __attrs.query(_GetComplSch{});, which can call the query function in question with a multi_gpu_stream_scheduler. That essentially does a bit cast of a multi_gpu_stream_scheduler to a stream_scheduler, which doesn't work because the data members aren't compatible.
This was found with NVC++ testing, where example nvexec.maxwell_gpu_m fails consistently with a runtime seg fault. (The problem wasn't noticed earlier because there were other problems (on our side, not problems with stdexec) with our stdexec tests that were masking the failures.) NVHPC tracking: FS#38096