Replies: 15 comments 3 replies
-
SubsystemsRequired
Nice to have
|
Beta Was this translation helpful? Give feedback.
-
|
Related to: |
Beta Was this translation helpful? Give feedback.
-
|
Responsible developers: @marcpaterno and @sabasehrish. |
Beta Was this translation helpful? Give feedback.
-
|
Here are some types of chunking needed for DUNE and what implications chunking may have. It has a focus on FD charge data and Wire-Cell implementations so is definitely not comprehensive. Terms
Wire-Cell charge waveform simulationBasic transformation: The
There are several types of chunking relevant to this simulation:
Wire-Cell charge waveform signal processingBasic transformation: Signal waveforms represent a reconstruction of the distribution of drifted ionization charge in (transverse) space vs time dimensions of each tomographic wire-plane view. The samples of a signal waveform are in units of number of (drifted) ionization electrons per tick per channel. The signal waveforms are highly sparse and can be represented in a space-efficient way either with sparse arrays or as compressed dense arrays (zero padding the sparse regions). There are two types of chunking that are relevant:
Wire-Cell charge sim+sigprocAs a special case, when both simulation and signal processing are needed, it is desired (at least for large scale production) to NOT expose Wire-Cell 3D charge imagingThis process reconstructs, with coarse resolution, locations in space/time likely to contain ionization electron signal. It is a per-APA transformation and essentially a streaming algorithm. Thus, robust against space-chunking at APA level and any reasonable time-chunk. Wire-Cell charge cluster stitchingWC (and other) reconstruction chains form "clusters" of some type that represent high resolution reconstruction of ionization locations. In WC and for the case of compact (nominal, not extended) data, clusters are constructed first on a per-TPC basis. They are then "stitched" across the two TPCs of one APA and then across neighboring APAs. Each type of stitching requires assembly of any chunk-level clusters such that the boundaries are spanned. This can be pair-wise at the 2TPC->APA stitching and then all APA level clusters can be assembled for the cross-APA stitching. Finding clusters from extended data poses a problem in the face of chunking due to a given set of blobs that should become a single cluster landing on a chunk boundary. Some possible solutions:
Wire-Cell Charge-Light matchingCharge clusters and "flashes" reconstructed from the optical detection system must be matched in space and time in order to absolutely locate the cluster. The DUNE FD design does not include optical boundaries at the TPC or APA level and so the matching is done with whole-detector charge and light information. Any prior chunking of these data must be such to allow the required assembly. Like with clustering, chunking in time may be required for input clusters and/or flashes and similar solutions can be considered ("chunk and hope" vs "streaming alg"). Cross-chain mergingDUNE has multiple, independent reco chains. Eg Wire-Cell and Pandora both split off after signal processing in order to implement different strategies. It is necessary to allow data products from one chain to "cross over" to another. This is needed for performing comparisons and so that one chain simply input results from the other to form a subsequent hybrid chain. Each consumer at the merge will impose some requirements related to the chunk boundaries of the data products from each stream. Even in the unlikely case that identical chunk boundaries existed on both streams, the node consuming the two streams may have special needs. Eg, it may require to consume a FIFO queue of some depth of data products from each stream. |
Beta Was this translation helpful? Give feedback.
-
|
To provide some context, we discussed these slides at yesterday's meeting to start the discussion. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @brettviren . Saba and I are just getting back to looking at the |
Beta Was this translation helpful? Give feedback.
-
|
Hi @brettviren and @absolution1, we're trying to move forward on the Phlex design, and we'd like to hear from you whether @marcpaterno and @sabasehrish's updated wire-cell workflow document accurately captures one of the use cases—i.e. the use case that Brett included above with the title "Wire-Cell charge waveform simulation". Could you take a look please and give us your thoughts? Thanks very much. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @knoepfel et al and sorry for slow response. The granularity of this example is smaller than the current simulation implemented with Wire-Cell toolkit running inside art/larsoft. It would "bust apart" the WCT graph execution for no perceived benefit. As a thought exercise to draw out some patterns, the example is perhaps okay, but I'd not want to see this particular graph implemented in phlex. It is not useful to reinvent WCT. Rather the focus should be on how phlex can run WCT and other "payloads". Also, the note describes a "time-binned depos" being converted into a "drift-binned depos". That's not sufficient for how the physics works. There is not a clean one-to-one mapping between any possible pre and post drift binnings because the drift distance/time swap causes order to change. The note assumes "all SNB depos in memory". I don't know if that is even feasible. It certainly is if one only considered SN interactions, but 100s of radiological and cosmics will be a lot of data. Instead I suggest we consider ways to operate in a "chunked streaming" mode throughout. This might start with a stream of G4 final state kinematics for individual interactions over an extended time. Each interaction can be fed through Geant4 one at a time to produce a set of depos. The sets of depos may overlap and so need some kind of windowed / streamed sort. The content of the window is bound and determined by the data (similar windowed sort is done in WCT drifter). The time ordered depos can be streamed into WCT sim and WCT sim would output time ordered stream of chunks of ADC waveform. In this serial-streamed mode, the "leakage" can be handled inside of WCT and not exposed to phlex. OTOH, this sequential stream will lead to rather long-running jobs. Simulating 100s of 1 APA is order a CPU-week of compute not counting G4 time. If monopolizing a single CPU for order week is not feasible then we may consider a scatter-gather approach. In this case we may segment the ordered depo stream, scatter those segment to many WCT sim jobs. In this case the output of these must not yet be "leakage free" ADC but be signal+noise waveform chunks at voltage level and with "leakage" tails. The tails must be snipped and transferred and added to the subsequent chunk in time and then the result (again, voltage level waveform) sent into WCT digitizer to produce (leakage free) ADC waveform chunks. Final gather of the resulting ADC waveform chunks would/could be done in order to assemble contiguous runs of chunks, eg 1 APA-second per file (approx 10 GB). Though, actually, DUNE already generally does not want to stop sim at ADC but immediately follow on with signal processing to avoid pointlessly storing large ADC data. So, the above scatter-gather gets a bit more complicated. In any case, this is a more practical granularity and data flow patterns to consider. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the response, @brettviren.
It is neither our desire nor intent to reinvent WCT. We are trying to understand the details of various WCT workflows that Phlex may need to support. It is certainly possible that a WCT workflow could be represented as a "black box" to the framework—this is (more-or-less) how things work with art/LArSoft jobs. You're well aware of the awkward back-and-forth between art and WCT through We're going to chat a bit on our end, and then I'd like to propose a joint Zoom meeting between the WCT folks and the Phlex developers. It's good to have a written record of our conversations (via these GitHub discussions), but we are likely getting to the point where a face-to-face meeting will be more efficient. Thanks for your patience, and stay tuned. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @knoepfel A zoom chat sounds good. Let me try to explain my thinking about "it's not useful to reinvent WCT" as that statement may have come off wrong. My concerns are all technical (and driven to minimize future effort). I think you hit on a core issue which is indeed the nature of the interface layer between framework and any given "toolkit". Toolkit here means WCT but also Pandora or any other implementation that has its own data model and possibly also its own execution model (eg, the optional use of the WCT's graph execution engines). The data model interface is more fundamental in my mind. Each side of that interface has different requirements on its own data model. The requirements for existing toolkits are already baked in to their implementations and we must keep those as immutable. It then becomes a challenge to PHLEX to provide its own data model that first works with all the existing toolkits' model and second that passing data through the interface between framework and toolkit model is fast (for some definition of "fast"). I think the "works" part poses no serious challenge. We can always come up with something. I think "fast" must be measured by comparing time for data to traverse framework->toolkit->framework and toolkit->framework->toolkit paths. That is the time spent in the ->'s and in just "framework" and in just "toolkit" must be all compared. What we have today in art/larsoft is a data interface that requires serializing of data between framework data model (meaning larsoft) and toolkit data model. Eg, in the And here is where my concerns about granularity of WCT graphs in PHLEX comes to play. For many individual WCT graph nodes, even the expensive serialization interface poses insignificant overhead. But for some nodes, this overhead would dominate. And the number of these "too fast" nodes are increasing as we develop more small-scope GPU-dominated nodes. These fast nodes implicitly rely on fast data transfer to be relevant. With WCT's So, with PHLEX, if we want fine-grained toolkit, we should think of a different framework data model that does not required data serialization. The only model I can think of that has a chance to be "fast" is one based on data encapsulation. Happily, a PHELX encapsulation data model is a good fit to WCT's data model which is based on an abstract base class hierarchy of interface classes rooted in a single This then enables a PHLEX data model that encapsulates WCT (and other toolkit?) data objects via lightweight pointer-like instances. We might think of a A really nice side effect of exposing the WCT leaf data interface types is that it enables people to develop new "PHLEX-native" nodes that can directly consume WCT data model types without writing full-blown-and-phlex-wrapped WCT flow graph node types. This would "steal" developer "mind share" from the WCT world but whatever lowers the bar for people developing what they need is a usually good thing. Another benefit with this "encapsulation" data model interface is that we can probably get away with developing a single, general purpose, templated PHLEX equivalent to the many type-specific data converters in |
Beta Was this translation helpful? Give feedback.
-
|
Hi all,
I’m just returning for holiday and catching up.
If there’s a zoom call please loop me in.
Cheers,
Dom
… On 4 Sep 2025, at 13:42, Brett Viren ***@***.***> wrote:
Hi @knoepfel <https://github.com/knoepfel>
A zoom chat sounds good.
Let me try to explain my thinking about "it's not useful to reinvent WCT" as that statement may have come off wrong. My concerns are all technical (and driven to minimize future effort).
I think you hit on a core issue which is indeed the nature of the interface layer between framework and any given "toolkit".
Toolkit here means WCT but also Pandora or any other implementation that has its own data model and possibly also its own execution model (eg, the optional use of the WCT's graph execution engines).
The data model interface is more fundamental in my mind.
Each side of that interface has different requirements on its own data model. The requirements for existing toolkits are already baked in to their implementations and we must keep those as immutable.
It then becomes a challenge to PHLEX to provide its own data model that first works with all the existing toolkits' model and second that passing data through the interface between framework and toolkit model is fast (for some definition of "fast").
I think the "works" part poses no serious challenge. We can always come up with something.
I think "fast" must be measured by comparing time for data to traverse framework->toolkit->framework and toolkit->framework->toolkit paths. That is the time spent in the ->'s and in just "framework" and in just "toolkit" must be all compared.
What we have today in art/larsoft is a data interface that requires serializing of data between framework data model (meaning larsoft) and toolkit data model. Eg, in the larewirecell WCT nodes we convert every bit of data between WCT IFrame and larsoft vector<raw::RawDigit> or vector<recob:Wire>. This satisfies the "works" but satisfying the "fast" depends on the nature of the job.
And here is where my concerns about granularity of WCT graphs in PHLEX comes to play.
For many individual WCT graph nodes, even the expensive serialization interface poses insignificant overhead. But for some nodes, this overhead would dominate. And the number of these "too fast" nodes are increasing as we develop more small-scope GPU-dominated nodes. These fast nodes implicitly rely on fast data transfer to be relevant. With WCT's TbbFlow, message passing by TBB's flow_graph is measured in MHz which makes data transfer times insignificant even for our fasted GPU nodes. If we inserted larwirecell style serialization as a fine-grained transfer method, it would largely degrade the benefits of developing these fast GPU nodes. Indeed, we take pains to even avoid a tensor leaving the GPU as the WCT data object that "holds" the torch::Tensor is transferred between flow graph nodes.
So, with PHLEX, if we want fine-grained toolkit, we should think of a different framework data model that does not required data serialization. The only model I can think of that has a chance to be "fast" is one based on data encapsulation.
Happily, a PHELX encapsulation data model is a good fit to WCT's data model which is based on an abstract base class hierarchy of interface classes rooted in a single IData. Furthermore, WCT data objects are passed between WCT nodes as shared_ptr and even type erased all the way down to boost::any inside the WCT graph execution engines.
This then enables a PHLEX data model that encapsulates WCT (and other toolkit?) data objects via lightweight pointer-like instances. We might think of a PhlexToolkitData<T> PHLEX data type (hopefully with some nicer name). The T might be as type free as boost::any or expose some level of toolkit-specific typing, eg base shared_ptt<IData> for WCT on up to leaf data interface types like shared_ptr<IFrame> or shared_ptr<ITorchTensorSet> (the main data type used by the GPU-heavy nodes).
A really nice side effect of exposing the WCT leaf data interface types is that it enables people to develop new "PHLEX-native" nodes that can directly consume WCT data model types without writing full-blown-and-phlex-wrapped WCT flow graph node types. This would "steal" developer "mind share" from the WCT world but whatever lowers the bar for people developing what they need is a usually good thing.
Another benefit with this "encapsulation" data model interface is that we can probably get away with developing a single, general purpose, templated PHLEX equivalent to the many type-specific data converters in larwirecell that are required for that "serialization" data model interface.
—
Reply to this email directly, view it on GitHub <#3 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACS5GRY7NAL4RI77X6HFMF33RAXR7AVCNFSM6AAAAABTXDRCJGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMZQG4ZDQOI>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
|
@knoepfel From listening to your nice presentation at the DUNE collab meeting, I think the "data encapsulation" approach I described above is well aligned with PHLEX data model. It makes me think about the possibility of (re)interpreting existing WCT graphs defined in Jsonnet to be executed as a PHLEX (sub)graph. IOW, I can start to see the shape of a new, 3rd WCT graph execution engine which looks like the current Pgraph and TbbFlow to the WCT side but looks like a "subgraph definer" to PHLEX (which would provide the graph execution engine). This could allow for a more fine-grained, but still generic, interface layer compared to the pattern in |
Beta Was this translation helpful? Give feedback.
-
|
We agree that the core issue in question is the nature of the interface between the framework and the algorithms. Our understanding is that WCT commonly uses data that are essentially vectors of shared pointers to some interface type. The pointed to objects, of course, must be of some concrete type. If this is correct, then the current early version of Phlex can likely already use the style of data products from WCT. We have a test in Phlex that demonstrates it will work correctly when passing std::vector<std::unique_ptr<> objects between algorithms (where Abstract is an abstract class). More specifically, an algorithm that expects a const reference to a vector of shared pointers to Base can be declared to Phlex, and Phlex would handle propagating the data type correctly. An algorithm can also return such a type. If a Phlex workflow is going to use algorithms that expect LArSoft-style types (e.g., vectorraw::RawDigit) and other algorithms that expect WCT-style types (e.g., WireCell::ITrace::vector), then translation from one to the other is needed. We are imagining at least two ways of doing this, one of which might be automatically scheduled by Phlex if the LArSoft and WCT types model the same data-product concept (we can say more about that another time). A Phlex workflow that closely matches those done by larwirecell could then consist of two or more components. The first is a translator that turns a LArSoft type into the input needed by the WCT-based algorithms to follow. Next would be one (or more) Phlex nodes that deal only with WCT-defined data types. The entire WCT workflow could be expressed as one Phlex node (following the LArSoft->WCT translation), or it may be desirable to factorize the WCT workflow more to allow for data-product provenance-tracking, taking advantage of framework-scheduling, etc. If translation back to LArSoft types is needed for follow-up algorithms, then last would come another translator from WCT types to LArSoft types. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @marcpaterno. Yes, this is essentially correct. Though the unit of data passed between WCT data flow graph nodes is always a I think your view of sometimes applying WC/LS data translation and sometimes passing WC data as-is is good. And, indeed, it is basically the WC vs LS data models and their inter-conversion could be an entire subgroup topic. There is some intersection and quite a lot of difference between them and writing converters is always non-trivial and non-fun (though doable). One approach that might be fruitful going forward is to define a generic data model that is flexible enough to faithfully represent the info in WC and LS (and other) data models. WCT actually does this with its own data model in the "WCT tensor data model" (link below). It consists of a generic "low level" spec which is a HDF5-inspired and is comprised of dense multi-dimensional tensors and for each tensor a JSON-like metadata object. The "high level' TDM spec then maps each complex, structured WC data type to the low-level TDM. Something like this, perhaps better thought out by more people than just me, could become sort of a "schema bus" that allows many disparate toolkits share data most easily. https://github.com/WireCell/wire-cell-toolkit/blob/master/aux/docs/tensor-data-model.org |
Beta Was this translation helpful? Give feedback.
-
|
We are closing this discussion as it was intended to cover data-chunking explorations for FY25. That effort is now complete, but we want to continue the WireCell discussion in the context of early adoption (see discussion #14). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
DUNE US S&C R&D item 101
Data chunking is intended to process a logical data product that is too large to fit in memory at once. This demonstrator requires several things:
std::span<T>vs.std::vector<T>could imply that some data can be chunked for an algorithm and some cannot.To produce a demonstrator we are introducing a concept of chunk-able data product (e.g. a sequence of waveforms), in general a chunk-able data product will be a sequence of something.
Beta Was this translation helpful? Give feedback.
All reactions