-
Notifications
You must be signed in to change notification settings - Fork 6
S3Source and Outputer #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Missing tail of events if number to be processed (-n) is not divisible by eventFlush parameter
Dr15Jones
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got up to the Source. Will stop until next week.
|
I think a mark down file describing the organization of the data within Ceph would be helpful. |
|
Here is a slightly radical idea for you. You do not actually require each data product stripe to contain the events in the same order. You only care that within a given block of a stripe the blocks for all data products hold the same set of events across all the stripes. With that in mind, you would not need to wait until all data products have serialized for an event before writing them out to the stripes. You just need to know for each given event which block the event belongs with and then for a data product's block within that stripe you have to also record where each event begins. That recording could be as simple as storing ever increasing counter which matches to the counter stored in the stripe containing the event's meta data. |
Order of magnitude speedup in serial section because compiler didn't figure out this was a memcpy with the old invocation
Some race is still around
…s3io Conflicts: Lane.cc
also const-ify some stuff
Ready to go