Skip to content

Conversation

@leventov
Copy link
Contributor

@leventov leventov commented Nov 28, 2016

This change makes concise bitset iteration from 1.5 to 3 times faster, but it requires Integer.numberOfTrailingZeros() be a 1-tick intrinsic which is present on Intel Haswell+, while apparently we are running Druid historicals on AWS R3 instances, which are Ivy Bridge.

int trailingZeros = Integer.numberOfTrailingZeros(word);
offset += trailingZeros;
buffer[len++] = offset;
word >>>= trailingZeros;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're shaving operations, you've already cleared the high bit here, do you really need unsigned shift?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@leventov leventov changed the title Optimize LiteralAndZeroFillExpander.resetLiteral() Optimize LiteralAndZeroFillExpander.resetLiteral() [don't merge] Dec 9, 2016
@leventov
Copy link
Contributor Author

leventov commented Dec 9, 2016

Added [don't merge] to title because it's not intended to be merged right now, because "the hardware is not ready".

@leventov leventov force-pushed the iteration-haswell-optimization branch from b7caac4 to 54d0e90 Compare December 9, 2016 01:23
@leventov leventov changed the base branch from concise-intersection-union-optimizations to master December 9, 2016 01:25
@drcrallen
Copy link
Contributor

@leventov does it SLOW down if that hardware intrinsic isn't present?

@leventov
Copy link
Contributor Author

leventov commented Feb 7, 2017

@drcrallen yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants