Added AVX1 support for salsa and chacha rounds#1
Added AVX1 support for salsa and chacha rounds#1kangaderoo wants to merge 7 commits intoghostlander:masterfrom
Conversation
Code is in C for better maintainabilty. ASM derived from these files might increase speed slightly. Current speed increase compared to SSE routines about 10%
Make the config work with the new files
|
Thanks, I plan to add the AVX/XOP assembly code in the future and may use your inline assembly as a reference. SSE2 4-way is also going to be improved. |
|
I was kind of wondering where your speed increase from the 4-way is The original CpuMiner had a scrypt 3-way and a SHA256 4-way, resulting Due to the mixing behavior (4 times a 4x4 matrix) of neo-scrypt it looks Unfortunately my development environment doesn't have AVX2, but the John Doering schreef op 2/17/2015 om 5:14 PM:
|
Increase hashing speed by running 3 calc in parallel. Eliminate simd latency by smart sequencing. ~25% speed increase observed.
Code is in C for better maintainabilty. ASM derived from these files
might increase speed slightly.
Current speed increase compared to SSE routines about 10%