aboutsummaryrefslogtreecommitdiffstats
path: root/src/libstrongswan/plugins/chapoly/chapoly_drv_ssse3.c
Commit message (Collapse)AuthorAgeFilesLines
* Use standard unsigned integer typesAndreas Steffen2016-03-241-33/+33
|
* chapoly: Process two Poly1305 blocks in parallel in SSSE3 driverMartin Willi2015-07-121-85/+291
| | | | | | | | | | By using a derived key r^2 we can improve performance, as we can do loop unrolling and slightly better utilize SIMD instructions. Overall ChaCha20-Poly1305 performance increases by ~12%. Converting integers to/from our 5-word representation in SSE does not seem to pay off, so we work on individual words.
* chapoly: Process four ChaCha20 blocks in parallel in SSSE3 driverMartin Willi2015-07-121-16/+207
| | | | | As we don't have to shuffle the state in each ChaCha round, overall performance for ChaCha20-Poly1305 increases by ~40%.
* chapoly: Add an SSSE3 based driverMartin Willi2015-06-291-0/+470
We always build the driver on x86/x64, but enable it only if SSSE3 support is detected during runtime. Poly1305 uses parallel 32-bit multiplication operands yielding a 64-bit result, for which two can be done in parallel in SSE. This is minimally faster than multiplication with 64-bit operands, and also works on 32-bit builds not having a __int128 result type. On a 32-bit architecture, this is more than twice as fast as the portable driver, and on 64-bit it is ~30% faster.