| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
This allows us to work with deterministic values for testing purposes.
|
|
|
|
|
|
|
|
|
|
| |
The performance impact is not measurable, as the compiler loads these variables
in xmm registers in unrolled loops anyway.
However, we avoid loading these sensitive keys onto the stack. This happens for
larger key schedules, where the register count is insufficient. If that key
material is not on the stack, we can avoid to wipe it explicitly after
crypto operations.
|
|
|
|
|
|
| |
While the required members are aligned in the struct as required, on 32-bit
platforms the allocator aligns the structures itself to 8 bytes only. This
results in non-aligned struct members, and invalid memory accesses.
|
| |
|
|
|
|
|
|
| |
If the assertion contains a modulo (%) operation, test_fail_msg() handles
this as printf() format specifier. Pass the assertion string as argument for
an explicit "%s" in the format string, instead.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While we could use posix_memalign(3), that is not fully portable. Further, it
might be difficult on some platforms to properly catch it in leak-detective,
which results in invalid free()s when releasing such memory.
We instead use a simple wrapper, which allocates larger data, and saves the
padding size in the allocated header. This requires that memory is released
using a dedicated function.
To reduce the risk of invalid free() when working on corrupted data, we fill up
all the padding with the padding length, and verify it during free_align().
|
|
|
|
|
| |
While associated data is usually not that large, in some specific cases
this can bring a significant performance boost.
|
|
|
|
| |
Increases performance by another ~30%.
|
|
|
|
| |
Increases overall performance by ~25%.
|
|
|
|
|
| |
This gives not much more than ~5% increase in performance, but allows us to
improve further.
|
| |
|
|
|
|
|
| |
Compared to the cmac plugin using AESNI-CBC as backend, this improves
performance of AES-CMAC by ~45%.
|
|
|
|
|
| |
Compared to the xcbc plugin using AESNI-CBC as backend, this improves
performance of AES-XCBC by ~45%.
|
|
|
|
| |
Due to the serial nature of the CBC mac, this brings only a marginal speedup.
|
| |
|
|
|
|
|
|
|
| |
CTR can be parallelized, and we do so by queueing instructions to the processor
pipeline. While we have enough registers for 128-bit decryption, the register
count is insufficient to hold all variables with larger key sizes. Nonetheless
is 4-way parallelism faster, depending on key size between ~10% and ~25%.
|
|
|
|
|
| |
This allows us to unroll loops and hold the key schedule in local (register)
variables. This brings an impressive speedup of ~45%.
|
| |
|
|
|
|
|
|
|
| |
CBC decryption can be parallelized, and we do so by queueing instructions
to the processor pipeline. While we have enough registers for 128-bit
decryption, the register count is insufficient to hold all variables with
larger key sizes. Nonetheless is 4-way parallelism faster, roughly by ~8%.
|
|
|
|
|
|
| |
This allows us to unroll loops, and use local (register) variables for the
key schedule. This improves performance slightly for encryption, but a lot
for reorderable decryption (>30%).
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
We missed test vectors for 192/256-bit key vectors for ICV8/12, and should
also have some for larger associated data chunk.
|
|
|
|
|
|
| |
We don't have any where plain or associated data is not a multiple of the block
size, but it is likely to find bugs here. Also, we miss some ICV12 test vectors
using 128- and 192-bit key sizes.
|
|
|
|
|
|
| |
We previously didn't pass the key size during algorithm registration, but this
resulted in benchmarking with the "default" key size the crypter uses when
passing 0 as key size.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Currently supported is x86/x64 via cpuid() for some common features.
|
|
|
|
|
|
| |
We previously checked for older library versions without locking support at
all. But newer libraries can be built in single-threading mode as well, where
we have to care about the locking.
|
| |
|
|
|
|
|
|
| |
Real AEADs directly provide a suitable IV generator, but traditional crypters
do not. For some (stream) ciphers, we should use sequential IVs, for which
we pass an appropriate generator to the AEAD wrapper.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
With OpenSSL commit 929b0d70c19f60227f89fac63f22a21f21950823 setting an empty
key fails if no previous key has been set on that HMAC.
In 9138f49e we explicitly added the check we remove now, as HMAC_Update()
might crash if HMAC_Init_ex() has not been called yet. To avoid that, we
set and check a flag locally to let any get_mac() call fail if set_key() has
not yet been called.
|
|
|
|
|
|
|
|
|
|
|
| |
sem_init() is deprecated on OS X, and it actually fails with ENOSYS. Using our
wrapped semaphore object is not an option, as it relies on the thread cleanup
that we can't rely on at this stage.
It is unclear why startup synchronization is required, as we can allocate the
thread ID just before creating the pthread. There is a chance that we allocate
a thread ID for a thread that fails to create, but the risk and consequences
are negligible.
|
|
|
|
|
| |
As we make no use of htonl() and friends, this is unneeded, but actually
prevents a Windows build.
|
|
|
|
|
| |
When building with C11 support, TIME_UTC is used for timespec_get() and
defined in <time.h>. Undefine TIME_UTC for our own internal use in asn1.c.
|
| |
|
|
|
|
|
| |
This was implicitly done by the seed length check before 58dda5d6, but we
now require an explicit check to avoid that unsupported use.
|