diff options
author | Tobias Brunner <tobias@strongswan.org> | 2011-07-21 16:17:08 +0200 |
---|---|---|
committer | Tobias Brunner <tobias@strongswan.org> | 2011-07-21 16:17:08 +0200 |
commit | 4f3ca916c50b0e0cddc170cc80012c71497f368c (patch) | |
tree | c813c05f76dbb15ef4749afda4b8e0df6bcb11f7 /man | |
parent | d33f6f7dba24a0cf9d34f93d0d79543d41abb72a (diff) | |
download | strongswan-4f3ca916c50b0e0cddc170cc80012c71497f368c.tar.bz2 strongswan-4f3ca916c50b0e0cddc170cc80012c71497f368c.tar.xz |
Documentation about job priorities added to man page.
Also includes docs about IKE_SA_INIT dropping.
Diffstat (limited to 'man')
-rw-r--r-- | man/strongswan.conf.5.in | 160 |
1 files changed, 159 insertions, 1 deletions
diff --git a/man/strongswan.conf.5.in b/man/strongswan.conf.5.in index da05eb1af..9fd4f6dbc 100644 --- a/man/strongswan.conf.5.in +++ b/man/strongswan.conf.5.in @@ -151,6 +151,9 @@ Section to define file loggers, see LOGGER CONFIGURATION .BR charon.flush_auth_cfg " [no]" .TP +.BR charon.half_open_timeout " [30]" +Timeout in seconds for connecting IKE_SAs (also see IKE_SA_INIT DROPPING). +.TP .BR charon.hash_and_url " [no]" Enable hash and URL support .TP @@ -166,6 +169,14 @@ Size of the IKE_SA hash table .BR charon.inactivity_close_ike " [no]" Whether to close IKE_SA if the only CHILD_SA closed due to inactivity .TP +.BR charon.init_limit_half_open " [0]" +Limit new connections based on the current number of half open IKE_SAs (see +IKE_SA_INIT DROPPING). +.TP +.BR charon.init_limit_job_load " [0]" +Limit new connections based on the number of jobs currently queued for +processing (see IKE_SA_INIT DROPPING). +.TP .BR charon.install_routes " [yes]" Install routes into a separate routing table for established IPsec tunnels .TP @@ -502,6 +513,10 @@ Check daemon, libstrongswan and plugin integrity at startup .BR libstrongswan.leak_detective.detailed " [yes]" Includes source file names and line numbers in leak detective output .TP +.BR libstrongswan.processor.priority_threads +Subsection to configure the number of reserved threads per priority class +see JOB PRIORITY MANAGEMENT +.TP .BR libstrongswan.x509.enforce_critical " [yes]" Discard certificates with unsupported or unknown critical extensions .SS libstrongswan.plugins subsection @@ -538,7 +553,7 @@ Command to be sent to the Test IMV .BR libimcv.plugins.imc_test.retry " [no]" Do a handshake retry .TP -.BR libimcv.plugins.imc_test.retry_command +.BR libimcv.plugins.imc_test.retry_command Command to be sent to the Test IMV in the handshake retry .TP .BR libimcv.plugins.imv_test.rounds " [0]" @@ -814,6 +829,149 @@ Also include sensitive material in dumps, e.g. keys } .EE +.SH JOB PRIORITY MANAGEMENT +Some operations in the IKEv2 daemon charon are currently implemented +synchronously and blocking. Two examples for such operations are communication +with a RADIUS server via EAP-RADIUS, or fetching CRL/OCSP information during +certificate chain verification. Under high load conditions, the thread pool may +run out of available threads, and some more important jobs, such as liveness +checking, may not get executed in time. +.PP +To prevent thread starvation in such situations job priorities were introduced. +The job processor will reserve some threads for higher priority jobs, these +threads are not available for lower priority, locking jobs. +.SS Implementation +Currently 4 priorities have been defined, and they are used in charon as +follows: +.TP +.B CRITICAL +Priority for long-running dispatcher jobs. +.TP +.B HIGH +INFORMATIONAL exchanges, as used by liveness checking (DPD). +.TP +.B MEDIUM +Everything not HIGH/LOW, including IKE_SA_INIT processing. +.TP +.B LOW +IKE_AUTH message processing. RADIUS and CRL fetching block here +.PP +Although IKE_SA_INIT processing is computationally expensive, it is explicitly +assigned to the MEDIUM class. This allows charon to do the DH exchange while +other threads are blocked in IKE_AUTH. To prevent the daemon from accepting more +IKE_SA_INIT requests than it can handle, use IKE_SA_INIT DROPPING. +.PP +The thread pool processes jobs strictly by priority, meaning it will consume all +higher priority jobs before looking for ones with lower priority. Further, it +reserves threads for certain priorities. A priority class having reserved +.I n +threads will always have +.I n +threads available for this class (either currently processing a job, or waiting +for one). +.SS Configuration +To ensure that there are always enough threads available for higher priority +tasks, threads must be reserved for each priority class. +.TP +.BR libstrongswan.processor.priority_threads.critical " [0]" +Threads reserved for CRITICAL priority class jobs +.TP +.BR libstrongswan.processor.priority_threads.high " [0]" +Threads reserved for HIGH priority class jobs +.TP +.BR libstrongswan.processor.priority_threads.medium " [0]" +Threads reserved for MEDIUM priority class jobs +.TP +.BR libstrongswan.processor.priority_threads.low " [0]" +Threads reserved for LOW priority class jobs +.PP +Let's consider the following configuration: +.PP +.EX + libstrongswan { + processor { + priority_threads { + high = 1 + medium = 4 + } + } + } +.EE +.PP +With this configuration, one thread is reserved for HIGH priority tasks. As +currently only liveness checking and stroke message processing is done with +high priority, one or two threads should be sufficient. +.PP +The MEDIUM class mostly processes non-blocking jobs. Unless your setup is +experiencing many blocks in locks while accessing shared resources, threads for +one or two times the number of CPU cores is fine. +.PP +It is usually not required to reserve threads for CRITICAL jobs. Jobs in this +class rarely return and do not release their thread to the pool. +.PP +The remaining threads are available for LOW priority jobs. Reserving threads +does not make sense (until we have an even lower priority). +.SS Monitoring +To see what the threads are actually doing, invoke +.IR "ipsec statusall" . +Under high load, something like this will show up: +.PP +.EX + worker threads: 2 or 32 idle, 5/1/2/22 working, + job queue: 0/0/1/149, scheduled: 198 +.EE +.PP +From 32 worker threads, +.IP 2 +are currently idle. +.IP 5 +are running CRITICAL priority jobs (dispatching from sockets, etc.). +.IP 1 +is currently handling a HIGH priority job. This is actually the thread currently +providing this information via stroke. +.IP 2 +are handling MEDIUM priority jobs, likely IKE_SA_INIT or CREATE_CHILD_SA +messages. +.IP 22 +are handling LOW priority jobs, probably waiting for an EAP-RADIUS response +while processing IKE_AUTH messages. +.PP +The job queue load shows how many jobs are queued for each priority, ready for +execution. The single MEDIUM priority job will get executed immediately, as +we have two spare threads reserved for MEDIUM class jobs. + +.SH IKE_SA_INIT DROPPING +If a responder receives more connection requests per seconds than it can handle, +it does not make sense to accept more IKE_SA_INIT messages. And if they are +queued but can't get processed in time, an answer might be sent after the +client has already given up and restarted its connection setup. This +additionally increases the load on the responder. +.PP +To limit the responder load resulting from new connection attempts, the daemon +can drop IKE_SA_INIT messages just after reception. There are two mechanisms to +decide if this should happen, configured with the following options: +.TP +.BR charon.init_limit_half_open " [0]" +Limit based on the number of half open IKE_SAs. Half open IKE_SAs are SAs in +connecting state, but not yet established. +.TP +.BR charon.init_limit_job_load " [0]" +Limit based on the number of jobs currently queued for processing (sum over all +job priorities). +.PP +The second limit includes load from other jobs, such as rekeying. Choosing a +good value is difficult and depends on the hardware and expected load. +.PP +The first limit is simpler to calculate, but includes the load from new +connections only. If your responder is capable of negotiating 100 tunnels/s, you +might set this limit to 1000. The daemon will then drop new connection attempts +if generating a response would require more than 10 seconds. If you are +allowing for a maximum response time of more than 30 seconds, consider adjusting +the timeout for connecting IKE_SAs +.RB ( charon.half_open_timeout ). +A responder, by default, deletes an IKE_SA if the initiator does not establish +it within 30 seconds. Under high load, a higher value might be required. + .SH LOAD TESTS To do stability testing and performance optimizations, the IKEv2 daemon charon provides the load-tester plugin. This plugin allows to setup thousands of |