Solaris TCP/IP Kernel Settings That Affect Fail Over Transitions

This section discusses Solaris TCP/IP kernel settings that affect how long a Solaris client takes to fail over from one server to the next in an application cluster when the server is forcibly removed from the network.

Background

After implementing application server clustering on Solaris, some have experienced long delays (as much as 60 minutes) from the client application failover from one server to the next occurs. This appears when someone tests by pulling the active server's network cable. At first, the client application appears frozen and unusable. This may even convince administrators that this application suite causes this delay.

While this application is single-threaded and blocks waiting for a response from the server, the actual culprit is the TCP/IP kernel tuning. The default retries and the amount of wait time between retries are the significant contributor to the duration of this wait. For an explanation of TCP window size and acknowledgements, see Douglas Comer's work1.

In contrast to the cable-pulling experiment, this application’s clients fail over to the next server in the cluster without much delay when testers kill the server's application server process. This is because the TCP/IP kernel is still active on the “failed” server and responds to the client that the ports the application server had open and listening have been closed. Therefore, the client's TCP/IP does not go through the TCP retry logic and reports back to the client application that the server is no longer responding. The Oware client application then attempts to connect to the application server cluster and another server in the cluster responds to the connection request.

Sample Implementation

The following sample implementation is provided as a guideline. Changes to the Solaris TCP kernel need to be reviewed by the Solaris Administrators and Network Engineers responsible for the production network to ensure that these sample changes do not negatively impact their existing network design criteria. Many of these attributes where updated based on the reference material from Jens Vöckler2.

The following script tunes the Solaris TCP kernel:

#!/sbin/sh

#

# Tuning Script for TCP Applications

# Version 1.4

# Originally sourced from Colin Bitterfield

# Excerpts from various sources

 

# Additional updates to support Dorado Software Redcell Client reconnect

# To Install this script

# Copy file to /etc/rc2.d

# chown root:sys S99tcp-performance

# chmod 744 S99tcp-performance

#

#

VERSION="1.4"

######################################################################

 

case "$1" in

start)

echo "TCP Performance script ${VERSION}"

/usr/sbin/ndd -set /dev/ip ip_ire_arp_interval 60000

# /usr/sbin/ndd -set /dev/arp arp_cleanup_interval 6000

/usr/sbin/ndd -set /dev/arp arp_cleanup_interval 30000

/usr/sbin/ndd -set /dev/tcp ip_ignore_redirect 1

/usr/sbin/ndd -set /dev/tcp tcp_deferred_ack_interval 1

/usr/sbin/ndd -set /dev/tcp tcp_conn_grace_period 500

/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q 8096

/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q0 8096

/usr/sbin/ndd -set /dev/tcp tcp_conn_req_min 1

/usr/sbin/ndd -set /dev/tcp tcp_cwnd_max 65534

/usr/sbin/ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 16000

# /usr/sbin/ndd -set /dev/tcp tcp_ip_abort_cinterval 60000

/usr/sbin/ndd -set /dev/tcp tcp_ip_abort_cinterval 10000

# /usr/sbin/ndd -set /dev/tcp tcp_ip_abort_interval 60000

/usr/sbin/ndd -set /dev/tcp tcp_ip_abort_interval 20000

/usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 90000

/usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 32768

# /usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_initial 3000

/usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_initial 500

# /usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_max 10000

# note Sun recommends making tcp_ip_abort_interval 4x this

/usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_max 5000

# /usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_min 3000

# note Sun recommends making tcp_rexmit_interval_max at least 8x this

/usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_min 500

/usr/sbin/ndd -set /dev/tcp tcp_slow_start_initial 2

/usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 20000

/usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 32768

# /usr/sbin/ndd -set /dev/tcp tcp_sack_permitted 2

 

# echo "NOTICE: Check the following parameters value in /etc/system"

# echo " * set tcp:tcp_conn_hash_size=8192"

# echo " * set rlim_fd_max=4096"

# echo " * set rlim_fd_cur=2000"

 

cmdtext="setting"

;;

default)

/usr/sbin/ndd -set /dev/ip ip_ire_arp_interval 1200000

/usr/sbin/ndd -set /dev/arp arp_cleanup_interval 300000

/usr/sbin/ndd -set /dev/tcp ip_ignore_redirect 0

/usr/sbin/ndd -set /dev/tcp tcp_conn_grace_period 0

/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q 128

/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q0 1024

/usr/sbin/ndd -set /dev/tcp tcp_conn_req_min 1

/usr/sbin/ndd -set /dev/tcp tcp_cwnd_max 1048576

/usr/sbin/ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 675000

/usr/sbin/ndd -set /dev/tcp tcp_ip_abort_cinterval 180000

/usr/sbin/ndd -set /dev/tcp tcp_ip_abort_interval 480000

/usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 7200000

/usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 24576

/usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_initial 3000

/usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_max 60000

/usr/sbin/ndd -set /dev/tcp tcp_rexmit_interval_min 400

/usr/sbin/ndd -set /dev/tcp tcp_slow_start_initial 4

/usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 240000

/usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 16384

 

cmdtext="resetting (kernel default)"

;;

show)

echo "ip_ire_arp_interval `/usr/sbin/ndd /dev/ip ip_ire_arp_interval`"

echo "arp_cleanup_interval `/usr/sbin/ndd /dev/arp arp_cleanup_interval`"

echo "ip_ignore_redirect `/usr/sbin/ndd /dev/tcp ip_ignore_redirect`"

echo "tcp_conn_grace_period `/usr/sbin/ndd /dev/tcp tcp_conn_grace_period`"

echo "tcp_conn_req_max_q `/usr/sbin/ndd /dev/tcp tcp_conn_req_max_q`"

echo "tcp_conn_req_max_q0 `/usr/sbin/ndd /dev/tcp tcp_conn_req_max_q0`"

echo "tcp_conn_req_min `/usr/sbin/ndd /dev/tcp tcp_conn_req_min`"

echo "tcp_cwnd_max `/usr/sbin/ndd /dev/tcp tcp_cwnd_max`"

echo "tcp_fin_wait_2_flush_interval `/usr/sbin/ndd /dev/tcp tcp_fin_wait_2_flush_interval`"

echo "tcp_ip_abort_cinterval `/usr/sbin/ndd /dev/tcp tcp_ip_abort_cinterval`"

echo "tcp_ip_abort_interval `/usr/sbin/ndd /dev/tcp tcp_ip_abort_interval`"

echo "tcp_keepalive_interval `/usr/sbin/ndd /dev/tcp tcp_keepalive_interval`"

echo "tcp_recv_hiwat `/usr/sbin/ndd /dev/tcp tcp_recv_hiwat`"

echo "tcp_rexmit_interval_initial `/usr/sbin/ndd /dev/tcp tcp_rexmit_interval_initial`"

echo "tcp_rexmit_interval_max `/usr/sbin/ndd /dev/tcp tcp_rexmit_interval_max`"

echo "tcp_rexmit_interval_min `/usr/sbin/ndd /dev/tcp tcp_rexmit_interval_min`"

echo "tcp_slow_start_initial `/usr/sbin/ndd /dev/tcp tcp_slow_start_initial`"

echo "tcp_time_wait_interval `/usr/sbin/ndd /dev/tcp tcp_time_wait_interval`"

echo "tcp_xmit_hiwat `/usr/sbin/ndd /dev/tcp tcp_xmit_hiwat`"

 

cmdtext="status"

;;

 

stop)

cmdtext="stopping"

;;

*)

echo "Usage: $0 {start|stop|show|default}"

exit 1

;;

esac

 

 

exit 0

 




  1. Comer, Douglas INTERNETWORKING with TCP/IP PRINCI­PLES, PROTOCOLS, AND ARCHITECTURES Volume 1, Pren­tice Hall, 2000
  2. Vöckler, Jens SolarisTM 2.x - Tuning Your TCP/IP Stack and More, www.sean.de/Solaris/, 24.05.2002