comparison xml/en/docs/freebsd_tuning.xml @ 0:61e04fc01027

Initial import of the nginx.org website.
author Ruslan Ermilov <ru@nginx.com>
date Thu, 11 Aug 2011 12:19:13 +0000
parents
children 9d544687d02c
comparison
equal deleted inserted replaced
-1:000000000000 0:61e04fc01027
1 <!DOCTYPE digest SYSTEM "../../../dtd/article.dtd">
2
3 <article title="Tuning FreeBSD for the highload"
4 link="/en/docs/tuning_freebsd.html"
5 lang="en">
6
7
8 <section title="Syncache and syncookies">
9
10 <para>
11 We look at how various kernel settings affect ability of the kernel
12 to process requests. Let&rsquo;s start with TCP/IP connection establishment.
13 </para>
14
15 <para>
16 [ syncache, syncookies ]
17 </para>
18
19 </section>
20
21
22 <section name="listen_queues"
23 title="Listen queues">
24
25 <para>
26 After the connection has been established it is placed in the listen queue
27 of the listen socket.
28 To see the current listen queues state, you may run the command
29 <path>netstat -Lan</path>:
30
31 <programlisting>
32 Current listen queue sizes (qlen/incqlen/maxqlen)
33 Proto Listen Local Address
34 tcp4 <b>10</b>/0/128 *.80
35 tcp4 0/0/128 *.22
36 </programlisting>
37
38 This is a normal case: the listen queue of the port *:80 contains
39 just 10 unaccepted connections.
40 If the web server is not able to handle the load, you may see
41 something like this:
42
43 <programlisting>
44 Current listen queue sizes (qlen/incqlen/maxqlen)
45 Proto Listen Local Address
46 tcp4 <b>192/</b>0/<b>128</b> *.80
47 tcp4 0/0/128 *.22
48 </programlisting>
49
50 Here are 192 unaccepted connections and most likely new coming connections
51 are discarding. Although the limit is 128 connections, FreeBSD allows
52 to receive 1.5 times connections than the limit before it starts to discard
53 the new connections. You may increase the limit using
54
55 <programlisting>
56 sysctl kern.ipc.somaxconn=4096
57 </programlisting>
58
59 However, note that the queue is only a damper to quench bursts.
60 If it is always overflowed, this means that you need to improve the web server,
61 but not to continue to increase the limit.
62 You may also change the listen queue maximum size in nginx configuration:
63
64 <programlisting>
65 listen 80 backlog=1024;
66 </programlisting>
67
68 However, you may not set it more than the current
69 <path>kern.ipc.somaxconn</path> value.
70 By default nginx uses the maximum value of FreeBSD kernel.
71 </para>
72
73 <para>
74 <programlisting>
75 </programlisting>
76 </para>
77
78 <para>
79 <programlisting>
80 </programlisting>
81 </para>
82
83 </section>
84
85
86 <section name="sockets_and_files"
87 title="Sockets and files">
88
89 <para>
90 [ sockets, files ]
91 </para>
92
93 </section>
94
95
96 <section name="socket_buffers"
97 title="Socket buffers">
98
99 <para>
100 When a client sends a data, the data first is received by the kernel
101 which places the data in the socket receiving buffer.
102 Then an application such as the web server
103 may call <code>recv()</code> or <code>read()</code> system calls
104 to get the data from the buffer.
105 When the application wants to send a data, it calls
106 <code>send()</code> or <code>write()</code>
107 system calls to place the data in the socket sending buffer.
108 Then the kernel manages to send the data from the buffer to the client.
109 In modern FreeBSD versions the default sizes of the socket receiving
110 and sending buffers are respectively 64K and 32K.
111 You may change them on the fly using the sysctls
112 <path>net.inet.tcp.recvspace</path> and
113 <path>net.inet.tcp.sendspace</path>.
114 Of course the bigger buffer sizes may increase throughput,
115 because connections may use bigger TCP sliding windows sizes.
116 And on the Internet you may see recomendations to increase
117 the buffer sizes to one or even several megabytes.
118 However, such large buffer sizes are suitable for local networks
119 or for networks under your control.
120 Since on the Internet a slow modem client may ask a large file
121 and then it will download the file during several minutes if not hours.
122 All this time the megabyte buffer will be bound to the slow client,
123 although we may devote just several kilobytes to it.
124 </para>
125
126 <para>
127 There is one more advantage of the large sending buffers for
128 the web servers such as Apache which use the blocking I/O system calls.
129 The server may place a whole large response in the sending buffer, then may
130 close the connection, and let the kernel to send the response to a slow client,
131 while the server is ready to serve other requests.
132 You should decide what is it better to bind to a client in your case:
133 a tens megabytes Apache/mod_perl process
134 or the hundreds kilbytes socket sending buffer.
135 Note that nginx uses non-blocking I/O system calls
136 and devotes just tens kilobytes to connections,
137 therefore it does not require the large buffer sizes.
138 </para>
139
140 <para>
141 [ dynamic buffers ]
142 </para>
143
144 </section>
145
146
147 <section name="mbufs"
148 title="mbufs, mbuf clusters, etc.">
149
150 <para>
151 Inside the kernel the buffers are stored in the form of chains of
152 memory chunks linked using the <i>mbuf</i> structures.
153 The mbuf size is 256 bytes and it can be used to store a small amount
154 of data, for example, TCP/IP header. However, the mbufs point mostly
155 to other data stored in the <i>mbuf clusters</i> or <i>jumbo clusters</i>,
156 and in this kind they are used as the chain links only.
157 The mbuf cluster size is 2K.
158 The jumbo cluster size can be equal to a CPU page size (4K for i386 and amd64),
159 9K, or 16K.
160 The 9K and 16K jumbo clusters are used mainly in local networks with Ethernet
161 frames larger than usual 1500 bytes, and they are beyond the scope of
162 this article.
163 The page size jumbo clusters are usually used for sending only,
164 while the mbuf clusters are used for both sending and receiving.
165
166 To see the current usage of the mbufs and clusters and their limits,
167 you may run the command <nobr><path>netstat -m</path>.</nobr>
168 Here is a sample from FreeBSD 7.2/amd64 with the default settings:
169
170 <programlisting>
171 1477/<b>3773/5250 mbufs</b> in use (current/cache/total)
172 771/2203/<b>2974/25600 mbuf clusters</b> in use (current/cache/total/max)
173 771/1969 mbuf+clusters out of packet secondary zone in use
174 (current/cache)
175 296/863/<b>1159/12800 4k (page size) jumbo clusters</b> in use
176 (current/cache/total/max)
177 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
178 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
179 3095K/8801K/11896K bytes allocated to network(current/cache/total)
180 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
181 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
182 0/0/0 sfbufs in use (current/peak/max)
183 0 requests for sfbufs denied
184 0 requests for sfbufs delayed
185 523590 requests for I/O initiated by sendfile
186 0 calls to protocol drain routines
187 </programlisting>
188
189 There are 12800 page size jumbo clusters,
190 therefore they can store only 50M of data.
191 If you set the <path>net.inet.tcp.sendspace</path> to 1M,
192 then merely 50 slow clients will take all jumbo clusters
193 requesting large files.
194 </para>
195
196 <para>
197 You may increase the clusters limits on the fly using:
198
199 <programlisting>
200 sysctl kern.ipc.nmbclusters=200000
201 sysctl kern.ipc.nmbjumbop=100000
202 </programlisting>
203
204 The former command increases the mbuf clusters limit
205 and the latter increases page size jumbo clusters limit.
206 Note that all allocated mbufs clusters will take about 440M physical memory:
207 (200000 &times; (2048 + 256)) because each mbuf cluster requires also the mbuf.
208 All allocated page size jumbo clusters will take yet about 415M physical memory:
209 (100000 &times; (4096 + 256)).
210 And together they may take 845M.
211
212 <note>
213 The page size jumbo clusters have been introduced in FreeBSD 7.0.
214 In earlier versions you should tune only 2K mbuf clusters.
215 Prior to FreeBSD 6.2, the <path>kern.ipc.nmbclusters</path> value can be
216 set only on the boot time via loader tunnable.
217 </note>
218 </para>
219
220 <para>
221 On the amd64 architecture FreeBSD kernel can use for sockets buffers
222 almost all physical memory,
223 while on the i386 architecture no more than 2G memory can be used,
224 regardless of the available physical memory.
225 We will discuss the i386 specific tunning later.
226 </para>
227
228 <para>
229 There is way not to use the jumbo clusters while serving static files:
230 the <i>sendfile()</i> system call.
231 The sendfile allows to send a file or its part to a socket directly
232 without reading the parts in an application buffer.
233 It creates the mbufs chain where the mufs point to the file pages that are
234 already present in FreeBSD cache memory, and passes the chain to
235 the TCP/IP stack.
236 Thus, sendfile decreases both CPU usage by omitting two memory copy operations,
237 and memory usage by using the cached file pages.
238 </para>
239
240 <para>
241 And again, the amd64 sendfile implementation is the best:
242 the zeros in the <nobr><path>netstat -m</path></nobr> output
243 <programlisting>
244 ...
245 <b>0/0/0</b> sfbufs in use (current/peak/max)
246 ...
247 </programlisting>
248 mean that there is no <i>sfbufs</i> limit,
249 while on i386 architecture you should to tune them.
250 </para>
251
252 <!--
253
254 <para>
255
256 <programlisting>
257 vm.pmap.pg_ps_enabled=1
258
259 vm.kmem_size=3G
260
261 net.inet.tcp.tcbhashsize=32768
262
263 net.inet.tcp.hostcache.cachelimit=40960
264 net.inet.tcp.hostcache.hashsize=4096
265 net.inet.tcp.hostcache.bucketlimit=10
266
267 net.inet.tcp.syncache.hashsize=1024
268 net.inet.tcp.syncache.bucketlimit=100
269 </programlisting>
270
271 <programlisting>
272
273 net.inet.tcp.syncookies=0
274 net.inet.tcp.rfc1323=0
275 net.inet.tcp.sack.enable=1
276 net.inet.tcp.fast_finwait2_recycle=1
277
278 net.inet.tcp.rfc3390=0
279 net.inet.tcp.slowstart_flightsize=2
280
281 net.inet.tcp.recvspace=8192
282 net.inet.tcp.recvbuf_auto=0
283
284 net.inet.tcp.sendspace=16384
285 net.inet.tcp.sendbuf_auto=1
286 net.inet.tcp.sendbuf_inc=8192
287 net.inet.tcp.sendbuf_max=131072
288
289 # 797M
290 kern.ipc.nmbjumbop=192000
291 # 504M
292 kern.ipc.nmbclusters=229376
293 # 334M
294 kern.ipc.maxsockets=204800
295 # 8M
296 net.inet.tcp.maxtcptw=163840
297 # 24M
298 kern.maxfiles=204800
299 </programlisting>
300
301 </para>
302
303 <para>
304
305 <programlisting>
306 sysctl net.isr.direct=0
307 </programlisting>
308
309 <programlisting>
310 sysctl net.inet.ip.intr_queue_maxlen=2048
311 </programlisting>
312
313 </para>
314
315 -->
316
317 </section>
318
319
320 <section name="proxying"
321 title="Proxying">
322
323
324 <programlisting>
325 net.inet.ip.portrange.randomized=0
326 net.inet.ip.portrange.first=1024
327 net.inet.ip.portrange.last=65535
328 </programlisting>
329
330 </section>
331
332
333 <section name="finalizing_connection"
334 title="Finalizing connection">
335
336 <programlisting>
337 net.inet.tcp.fast_finwait2_recycle=1
338 </programlisting>
339
340 </section>
341
342
343 <section name="i386_specific_tuning"
344 title="i386 specific tuning">
345
346 <para>
347 [ KVA, KVM, nsfbufs ]
348 </para>
349
350 </section>
351
352
353 <section name="minor_optmizations"
354 title="Minor optimizations">
355
356 <para>
357
358 <programlisting>
359 sysctl kern.random.sys.harvest.ethernet=0
360 </programlisting>
361
362 </para>
363
364 </section>
365
366 </article>