1 | \documentclass{howto}
|
---|
2 |
|
---|
3 | \title{Socket Programming HOWTO}
|
---|
4 |
|
---|
5 | \release{0.00}
|
---|
6 |
|
---|
7 | \author{Gordon McMillan}
|
---|
8 | \authoraddress{\email{gmcm@hypernet.com}}
|
---|
9 |
|
---|
10 | \begin{document}
|
---|
11 | \maketitle
|
---|
12 |
|
---|
13 | \begin{abstract}
|
---|
14 | \noindent
|
---|
15 | Sockets are used nearly everywhere, but are one of the most severely
|
---|
16 | misunderstood technologies around. This is a 10,000 foot overview of
|
---|
17 | sockets. It's not really a tutorial - you'll still have work to do in
|
---|
18 | getting things operational. It doesn't cover the fine points (and there
|
---|
19 | are a lot of them), but I hope it will give you enough background to
|
---|
20 | begin using them decently.
|
---|
21 |
|
---|
22 | This document is available from the Python HOWTO page at
|
---|
23 | \url{http://www.python.org/doc/howto}.
|
---|
24 |
|
---|
25 | \end{abstract}
|
---|
26 |
|
---|
27 | \tableofcontents
|
---|
28 |
|
---|
29 | \section{Sockets}
|
---|
30 |
|
---|
31 | Sockets are used nearly everywhere, but are one of the most severely
|
---|
32 | misunderstood technologies around. This is a 10,000 foot overview of
|
---|
33 | sockets. It's not really a tutorial - you'll still have work to do in
|
---|
34 | getting things working. It doesn't cover the fine points (and there
|
---|
35 | are a lot of them), but I hope it will give you enough background to
|
---|
36 | begin using them decently.
|
---|
37 |
|
---|
38 | I'm only going to talk about INET sockets, but they account for at
|
---|
39 | least 99\% of the sockets in use. And I'll only talk about STREAM
|
---|
40 | sockets - unless you really know what you're doing (in which case this
|
---|
41 | HOWTO isn't for you!), you'll get better behavior and performance from
|
---|
42 | a STREAM socket than anything else. I will try to clear up the mystery
|
---|
43 | of what a socket is, as well as some hints on how to work with
|
---|
44 | blocking and non-blocking sockets. But I'll start by talking about
|
---|
45 | blocking sockets. You'll need to know how they work before dealing
|
---|
46 | with non-blocking sockets.
|
---|
47 |
|
---|
48 | Part of the trouble with understanding these things is that "socket"
|
---|
49 | can mean a number of subtly different things, depending on context. So
|
---|
50 | first, let's make a distinction between a "client" socket - an
|
---|
51 | endpoint of a conversation, and a "server" socket, which is more like
|
---|
52 | a switchboard operator. The client application (your browser, for
|
---|
53 | example) uses "client" sockets exclusively; the web server it's
|
---|
54 | talking to uses both "server" sockets and "client" sockets.
|
---|
55 |
|
---|
56 |
|
---|
57 | \subsection{History}
|
---|
58 |
|
---|
59 | Of the various forms of IPC (\emph{Inter Process Communication}),
|
---|
60 | sockets are by far the most popular. On any given platform, there are
|
---|
61 | likely to be other forms of IPC that are faster, but for
|
---|
62 | cross-platform communication, sockets are about the only game in town.
|
---|
63 |
|
---|
64 | They were invented in Berkeley as part of the BSD flavor of Unix. They
|
---|
65 | spread like wildfire with the Internet. With good reason --- the
|
---|
66 | combination of sockets with INET makes talking to arbitrary machines
|
---|
67 | around the world unbelievably easy (at least compared to other
|
---|
68 | schemes).
|
---|
69 |
|
---|
70 | \section{Creating a Socket}
|
---|
71 |
|
---|
72 | Roughly speaking, when you clicked on the link that brought you to
|
---|
73 | this page, your browser did something like the following:
|
---|
74 |
|
---|
75 | \begin{verbatim}
|
---|
76 | #create an INET, STREAMing socket
|
---|
77 | s = socket.socket(
|
---|
78 | socket.AF_INET, socket.SOCK_STREAM)
|
---|
79 | #now connect to the web server on port 80
|
---|
80 | # - the normal http port
|
---|
81 | s.connect(("www.mcmillan-inc.com", 80))
|
---|
82 | \end{verbatim}
|
---|
83 |
|
---|
84 | When the \code{connect} completes, the socket \code{s} can
|
---|
85 | now be used to send in a request for the text of this page. The same
|
---|
86 | socket will read the reply, and then be destroyed. That's right -
|
---|
87 | destroyed. Client sockets are normally only used for one exchange (or
|
---|
88 | a small set of sequential exchanges).
|
---|
89 |
|
---|
90 | What happens in the web server is a bit more complex. First, the web
|
---|
91 | server creates a "server socket".
|
---|
92 |
|
---|
93 | \begin{verbatim}
|
---|
94 | #create an INET, STREAMing socket
|
---|
95 | serversocket = socket.socket(
|
---|
96 | socket.AF_INET, socket.SOCK_STREAM)
|
---|
97 | #bind the socket to a public host,
|
---|
98 | # and a well-known port
|
---|
99 | serversocket.bind((socket.gethostname(), 80))
|
---|
100 | #become a server socket
|
---|
101 | serversocket.listen(5)
|
---|
102 | \end{verbatim}
|
---|
103 |
|
---|
104 | A couple things to notice: we used \code{socket.gethostname()}
|
---|
105 | so that the socket would be visible to the outside world. If we had
|
---|
106 | used \code{s.bind(('', 80))} or \code{s.bind(('localhost',
|
---|
107 | 80))} or \code{s.bind(('127.0.0.1', 80))} we would still
|
---|
108 | have a "server" socket, but one that was only visible within the same
|
---|
109 | machine.
|
---|
110 |
|
---|
111 | A second thing to note: low number ports are usually reserved for
|
---|
112 | "well known" services (HTTP, SNMP etc). If you're playing around, use
|
---|
113 | a nice high number (4 digits).
|
---|
114 |
|
---|
115 | Finally, the argument to \code{listen} tells the socket library that
|
---|
116 | we want it to queue up as many as 5 connect requests (the normal max)
|
---|
117 | before refusing outside connections. If the rest of the code is
|
---|
118 | written properly, that should be plenty.
|
---|
119 |
|
---|
120 | OK, now we have a "server" socket, listening on port 80. Now we enter
|
---|
121 | the mainloop of the web server:
|
---|
122 |
|
---|
123 | \begin{verbatim}
|
---|
124 | while 1:
|
---|
125 | #accept connections from outside
|
---|
126 | (clientsocket, address) = serversocket.accept()
|
---|
127 | #now do something with the clientsocket
|
---|
128 | #in this case, we'll pretend this is a threaded server
|
---|
129 | ct = client_thread(clientsocket)
|
---|
130 | ct.run()
|
---|
131 | \end{verbatim}
|
---|
132 |
|
---|
133 | There's actually 3 general ways in which this loop could work -
|
---|
134 | dispatching a thread to handle \code{clientsocket}, create a new
|
---|
135 | process to handle \code{clientsocket}, or restructure this app
|
---|
136 | to use non-blocking sockets, and mulitplex between our "server" socket
|
---|
137 | and any active \code{clientsocket}s using
|
---|
138 | \code{select}. More about that later. The important thing to
|
---|
139 | understand now is this: this is \emph{all} a "server" socket
|
---|
140 | does. It doesn't send any data. It doesn't receive any data. It just
|
---|
141 | produces "client" sockets. Each \code{clientsocket} is created
|
---|
142 | in response to some \emph{other} "client" socket doing a
|
---|
143 | \code{connect()} to the host and port we're bound to. As soon as
|
---|
144 | we've created that \code{clientsocket}, we go back to listening
|
---|
145 | for more connections. The two "clients" are free to chat it up - they
|
---|
146 | are using some dynamically allocated port which will be recycled when
|
---|
147 | the conversation ends.
|
---|
148 |
|
---|
149 | \subsection{IPC} If you need fast IPC between two processes
|
---|
150 | on one machine, you should look into whatever form of shared memory
|
---|
151 | the platform offers. A simple protocol based around shared memory and
|
---|
152 | locks or semaphores is by far the fastest technique.
|
---|
153 |
|
---|
154 | If you do decide to use sockets, bind the "server" socket to
|
---|
155 | \code{'localhost'}. On most platforms, this will take a shortcut
|
---|
156 | around a couple of layers of network code and be quite a bit faster.
|
---|
157 |
|
---|
158 |
|
---|
159 | \section{Using a Socket}
|
---|
160 |
|
---|
161 | The first thing to note, is that the web browser's "client" socket and
|
---|
162 | the web server's "client" socket are identical beasts. That is, this
|
---|
163 | is a "peer to peer" conversation. Or to put it another way, \emph{as the
|
---|
164 | designer, you will have to decide what the rules of etiquette are for
|
---|
165 | a conversation}. Normally, the \code{connect}ing socket
|
---|
166 | starts the conversation, by sending in a request, or perhaps a
|
---|
167 | signon. But that's a design decision - it's not a rule of sockets.
|
---|
168 |
|
---|
169 | Now there are two sets of verbs to use for communication. You can use
|
---|
170 | \code{send} and \code{recv}, or you can transform your
|
---|
171 | client socket into a file-like beast and use \code{read} and
|
---|
172 | \code{write}. The latter is the way Java presents their
|
---|
173 | sockets. I'm not going to talk about it here, except to warn you that
|
---|
174 | you need to use \code{flush} on sockets. These are buffered
|
---|
175 | "files", and a common mistake is to \code{write} something, and
|
---|
176 | then \code{read} for a reply. Without a \code{flush} in
|
---|
177 | there, you may wait forever for the reply, because the request may
|
---|
178 | still be in your output buffer.
|
---|
179 |
|
---|
180 | Now we come the major stumbling block of sockets - \code{send}
|
---|
181 | and \code{recv} operate on the network buffers. They do not
|
---|
182 | necessarily handle all the bytes you hand them (or expect from them),
|
---|
183 | because their major focus is handling the network buffers. In general,
|
---|
184 | they return when the associated network buffers have been filled
|
---|
185 | (\code{send}) or emptied (\code{recv}). They then tell you
|
---|
186 | how many bytes they handled. It is \emph{your} responsibility to call
|
---|
187 | them again until your message has been completely dealt with.
|
---|
188 |
|
---|
189 | When a \code{recv} returns 0 bytes, it means the other side has
|
---|
190 | closed (or is in the process of closing) the connection. You will not
|
---|
191 | receive any more data on this connection. Ever. You may be able to
|
---|
192 | send data successfully; I'll talk about that some on the next page.
|
---|
193 |
|
---|
194 | A protocol like HTTP uses a socket for only one transfer. The client
|
---|
195 | sends a request, the reads a reply. That's it. The socket is
|
---|
196 | discarded. This means that a client can detect the end of the reply by
|
---|
197 | receiving 0 bytes.
|
---|
198 |
|
---|
199 | But if you plan to reuse your socket for further transfers, you need
|
---|
200 | to realize that \emph{there is no "EOT" (End of Transfer) on a
|
---|
201 | socket.} I repeat: if a socket \code{send} or
|
---|
202 | \code{recv} returns after handling 0 bytes, the connection has
|
---|
203 | been broken. If the connection has \emph{not} been broken, you may
|
---|
204 | wait on a \code{recv} forever, because the socket will
|
---|
205 | \emph{not} tell you that there's nothing more to read (for now). Now
|
---|
206 | if you think about that a bit, you'll come to realize a fundamental
|
---|
207 | truth of sockets: \emph{messages must either be fixed length} (yuck),
|
---|
208 | \emph{or be delimited} (shrug), \emph{or indicate how long they are}
|
---|
209 | (much better), \emph{or end by shutting down the connection}. The
|
---|
210 | choice is entirely yours, (but some ways are righter than others).
|
---|
211 |
|
---|
212 | Assuming you don't want to end the connection, the simplest solution
|
---|
213 | is a fixed length message:
|
---|
214 |
|
---|
215 | \begin{verbatim}
|
---|
216 | class mysocket:
|
---|
217 | '''demonstration class only
|
---|
218 | - coded for clarity, not efficiency
|
---|
219 | '''
|
---|
220 |
|
---|
221 | def __init__(self, sock=None):
|
---|
222 | if sock is None:
|
---|
223 | self.sock = socket.socket(
|
---|
224 | socket.AF_INET, socket.SOCK_STREAM)
|
---|
225 | else:
|
---|
226 | self.sock = sock
|
---|
227 |
|
---|
228 | def connect(self, host, port):
|
---|
229 | self.sock.connect((host, port))
|
---|
230 |
|
---|
231 | def mysend(self, msg):
|
---|
232 | totalsent = 0
|
---|
233 | while totalsent < MSGLEN:
|
---|
234 | sent = self.sock.send(msg[totalsent:])
|
---|
235 | if sent == 0:
|
---|
236 | raise RuntimeError, \\
|
---|
237 | "socket connection broken"
|
---|
238 | totalsent = totalsent + sent
|
---|
239 |
|
---|
240 | def myreceive(self):
|
---|
241 | msg = ''
|
---|
242 | while len(msg) < MSGLEN:
|
---|
243 | chunk = self.sock.recv(MSGLEN-len(msg))
|
---|
244 | if chunk == '':
|
---|
245 | raise RuntimeError, \\
|
---|
246 | "socket connection broken"
|
---|
247 | msg = msg + chunk
|
---|
248 | return msg
|
---|
249 | \end{verbatim}
|
---|
250 |
|
---|
251 | The sending code here is usable for almost any messaging scheme - in
|
---|
252 | Python you send strings, and you can use \code{len()} to
|
---|
253 | determine its length (even if it has embedded \code{\e 0}
|
---|
254 | characters). It's mostly the receiving code that gets more
|
---|
255 | complex. (And in C, it's not much worse, except you can't use
|
---|
256 | \code{strlen} if the message has embedded \code{\e 0}s.)
|
---|
257 |
|
---|
258 | The easiest enhancement is to make the first character of the message
|
---|
259 | an indicator of message type, and have the type determine the
|
---|
260 | length. Now you have two \code{recv}s - the first to get (at
|
---|
261 | least) that first character so you can look up the length, and the
|
---|
262 | second in a loop to get the rest. If you decide to go the delimited
|
---|
263 | route, you'll be receiving in some arbitrary chunk size, (4096 or 8192
|
---|
264 | is frequently a good match for network buffer sizes), and scanning
|
---|
265 | what you've received for a delimiter.
|
---|
266 |
|
---|
267 | One complication to be aware of: if your conversational protocol
|
---|
268 | allows multiple messages to be sent back to back (without some kind of
|
---|
269 | reply), and you pass \code{recv} an arbitrary chunk size, you
|
---|
270 | may end up reading the start of a following message. You'll need to
|
---|
271 | put that aside and hold onto it, until it's needed.
|
---|
272 |
|
---|
273 | Prefixing the message with it's length (say, as 5 numeric characters)
|
---|
274 | gets more complex, because (believe it or not), you may not get all 5
|
---|
275 | characters in one \code{recv}. In playing around, you'll get
|
---|
276 | away with it; but in high network loads, your code will very quickly
|
---|
277 | break unless you use two \code{recv} loops - the first to
|
---|
278 | determine the length, the second to get the data part of the
|
---|
279 | message. Nasty. This is also when you'll discover that
|
---|
280 | \code{send} does not always manage to get rid of everything in
|
---|
281 | one pass. And despite having read this, you will eventually get bit by
|
---|
282 | it!
|
---|
283 |
|
---|
284 | In the interests of space, building your character, (and preserving my
|
---|
285 | competitive position), these enhancements are left as an exercise for
|
---|
286 | the reader. Lets move on to cleaning up.
|
---|
287 |
|
---|
288 | \subsection{Binary Data}
|
---|
289 |
|
---|
290 | It is perfectly possible to send binary data over a socket. The major
|
---|
291 | problem is that not all machines use the same formats for binary
|
---|
292 | data. For example, a Motorola chip will represent a 16 bit integer
|
---|
293 | with the value 1 as the two hex bytes 00 01. Intel and DEC, however,
|
---|
294 | are byte-reversed - that same 1 is 01 00. Socket libraries have calls
|
---|
295 | for converting 16 and 32 bit integers - \code{ntohl, htonl, ntohs,
|
---|
296 | htons} where "n" means \emph{network} and "h" means \emph{host},
|
---|
297 | "s" means \emph{short} and "l" means \emph{long}. Where network order
|
---|
298 | is host order, these do nothing, but where the machine is
|
---|
299 | byte-reversed, these swap the bytes around appropriately.
|
---|
300 |
|
---|
301 | In these days of 32 bit machines, the ascii representation of binary
|
---|
302 | data is frequently smaller than the binary representation. That's
|
---|
303 | because a surprising amount of the time, all those longs have the
|
---|
304 | value 0, or maybe 1. The string "0" would be two bytes, while binary
|
---|
305 | is four. Of course, this doesn't fit well with fixed-length
|
---|
306 | messages. Decisions, decisions.
|
---|
307 |
|
---|
308 | \section{Disconnecting}
|
---|
309 |
|
---|
310 | Strictly speaking, you're supposed to use \code{shutdown} on a
|
---|
311 | socket before you \code{close} it. The \code{shutdown} is
|
---|
312 | an advisory to the socket at the other end. Depending on the argument
|
---|
313 | you pass it, it can mean "I'm not going to send anymore, but I'll
|
---|
314 | still listen", or "I'm not listening, good riddance!". Most socket
|
---|
315 | libraries, however, are so used to programmers neglecting to use this
|
---|
316 | piece of etiquette that normally a \code{close} is the same as
|
---|
317 | \code{shutdown(); close()}. So in most situations, an explicit
|
---|
318 | \code{shutdown} is not needed.
|
---|
319 |
|
---|
320 | One way to use \code{shutdown} effectively is in an HTTP-like
|
---|
321 | exchange. The client sends a request and then does a
|
---|
322 | \code{shutdown(1)}. This tells the server "This client is done
|
---|
323 | sending, but can still receive." The server can detect "EOF" by a
|
---|
324 | receive of 0 bytes. It can assume it has the complete request. The
|
---|
325 | server sends a reply. If the \code{send} completes successfully
|
---|
326 | then, indeed, the client was still receiving.
|
---|
327 |
|
---|
328 | Python takes the automatic shutdown a step further, and says that when a socket is garbage collected, it will automatically do a \code{close} if it's needed. But relying on this is a very bad habit. If your socket just disappears without doing a \code{close}, the socket at the other end may hang indefinitely, thinking you're just being slow. \emph{Please} \code{close} your sockets when you're done.
|
---|
329 |
|
---|
330 |
|
---|
331 | \subsection{When Sockets Die}
|
---|
332 |
|
---|
333 | Probably the worst thing about using blocking sockets is what happens
|
---|
334 | when the other side comes down hard (without doing a
|
---|
335 | \code{close}). Your socket is likely to hang. SOCKSTREAM is a
|
---|
336 | reliable protocol, and it will wait a long, long time before giving up
|
---|
337 | on a connection. If you're using threads, the entire thread is
|
---|
338 | essentially dead. There's not much you can do about it. As long as you
|
---|
339 | aren't doing something dumb, like holding a lock while doing a
|
---|
340 | blocking read, the thread isn't really consuming much in the way of
|
---|
341 | resources. Do \emph{not} try to kill the thread - part of the reason
|
---|
342 | that threads are more efficient than processes is that they avoid the
|
---|
343 | overhead associated with the automatic recycling of resources. In
|
---|
344 | other words, if you do manage to kill the thread, your whole process
|
---|
345 | is likely to be screwed up.
|
---|
346 |
|
---|
347 | \section{Non-blocking Sockets}
|
---|
348 |
|
---|
349 | If you've understood the preceeding, you already know most of what you
|
---|
350 | need to know about the mechanics of using sockets. You'll still use
|
---|
351 | the same calls, in much the same ways. It's just that, if you do it
|
---|
352 | right, your app will be almost inside-out.
|
---|
353 |
|
---|
354 | In Python, you use \code{socket.setblocking(0)} to make it
|
---|
355 | non-blocking. In C, it's more complex, (for one thing, you'll need to
|
---|
356 | choose between the BSD flavor \code{O_NONBLOCK} and the almost
|
---|
357 | indistinguishable Posix flavor \code{O_NDELAY}, which is
|
---|
358 | completely different from \code{TCP_NODELAY}), but it's the
|
---|
359 | exact same idea. You do this after creating the socket, but before
|
---|
360 | using it. (Actually, if you're nuts, you can switch back and forth.)
|
---|
361 |
|
---|
362 | The major mechanical difference is that \code{send},
|
---|
363 | \code{recv}, \code{connect} and \code{accept} can
|
---|
364 | return without having done anything. You have (of course) a number of
|
---|
365 | choices. You can check return code and error codes and generally drive
|
---|
366 | yourself crazy. If you don't believe me, try it sometime. Your app
|
---|
367 | will grow large, buggy and suck CPU. So let's skip the brain-dead
|
---|
368 | solutions and do it right.
|
---|
369 |
|
---|
370 | Use \code{select}.
|
---|
371 |
|
---|
372 | In C, coding \code{select} is fairly complex. In Python, it's a
|
---|
373 | piece of cake, but it's close enough to the C version that if you
|
---|
374 | understand \code{select} in Python, you'll have little trouble
|
---|
375 | with it in C.
|
---|
376 |
|
---|
377 | \begin{verbatim} ready_to_read, ready_to_write, in_error = \\
|
---|
378 | select.select(
|
---|
379 | potential_readers,
|
---|
380 | potential_writers,
|
---|
381 | potential_errs,
|
---|
382 | timeout)
|
---|
383 | \end{verbatim}
|
---|
384 |
|
---|
385 | You pass \code{select} three lists: the first contains all
|
---|
386 | sockets that you might want to try reading; the second all the sockets
|
---|
387 | you might want to try writing to, and the last (normally left empty)
|
---|
388 | those that you want to check for errors. You should note that a
|
---|
389 | socket can go into more than one list. The \code{select} call is
|
---|
390 | blocking, but you can give it a timeout. This is generally a sensible
|
---|
391 | thing to do - give it a nice long timeout (say a minute) unless you
|
---|
392 | have good reason to do otherwise.
|
---|
393 |
|
---|
394 | In return, you will get three lists. They have the sockets that are
|
---|
395 | actually readable, writable and in error. Each of these lists is a
|
---|
396 | subset (possbily empty) of the corresponding list you passed in. And
|
---|
397 | if you put a socket in more than one input list, it will only be (at
|
---|
398 | most) in one output list.
|
---|
399 |
|
---|
400 | If a socket is in the output readable list, you can be
|
---|
401 | as-close-to-certain-as-we-ever-get-in-this-business that a
|
---|
402 | \code{recv} on that socket will return \emph{something}. Same
|
---|
403 | idea for the writable list. You'll be able to send
|
---|
404 | \emph{something}. Maybe not all you want to, but \emph{something} is
|
---|
405 | better than nothing. (Actually, any reasonably healthy socket will
|
---|
406 | return as writable - it just means outbound network buffer space is
|
---|
407 | available.)
|
---|
408 |
|
---|
409 | If you have a "server" socket, put it in the potential_readers
|
---|
410 | list. If it comes out in the readable list, your \code{accept}
|
---|
411 | will (almost certainly) work. If you have created a new socket to
|
---|
412 | \code{connect} to someone else, put it in the ptoential_writers
|
---|
413 | list. If it shows up in the writable list, you have a decent chance
|
---|
414 | that it has connected.
|
---|
415 |
|
---|
416 | One very nasty problem with \code{select}: if somewhere in those
|
---|
417 | input lists of sockets is one which has died a nasty death, the
|
---|
418 | \code{select} will fail. You then need to loop through every
|
---|
419 | single damn socket in all those lists and do a
|
---|
420 | \code{select([sock],[],[],0)} until you find the bad one. That
|
---|
421 | timeout of 0 means it won't take long, but it's ugly.
|
---|
422 |
|
---|
423 | Actually, \code{select} can be handy even with blocking sockets.
|
---|
424 | It's one way of determining whether you will block - the socket
|
---|
425 | returns as readable when there's something in the buffers. However,
|
---|
426 | this still doesn't help with the problem of determining whether the
|
---|
427 | other end is done, or just busy with something else.
|
---|
428 |
|
---|
429 | \textbf{Portability alert}: On Unix, \code{select} works both with
|
---|
430 | the sockets and files. Don't try this on Windows. On Windows,
|
---|
431 | \code{select} works with sockets only. Also note that in C, many
|
---|
432 | of the more advanced socket options are done differently on
|
---|
433 | Windows. In fact, on Windows I usually use threads (which work very,
|
---|
434 | very well) with my sockets. Face it, if you want any kind of
|
---|
435 | performance, your code will look very different on Windows than on
|
---|
436 | Unix. (I haven't the foggiest how you do this stuff on a Mac.)
|
---|
437 |
|
---|
438 | \subsection{Performance}
|
---|
439 |
|
---|
440 | There's no question that the fastest sockets code uses non-blocking
|
---|
441 | sockets and select to multiplex them. You can put together something
|
---|
442 | that will saturate a LAN connection without putting any strain on the
|
---|
443 | CPU. The trouble is that an app written this way can't do much of
|
---|
444 | anything else - it needs to be ready to shuffle bytes around at all
|
---|
445 | times.
|
---|
446 |
|
---|
447 | Assuming that your app is actually supposed to do something more than
|
---|
448 | that, threading is the optimal solution, (and using non-blocking
|
---|
449 | sockets will be faster than using blocking sockets). Unfortunately,
|
---|
450 | threading support in Unixes varies both in API and quality. So the
|
---|
451 | normal Unix solution is to fork a subprocess to deal with each
|
---|
452 | connection. The overhead for this is significant (and don't do this on
|
---|
453 | Windows - the overhead of process creation is enormous there). It also
|
---|
454 | means that unless each subprocess is completely independent, you'll
|
---|
455 | need to use another form of IPC, say a pipe, or shared memory and
|
---|
456 | semaphores, to communicate between the parent and child processes.
|
---|
457 |
|
---|
458 | Finally, remember that even though blocking sockets are somewhat
|
---|
459 | slower than non-blocking, in many cases they are the "right"
|
---|
460 | solution. After all, if your app is driven by the data it receives
|
---|
461 | over a socket, there's not much sense in complicating the logic just
|
---|
462 | so your app can wait on \code{select} instead of
|
---|
463 | \code{recv}.
|
---|
464 |
|
---|
465 | \end{document}
|
---|