[391] | 1 | .. _socket-howto:
|
---|
| 2 |
|
---|
[2] | 3 | ****************************
|
---|
| 4 | Socket Programming HOWTO
|
---|
| 5 | ****************************
|
---|
| 6 |
|
---|
| 7 | :Author: Gordon McMillan
|
---|
| 8 |
|
---|
| 9 |
|
---|
| 10 | .. topic:: Abstract
|
---|
| 11 |
|
---|
| 12 | Sockets are used nearly everywhere, but are one of the most severely
|
---|
| 13 | misunderstood technologies around. This is a 10,000 foot overview of sockets.
|
---|
| 14 | It's not really a tutorial - you'll still have work to do in getting things
|
---|
| 15 | operational. It doesn't cover the fine points (and there are a lot of them), but
|
---|
| 16 | I hope it will give you enough background to begin using them decently.
|
---|
| 17 |
|
---|
| 18 |
|
---|
| 19 | Sockets
|
---|
| 20 | =======
|
---|
| 21 |
|
---|
| 22 | I'm only going to talk about INET sockets, but they account for at least 99% of
|
---|
| 23 | the sockets in use. And I'll only talk about STREAM sockets - unless you really
|
---|
| 24 | know what you're doing (in which case this HOWTO isn't for you!), you'll get
|
---|
| 25 | better behavior and performance from a STREAM socket than anything else. I will
|
---|
| 26 | try to clear up the mystery of what a socket is, as well as some hints on how to
|
---|
| 27 | work with blocking and non-blocking sockets. But I'll start by talking about
|
---|
| 28 | blocking sockets. You'll need to know how they work before dealing with
|
---|
| 29 | non-blocking sockets.
|
---|
| 30 |
|
---|
| 31 | Part of the trouble with understanding these things is that "socket" can mean a
|
---|
| 32 | number of subtly different things, depending on context. So first, let's make a
|
---|
| 33 | distinction between a "client" socket - an endpoint of a conversation, and a
|
---|
| 34 | "server" socket, which is more like a switchboard operator. The client
|
---|
| 35 | application (your browser, for example) uses "client" sockets exclusively; the
|
---|
| 36 | web server it's talking to uses both "server" sockets and "client" sockets.
|
---|
| 37 |
|
---|
| 38 |
|
---|
| 39 | History
|
---|
| 40 | -------
|
---|
| 41 |
|
---|
[391] | 42 | Of the various forms of :abbr:`IPC (Inter Process Communication)`,
|
---|
| 43 | sockets are by far the most popular. On any given platform, there are
|
---|
| 44 | likely to be other forms of IPC that are faster, but for
|
---|
| 45 | cross-platform communication, sockets are about the only game in town.
|
---|
[2] | 46 |
|
---|
| 47 | They were invented in Berkeley as part of the BSD flavor of Unix. They spread
|
---|
| 48 | like wildfire with the Internet. With good reason --- the combination of sockets
|
---|
| 49 | with INET makes talking to arbitrary machines around the world unbelievably easy
|
---|
| 50 | (at least compared to other schemes).
|
---|
| 51 |
|
---|
| 52 |
|
---|
| 53 | Creating a Socket
|
---|
| 54 | =================
|
---|
| 55 |
|
---|
| 56 | Roughly speaking, when you clicked on the link that brought you to this page,
|
---|
| 57 | your browser did something like the following::
|
---|
| 58 |
|
---|
| 59 | #create an INET, STREAMing socket
|
---|
| 60 | s = socket.socket(
|
---|
| 61 | socket.AF_INET, socket.SOCK_STREAM)
|
---|
| 62 | #now connect to the web server on port 80
|
---|
| 63 | # - the normal http port
|
---|
| 64 | s.connect(("www.mcmillan-inc.com", 80))
|
---|
| 65 |
|
---|
[391] | 66 | When the ``connect`` completes, the socket ``s`` can be used to send
|
---|
| 67 | in a request for the text of the page. The same socket will read the
|
---|
| 68 | reply, and then be destroyed. That's right, destroyed. Client sockets
|
---|
| 69 | are normally only used for one exchange (or a small set of sequential
|
---|
| 70 | exchanges).
|
---|
[2] | 71 |
|
---|
| 72 | What happens in the web server is a bit more complex. First, the web server
|
---|
[391] | 73 | creates a "server socket"::
|
---|
[2] | 74 |
|
---|
| 75 | #create an INET, STREAMing socket
|
---|
| 76 | serversocket = socket.socket(
|
---|
| 77 | socket.AF_INET, socket.SOCK_STREAM)
|
---|
| 78 | #bind the socket to a public host,
|
---|
| 79 | # and a well-known port
|
---|
| 80 | serversocket.bind((socket.gethostname(), 80))
|
---|
| 81 | #become a server socket
|
---|
| 82 | serversocket.listen(5)
|
---|
| 83 |
|
---|
| 84 | A couple things to notice: we used ``socket.gethostname()`` so that the socket
|
---|
[391] | 85 | would be visible to the outside world. If we had used ``s.bind(('localhost',
|
---|
| 86 | 80))`` or ``s.bind(('127.0.0.1', 80))`` we would still have a "server" socket,
|
---|
| 87 | but one that was only visible within the same machine. ``s.bind(('', 80))``
|
---|
| 88 | specifies that the socket is reachable by any address the machine happens to
|
---|
| 89 | have.
|
---|
[2] | 90 |
|
---|
| 91 | A second thing to note: low number ports are usually reserved for "well known"
|
---|
| 92 | services (HTTP, SNMP etc). If you're playing around, use a nice high number (4
|
---|
| 93 | digits).
|
---|
| 94 |
|
---|
| 95 | Finally, the argument to ``listen`` tells the socket library that we want it to
|
---|
| 96 | queue up as many as 5 connect requests (the normal max) before refusing outside
|
---|
| 97 | connections. If the rest of the code is written properly, that should be plenty.
|
---|
| 98 |
|
---|
[391] | 99 | Now that we have a "server" socket, listening on port 80, we can enter the
|
---|
[2] | 100 | mainloop of the web server::
|
---|
| 101 |
|
---|
| 102 | while 1:
|
---|
| 103 | #accept connections from outside
|
---|
| 104 | (clientsocket, address) = serversocket.accept()
|
---|
| 105 | #now do something with the clientsocket
|
---|
| 106 | #in this case, we'll pretend this is a threaded server
|
---|
| 107 | ct = client_thread(clientsocket)
|
---|
| 108 | ct.run()
|
---|
| 109 |
|
---|
| 110 | There's actually 3 general ways in which this loop could work - dispatching a
|
---|
| 111 | thread to handle ``clientsocket``, create a new process to handle
|
---|
| 112 | ``clientsocket``, or restructure this app to use non-blocking sockets, and
|
---|
| 113 | mulitplex between our "server" socket and any active ``clientsocket``\ s using
|
---|
| 114 | ``select``. More about that later. The important thing to understand now is
|
---|
| 115 | this: this is *all* a "server" socket does. It doesn't send any data. It doesn't
|
---|
| 116 | receive any data. It just produces "client" sockets. Each ``clientsocket`` is
|
---|
| 117 | created in response to some *other* "client" socket doing a ``connect()`` to the
|
---|
| 118 | host and port we're bound to. As soon as we've created that ``clientsocket``, we
|
---|
| 119 | go back to listening for more connections. The two "clients" are free to chat it
|
---|
| 120 | up - they are using some dynamically allocated port which will be recycled when
|
---|
| 121 | the conversation ends.
|
---|
| 122 |
|
---|
| 123 |
|
---|
| 124 | IPC
|
---|
| 125 | ---
|
---|
| 126 |
|
---|
| 127 | If you need fast IPC between two processes on one machine, you should look into
|
---|
| 128 | whatever form of shared memory the platform offers. A simple protocol based
|
---|
| 129 | around shared memory and locks or semaphores is by far the fastest technique.
|
---|
| 130 |
|
---|
| 131 | If you do decide to use sockets, bind the "server" socket to ``'localhost'``. On
|
---|
| 132 | most platforms, this will take a shortcut around a couple of layers of network
|
---|
| 133 | code and be quite a bit faster.
|
---|
| 134 |
|
---|
| 135 |
|
---|
| 136 | Using a Socket
|
---|
| 137 | ==============
|
---|
| 138 |
|
---|
| 139 | The first thing to note, is that the web browser's "client" socket and the web
|
---|
| 140 | server's "client" socket are identical beasts. That is, this is a "peer to peer"
|
---|
| 141 | conversation. Or to put it another way, *as the designer, you will have to
|
---|
| 142 | decide what the rules of etiquette are for a conversation*. Normally, the
|
---|
| 143 | ``connect``\ ing socket starts the conversation, by sending in a request, or
|
---|
| 144 | perhaps a signon. But that's a design decision - it's not a rule of sockets.
|
---|
| 145 |
|
---|
| 146 | Now there are two sets of verbs to use for communication. You can use ``send``
|
---|
| 147 | and ``recv``, or you can transform your client socket into a file-like beast and
|
---|
[391] | 148 | use ``read`` and ``write``. The latter is the way Java presents its sockets.
|
---|
[2] | 149 | I'm not going to talk about it here, except to warn you that you need to use
|
---|
| 150 | ``flush`` on sockets. These are buffered "files", and a common mistake is to
|
---|
| 151 | ``write`` something, and then ``read`` for a reply. Without a ``flush`` in
|
---|
| 152 | there, you may wait forever for the reply, because the request may still be in
|
---|
| 153 | your output buffer.
|
---|
| 154 |
|
---|
[391] | 155 | Now we come to the major stumbling block of sockets - ``send`` and ``recv`` operate
|
---|
[2] | 156 | on the network buffers. They do not necessarily handle all the bytes you hand
|
---|
| 157 | them (or expect from them), because their major focus is handling the network
|
---|
| 158 | buffers. In general, they return when the associated network buffers have been
|
---|
| 159 | filled (``send``) or emptied (``recv``). They then tell you how many bytes they
|
---|
| 160 | handled. It is *your* responsibility to call them again until your message has
|
---|
| 161 | been completely dealt with.
|
---|
| 162 |
|
---|
| 163 | When a ``recv`` returns 0 bytes, it means the other side has closed (or is in
|
---|
| 164 | the process of closing) the connection. You will not receive any more data on
|
---|
| 165 | this connection. Ever. You may be able to send data successfully; I'll talk
|
---|
[391] | 166 | more about this later.
|
---|
[2] | 167 |
|
---|
| 168 | A protocol like HTTP uses a socket for only one transfer. The client sends a
|
---|
[391] | 169 | request, then reads a reply. That's it. The socket is discarded. This means that
|
---|
[2] | 170 | a client can detect the end of the reply by receiving 0 bytes.
|
---|
| 171 |
|
---|
| 172 | But if you plan to reuse your socket for further transfers, you need to realize
|
---|
[391] | 173 | that *there is no* :abbr:`EOT (End of Transfer)` *on a socket.* I repeat: if a socket
|
---|
[2] | 174 | ``send`` or ``recv`` returns after handling 0 bytes, the connection has been
|
---|
| 175 | broken. If the connection has *not* been broken, you may wait on a ``recv``
|
---|
| 176 | forever, because the socket will *not* tell you that there's nothing more to
|
---|
| 177 | read (for now). Now if you think about that a bit, you'll come to realize a
|
---|
| 178 | fundamental truth of sockets: *messages must either be fixed length* (yuck), *or
|
---|
| 179 | be delimited* (shrug), *or indicate how long they are* (much better), *or end by
|
---|
| 180 | shutting down the connection*. The choice is entirely yours, (but some ways are
|
---|
| 181 | righter than others).
|
---|
| 182 |
|
---|
| 183 | Assuming you don't want to end the connection, the simplest solution is a fixed
|
---|
| 184 | length message::
|
---|
| 185 |
|
---|
| 186 | class mysocket:
|
---|
| 187 | '''demonstration class only
|
---|
| 188 | - coded for clarity, not efficiency
|
---|
| 189 | '''
|
---|
| 190 |
|
---|
| 191 | def __init__(self, sock=None):
|
---|
| 192 | if sock is None:
|
---|
| 193 | self.sock = socket.socket(
|
---|
| 194 | socket.AF_INET, socket.SOCK_STREAM)
|
---|
| 195 | else:
|
---|
| 196 | self.sock = sock
|
---|
| 197 |
|
---|
| 198 | def connect(self, host, port):
|
---|
| 199 | self.sock.connect((host, port))
|
---|
| 200 |
|
---|
| 201 | def mysend(self, msg):
|
---|
| 202 | totalsent = 0
|
---|
| 203 | while totalsent < MSGLEN:
|
---|
| 204 | sent = self.sock.send(msg[totalsent:])
|
---|
| 205 | if sent == 0:
|
---|
| 206 | raise RuntimeError("socket connection broken")
|
---|
| 207 | totalsent = totalsent + sent
|
---|
| 208 |
|
---|
| 209 | def myreceive(self):
|
---|
| 210 | msg = ''
|
---|
| 211 | while len(msg) < MSGLEN:
|
---|
| 212 | chunk = self.sock.recv(MSGLEN-len(msg))
|
---|
| 213 | if chunk == '':
|
---|
| 214 | raise RuntimeError("socket connection broken")
|
---|
| 215 | msg = msg + chunk
|
---|
| 216 | return msg
|
---|
| 217 |
|
---|
| 218 | The sending code here is usable for almost any messaging scheme - in Python you
|
---|
| 219 | send strings, and you can use ``len()`` to determine its length (even if it has
|
---|
| 220 | embedded ``\0`` characters). It's mostly the receiving code that gets more
|
---|
| 221 | complex. (And in C, it's not much worse, except you can't use ``strlen`` if the
|
---|
| 222 | message has embedded ``\0``\ s.)
|
---|
| 223 |
|
---|
| 224 | The easiest enhancement is to make the first character of the message an
|
---|
| 225 | indicator of message type, and have the type determine the length. Now you have
|
---|
| 226 | two ``recv``\ s - the first to get (at least) that first character so you can
|
---|
| 227 | look up the length, and the second in a loop to get the rest. If you decide to
|
---|
| 228 | go the delimited route, you'll be receiving in some arbitrary chunk size, (4096
|
---|
| 229 | or 8192 is frequently a good match for network buffer sizes), and scanning what
|
---|
| 230 | you've received for a delimiter.
|
---|
| 231 |
|
---|
| 232 | One complication to be aware of: if your conversational protocol allows multiple
|
---|
| 233 | messages to be sent back to back (without some kind of reply), and you pass
|
---|
| 234 | ``recv`` an arbitrary chunk size, you may end up reading the start of a
|
---|
| 235 | following message. You'll need to put that aside and hold onto it, until it's
|
---|
| 236 | needed.
|
---|
| 237 |
|
---|
| 238 | Prefixing the message with it's length (say, as 5 numeric characters) gets more
|
---|
| 239 | complex, because (believe it or not), you may not get all 5 characters in one
|
---|
| 240 | ``recv``. In playing around, you'll get away with it; but in high network loads,
|
---|
| 241 | your code will very quickly break unless you use two ``recv`` loops - the first
|
---|
| 242 | to determine the length, the second to get the data part of the message. Nasty.
|
---|
| 243 | This is also when you'll discover that ``send`` does not always manage to get
|
---|
| 244 | rid of everything in one pass. And despite having read this, you will eventually
|
---|
| 245 | get bit by it!
|
---|
| 246 |
|
---|
| 247 | In the interests of space, building your character, (and preserving my
|
---|
| 248 | competitive position), these enhancements are left as an exercise for the
|
---|
| 249 | reader. Lets move on to cleaning up.
|
---|
| 250 |
|
---|
| 251 |
|
---|
| 252 | Binary Data
|
---|
| 253 | -----------
|
---|
| 254 |
|
---|
| 255 | It is perfectly possible to send binary data over a socket. The major problem is
|
---|
| 256 | that not all machines use the same formats for binary data. For example, a
|
---|
| 257 | Motorola chip will represent a 16 bit integer with the value 1 as the two hex
|
---|
| 258 | bytes 00 01. Intel and DEC, however, are byte-reversed - that same 1 is 01 00.
|
---|
| 259 | Socket libraries have calls for converting 16 and 32 bit integers - ``ntohl,
|
---|
| 260 | htonl, ntohs, htons`` where "n" means *network* and "h" means *host*, "s" means
|
---|
| 261 | *short* and "l" means *long*. Where network order is host order, these do
|
---|
| 262 | nothing, but where the machine is byte-reversed, these swap the bytes around
|
---|
| 263 | appropriately.
|
---|
| 264 |
|
---|
| 265 | In these days of 32 bit machines, the ascii representation of binary data is
|
---|
| 266 | frequently smaller than the binary representation. That's because a surprising
|
---|
| 267 | amount of the time, all those longs have the value 0, or maybe 1. The string "0"
|
---|
| 268 | would be two bytes, while binary is four. Of course, this doesn't fit well with
|
---|
| 269 | fixed-length messages. Decisions, decisions.
|
---|
| 270 |
|
---|
| 271 |
|
---|
| 272 | Disconnecting
|
---|
| 273 | =============
|
---|
| 274 |
|
---|
| 275 | Strictly speaking, you're supposed to use ``shutdown`` on a socket before you
|
---|
| 276 | ``close`` it. The ``shutdown`` is an advisory to the socket at the other end.
|
---|
| 277 | Depending on the argument you pass it, it can mean "I'm not going to send
|
---|
| 278 | anymore, but I'll still listen", or "I'm not listening, good riddance!". Most
|
---|
| 279 | socket libraries, however, are so used to programmers neglecting to use this
|
---|
| 280 | piece of etiquette that normally a ``close`` is the same as ``shutdown();
|
---|
| 281 | close()``. So in most situations, an explicit ``shutdown`` is not needed.
|
---|
| 282 |
|
---|
| 283 | One way to use ``shutdown`` effectively is in an HTTP-like exchange. The client
|
---|
| 284 | sends a request and then does a ``shutdown(1)``. This tells the server "This
|
---|
| 285 | client is done sending, but can still receive." The server can detect "EOF" by
|
---|
| 286 | a receive of 0 bytes. It can assume it has the complete request. The server
|
---|
| 287 | sends a reply. If the ``send`` completes successfully then, indeed, the client
|
---|
| 288 | was still receiving.
|
---|
| 289 |
|
---|
| 290 | Python takes the automatic shutdown a step further, and says that when a socket
|
---|
| 291 | is garbage collected, it will automatically do a ``close`` if it's needed. But
|
---|
| 292 | relying on this is a very bad habit. If your socket just disappears without
|
---|
| 293 | doing a ``close``, the socket at the other end may hang indefinitely, thinking
|
---|
| 294 | you're just being slow. *Please* ``close`` your sockets when you're done.
|
---|
| 295 |
|
---|
| 296 |
|
---|
| 297 | When Sockets Die
|
---|
| 298 | ----------------
|
---|
| 299 |
|
---|
| 300 | Probably the worst thing about using blocking sockets is what happens when the
|
---|
| 301 | other side comes down hard (without doing a ``close``). Your socket is likely to
|
---|
| 302 | hang. SOCKSTREAM is a reliable protocol, and it will wait a long, long time
|
---|
| 303 | before giving up on a connection. If you're using threads, the entire thread is
|
---|
| 304 | essentially dead. There's not much you can do about it. As long as you aren't
|
---|
| 305 | doing something dumb, like holding a lock while doing a blocking read, the
|
---|
| 306 | thread isn't really consuming much in the way of resources. Do *not* try to kill
|
---|
| 307 | the thread - part of the reason that threads are more efficient than processes
|
---|
| 308 | is that they avoid the overhead associated with the automatic recycling of
|
---|
| 309 | resources. In other words, if you do manage to kill the thread, your whole
|
---|
| 310 | process is likely to be screwed up.
|
---|
| 311 |
|
---|
| 312 |
|
---|
| 313 | Non-blocking Sockets
|
---|
| 314 | ====================
|
---|
| 315 |
|
---|
[391] | 316 | If you've understood the preceding, you already know most of what you need to
|
---|
[2] | 317 | know about the mechanics of using sockets. You'll still use the same calls, in
|
---|
| 318 | much the same ways. It's just that, if you do it right, your app will be almost
|
---|
| 319 | inside-out.
|
---|
| 320 |
|
---|
| 321 | In Python, you use ``socket.setblocking(0)`` to make it non-blocking. In C, it's
|
---|
| 322 | more complex, (for one thing, you'll need to choose between the BSD flavor
|
---|
| 323 | ``O_NONBLOCK`` and the almost indistinguishable Posix flavor ``O_NDELAY``, which
|
---|
| 324 | is completely different from ``TCP_NODELAY``), but it's the exact same idea. You
|
---|
| 325 | do this after creating the socket, but before using it. (Actually, if you're
|
---|
| 326 | nuts, you can switch back and forth.)
|
---|
| 327 |
|
---|
| 328 | The major mechanical difference is that ``send``, ``recv``, ``connect`` and
|
---|
| 329 | ``accept`` can return without having done anything. You have (of course) a
|
---|
| 330 | number of choices. You can check return code and error codes and generally drive
|
---|
| 331 | yourself crazy. If you don't believe me, try it sometime. Your app will grow
|
---|
| 332 | large, buggy and suck CPU. So let's skip the brain-dead solutions and do it
|
---|
| 333 | right.
|
---|
| 334 |
|
---|
| 335 | Use ``select``.
|
---|
| 336 |
|
---|
| 337 | In C, coding ``select`` is fairly complex. In Python, it's a piece of cake, but
|
---|
| 338 | it's close enough to the C version that if you understand ``select`` in Python,
|
---|
[391] | 339 | you'll have little trouble with it in C::
|
---|
[2] | 340 |
|
---|
| 341 | ready_to_read, ready_to_write, in_error = \
|
---|
| 342 | select.select(
|
---|
| 343 | potential_readers,
|
---|
| 344 | potential_writers,
|
---|
| 345 | potential_errs,
|
---|
| 346 | timeout)
|
---|
| 347 |
|
---|
| 348 | You pass ``select`` three lists: the first contains all sockets that you might
|
---|
| 349 | want to try reading; the second all the sockets you might want to try writing
|
---|
| 350 | to, and the last (normally left empty) those that you want to check for errors.
|
---|
| 351 | You should note that a socket can go into more than one list. The ``select``
|
---|
| 352 | call is blocking, but you can give it a timeout. This is generally a sensible
|
---|
| 353 | thing to do - give it a nice long timeout (say a minute) unless you have good
|
---|
| 354 | reason to do otherwise.
|
---|
| 355 |
|
---|
[391] | 356 | In return, you will get three lists. They contain the sockets that are actually
|
---|
[2] | 357 | readable, writable and in error. Each of these lists is a subset (possibly
|
---|
[391] | 358 | empty) of the corresponding list you passed in.
|
---|
[2] | 359 |
|
---|
| 360 | If a socket is in the output readable list, you can be
|
---|
| 361 | as-close-to-certain-as-we-ever-get-in-this-business that a ``recv`` on that
|
---|
| 362 | socket will return *something*. Same idea for the writable list. You'll be able
|
---|
| 363 | to send *something*. Maybe not all you want to, but *something* is better than
|
---|
| 364 | nothing. (Actually, any reasonably healthy socket will return as writable - it
|
---|
| 365 | just means outbound network buffer space is available.)
|
---|
| 366 |
|
---|
| 367 | If you have a "server" socket, put it in the potential_readers list. If it comes
|
---|
| 368 | out in the readable list, your ``accept`` will (almost certainly) work. If you
|
---|
| 369 | have created a new socket to ``connect`` to someone else, put it in the
|
---|
| 370 | potential_writers list. If it shows up in the writable list, you have a decent
|
---|
| 371 | chance that it has connected.
|
---|
| 372 |
|
---|
| 373 | One very nasty problem with ``select``: if somewhere in those input lists of
|
---|
| 374 | sockets is one which has died a nasty death, the ``select`` will fail. You then
|
---|
| 375 | need to loop through every single damn socket in all those lists and do a
|
---|
| 376 | ``select([sock],[],[],0)`` until you find the bad one. That timeout of 0 means
|
---|
| 377 | it won't take long, but it's ugly.
|
---|
| 378 |
|
---|
| 379 | Actually, ``select`` can be handy even with blocking sockets. It's one way of
|
---|
| 380 | determining whether you will block - the socket returns as readable when there's
|
---|
| 381 | something in the buffers. However, this still doesn't help with the problem of
|
---|
| 382 | determining whether the other end is done, or just busy with something else.
|
---|
| 383 |
|
---|
| 384 | **Portability alert**: On Unix, ``select`` works both with the sockets and
|
---|
| 385 | files. Don't try this on Windows. On Windows, ``select`` works with sockets
|
---|
| 386 | only. Also note that in C, many of the more advanced socket options are done
|
---|
| 387 | differently on Windows. In fact, on Windows I usually use threads (which work
|
---|
| 388 | very, very well) with my sockets. Face it, if you want any kind of performance,
|
---|
| 389 | your code will look very different on Windows than on Unix.
|
---|
| 390 |
|
---|
| 391 |
|
---|
| 392 | Performance
|
---|
| 393 | -----------
|
---|
| 394 |
|
---|
| 395 | There's no question that the fastest sockets code uses non-blocking sockets and
|
---|
| 396 | select to multiplex them. You can put together something that will saturate a
|
---|
| 397 | LAN connection without putting any strain on the CPU. The trouble is that an app
|
---|
| 398 | written this way can't do much of anything else - it needs to be ready to
|
---|
| 399 | shuffle bytes around at all times.
|
---|
| 400 |
|
---|
| 401 | Assuming that your app is actually supposed to do something more than that,
|
---|
| 402 | threading is the optimal solution, (and using non-blocking sockets will be
|
---|
| 403 | faster than using blocking sockets). Unfortunately, threading support in Unixes
|
---|
| 404 | varies both in API and quality. So the normal Unix solution is to fork a
|
---|
| 405 | subprocess to deal with each connection. The overhead for this is significant
|
---|
| 406 | (and don't do this on Windows - the overhead of process creation is enormous
|
---|
| 407 | there). It also means that unless each subprocess is completely independent,
|
---|
| 408 | you'll need to use another form of IPC, say a pipe, or shared memory and
|
---|
| 409 | semaphores, to communicate between the parent and child processes.
|
---|
| 410 |
|
---|
| 411 | Finally, remember that even though blocking sockets are somewhat slower than
|
---|
| 412 | non-blocking, in many cases they are the "right" solution. After all, if your
|
---|
| 413 | app is driven by the data it receives over a socket, there's not much sense in
|
---|
| 414 | complicating the logic just so your app can wait on ``select`` instead of
|
---|
| 415 | ``recv``.
|
---|
| 416 |
|
---|