Preview: Reliable UDP implementation, lockstep, LAN, and parity bit checking

Published October 08, 2015 by Denis Ivanov, posted by polyfrag
Do you see issues with this article? Let us know.
Advertisement

Introduction Whether you're interested in making an FPS or RTS, you've probably heard that you should use UDP. It's probably because of its speed. Using TCP with the TCP_NODELAY option (which means it doesn't wait for enough data to be buffered before sending) may not be enough as TCP also does congestion control and you may want to do voice chat, which is better done using a lossy protocol like UDP. TCP doesn't allow you to adjust the "sliding window", which means you might not reach the full speed of the communication channel if it has a high delay (search for "bandwidth-delay factor" for more information). If one packet doesn't arrive in TCP and is lost, TCP stops all traffic flow until it arrives, resulting in pauses. The packet header in TCP is also 20 bytes, as opposed to 6 bytes in UDP plus a few for reliability. Combining TCP and UDP is not an option as one induces packet loss in the other. So you should use UDP, but how do you guarantee packets are delivered and in the order they were sent in? Also, if you're making an RTS, how do you make sure that clients are running exactly the same simulation? A small change can cause a butterfly effect and be the difference between one player winning and losing. For this you need reliable UDP (RUDP) and lockstep. In addition, you probably want to use the latest client-server methodology, which allows the game to not be held back by the slowest player, as is done in the older peer-to-peer model. Along the way we'll cover parity bit checking to ensure packet correctness and integrity. We'll also cover LAN networking and setting up a matchmaking server.

This free article version covers only the UDP implementation. If you are interested in the full-breadth of the topic discussed above, you can purchase the complete 70-page PDF document on the GDNet Marketplace.
Library The library that I use for networking is SDL2_net. This is just abstraction of networking to allow us to deploy to different platforms like Windows, Linux, iPhone, and Android. If you want, you can use WinSock or the native networking functions of Linux. They're all pretty much the same, with a few differences in initialization and function names. But I would recommand SDL2_net as that is cross-platform. Watch for a possible article on setting up and compiling libraries for details on how to set up SDL2_net for use in your projects if you aren't able to do this yourself. By the way, if you're doing voice chat, try PortAudio for getting microphone input on Windows, Mac, and Linux. For transmitting you'd need Speex or the newer Opus speech codec for encoding the data. Packet Header So let's jump in. The way you will keep track of the order packets are sent in and also guarantee delivery is by using an ack or sequence number. We will make a header that will be at the beginning of each packet we send. We also need to know what kind of packet this is. struct PacketHeader { unsigned short type; unsigned short ack; }; To make sure the compiler packs the packet data as tightly as possible, to reduce network data usage, we can remove byte padding by putting this around all of our packet definitions. // byte-align structures #pragma pack(push, 1) //.. packets go here ?????????????????????????????? // default alignment #pragma pack(pop) By the way, they're not really packets; packets are what we use in TCP, but the subset in UDP is called a datagram. UDP stands for "user datagram protocol". But I call them packets. Control/Protocol Packets Let's define some control/protocol packets that will be part of our reliable UDP protocol. #define PACKET_NULL 0 #define PACKET_DISCONNECT 1 #define PACKET_CONNECT 2 #define PACKET_ACKNOWLEDGMENT 3 #define PACKET_NOCONN 4 #define PACKET_KEEPALIVE 5 #define PACKET_NACK 6 struct BasePacket { PacketHeader header; }; typedef BasePacket NoConnectionPacket; typedef BasePacket AckPacket; typedef BasePacket KeepAlivePacket; struct ConnectPacket { PacketHeader header; bool reconnect; unsigned short yourlastrecvack; unsigned short yournextrecvack; unsigned short yourlastsendack; }; struct DisconnectPacket { PacketHeader header; }; We will send a ConnectPacket to establish a new connection to a computer that is listening on a certain port. If you're behind a router and have several computers behind it, your router doesn't know which computer to forward an incoming connection to unless that computer has already sent data from that port outside. This is why a matchmaking server is needed, unless playing on LAN. More will be covered later. We will send a DisconnectPacket to end a connection. We need to send an AckPacket (acknowledgment) whenever we receive a packet type that is supposed to be reliable (that needs to arrive and be processed in order and cannot be lost). If the other side doesn't receive the ack, it will keep resending until it gets one or times out and assumes the connection to be lost. This is called "selective repeat" reliable UDP. Occasionally, we might not need to send anything for a long time but tell the other side to keep our connection alive, for example if we've connected to a matchmaking server and want to tell it to keep our game in the list. When this happens, we need to send a KeepAlivePacket. Almost all of these packets are reliable, as defined in the second paragraph on this page, except for the AckPacket and NoConnectionPacket. When we're first establishing a connection, we will use the ack number to set as the start of our sequence. When there's an interruption and the other side has dropped us due to timeout, but we still have a connection to them, they will send a NoConnectionPacket, which is not reliable, but is sent every time we receive a packet from an unknown source (whose address we don't recognize as having established a connection to). Whenever this happens, we have to do a full reconnect as the sequence/ack numbers can't be recovered. This is important because we will have a list of packets that we didn't get a reply to that we will resend that need to be acknowledged. You will understand more as we talk about how ack/sequence numbers work. Lastly, the AckPacket's ack member is used to tell the other side which packet we're acknowledging. So an ack packet is not reliable, and is sent on an as-needed basis, when we receive a packet from the other side. User Control/Protocol Packets The rest of the packet types are user-defined (that's you) and depend on the type of application needed. If you're making a strategy game, you'll have a packet type to place a building, or to order units around. But there are also some control/protocol packets needed that are not part of the core protocol. These are packets for joining game sessions, getting game host information, getting a list of game hosts, informing a game client that the game room is full or that his version is older. Here are some suggested packet types for a multiplayer RTS or any kind of strategy game, but may also apply to FPS. These are used after a connection has been established. #define PACKET_JOIN 7 #define PACKET_ADDSV 8 #define PACKET_ADDEDSV 9 #define PACKET_GETSVLIST 10 #define PACKET_SVADDR 11 #define PACKET_SVINFO 12 #define PACKET_GETSVINFO 13 #define PACKET_SENDNEXTHOST 14 #define PACKET_NOMOREHOSTS 15 #define PACKET_ADDCLIENT 16 #define PACKET_SELFCLIENT 17 #define PACKET_SETCLNAME 18 #define PACKET_CLIENTLEFT 19 #define PACKET_CLIENTROLE 20 #define PACKET_DONEJOIN 21 #define PACKET_TOOMANYCL 22 #define PACKET_MAPCHANGE 23 #define PACKET_CLDISCONNECTED 24 #define PACKET_CLSTATE 25 #define PACKET_CHAT 26 #define PACKET_MAPSTART 27 #define PACKET_GAMESTARTED 28 #define PACKET_WRONGVERSION 29 #define PACKET_LANCALL 30 #define PACKET_LANANSWER 31 Lockstep Control Packets Once a game has been joined and started, RTS games also need to control the simulation to keep it in sync on all sides. A class of packet types that control the lockstep protocol we will call the lockstep control packets. #define PACKET_NETTURN 29 #define PACKET_DONETURN 30 User Command / In-Game Packets Any user command or input that effects the simulation needs to be bundled up inside a NetTurnPacket by the host and sent to all the clients as part of lockstep. But first it is sent to the host on its own. More on this later. Here are some example packets I use. #define PACKET_PLACEBL 32 #define PACKET_CHVAL 33 #define PACKET_ORDERMAN 34 #define PACKET_MOVEORDER 35 #define PACKET_PLACECD 36 Initialization You should really read some C/C++ UDP or SDL2_net UDP example code before you try to proceed with this, and the basics of that are not what I'm covering here. But nevertheless, I will mention that you need to initialize before you use SDL2_net or WinSock. if(SDLNet_Init() == -1) { char msg[1280]; sprintf(msg, "SDLNet_Init: %s\n", SDLNet_GetError()); SDL_ShowSimpleMessageBox(SDL_MESSAGEBOX_ERROR, "Error", msg, NULL); } Here are some examples for low-level (not using SDL) UDP networking: https://www.cs.rutgers.edu/~pxk/417/notes/sockets/udp.html http://www.binarytides.com/programming-udp-sockets-c-linux/ http://www.codeproject.com/Articles/11740/A-simple-UDP-time-server-and-client-for-beginners They all cover the same material. You just need to know the basics of how to intialize, send data, and receive data using UDP (not TCP). Net Update Loop So somewhere in your frame loop, where you update your game state and render everything, you will add a call to UpdNet, which will process received packets. The skeleton of it will look roughly like this. //Net input void UpdNet() { int bytes; UDPpacket *in; UDPsocket* sock = &g_sock; if(!sock) return; in = SDLNet_AllocPacket(65535); do { in->data[0] = 0; bytes = SDLNet_UDP_Recv(*sock, in); IPaddress ip; memcpy(&ip, &in->address, sizeof(IPaddress)); if(bytes > 0) TranslatePacket((char*)in->data, bytes, true, &g_sock, &ip); } while(bytes > 0); SDLNet_FreePacket(in); } Following, inside the do-while loop, is the equivalent code in regular Berkeley sockets using recvfrom, in case you're not using SDL2_net: //Net input void UpdNet() { int bytes; int* sock = &g_sock; if(!sock) return; do { struct sockaddr_in from; socklen_t fromlen = sizeof(struct sockaddr_in); char buffer[65535]; bytes = recvfrom(g_socket, buffer, 65535, 0, (struct addr *)&from, &fromlen); if(bytes > 0) TranslatePacket(buffer, bytes, true, &g_sock, &ip); } while(bytes > 0); } This basically loops while we still have data to process in the buffer. If we do, we send it to the TranslatePacket function, which takes as parameters: the buffer of data ("buffer") the number of bytes received ("bytes") whether we want to check the acknowledgement/sequence number and process it in order ("checkprev"), as we might choose to send a dummy packet that we stored to reuse our command switch functionality for lockstep batch packets, or whether we want to process it right away the socket ("sock") and the IP address and port it came from ("from"). void TranslatePacket(char* buffer, int bytes, bool checkprev, UDPsocket* sock, IPaddress* from) We will get to TranslatePacket next, but first take a look at these function calls we will also have at the end of the UpdNet function. KeepAlive(); CheckConns(); ResendPacks(); #ifndef MATCHMAKER CheckAddSv(); CheckGetSvs(); #else SendSvs(); #endif KeepAlive() tries to keep connections alive that are about to time out. CheckConns() checks for connections that we've closed or that have timed out from unresponsiveness and recycles them. ResendPacks() tries to resend packets that we haven't received an acknowledgement for, once they've waited long enough. Then we have a preprocessor check to see whether we're compiling for the matchmaking server or the client program/app. If we're a game client or host, we only care about CheckAddSv() and CheckGetSv(). CheckAddSv() checks whether our game host address is up on the matchmaker list and getting the information on each one of the received hosts. It also removes hosts from our list that we have lost a connection too (due to time out). CheckGetSv() makes sure the matchmaker will send us the next host from its list if we've requested to get a host list. If we're the matchmaker however, we only care about SendSvs(), which sends the next host in the list for each requesting client when the last one has been ack'd. Connection Class Because there's no concept of a connection in UDP, we need to make it ourselves. class NetConn { public: unsigned short nextsendack; unsigned short lastrecvack; bool handshook; IPaddress addr; //TODO change these to flags bool isclient; //is this a hosted game's client? or for MATCHMAKER, is this somebody requesting sv list? bool isourhost; //is this the currently joined game's host? cannot be a host from a server list or something. for MATCHMAKER, it can be a host getting added to sv list. bool ismatch; //matchmaker? bool ishostinfo; //is this a host we're just getting info from for our sv list? //bool isunresponsive; unsigned long long lastsent; unsigned long long lastrecv; short client; float ping; bool closed; bool disconnecting; void expirein(int millis); #ifdef MATCHMAKER int svlistoff; //offset in server list, sending a few at a time SendSvInfo svinfo; #endif //void (*chcallback)(NetConn* nc, bool success); //connection state change callback - did we connect successfully or time out? NetConn() { client = -1; handshook = false; nextsendack = 0; //important - reply ConnectPacket with ack=0 will be //ignored as copy (even though it is original) if new NetConn's lastrecvack=0. lastrecvack = USHRT_MAX; isclient = false; isourhost = false; ismatch = false; ishostinfo = false; //isunresponsive = false; lastrecv = GetTicks(); lastsent = GetTicks(); //chcallback = NULL; #ifdef MATCHMAKER svlistoff = -1; #endif ping = 1; closed = false; } }; "nextsendack" is the outgoing sequence number that the next reliable packet will have. We increment it by one each time and it wraps around from 0 when it maxes out. "lastrecvack" is the inbound sequence number of the last received reliable packet (the next one will be greater until it wraps around). Because we can send and receive independently, we keep two acks/sequences. When we start a connection, "nextsendack" is 0 (for the first packet sent, the ConnectPacket), and "lastrecvack" is 65535, which is the maximum unsigned short value, before it wraps around to 0. Although nextsendack can be set to anything (and should) as long as we set the ack of the first ConnectPacket to that, and that is more secure, as it is harder to predict or crack and probably protects data better (but I haven't tried it). "handshook" tells us whether the other side has acknowledged the ConnectPacket (and therefore created an accompanying NetConn connection instance for us). When we receive a ConnectPacket, we set "handshook" to true and acknowledge them, recording the inbound ack and setting "nextsendack" to 0. If "handshook" is true, it tells us that we can now send reliable packets on the connection to this address, as the sequence numbers are in place. IPaddress is the SDL2_net IP address and port structure, the equivalent of sockaddr_in in regular Berkeley sockets. Translating Packets This is what TranslatePacket does. When we "translate" or process the packet we want to know if it's an old packet that we've already processed once or if it's ahead of the next expected packet, if it's meant to be reliable as defined previously. We also acknowledge any packets that are reliable. And then finally we execute them. The game host does an extra part in beginning of the function, checking if any client connections are unresponsive or have become responsive again and relays that information to other clients, so the blame can be pinned on the lagger. The TranslatePacket() function has this basic outline: 1. Match address to a connection 2. Update last received timestamp if match found 3. Check packet type if we need to check sequence number or if we process it right away 4. Check sequence number for one of three cases (behind, current, or future) 5. Acknowledge packet if needed 6. If we don't recognize the connection and it's supposed to be reliable, tell the other side that we don't have a connection with them 7. Execute the packet 8. Execute any buffered packets after the current one (in order) 9. And update the last received sequence number to the last packet executed Step by step, the function is: void TranslatePacket(char* buffer, int bytes, bool checkprev, UDPsocket* sock, IPaddress* from) { //1. Match address to a connection PacketHeader* header = (PacketHeader*)buffer; NetConn* nc = Match(from); We pass an IPaddress struct pointer to Match which returns the matching connection or NULL on failure. Then, if we got a match, we update the last received time for the connection. If the connection is associated with client in the game room, and that client was previously unresponsive, we can mark it as responsive again and tell the other clients. //If we recognize this connection... if(nc) { //2. Update the timestamp of the last received packet nc->lastrecv = GetTicks(); #ifndef MATCHMAKER //check if was previously unresponsive //and if (s)he was, tell others that (s)he //is now responsive. if(nc->client >= 0) { Client* c = &g_client[nc->client]; //was this client unresponsive? if(c->unresp) { //it's now responsive again. c->unresp = false; //if we're the game host if(g_netmode == NETM_HOST) { //inform others ClStatePacket csp; csp.header.type = PACKET_CLSTATE; csp.chtype = CLCH_RESP; csp.client = nc->client; //send to all except the original client (nc->addr) SendAll((char*)&csp, sizeof(ClStatePacket), true, false, &nc->addr); } } } #endif } We know that certain packets are meant to be processed right away, without checking for them to be processed in sequence. For example, acknowedgement packets are non-reliable and don't need to be processed in a specific order. Connect packets are to be executed as soon as they are received, because no other packet is supposed to be sent with them. Same for disconnect packets, "no connection" packets, LAN call, and LAN answer. //3. Check packet type if we need to check sequence number or if we process it right away //control packets //don't check sequence for these ones and process them straight away //but acknowledge CONNECT and DISCONNECT switch(header->type) { case PACKET_ACKNOWLEDGMENT: case PACKET_CONNECT: //need to send back ack case PACKET_DISCONNECT: //need to send back ack case PACKET_NOCONN: case PACKET_NACK: case PACKET_LANCALL: case PACKET_LANANSWER: checkprev = false; break; default: break; } If it's not one of those packet types, checkprev=true, and we check the sequence number. These are reliable packets that must be processed in the order they were sent in. If we're missing a packet in the sequence, we will buffer the packets after it while we wait for the missing packet to arrive. "next" will be the next expected sequence number (the one after lastrecvack in NetConn). "last" will be updated each time we execute a packet, to update lastrecvack with the last one. unsigned short next; //next expected packet ack unsigned short last = PrevAck(header->ack); //last packet ack to be executed //4. Check sequence number for one of three cases (behind, current, or future) //If checkprev was set (directly above), we need to check the sequence. //It must be a recognized NetConn; otherwise we don't have any sequence numbers. if(checkprev && nc != NULL) { // ?????????????????????????????? check sequence number (check snippet further down) ... } Then we acknowledge the packet if it's meant to be reliable. Acknowledgement packets don't need acknowledgements themselves. A "no connection" packet tells us the other side doesn't even have sequence numbers for us, so there's no point acknowledging it. Usually, if checkprev=false, we don't check the packet sequence so we don't care about acknowledging it, but for connect and disconnect packets we must acknowledge because the other side expects a success signal back. //5. Acknowledge packet if needed procpack: //We might disconnect further down in PacketSwitch() //So acknowledge packets while we still have the sequence numbers nc = Match(from); //Don't acknowledge NoConn packets as they are non-reliable, //and ack'ing them would cause a non-ending ack loop. if(header->type != PACKET_ACKNOWLEDGMENT && header->type != PACKET_NOCONN && sock && nc) { Acknowledge(header->ack, nc, from, sock, buffer, bytes); } //Always acknowledge ConnectPacket's else if( header->type == PACKET_CONNECT && sock ) { Acknowledge(header->ack, NULL, from, sock, buffer, bytes); } //And acknowledge DisconnectPacket's else if(header->type == PACKET_DISCONNECT && sock) { Acknowledge(header->ack, NULL, from, sock, buffer, bytes); } If we got disconnected from the other side and for some reason they retained the connection, we'll get packets that we have to tell the other side we can't process. They can then show an error to the user or try to reconnect. //6. If we don't recognize the connection and it's supposed to be reliable, tell the other side that we don't have a connection with them //We're getting an anonymous packet. //Maybe we've timed out and they still have a connection. //Tell them we don't have a connection. //We check if sock is set to make sure this isn't a local //command packet being executed. if(!nc && header->type != PACKET_CONNECT && header->type != PACKET_NOCONN && header->type != PACKET_LANCALL && header->type != PACKET_LANANSWER && sock) { NoConnectionPacket ncp; ncp.header.type = PACKET_NOCONN; SendData((char*)&ncp, sizeof(NoConnectionPacket), from, false, true, NULL, &g_sock, 0, NULL); return; } Then we execute packets. First, any packets before the current received one. Then the one we just received. And then any that we buffered that come after it. The reason we execute packets that came BEFORE is because we may have a case like this: packet 1 received packet 2 received packet 5 received We'll be able to execute packets 1 and 2 even though the current is 5. updinack: //7. Execute the packet //8. Execute any buffered packets after the current one (in order) // Translate in order if(checkprev && nc) { last = PrevAck(header->ack); last = ParseRecieved(next, last, nc); } // Translate in order if(NextAck(last) == header->ack || !checkprev) { PacketSwitch(header->type, buffer, bytes, nc, from, sock); last = header->ack; } // Translate in order if(checkprev && nc && last == header->ack) { while(true) { if(!Recieved(last+1, last+1, nc)) break; last++; ParseRecieved(last, last, nc); } } Finally, we update the received sequence number. We have to match up the connection pointer with the address again, because the instance it was pointing to might have been erased, or it might have appeared when it wasn't previously, as a connection is erased or created respectively. For non-reliable packets we don't update the sequence number. For connect or disconnect packets, we only set the sequence number inside the PacketSwitch read function call when we create a connection. //9. And update the last received sequence number to the last packet executed //have to do this again because PacketSwitch might //read a ConnectPacket, which adds new connections. //also connection might have //been Disconnected(); and erased. nc = Match(from); //ack Connect packets after new NetConn added... //Don't acknowledge NoConn packets as they are non-reliable if(header->type != PACKET_ACKNOWLEDGMENT && header->type != PACKET_NOCONN && sock && nc && checkprev) { if(header->type != PACKET_CONNECT && header->type != PACKET_DISCONNECT) nc->lastrecvack = last; } } The PacketSwitch() at the end is what executes the packet. It might better be called ExecPacket(). The Match() function at the top compares the "addr" port and IP address integers to every known connection and returns the match, or NULL on failure. NetConn* Match(IPaddress* addr) { if(!addr) return NULL; for(auto ci=g_conn.begin(); ci!=g_conn.end(); ci++) if(Same(&ci->addr, addr)) return &*ci; return NULL; } bool Same(IPaddress* a, IPaddress* b) { if(a->host != b->host) return false; if(a->port != b->port) return false; return true; } The packet is "old" if we've already buffered it (but it's ahead of the next expected ack/sequence number that we processed), or if its ack/sequence is behind our connection class's "lastrecvack". We use an unsigned short for the sequence number, which holds a maximum value of 65535. Because we might exceed this value after 36 minutes if we send 30 packets a second, we wrap around and thus, there's a "sliding window" of values that are considered to be in the past (don't confuse this with the "sliding window" packet range that might be being reliably resent at any given moment). We can check if an ack is in the past (behind what is already executed) using PastAck(): bool PastAck(unsigned short test, unsigned short current) { return ((current >= test) && (current - test <= USHRT_MAX/2)) || ((test > current) && (test - current > USHRT_MAX/2)); } Where PastAck tests whether "test" is behind or at "current". Let's look in more detail at the part in the middle of TranslatePacket that checks the sequence number. We define some variables. "next" will hold the current expected ack (lastrecvack+1) inside the following code block. "last" will hold the last packet to have been executed. For now, it's set to something, but it doesn't matter, as we update it at the end. unsigned short next; //next expected packet ack unsigned short last = PrevAck(header->ack); //last packet ack to be executed We only check the sequence numbers if it's a packet that makes checkprev=true and if it's from a recognized connection. if(checkprev && nc != NULL) { We set the "next" expected packet number. next = NextAck(nc->lastrecvack); //next expected packet ack last = next; //last packet ack to be executed Next, we check how the received packet's sequence number compares to the next expected one. //CASE #1: ???????????????old??????????????? packet if(PastAck(header->ack, nc->lastrecvack) || Recieved(header->ack, header->ack, nc)) { Acknowledge(header->ack, nc, from, sock, buffer, bytes); return; } //CASE #2: current packet (the next expected packet) if(header->ack == next) { // Translate packet last = next; } //CASE #3: an unbuffered, future packet else // More than +1 after lastrecvack? { /* last will be updated to the last executed packet at the end. for now it will hold the last buffered packet to be executed. */ unsigned short checklast = PrevAck(header->ack); if(Recieved(next, checklast, nc)) { // Translate in order last = checklast; goto procpack; } else { AddRecieved(buffer, bytes, nc); if(Recieved(next, checklast, nc)) { // Translate in order last = checklast; goto procpack; } else { //TODO //how to find which ack was missed, have to go through all buffered //this is something somebody smart can do in the future //NAckPacket nap; //nap.header.type = PACKET_NACK; //nap.header.ack = } } } } As can be seen, there are three possible cases for the inbound packet's sequence number: it is either, 1.) behind or buffered, 2.) current expected, or 3.) future unbuffered. Case 1: behind and buffered received packets If we've already dealt with (executed) the packet, we simply acknowledge it again and return from TranslatePacket() with no further action. if(PastAck(header->ack, nc->lastrecvack) || Recieved(header->ack, header->ack, nc)) { Acknowledge(header->ack, nc, from, sock, buffer, bytes); return; } In the the second testcase of the if statement (packet is buffered received), we check if we've already buffered it, using Recieved(): //check when we've recieved a packet range [first,last] inclusive bool Recieved(unsigned short first, unsigned short last, NetConn* nc) { OldPacket* p; PacketHeader* header; unsigned short current = first; unsigned short afterlast = NextAck(last); bool missed; //go through all the received packets and check if we have the complete range [first,last] do { //for each number in the sequence... missed = true; //look through each packet from that address for(auto i=g_recv.begin(); i!=g_recv.end(); i++) { p = &*i; header = (PacketHeader*)p->buffer; //is this the sequence number we're looking for? if(header->ack != current) continue; //is this the correct address? if(!Same(&p->addr, &nc->addr)) continue; //go to next number in the sequence now that we know we have the previous one current = NextAck(current); missed = false; break; } //if we finished the inner loop and ???????????????missed??????????????? is still false, we missed a number in the sequence, so return false if(missed && current != afterlast) return false; //continue looping until we've arrived at the number after the ???????????????last??????????????? number } while(current != afterlast); //if we got here, we got all the numbers return true; } "g_recv" is a linked list of OldPacket's. We go through each sequence number between "first" and "last" and check if we have each and every one. Because we use the received packet's ack number for both parameters in Case 1, we only check if we've buffered that one packet. Because g_recv holds inbound packets from every address we're connected to, we have to check to match the address when comparing ack numbers. You can store g_recv in the NetConn's and this might be more efficient. Buffered Packets The OldPacket class holds the byte array for the packet and the address and port of the sender (or the outbound port and address for outgoing buffered packets). class OldPacket { public: char* buffer; int len; unsigned long long last; //last time resent unsigned long long first; //first time sent bool expires; bool acked; //used for outgoing packets //sender/reciever IPaddress addr; void (*onackfunc)(OldPacket* op, NetConn* nc); void freemem() { if(len <= 0) return; if(buffer != NULL) delete [] buffer; buffer = NULL; } OldPacket() { len = 0; buffer = NULL; onackfunc = NULL; acked = false; } ~OldPacket() { freemem(); } OldPacket(const OldPacket& original) { len = 0; buffer = NULL; *this = original; } OldPacket& operator=(const OldPacket &original) { freemem(); if(original.buffer && original.len > 0) { len = original.len; if(len > 0) { buffer = new char[len]; memcpy((void*)buffer, (void*)original.buffer, len); } last = original.last; first = original.first; expires = original.expires; acked = original.acked; addr = original.addr; onackfunc = original.onackfunc; } else { buffer = NULL; len = 0; onackfunc = NULL; } return *this; } }; It has some extra fields for outbound packets. Case 2: current expected received packets The second case is when the received packet is the next expected one, which means we received it in the correct order without repeats. The next expected (current) packet is the one after the "last received" one (lastrecvack). The variable "next" here will hold that ack. It is equal to nc->lastrecvack + 1, so you can use that instead of the function "NextAck". next = NextAck(nc->lastrecvack); //next expected packet ack last = next; //last packet ack to be executed //CASE #2: current packet (the next expected packet) if(header->ack == next) { // Translate packet last = next; } If it matches "next" we will process the packet and acknowledge it further down. We record the "last" packet executed, to update the sequence number. Case #3: future, unbuffered received packets If we reach "else" it means we have an unbuffered, future packet. //CASE #3: an unbuffered, future packet else // More than +1 after lastrecvack? { /* last will be updated to the last executed packet at the end. for now it will hold the last buffered packet to be executed. */ unsigned short checklast = PrevAck(header->ack); if(Recieved(next, checklast, nc)) { // Translate in order last = checklast; goto procpack; } else { AddRecieved(buffer, bytes, nc); if(Recieved(next, checklast, nc)) { // Translate in order last = checklast; goto procpack; } else { //TODO //how to find which ack was missed, have to go through all buffered //this is something somebody smart can do in the future //NAckPacket nap; //nap.header.type = PACKET_NACK; //nap.header.ack = } } } We check if we have a range of buffered packets up to this one. If we have a complete range, starting from the current (expected next) packet, we can execute them (because we only run them in the order they're sent in) and increase lastrecvack to equal "last". We move up lastrecvack at the end of TranslatePacket. We might have more buffered packets after the received one. That is why we check for any extra packets and store the last executed one's ack number in "last". If we don't have a complete set of packets up to the received one, we call AddRecieved (buffer it). void AddRecieved(char* buffer, int len, NetConn* nc) { OldPacket* p; g_recv.push_back(OldPacket()); p = &*g_recv.rbegin(); p->freemem(); p->addr = nc->addr; p->buffer = new char[ len ]; p->len = len; memcpy((void*)p->buffer, (void*)buffer, len); memcpy((void*)&p->addr, (void*)&nc->addr, sizeof(IPaddress)); g_recv.push_back(p); } If we have to buffer it, it means it's ahead of the last executed packet, and there's one missing before it. If we wanted to only send selective repeats every second or so (if that was the delay on the channel and we didn't want to send some three copies of it before we received back an ack, and we're sure that loss of packets is minimal, and we'd rather leave the "sliding window" huge), we could use NAck's (negative ack's) to tell us when we've missed a packet. But selective repeat works pretty well. (Using nacks is a different kind of reliable UDP implementation.) Acknowledgements Further on we send ack's. void Acknowledge(unsigned short ack, NetConn* nc, IPaddress* addr, UDPsocket* sock, char* buffer, int bytes) { AckPacket p; p.header.type = PACKET_ACKNOWLEDGMENT; p.header.ack = ack; SendData((char*)&p, sizeof(AckPacket), addr, false, true, nc, sock, 0, NULL); } We use a SendData function for our RUDP implementation, shown and explained further down. Whenever we send data, we have to fill out a packet struct for that type of packet. At minimum, we have to set header.type so that the received end can know what packet type it is from reading the first 2 bytes of the packet. Executing Packet and Updating Sequence Number If we get to this point in TranslatePacket, we'll execute the packets in order. If we checked sequence numbers, and we have a connection, we'll execute the buffered previous packets, then the current received packets, then check for any future buffered packets. If we don't check the sequence, or don't have a connection, we just execute the one packet we received. updinack: // Translate in order if(checkprev && nc) { last = header->ack; last = ParseRecieved(next, last, nc); } // Translate in order if(NextAck(last) == header->ack || !checkprev) { PacketSwitch(header->type, buffer, bytes, nc, from, sock); last = header->ack; } // Translate in order if(checkprev && nc && last == header->ack) { while(true) { if(!Recieved(last+1, last+1, nc)) break; last++; ParseRecieved(last, last, nc); } } //have to do this again because PacketSwitch might //read a ConnectPacket, which adds new connections. //but also the connection might have //been Disconnected(); and erased. nc = Match(from); //ack Connect packets after new NetConn added... //Don't acknowledge NoConn packets as they are non-reliable if(header->type != PACKET_ACKNOWLEDGMENT && header->type != PACKET_NOCONN && sock && nc && checkprev) { if(header->type != PACKET_CONNECT && header->type != PACKET_DISCONNECT) nc->lastrecvack = last; } At the end we update the connection's "lastrecvack" to "last" one executed. If it's a ConnectPacket, we set the lastrecvack when reading the packet. Executing a buffered packet range We need to execute a packet range when we know we've got a complete sequence up to a certain ack. We return the last executed packet number here, in case it's behind "last". unsigned short ParseRecieved(unsigned short first, unsigned short last, NetConn* nc) { OldPacket* p; PacketHeader* header; unsigned short current = first; unsigned short afterlast = NextAck(last); do { bool execd = false; for(auto i=g_recv.begin(); i!=g_recv.end(); i++) { p = &*i; header = (PacketHeader*)p->buffer; if(header->ack != current) continue; if(!Same(&p->addr, &nc->addr)) continue; PacketSwitch(header->type, p->buffer, p->len, nc, &p->addr, &g_sock); execd = true; current = NextAck(current); i = g_recv.erase(i); break; } if(execd) continue; break; } while(current != afterlast); return PrevAck(current); } SendData We send data like so, passing the data bytes, size, address, whether it is meant to be reliable, whether we want it to expire after a certain time of resending (like a ConnectPacket that needs to fail sooner than the default timeout), the NetConn connection (which musn't be NULL if we're sending a reliable packet), the socket, the millisecond delay if we want to queue it to send a few moments from now, and a callback function to be called when it's acknowledged so we can take further action (like setting "handshook" to true for ConnectPacket's, or destroying the NetConn when a DisconnectPacket is acknowledged). void SendData(char* data, int size, IPaddress * paddr, bool reliable, bool expires, NetConn* nc, UDPsocket* sock, int msdelay, void (*onackfunc)(OldPacket* p, NetConn* nc)) { //is this packet supposed to be reliable? if(reliable) { //if so, set the ack number ((PacketHeader*)data)->ack = nc->nextsendack; //and add an OldPacket to the g_outgo list OldPacket* p; g_outgo.push_back(OldPacket()); p = &*g_outgo.rbegin(); p->freemem(); p->buffer = new char[ size ]; p->len = size; memcpy(p->buffer, data, size); memcpy((void*)&p->addr, (void*)paddr, sizeof(IPaddress)); //in msdelay milliseconds, p.last will be RESEND_DELAY millisecs behind GetTicks() p->last = GetTicks() + msdelay - RESEND_DELAY; p->first = p->last; p->expires = expires; p->onackfunc = onackfunc; //update outbound ack for this connection nc->nextsendack = NextAck(nc->nextsendack); } if(reliable && msdelay > 0) return; PacketHeader* ph = (PacketHeader*)data; if(reliable && (!nc || !nc->handshook) && (ph->type != PACKET_CONNECT && ph->type != PACKET_DISCONNECT && ph->type != PACKET_ACKNOWLEDGMENT && ph->type != PACKET_NOCONN) ) { Connect(paddr, false, false, false, false); return; } memcpy(out->data, data, size); out->len = size; out->data[size] = 0; SDLNet_UDP_Unbind(*sock, 0); if(SDLNet_UDP_Bind(*sock, 0, (const IPaddress*)paddr) == -1) { char msg[1280]; sprintf(msg, "SDLNet_UDP_Bind: %s\n",SDLNet_GetError()); ErrMess("Error", msg); //printf("SDLNet_UDP_Bind: %s\n",SDLNet_GetError()); //exit(7); } //sendto(g_socket, data, size, 0, (struct addr *)paddr, sizeof(struct sockaddr_in)); SDLNet_UDP_Send(*sock, 0, out); g_transmitted += size; SDLNet_FreePacket(out); } If it's reliable, we add an entry to the outbound OldPacket list. We set the "last" member variable of the OldPacket entry such that it is resent in a certain amount of time depending on when we delayed it to and the usual resend delay. If it's reliable and the delay is greater than 0, we don't take any action in this function after buffering it in the outbound list because we will send it after ResendPacks() is called. If it's reliable and we don't have a connection specified, we call Connect() to connect first, and return. It is also called if the connection hasn't finished the handshake (in which case Connect() will check to make sure that we have an outgoing ConnectPacket). The only case in which we don't need a handshook connection and send reliably is if we're sending a ConnectPacket or DisconnectPacket. The SendData function is called itself with "reliable" set to false when resending a reliable packet from a buffered outbound OldPacket container. The SendData function automatically sets the outbound ack for the reliable packets. Keeping Connections Alive As mentioned, there are three more functions in the UpdNet loop function: KeepAlive(); CheckConns(); ResendPacks(); The KeepAlive() function sends KeepAlive packets to connections that are expiring. It prevents the other side from closing the connection, and also triggers an ack packet back, preventing from the connection being closed locally. The default is to keep connections alive until the user decides to Disconnect them. //keep expiring connections alive (try to) void KeepAlive() { unsigned long long nowt = GetTicks(); auto ci = g_conn.begin(); //loop while we still have more connections to process... while(g_conn.size() > 0 && ci != g_conn.end()) { //if we haven't received a handshake back, or if it's closed, we don't need to be keep it alive if(!ci->handshook || ci->closed) { ci++; continue; } //otherwise, if it's reached a certain percent of the timeout period, send a KeepAlivePacket... if(nowt - ci->lastrecv > NETCONN_TIMEOUT/4) { //check if we're already trying to send a packet to get a reply bool outgoing = false; //check all outgoing packets for a packet to this address for(auto pi=g_outgo.begin(); pi!=g_outgo.end(); pi++) { //if(memcmp(&pi->addr, &ci->addr, sizeof(IPaddress)) != 0) if(!Same(&pi->addr, &ci->addr)) { continue; } outgoing = true; break; } //if we have an outgoing packet, we don't have to send a KeepAlivePacket if(outgoing) { ci++; continue; } //otherwise, send a KeepAlivePacket... KeepAlivePacket kap; kap.header.type = PACKET_KEEPALIVE; SendData((char*)&kap, sizeof(KeepAlivePacket), &ci->addr, true, false, &*ci, &g_sock, 0, NULL); } //check next connection next ci++; } } GetTicks() is our 64-bit timestamp function in milliseconds: unsigned long long GetTicks() { #ifdef PLATFORM_WIN SYSTEMTIME st; GetSystemTime (&st); _FILETIME ft; SystemTimeToFileTime(&st, &ft); //convert from 100-nanosecond intervals to milliseconds return (*(unsigned long long*)&ft)/(10*1000); #else struct timeval tv; gettimeofday(&tv, NULL); return (unsigned long long)(tv.tv_sec) * 1000 + (unsigned long long)(tv.tv_usec) / 1000; #endif } Checking and Pruning Connections Two more functions in UpdNet: CheckConns(); ResendPacks(); In CheckConns we do several things: 1. Send out periodic pings for all the players in the room for all the clients using Cl(ient)StatePacket's 2. Handle and close any connections that are not yet closed but have timed out because the last received message has been longer than NETCONN_TIMEOUT milliseconds ago 3. For closed connections, flush any buffered inbound or outbound OldPacket's, and erase the NetConn from the list 4. For unresponsive clients, inform other players of the lagger void CheckConns() { unsigned long long now = GetTicks(); // If we're not compiling for the matchmaker (the game app itself) #ifndef MATCHMAKER static unsigned long long pingsend = GetTicks(); //send out client pings if(g_netmode == NETM_HOST && now - pingsend > (NETCONN_UNRESP/2) ) { pingsend = now; for(int i=0; i; if(!c->on) continue; if(i == g_localC) continue; //clients will have their own ping for the host NetConn* nc = c->nc; if(!nc) continue; ClStatePacket csp; csp.header.type = PACKET_CLSTATE; csp.chtype = CLCH_PING; csp.ping = nc->ping; csp.client = i; SendAll((char*)&csp, sizeof(ClStatePacket), true, false, NULL); } } #endif auto ci = g_conn.begin(); while(g_conn.size() > 0 && ci != g_conn.end()) { //get rid of timed out connections if(!ci->closed && now - ci->lastrecv > NETCONN_TIMEOUT) { //TO DO any special condition handling, inform user about sv timeout, etc. #ifndef MATCHMAKER if(ci->ismatch) { g_sentsvinfo = false; } else if(ci->isourhost) { EndSess(); RichText mess = RichText("ERROR: Connection to host timed out."); Mess(&mess); } else if(ci->ishostinfo) ; //ErrMess("Error", "Connection to prospective game host timed out."); else if(ci->isclient) { //ErrMess("Error", "Connection to client timed out."); /* TODO combine ClDisconnectedPacket and ClientLeftPacket. use params to specify conditions of leaving: - of own accord - timed out - kicked by host */ //TODO inform other clients? ClDisconnectedPacket cdp; cdp.header.type = PACKET_CLDISCONNECTED; cdp.client = ci->client; cdp.timeout = true; SendAll((char*)&cdp, sizeof(ClDisconnectedPacket), true, false, &ci->addr); Client* c = &g_client[ci->client]; RichText msg = c->name + RichText(" timed out."); AddChat(&msg); } #else g_log<closed = true; //Close it using code below } //get rid of closed connections if(ci->closed) { if(&*ci == g_mmconn) { g_sentsvinfo = false; g_mmconn = NULL; } if(&*ci == g_svconn) g_svconn = NULL; #ifndef MATCHMAKER for(int cli=0; clion) continue; if(c->nc == &*ci) { if(g_netmode == NETM_HOST) { } if(c->player >= 0) { Player* py = &g_player[c->player]; py->on = false; py->client = -1; } c->player = -1; c->on = false; } } #endif //necessary to flush? already done in ReadDisconnectPacket(); //might be needed if connection can become ->closed another way. FlushPrev(&ci->addr); ci = g_conn.erase(ci); continue; } //inform other clients of unresponsive clients //or inform local player or unresponsive host if(now - ci->lastrecv > NETCONN_UNRESP && ci->isclient) //make sure this is not us or a matchmaker { #ifndef MATCHMAKER NetConn* nc = &*ci; Client* c = NULL; if(nc->client >= 0) c = &g_client[nc->client]; if(g_netmode == NETM_CLIENT && nc->isourhost) { //inform local player TODO c->unresp = true; } else if(g_netmode == NETM_HOST && nc->isclient && c) { //inform others if(c->unresp) { ci++; continue; //already informed } c->unresp = true; ClStatePacket csp; csp.header.type = PACKET_CLSTATE; csp.chtype = CLCH_UNRESP; csp.client = c - g_client; SendAll((char*)&csp, sizeof(ClStatePacket), true, false, &nc->addr); } #endif } ci++; } } Resending Packets Finally, ResendPacks(): void ResendPacks() { OldPacket* p; unsigned long long now = GetTicks(); //remove expired ack'd packets auto i=g_outgo.begin(); while(i!=g_outgo.end()) { p = &*i; if(!p->acked) { i++; continue; } //p->last and first might be in the future due to delayed sends, //which would cause an overflow for unsigned long long. unsigned long long safelast = enmin(p->last, now); unsigned long long passed = now - safelast; unsigned long long safefirst = enmin(p->first, now); if(passed < RESEND_EXPIRE) { i++; continue; } i = g_outgo.erase(i); } //resend due packets within sliding window i=g_outgo.begin(); while(i!=g_outgo.end()) { p = &*i; //kept just in case it needs to be recalled by other side if(p->acked) { i++; continue; } unsigned long long safelast = enmin(p->last, now); unsigned long long passed = now - safelast; unsigned long long safefirst = enmin(p->first, now); NetConn* nc = Match(&p->addr); //increasing resend delay for the same outgoing packet unsigned int nextdelay = RESEND_DELAY; unsigned long long firstpassed = now - safefirst; if(nc && firstpassed >= RESEND_DELAY) { unsigned long long sincelast = safelast - safefirst; //30, 60, 90, 120, 150, 180, 210, 240, 270 nextdelay = ((sincelast / RESEND_DELAY) + 1) * RESEND_DELAY; } if(passed < nextdelay) { i++; continue; } PacketHeader* ph = (PacketHeader*)p->buffer; /* If we don't have a connection to them and it's not a control packet, we need to connect to them to send reliably. Send it when we get a handshake back. */ if((!nc || !nc->handshook) && ph->type != PACKET_CONNECT && ph->type != PACKET_DISCONNECT && ph->type != PACKET_ACKNOWLEDGMENT && ph->type != PACKET_NOCONN) { Connect(&p->addr, false, false, false, false); i++; continue; } //do we want a sliding window? //edit: this is not correct, don't use this, it will cause a blockage #if 0 if(nc) { unsigned short lastack = nc->nextsendack + SLIDING_WIN - 1; if(PastAck(lastack, ph->ack) && ph->ack != lastack) { i++; continue; //don't resend more than SLIDING_WIN packets ahead } } #endif if(p->expires && now - safefirst > RESEND_EXPIRE) { i = g_outgo.erase(i); continue; } SendData(p->buffer, p->len, &p->addr, false, p->expires, nc, &g_sock, 0, NULL); p->last = now; i++; } } We 1.) erase OldPacket's that have been acknowledged (acked = true), 2.) check if the OldPacket in question is within the sliding window, and if it is, 2.) resend those OldPacket's that have reached a certain delay, 3.) and erase OldPacket's that are set to expire. "enmin" and "enmax" are just the min max macros: #define enmax(a,b) (((a)>(b))?(a):(b)) #define enmin(a,b) (((a)<(b))?(a):(b)) We don't want the "firstpassed" value (the amount of time that has passed since the OldPacket was first sent) to be negative (which would be a giant positive number for an unsigned 64-bit long long), so we set "safefirst" used in its calculation to be no more than the time "now", from which it is subtracted. If we didn't do this, we would get undefined behaviour, with some packets getting resent and some getting erased. unsigned long long safelast = enmin(p->last, now); unsigned long long passed = now - safelast; unsigned long long safefirst = enmin(p->first, now); NetConn* nc = Match(&p->addr); //increasing resend delay for the same outgoing packet unsigned int nextdelay = RESEND_DELAY; unsigned long long firstpassed = now - safefirst; Reading Acknowledgements Whenever we receive an AckPacket, we call ReadAckPacket on it in PacketSwitch: void ReadAckPacket(AckPacket* ap, NetConn* nc, IPaddress* from, UDPsocket* sock) { OldPacket* p; PacketHeader* header; for(auto i=g_outgo.begin(); i!=g_outgo.end(); i++) { p = &*i; header = (PacketHeader*)p->buffer; if(header->ack == ap->header.ack && Same(&p->addr, from)) { if(!nc) nc = Match(from); if(nc) { nc->ping = (float)(GetTicks() - i->first); } if(p->onackfunc) p->onackfunc(p, nc); i = g_outgo.erase(i); return; } } } In it, we will check for the matching buffered inbound OldPacket, and erase it from the list if found. But before that, we call a registered callback method that was set up when the packet was sent. Using the "first" time the packet was sent, subtracting it from the current time, gives the round-trip latency for that connection, which we can record in the NetConn class. Callbacks on Acknowledgement Whenever we send a DisconnectPacket, we set the callback function to: void OnAck_Disconnect(OldPacket* p, NetConn* nc) { if(!nc) return; nc->closed = true; //to be cleaned up this or next frame } Which will clean up the connection and stop resending the DisconnectPacket once it's acknowledged. It's best to encapsulate the needed functionality so we can safely Disconnect. void Disconnect(NetConn* nc) { nc->disconnecting = true; //check if we already called Disconnect on this connection //and have an outgoing DisconnectPacket bool out = false; for(auto pit=g_outgo.begin(); pit!=g_outgo.end(); pit++) { if(!Same(&pit->addr, &nc->addr)) continue; PacketHeader* ph = (PacketHeader*)pit->buffer; if(ph->type != PACKET_DISCONNECT) continue; out = true; break; } if(!out) { DisconnectPacket dp; dp.header.type = PACKET_DISCONNECT; SendData((char*)&dp, sizeof(DisconnectPacket), &nc->addr, true, false, nc, &g_sock, 0, OnAck_Disconnect); } } When we receive an acknowledgement of a ConnectPacket that we sent out, we also need to set "handshook" to true. You can set user callbacks for certain special connections, like matchmakers or game hosts, to carry out certain functions, like immediately polling for servers, or getting server info, or joining the game room. //on connect packed ack'd void OnAck_Connect(OldPacket* p, NetConn* nc) { if(!nc) nc = Match(&p->addr); if(!nc) return; nc->handshook = true; ConnectPacket* scp = (ConnectPacket*)p->buffer; //if(!scp->reconnect) { #ifndef MATCHMAKER GUI* gui = &g_gui; if(nc->isourhost) { g_svconn = nc; //TO DO request data, get ping, whatever, server info JoinPacket jp; jp.header.type = PACKET_JOIN; std::string name = g_name.rawstr(); if(name.length() >= PYNAME_LEN) name[PYNAME_LEN] = 0; strcpy(jp.name, name.c_str()); jp.version = VERSION; SendData((char*)&jp, sizeof(JoinPacket), &nc->addr, true, false, nc, &g_sock, 0, NULL); } #endif if(nc->ishostinfo) { //TO DO request data, get ping, whatever, server info GetSvInfoPacket gsip; gsip.header.type = PACKET_GETSVINFO; SendData((char*)&gsip, sizeof(GetSvInfoPacket), &nc->addr, true, false, nc, &g_sock, 0, NULL); } #ifndef MATCHMAKER if(nc->ismatch) { g_mmconn = nc; g_sentsvinfo = false; if(g_reqsvlist && !g_reqdnexthost) { g_reqdnexthost = true; GetSvListPacket gslp; gslp.header.type = PACKET_GETSVLIST; SendData((char*)&gslp, sizeof(GetSvListPacket), &nc->addr, true, false, nc, &g_sock, 0, NULL); } } #endif } } You can see there's a commented out function pointer called "chcallback" in the NetConn class, which might be given a function to call when the connection is handshook, instead of hard-coding several cases for the connection type ("ismatch", "isourhost", etc.) Connecting Before we host a server or connect to the matchmaker, we must open a socket. void OpenSock() { unsigned short startport = PORT; if(g_sock) { IPaddress* ip = SDLNet_UDP_GetPeerAddress(g_sock, -1); if(!ip) g_log<<"SDLNet_UDP_GetPeerAddress: "<port); SDLNet_UDP_Close(g_sock); g_sock = NULL; } if(g_sock = SDLNet_UDP_Open(startport)) return; //try 10 ports #ifndef MATCHMAKER for(int i=0; i<10; i++) { if(!(g_sock = SDLNet_UDP_Open(PORT+i))) continue; return; } #endif char msg[1280]; sprintf(msg, "SDLNet_UDP_Open: %s\n", SDLNet_GetError()); g_log< This OpenSock method will try 10 different port numbers if the first one fails. If it still doesn't work, it will log a message from SDLNet. After we open a port, we can send packets. The OpenSock method is encapsulated in the Connect() function. We can call the first Connect method, which takes an IP string or domain address, or the second, which accepts an IPaddress struct. They also accept some parameters to describe their use, like whether the connection is the matchmaker, the host being joined, a client of our room, or a random server we're getting info on. NetConn* Connect(const char* addrstr, unsigned short port, bool ismatch, bool isourhost, bool isclient, bool ishostinfo) { IPaddress ip; //translate the web address string to an IP and port number if(SDLNet_ResolveHost(&ip, addrstr, port) == -1) { return NULL; } //call the following function... return Connect(&ip, ismatch, isourhost, isclient, ishostinfo); } //Safe to call more than once, if connection already established, this will just //update NetConn booleans. NetConn* Connect(IPaddress* ip, bool ismatch, bool isourhost, bool isclient, bool ishostinfo) { if(!g_sock) OpenSock(); NetConn* nc = Match(ip); NetConn newnc; bool isnew = false; //if we don't recognize this address as having a connection to, make a new NetConn instance for the list if(!nc) { isnew = true; newnc.addr = *ip; newnc.handshook = false; newnc.lastrecv = GetTicks(); newnc.lastsent = newnc.lastrecv; //important - reply ConnectPacket with ack=0 will be //ignored as copy (even though it is original) if new NetConn's lastrecvack=0. newnc.lastrecvack = USHRT_MAX; newnc.nextsendack = 0; newnc.closed = false; g_conn.push_back(newnc); nc = &*g_conn.rbegin(); } else { //force reconnect (sending ConnectPacket). //also important for Click_SL_Join to know that we //can't send a JoinPacket immediately after this function, //but must wait for a reply ConnectPacket. if(nc->closed) nc->handshook = false; } bool disconnecting = false; //if we have an outgoing DisconnectPacket, set disconnecting=true for(auto pit=g_outgo.begin(); pit!=g_outgo.end(); pit++) { OldPacket* op = &*pit; if(!Same(&op->addr, &nc->addr)) continue; PacketHeader* ph = (PacketHeader*)op->buffer; if(ph->type != PACKET_DISCONNECT) continue; disconnecting = true; break; } //if we're closing this connection, don't send any other reliable packets on it except DisconnectPacket and clear any outbound or inbound OldPacket's if(disconnecting) { nc->handshook = false; FlushPrev(&nc->addr); } //different connection purposes //only "true" it, or retain current state of nc->... nc->isclient = isclient ? true : nc->isclient; nc->isourhost = isourhost ? true : nc->isourhost; nc->ismatch = ismatch ? true : nc->ismatch; nc->ishostinfo = ishostinfo ? true : nc->ishostinfo; if(isourhost) g_svconn = nc; if(ismatch) g_mmconn = nc; //see if we need to connect for realsies (send a ConnectPacket). //i.e., send a connect packet and clean previous packets (OldPacket's list). if(!nc->handshook) { bool sending = false; //sending ConnectPacket? unsigned short yourlastrecvack = PrevAck(nc->nextsendack); //check if we have an outgoing ConnectPacket for(auto pi=g_outgo.begin(); pi!=g_outgo.end(); pi++) { if(!Same(&pi->addr, &nc->addr)) continue; PacketHeader* ph = (PacketHeader*)pi->buffer; if(PastAck(PrevAck(ph->ack), yourlastrecvack)) yourlastrecvack = PrevAck(ph->ack); if(ph->type != PACKET_CONNECT) continue; sending = true; break; } if(!sending) { ConnectPacket cp; cp.header.type = PACKET_CONNECT; cp.reconnect = false; cp.yourlastrecvack = yourlastrecvack; cp.yournextrecvack = nc->nextsendack; cp.yourlastsendack = nc->lastrecvack; SendData((char*)&cp, sizeof(ConnectPacket), ip, isnew, false, nc, &g_sock, 0, OnAck_Connect); } } nc->closed = false; return nc; } When closing a connection, or connecting again after a connection had been disconnected, we flush any buffered in- or out-bound OldPacket's. //flush all previous incoming and outgoing packets from this addr void FlushPrev(IPaddress* from) { auto it = g_outgo.begin(); while(it!=g_outgo.end()) { if(!Same(&it->addr, from)) { it++; continue; } it = g_outgo.erase(it); } it = g_recv.begin(); while(it!=g_recv.end()) { if(!Same(&it->addr, from)) { it++; continue; } it = g_recv.erase(it); } } We read a ConnectPacket like this: void ReadConnectPacket(ConnectPacket* cp, NetConn* nc, IPaddress* from, UDPsocket* sock) { bool isnew = false; if(!nc) { nc = Match(from, cp->header.senddock); if(!nc) { nc = Match(from, 0); if(nc) nc->dock = cp->header.senddock; } if(!nc) { isnew = true; NetConn newnc; newnc.addr = *from; newnc.handshook = true; newnc.lastrecvack = cp->header.ack; newnc.nextsendack = 0; newnc.lastrecv = GetTicks(); newnc.closed = false; g_conn.push_back(newnc); nc = &*g_conn.rbegin(); } } if( nc && ( nc->ismatch || nc == g_mmconn ) ) { nc->ismatch = true; g_mmconn = nc; g_sentsvinfo = false; } nc->handshook = true; nc->closed = false; if(isnew) { nc->lastrecvack = cp->header.ack; nc->nextsendack = 0; } else { FlushPrev(&nc->addr); nc->lastrecvack = cp->header.ack; nc->nextsendack = 0; } #endif } And the DisconnectPacket: void ReadDisconnectPacket(DisconnectPacket* dp, NetConn* nc, IPaddress* from, UDPsocket* sock) { if(!nc) nc = Match(from, dp->header.senddock); if(!nc) return; if(nc->isourhost) { g_svconn = NULL; #ifndef MATCHMAKER EndSess(); RichText mess = STRTABLE[STR_HOSTDISC]; //host disconnected Mess(&mess); #endif //TODO message box to inform that host left the game and that game is over } if(nc->ismatch) { g_mmconn = NULL; g_sentsvinfo = false; } for(auto ci=g_conn.begin(); ci!=g_conn.end(); ci++) if(&*ci == nc) { ci->closed = true; FlushPrev(&ci->addr, ci->dock); break; } #ifndef MATCHMAKER //get rid of client if(nc->client >= 0) { Client* c = &g_client[nc->client]; if(g_netmode == NETM_HOST) { //inform other clients ClientLeftPacket clp; clp.header.type = PACKET_CLIENTLEFT; clp.client = nc->client; SendAll((char*)&clp, sizeof(ClientLeftPacket), true, false, &nc->addr, nc->dock); RichText msg = c->name + STRTABLE[STR_LEFT]; AddChat(&msg); } ResetCl(c); } #endif nc->client = -1; } Conclusion That is all for this article. If you want to see the rest of the article covering parity bit checking, lockstep, and LAN networking, purchase the full article here: http://www.gamedev.net/files/file/223-reliable-udp-implementation-lockstep-lan-and-parity-bit-checking/ Article Update Log 4 Apr 2016: Forgot to tell you how to respond to a ConnectPacket and DisconnectPacket. 29 Dec 2015: Fixed some bugs. The first line here // Translate in order if(checkprev && nc) { while(true) { if(!Recieved(last+1, last+1, nc)) break; last++; ParseRecieved(last, last, nc); } } Should be if(checkprev && nc && last == header->ack) Also, commented out the sliding window because it is an incorrect implementation and causes a blockage. And in Recieved() the line if(missed) should be if(missed && current != afterlast) I changed how OldPacket's are added to g_recv and g_outgo in AddRecieved and SendData. And this header = (PacketHeader*)&p->buffer; Is now this header = (PacketHeader*)p->buffer; 6 Oct 2015: Initial release

Cancel Save
0 Likes 11 Comments

Comments

jbadams

Just a quick staff note:

We've never had "premium" content before, and we didn't actively seek this out; Denis produced and submitted all of this by himself and we thought we may as well give it a go and see the community reaction -- we wouldn't want our article section filled with small low-quality snippets, but as Denis seems to have put quite a bit of effort into producing a detailed and valuable "free sample" we thought it might be received positively.

In addition to feedback on the article itself, we would love if you stopped by our Comments, Suggestions & Ideas forum to give your feedback on this free intro, premium content model. Note also that it's not something we'll be actively encouraging either way, but if it allows us to attract authors of high quality content we'll continue to allow it -- if it's very poorly received we'll take that on board and disallow future submissions of this type.

October 08, 2015 10:15 PM
Krohm

This article has many small issues I could let go over. A few issues which really needs to be addressed but when I read about SDLNet_UDP_Recv / recvfrom polling I lost most of my positivism. I do not consider socket polling an adequate pattern in any application.

This article tries to attack the problem on all fronts at once and eventually makes a blur of everything. I cannot really enumerate all the problems I've matched, after all, they just add up to something to be reworked but I'll try.

  • The statement about RUDP being a requirement for predictable evolution is at most partially true. We know - and we can even observe - games are allowed to diverge state across machines within limited bounds. This might be not particularly evident in modern games but it was very easily observable. Different approaches have been tried and while most of them require at least some level of reliability, "delta frames" are a thing and yes, they can be dropped. Even the article later agrees with that as not all packets are required to be confirmed.
  • It is suggested to use #pragma pack to align structures so the packets are as tight as possible. No. Just No. You don't send structs over the wire. You don't send live structs over the wire. This pragma might have performance implications (we cannot say for sure given the code).
  • Macros gone wild, not a good practice. There's no reason to use a macro here, the PACKET_* macros are broken: they enumerate packet types from different abstraction levels. Consider PACKET_CONNECT (communication level handshake), PACKET_GETSVINFO (application level, post "connect" information), PACKET_MAPSTART (gameplay-level event), PACKET_MOVEORDER (high-frequency, higher abstraction gameplay event).
  • Do not allocate buffers on stack so easily. This is possibly personal opinion but I am uncomfortable in seeing 64KiB allocated on stack.
  • It is unclear why at a certain point we do SDLNet_AllocPacket(65535), then we make char buffer[65535]. We can give an excuse to the thing as UDP guarantees an upper bound but this is an antipattern and a well known source of issues longterm. I think 0xFFFF, ushort(~0), 64*1024-1 would all be more convenient but most importantly, do not keep writing it over and over. Have a const somewhere or (for this value specifically) look at std::numeric_limits (oh! It turns out this is C++!).
  • Socket polling should be event driven. Polling sockets, even just once is inefficient and does not scale. Besides, if you poll continuously on another thread (which you should be) it'll burn you a whole CPU core or introduce a minimum but totally unnecessary latency.
  • I would argue that calling your network pump UpdNet is not very sensible.

  • There's just not enough emphasis on the rationale behind decisions. "KeepAlive() tries to keep connections alive that are about to time out", thank you, we had an hint about that. What about some initial drawing to show an example of communication (those things with two peers on the opposite, time flowing "down" and oblique lines).

  • The NetConn class manages different abstraction levels: nextsendack, lastrecv, disconnecting is clearly "network level logic", isclient, ismatch seem to be state for much higher level reasoning. As a side note no, you don't change those to flags!

  • You don't just PacketHeader* header = (PacketHeader*)buffer. Granted, we "took care of packing". I don't think SDL recv manages endianess for you. Do not use C-style casts.

  • I still have difficulty seeing how the matchmaker can possibly fit in the same architecture as the application (assuming the matchmaker is the server making the matches).

  • Besides, NULL is something you can leave in the past. Good compilers have supported nullptr for at least a couple of years by now. It has some nice advantages.

  • Using goto procpack is an antipattern. Goto is harmful.

  • You don't just SendData like this. Or perhaps you do in some restricted environment but this is stuff other people will read. This is cutting corners.

  • Match(IPaddress* addr) and Same(IPaddress* a, IPaddress* b) are antipatterns. Both can be better implemented by using C++ and/or its library properly, resolving to 1LOC. This is a major hint the author has spent too much time cobbling stuff toghether and too little thinking about it. I am against use of extensive operator overload but this is too much even for me, mostly because there's no need to overload anything here.
  • It turns out delete (ptr = nullptr); is NOP so you don't need an if(ptr != nullptr) to guard it. I won't stress that as I have only figured that out the other day. tongue.png
  • You don't want void (*onackfunc)(OldPacket* op, NetConn* nc); what you want is std::function<void(OldPacket *op, NetConn *nc)> onAckFunc or most likely std::function<void(OldPacket &op, NetConn &nc)> onAckFunc
  • Don't use raw pointers! 99% of the time you really want some smart pointer OR a reference.
  • No idea why you couldn't write NextAck(last) == header->ack || !checkprev on a single line as two lines later you go much much longer. But let's say this is "just" layout at this point!
  • for(auto i=g_recv.begin(); i!=g_recv.end(); i++). It's 2015 man. Every compiler having auto will most likely have some range-for. Consider: for(auto &i : g_recv) { ... }
  • I would argue that SystemTimeToFileTime is overkill here but that's no more relevant, you want to use std::chrono::system_clock or something like it. If that fails cobble something together using the performance counter (but I don't think you need such resolution).
  • You don't "try" 10 ports when starting a server dude. If a port isn't available for you that's in real world a major failure at configuration, don't try to fix it.

Somehow I forced myself to the end.

$20 for the full article? u mad bro?

Note to staff: this makes use of the "ul" bullet points. For some reason, they don't show up for me.

October 09, 2015 01:56 PM
SeanMiddleditch

Don't use raw pointers! 99% of the time you really want some smart pointer OR a reference.


Please, no. You should _not_ be using so-called "smart" pointers 99% of the time.

Smart pointers manage ownership and lifetimes. If you're not managing lifetimes, you should be using a good ol' raw pointer. Reference are also not just replacements for raw pointers, though certainly a lot of uses of raw pointers in old C-style code are better served by references. We're actually going through a round of replacing smart pointers with raw pointers in a lot of our engine code right now (the engine was originally written back when Boost smart pointers were the new hot thing, so far as I can tell, and it suffers for it).

Smart pointer overuse is an anti-pattern. smile.png

That said, I agree with most of the rest of what you said. I think this article (based on this sample) needs a thorough round of professional editing and code refinement before it's something I'd recommend for free, much less for money.

My recommendation to start would be to avoid littering the article with actual code samples. First explain the concepts and reasonings, with pseudo-code where needed to illustrate points, and then provide a separate well-commented sample app to download and dissect. The people who need raw code want a complete sample, and the people who need to understand the "why" here don't need 50% of the space dedicated to boilerplate code, and the article as it stands serves neither demographic well.

Krohm's points about layering are especially important, too. Yes, there's a reason to simplify abstractions for educational purposes, but I feel this article goes too far in that direction. It also doesn't really exemplify the important bits of structure of a good networking engine. The article's code needs at least a little more abstraction; that bloats the samples a bit, yes, but that feeds back into the need for separating the article's important content from the sample code.

There's also a misdesign present in the idea that you'll have a "packet" for placing a building or whatnot at the game level. That's a high-level message; multiple messages should be communicated per packet. There's too much overhead per-packet to send one for each game-level message. Even things like keep-alive and whatnot should not be whole separate packets; just send a packet with no messages in it if all you want is to let the other side know that you're still there.

You want to packet as much data into a single UDP packet as you can (and that's far less than 64k, btw, because of MTU sizes. Most UDP packets that are actually 64k in size will be dropped, even though that is technically a legal IP packet size. Transmission logic really should work hard to keep packets smaller. Of course, there's a trade-off in packet size between packet fragmentation with resulting packet loss and the the overhead in dispatching many packets vs fewer packets.

Also rememeber that bandwidth is not free. For client-server games, bandwidth alone can cost a company hundreds of thousand of dollars per month if it's poorly utilized.

Gameplay code doesn't want to have to deal with those complexities, though, so the low-level networking layer needs to intelligently deal with those complexities while abstracting them from the gameplay programmers.
October 09, 2015 04:37 PM
polyfrag
Krohm, there's no event driven sockets in SDL Net. There's only SDLNet_CheckSockets and that isn't an event. I'm not deploying to any big endian platform so I don't have to worry about endianness. I don't think anybody else is either. It's 2015 man. Yes I mix C and C++. Casts etc. And I mix different application level packets. If I have to rewrite all this, you may as well reject the article. I thought it would give valuable information and stir thought of how to implement reliable UDP so somebody else doesn't have to struggle for months like I did.
October 10, 2015 02:13 AM
Krohm

You have struggled for months because you have applied antipatterns, did not separate concerns, had to work with non-minimal state etc etc...

While the event system is not exposed by SDLNet_CheckSockets, it is absolutely event-based internally (likely a wrapper for select), which at the end of the day is the fairly convenient and totally standard legacy way to avoid socket polling.

ARM is usually BE AFAIK. Last gen consoles were BE.

What about a target with the same endianess but where unsigned long is 64 bit?

I can reset my vote and I will as soon as the issues are addressed. I do like your idea of proposing freemium content but keep in mind this has consequences - such as requiring it to be top notch.

October 12, 2015 06:17 AM
alnite

Not really a networking expert here, but what do you think about QUIC?

https://github.com/devsisters/libquic

October 12, 2015 06:39 AM
jsaade

One of the very few complete articles for UDP. I wish I had access to such resource when I was starting out with networking.

It was mainly about a lot of trial and error. Definitely worth the 20$.

October 13, 2015 08:43 AM
__0BZEN__

Gaffer has a very good, clear and concise section on networking. Your article should be structured in a similar way.

I know, it's not always easy, and I've fallen into the trap many times, but pictures are worth 1,000 words. The article should be about concepts, algorithms, not code. That can come later on.

My two cents.

October 29, 2015 11:52 AM
Brain

I've voted to not peer review for now based on two issues;

firstly there are layout issues with the header tags, something is up and they do not display correctly. Secondly, as a massive wall of text and code i find this to be off-putting. Perhaps this should be broken down into separate articles, and/or as others have stated to provide a download of the complete code, replacing the snippets with pseudocode and diagrams.

With generic enough pseudocode, the concepts here could be re-implemented in other libraries than SDL quite easily by someone with experience in network programming.

I will be happy to change my vote to peer reviewed if these issues can be rectified!

January 18, 2016 01:23 PM
samoth

I've admittedly only read the first 1/3 of the article, and some lines around the middle, and can obviously only talk about what I've actually read.

There are, in my opinion, problems on several layers, including what I believe to be fundamental misconceptions about UDP and TCP, and in total, as it stands, I perceive the whole thing as, in some way, just-not-right.

That doesn't mean it couldn't be overworked and turned into a really good article. But as it stands, I'm not sold on it.

For example, I find it somewhat funny to worry about TCP headers being a few bytes larger on the one hand side , but then send out one UDP datagram per message. That doesn't make sense. (Ironically, using vanilla TCP would solve exactly this problem, it would batch several of your messages together).

The musings on packet loss and congestion control are somewhat odd in my opinion, too. Maybe that's not what you are intending to say, but it's what it reads like to me.

It sounds like congestion control is something optional that you can do, or or can skip, and would rather not do. Packet loss is a normal, regularly occurring condition in IP networks. It's neither optional, nor exceptional, nor rare. It really happens all the time, every second. Also I'm not sure about what's with TCP not using the maximum capacity. TCP will, except in the worst contrieved, artificial cases, fill the line capacity almost perfectly, upwards of 99% (what makes you think it doesn't?). Yes, TCP will introduce packet loss in UDP, but so will UDP. It's not like you use UDP and the problem doesn't exist. Again, packet loss is normal.

Congestion control (which limits the amount of packet loss) and a strategy of dealing with the packet loss that will (not can) happen is mandatory -- whether your strategy is just "fire and forget" or "ack and resend" or something else.

The one selling argument for UDP in my opinion is that if this works for your application, you can indeed just say "Oh fuck's sake..." and ignore data that hasn't been delivered in time, or alltogether. That, and maybe that you don't absolutely need to do in-order if the order doesn't really bother you (with TCP, you do not have a choice). But certainly it's not about congestion or packet loss, or maxing out the available physical bandwidth.

You lost me somewhere in a large-ish code wall between wondering about that timer you're using which has a 15.6ms resolution, and still pondering about why there needs to be a special protocol when a client has been disconnected due to time out (what's wrong with simply dropping that one?). There's a lot of stuff in there that is very specific to your implementation, and much of it makes me scratch my head.

Less can be more, maybe in this case that may help, too.

March 22, 2016 09:50 AM
polyfrag
I was sold on the idea of UDP being better by somebody else and really I couldn't do some things I do now if I didn't use UDP like telling when somebody's laggy or getting ping. You can probably do all sorts of stuff. The sliding window in TCP can only be up to 2^16 = 64 kB. With a 7 gbps connection and 60 ms lag you will get 1.04 MBps. That's not a concern here. But UDP you get a tighter interop so you know when a packet is acked and you can just avoid sending back a reply. With TCP you don't know when something finally arrives and is processed or if there's a problem. With TCP you'll just get disconnected from timeout or delayed response. With a scheduled delay of 1 ms between packets I'm able to send the whole live 286 kB+ simstate in 2-3 seconds for an instant join on LAN. Also if you're polling 1000 servers and you expect some responses to take 3 s you would have to poll 1000 nonblocking connect's and have a 3+ s timeout. If you want to do voice chat or video streaming you'll want UDP anyway and you'll need them in order. So I could do this with TCP and UDP but now that I have RUDP I can't think how it's worse and probably better with 10/20ths the header size. Which is a good saving for average 12 B payloads. And the most common packets (lockstep done, lockstep next turn empty, and ack) are only 10 B or less for me. Also how do you check in TCP if a connection to a matchmaker has timed out? With UDP I definitely know when the matchmaker is responsive even if I wait on the last packet to respond with something. In TCP that would be 4 datagrams, in UDP that's just an ack to a keepalive, which might even be doubled in TCP by resending on jammed wifi. Seems inefficient to keep sending/receiving 2+ times more data for a matchmaker to keep track of clients, which you have to do to know if they've timed out or not. You could get away with only doing this occasionally and hope they always send a disconnect packet when the TCP socket is closed. In TCP you have to enable a keepalive option and even then you don't know unless you send something and wait for a timeout. What if there's people on public wifi? You always have to keep track of last received time and send keepalive packets when the received time exceeds something and you have to make it 2x to account for 2 round trips. Congestion seems like an issue though because even on LAN/localhost I can't send the 286 kB in one go and it's slower if I don't space out packets, so much that I can't even get anything through barely.
April 05, 2016 06:51 AM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!

Whether you're interested in making an FPS or RTS, you've probably heard that you should use UDP. Here's how

Advertisement

Other Tutorials by polyfrag

polyfrag has not posted any other tutorials. Encourage them to write more!
Advertisement