Game crash while hosting - 100% upload utilised - all desync

Game crash while hosting - 100% upload utilised - all desync

Discuss your problems with the latest release of the engine here. Problems with games, maps or other utilities belong in their respective forums.

Moderator: Moderators

Post Reply
Jack
Posts: 32
Joined: 15 Jul 2006, 00:35

Game crash while hosting - 100% upload utilised - all desync

Post by Jack »

Twice, while hosting Spring, the game has started using 100% of my upload bandwidth. During normal play there is plenty of bandwidth to spare, but it seems that a rare condition can cause Spring to start flooding the network. When this happens, everyone desyncs (possibly as no real messages can get through) and reports the same effects found if the host actually crashed. However, the game continues for the host.

I have a game recording of this, and a tcpdump listing of the data being sent by TA Spring during the failure condition. If either of these files would be of interest to the Spring developers, I can upload them somewhere. Let me know.

This happened today, using the latest versions of TA Spring and AA, during an 8 player game on Small Supreme Battlefield, hosted by my Win2K box. It also happened a few weeks ago - I don't remember which versions were being used then.
User avatar
LordMatt
Posts: 3393
Joined: 15 May 2005, 04:26

Post by LordMatt »

I think this happened in two games I was a part of yesterday. For those who aren't the host, the simulation just stops.
Jack
Posts: 32
Joined: 15 Jul 2006, 00:35

Post by Jack »

Yes - you were in my game. Sorry about the crash.
User avatar
clericvash
Posts: 1394
Joined: 05 Oct 2004, 01:05

Post by clericvash »

It has happend to me too, no idea why.
Jack
Posts: 32
Joined: 15 Jul 2006, 00:35

Post by Jack »

Hey, it happened again. This time I have a tcpdump recording of the entire game: any dev is welcome to it. I figure that it will be possible to work out what causes the game to break by looking at the network traffic, as the bug manifests itself as 100% uploading.

This time, it was Crossings 4, latest AA, latest Spring.
Jack
Posts: 32
Joined: 15 Jul 2006, 00:35

Post by Jack »

Ok, couldn't resist having a crack at this problem myself.

There are two bugs here. The serious one is that the server can be crashed by a malformed NETMSG_SELECT message. The less serious one is that such messages are ever sent.

It looks like the server receives a malformed NETMSG_SELECT message from one of the clients. The message is truncated by the limit on the size of a UDP packet. Thus, it cannot be decoded correctly. For some reason, this causes the server to repeat the packet to all clients forever, instead of only once. Here is an overview of the sequence.

Code: Select all

22:56:53.526751 IP 84.58.248.48.4968 > 172.31.2.9.26000: UDP, length 4240 <-- initial bad packet
22:56:53.539934 IP 172.31.2.9.26000 > 82.10.51.182.2153: UDP, length 4230 <-- repeats..
22:56:53.540282 IP 172.31.2.9.26000 > 88.104.145.187.4169: UDP, length 4230
22:56:53.540634 IP 172.31.2.9.26000 > 69.194.156.74.2702: UDP, length 4230
22:56:53.540989 IP 172.31.2.9.26000 > 84.208.118.119.1052: UDP, length 4230
22:56:54.164597 IP 172.31.2.9.26000 > 82.10.51.182.2153: UDP, length 4230
22:56:54.164947 IP 172.31.2.9.26000 > 88.104.145.187.4169: UDP, length 4230
22:56:54.175325 IP 172.31.2.9.26000 > 84.58.248.48.4968: UDP, length 4230
22:56:54.186069 IP 172.31.2.9.26000 > 82.152.178.152.4628: UDP, length 4230
22:56:54.196805 IP 172.31.2.9.26000 > 84.208.118.119.1052: UDP, length 4230
22:56:54.229038 IP 172.31.2.9.26000 > 71.162.115.14.62418: UDP, length 4230
22:56:54.250567 IP 172.31.2.9.26000 > 69.194.156.74.2702: UDP, length 4230
22:56:54.325714 IP 172.31.2.9.26000 > 81.56.123.90.2113: UDP, length 4230
...continues until quit

Here is the bad packet:

Code: Select all

22:56:53.526751 IP 84.58.248.48.4968 > 172.31.2.9.26000: UDP, length 4240
0x0000:  4500 05dc 4b38 2000 7711 d845 543a f830  E...K8..w..ET:.0
0x0010:  ac1f 0209 1368 6590 1098 c7df 5381 0000  .....he.....S...
0x0020:  1290 0000 0003 0b00 0100 030c 0001 000b  ................
0x0030:  1500 09c9 ffff ffa0 0000 d345 56e2 1743  ...........EV..C
0x0040:  0000 5345 0b15 0009 c9ff ffff a000 00d4  ..SE............
0x0050:  451a 8017 4300 0053 450b 1500 09c9 ffff  E...C..SE.......
...edit...

Code: Select all

0x0570:  1500 09c9 ffff ffa0 0000 d745 72e8 1643  ...........Er..C
0x0580:  0000 5f45 0b15 0009 c9ff ffff a000 00d8  .._E............
0x0590:  4520 db16 4300 005f 450b 1500 09c9 ffff  E...C.._E.......
0x05a0:  ffa0 0000 d945 defa 1743 0000 5f45 0b15  .....E...C.._E..
0x05b0:  0009 c9ff ffff a000 00da 455e fe17 4300  ..........E^..C.
0x05c0:  005f 450b 1500 09c9 ffff ffa0 0000 db45  ._E............E
0x05d0:  b40f 1843 0000 5f45 0b15 0009            ...C.._E....
Edit: Note that the final message is incomplete. It starts with the message code (0x0b), the size (0x0015), and the player number (0x09), but the rest of the payload is missing, cut off by the MTU size of the network (apparently around 0x5d0 (1488) bytes).
Last edited by Jack on 21 Jul 2006, 14:32, edited 1 time in total.
User avatar
Acidd_UK
Posts: 963
Joined: 23 Apr 2006, 02:15

Post by Acidd_UK »

1337 - you should post this bug & info up on mantis :

http://taspring.clan-sy.com/mantis/main_page.php

Well done for finding the cause of the problem!
Jack
Posts: 32
Joined: 15 Jul 2006, 00:35

Post by Jack »

Ooh, the issue is already open. I'll add a comment with more information.

http://taspring.clan-sy.com/mantis/view.php?id=201
Jack
Posts: 32
Joined: 15 Jul 2006, 00:35

Post by Jack »

This bug may be related to the known issue about queueing large numbers of buildings in one go. I have updated Mantis with some new information about this.

Is the cause of the queueing bug known? If not, this may be it!
Post Reply

Return to “Help & Bugs”