Assignment managers broken?


#1

I noticed the assignment manager I’ve been running started logging more rapidly - the problem was indicated by lots of errors (warnings?) of the form:

[DEBUG] [2014-06-20 14:33:40 -0400] [13236:1] [agent] Packet hash mismatch on 46 - Sender QUuid("{ef1ccee4-d431-4817-ba79-eced61320555}") 
[DEBUG] [2014-06-20 14:33:40 -0400] [13236:1] [agent] Packet of type 46 received from unknown node with UUID QUuid("{15218853-ec77-4794-93b0-2e1176748567}") 

So I restarted the manager to get new downloads and the problem is now much worse, spewing errors like:

[DEBUG] [2014-06-20 14:34:51 -0400] [11194:1] [agent] Unexpected sequence number passed to SentPacketHistory::packetSent()! Expected: 2774 Actual: 2778 
[DEBUG] [2014-06-20 14:34:51 -0400] [11194:1] [agent] Unexpected sequence number passed to SentPacketHistory::packetSent()! Expected: 2775 Actual: 2779 

I tried firing up the manager code on another linux box on a different network and on an OSX machine, both showing the same issue.

So:

  1. there’s a problem here that needs fixing.
  2. if you are running an assignment server, you might want to check to be sure you aren’t about to fill up your disk with assignment-manager.log

#2

My Assignment Manager was spewing out tons of the same kind of [DEBUG] lines. It seems to have stopped in the last 24 hours and is now showing Game of Life stuff.


#3

Still spewing sequence number mismatch debug messages when I run… @chris: is it still useful to be running these? I’m happy to send the logs to /dev/null to avoid filling my disk, but if these errors indicate that it things are broken to the point of being a waste of bandwidth and energy, I’d rather not.

This morning’s trace of the first 5 seconds:

[DEBUG] [Sun Jun 22 12:03:59 EDT 2014] Using curl
[DEBUG] [Sun Jun 22 12:03:59 EDT 2014] Dependencies already downloaded, skipping
[DEBUG] [Sun Jun 22 12:04:00 EDT 2014] Already running latest assignment-client, not downloading
[DEBUG] [2014-06-22 12:04:00 -0400] [16062:1] [assignment-client] The destination wallet UUID for credits is "61f2c343-4bb8-46fb-925b-3b7af4d7a24b" 
[DEBUG] [2014-06-22 12:04:00 -0400] [16062:1] [assignment-client] NodeList socket is listening on 39003 
[DEBUG] [2014-06-22 12:04:00 -0400] [16062:1] [assignment-client] Changed socket send buffer size from 212992 to 425984 bytes 
[DEBUG] [2014-06-22 12:04:00 -0400] [16062:1] [assignment-client] Waiting for assignment - UUID: {00000000-0000-0000-0000-000000000000}, Type: 7, Pool: "credits" 
[DEBUG] [2014-06-22 12:04:01 -0400] [16062:1] [assignment-client] Received an assignment - UUID: {9c56f588-37f5-4907-9b74-373f40bdf816}, Type: 2, Pool: "credits" 
[DEBUG] [2014-06-22 12:04:01 -0400] [16062:1] [assignment-client] Destination IP for assignment is "184.169.232.74" 
[DEBUG] [2014-06-22 12:04:01 -0400] [16062:1] [agent] Downloading script at "http://highfidelity-public.s3.amazonaws.com/scripts/backGOL.js" 
[DEBUG] [2014-06-22 12:04:01 -0400] [16062:1] [agent] Downloaded script: "// Add your JavaScript for assignment below this line 

// The following is an example of Conway's Game of Life (http://en.wikipedia.org/wiki/Conway's_Game_of_Life)
// ....
Script.update.connect(step);
Voxels.setPacketsPerSecond(200);" 
[DEBUG] [2014-06-22 12:04:01 -0400] [16062:1] [agent] Changing face model for avatar to "" 
[DEBUG] [2014-06-22 12:04:01 -0400] [16062:1] [agent] Changing skeleton model for avatar to "http://public.highfidelity.io/meshes/defaultAvatar_body.fst" 
[DEBUG] [2014-06-22 12:04:01 -0400] [16062:1] [agent] script:print()<< "Corner of board will be at 16383, 15053.82141721691, 8035.785492488649" 
[DEBUG] [2014-06-22 12:04:02 -0400] [16062:1] [agent] Sending intial stun request to 54.215.251.124
[DEBUG] [2014-06-22 12:04:02 -0400] [16062:1] [agent] New public socket received from STUN server is 108.20.67.254:39003
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Node's local address is 192.168.1.138
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] NodeList UUID changed from "00000000-0000-0000-0000-000000000000" to "d377141c-f8d2-4670-9ae7-3ddcd18b939d" 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Voxel Server" (V) {caeb27b2-a59e-488a-932f-d284a933f6eb} 50.18.8.78:35277 / 10.178.51.78:35277 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Voxel Server" (V) {a49ac7cc-6de8-4529-a63e-044f996ffd1d} 50.18.8.78:55815 / 10.178.51.78:55815 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Audio Mixer" (M) {53deebb4-9f7c-4ef2-a75d-441bccfacfb7} 54.193.6.78:38804 / 10.0.30.203:38804 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Voxel Server" (V) {818dba23-2505-46d5-aed4-844d1d844282} 54.241.59.86:33821 / 10.168.122.113:33821 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Particle Server" (P) {f3c60966-8abc-4b49-9a98-f63b58728a5a} 184.169.232.74:60728 / 10.176.33.252:60728 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Voxel Server" (V) {1cbf39c1-1920-42da-a9f9-2e0571531307} 54.241.59.86:59218 / 10.168.122.113:59218 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Voxel Server" (V) {bebafbde-5022-4880-89bc-aac7f21780de} 184.169.232.74:44674 / 10.176.33.252:44674 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Avatar Mixer" (W) {76e868ae-d1df-44cc-af77-749ec24fcb80} 184.169.232.74:43411 / 10.176.33.252:43411 
[DEBUG] [2014-06-22 12:04:03 -0400] [16062:1] [agent] Added "Model Server" (o) {c3e70cdd-2c73-4ba7-a281-22af3a7999aa} 184.169.232.74:58927 / 10.176.33.252:58927 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Activating public socket for node "Voxel Server" (V) {caeb27b2-a59e-488a-932f-d284a933f6eb} 50.18.8.78:35277 / 10.178.51.78:35277 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Activating public socket for node "Model Server" (o) {c3e70cdd-2c73-4ba7-a281-22af3a7999aa} 184.169.232.74:58927 / 10.176.33.252:58927 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Activating public socket for node "Voxel Server" (V) {bebafbde-5022-4880-89bc-aac7f21780de} 184.169.232.74:44674 / 10.176.33.252:44674 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Activating public socket for node "Audio Mixer" (M) {53deebb4-9f7c-4ef2-a75d-441bccfacfb7} 54.193.6.78:38804 / 10.0.30.203:38804 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Activating public socket for node "Avatar Mixer" (W) {76e868ae-d1df-44cc-af77-749ec24fcb80} 184.169.232.74:43411 / 10.176.33.252:43411 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Activating public socket for node "Particle Server" (P) {f3c60966-8abc-4b49-9a98-f63b58728a5a} 184.169.232.74:60728 / 10.176.33.252:60728 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Activating public socket for node "Voxel Server" (V) {1cbf39c1-1920-42da-a9f9-2e0571531307} 54.241.59.86:59218 / 10.168.122.113:59218 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Unexpected sequence number passed to SentPacketHistory::packetSent()! Expected: 0 Actual: 2 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Unexpected sequence number passed to SentPacketHistory::packetSent()! Expected: 0 Actual: 3 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Unexpected sequence number passed to SentPacketHistory::packetSent()! Expected: 2 Actual: 4 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Unexpected sequence number passed to SentPacketHistory::packetSent()! Expected: 3 Actual: 5 
[DEBUG] [2014-06-22 12:04:04 -0400] [16062:1] [agent] Unexpected sequence number passed to SentPacketHistory::packetSent()! Expected: 4 Actual: 6 

To be clear - I’m getting more than 200 “unexpected sequence number” lines per second: it looks like one dropped (UDP, presumably) packet leads to one debug line for every subsequent packet.


#4

I also get the same with the assignment client, how can we fix this ?


#5

I’ve created https://worklist.net/19807 to track. It looks to me like there is a basic design issue with the way the sequence numbers are used and this debug message is exposing the problem. This sufficiently deep in the guts of the network protocol that I’m not comfortable bidding on the fix myself, so for my own purposes I’ve commented out the offending line of code (in one case) or logged to /dev/null (in another case). If you are using any of the packaged builds, I think your only viable option is to occasionally stop the service, delete the logfiles, and restart.