Timers stop responding

Started by aXXo, Sep 08, 2016, 09:37 PM

Previous topic - Next topic

aXXo

I'm using a lot of timers for various stuff in my server. There are only a couple of timers which run infinitely, most of the timers are created dynamically and expire after a certain duration/runs.
After the server is running for some hours, depending on number of players that connected. All the timers stop responding.

They act like they are permanently paused. They won't call the functions anymore. I have no idea what could cause this behavior and it can not be replicated. Maybe, there is a limit on number of timers that can exist on the server(counting expired once too)? A server restart is required to get them working again.

Some investigation I did to find the root cause:
  • VM clock is fucked? Tried printing time() and GetTickCount(), they both work as they should.
  • print() the instance of a timer. It prints the instance as it should, no nulls.
  • Create a new timer. NewTimer doesnt work in this server state. The newly created timer exists, but does not do anything like other timers.
  • No squirrel error.

Running Squirrel 04rel003 plugin(though I expect this to happen on rel004 too). Linux 32 bit.

.

#1
This isn't really a surprise. There's a plethora of issues with the timers implementation. And yes, there is a limit. That limit is 255.

Where do I begin with the issues?

1) Deleting the timer instance even if there might be references to it inside the VM. This should produce some nice segmentation faults on Linux when the VM tries to delete it as well. Because once an instance was given to the VM, the VM takes control and delete's it when the object counter reaches 0. Deleting instances manually before that means begging for a crash.

2) Leaking memory in destructor by not deleting the parameter values allocated on the heap. Depending on how many timers you create and destroy, and how much memory you have, you may eventually just run out of memory. Sooner or later.

3) Keeps a weak pointer to a string object owned by the VM which will eventually be released when removed from the VM stack. That name is then used as the name of the function to call. Potentially using a junk string from freed memory reused by the VM. Which would cause the VM to not find that function. Which as a fallback, makes the timer think it has no purpose to live. Should be no surprise if this is your issue. Basically, the VM ran out of memory and used the memory which was freed when the function name was popped from the stack.

Should I continue? No offence, but this is a total clusterf*.
.

aXXo

Thanks! The 255 limit was the cause.

Apparently, the plugin automatically cleans up the timers after that have run its course, so they don't contribute towards the 255 limit after expiring.
But, the timers which are set to run infinitely, never get cleaned up. Even if they are .Stop()'ed, .Delete()'ed and = null'ed they still count towards the limit. Once the amount of these timers reached 255, all timers were fucked.