Author Topic: Long polling adaptive frequency & time shift  (Read 648 times)

Offline Nox

  • Level 35
  • **
  • Posts: 768
  • Reputation: +12/-2
    • View Profile
Long polling adaptive frequency & time shift
« on: May 16, 2010, 08:24:08 AM »
Hi there,
I'm pretty sure many of you already figure out something like this, but I just got an idea so I want to share it
Long thread, but I try hard to keep it easily readable
Of course Apes are nice animals, but it's kinda non-standard way and useless for webhostings etc.
Quote from: lolninja
ape is like apache, so you'll need to be able to install it on your server, thus requiring something more than a standard hosting account, so a VPS would world, though you can install it locally with wamp/mamp/lamp for development.
It's of course (this general matter) bypassing of something that should be done in a totally different way, but so far (unless using flash), there aren't many choices

Intro (can skip if you know long polling)
Example: we want to check if the opponent has already used his turn in real time
(note: "db" means db or another storage or check or whatever, just an example)

Naive implementation
We will fire requests and receive YES/NO answer on every request

Problems:
* heavy server<-->client request traffic
* heavy server-->db query traffic
* lots of whole script runs

Long polling
We send request to server and will receive YES if the player has used turn
If he hasn't, server waits until YES is acquired or until TIMEOUT (in PHP often max_execution_time*x, x e (0,1>) is reached, in that case returns NO

Problems:
* heavy server-->db query traffic


Dealing with problems - adapting frequency
Okey, so we make checks less frequent.
But by which number?
Using some random number might lead to either almost no performance gain or the system having noticable lag

Solution imho is (if appliable) to determine average response time. This of course introduces overhead I will talk about later.
Following description mentiones particular way of storing, e.g. storing values for every player, which is not compulsory and among things I'll mention afterwards.

Response time means how it takes for a opponent to respond -> generally how long it takes to make the action which causes the polling script to finish and return result.

To the core
To determine frequency of checking ~ waiting(sleeping) times between checks, for each player we will have tables:

user {..., lp_response_avg, lp_reponse_sd }
user_combat_response { id, user_id, response_time }

After each reponse (/combat) we will store (/average) reponse time in a time unit of our liking
We will store total of N rows for each player, so we'll delete the oldest value for sake of performance AND adapting users evolving behaviour

After that we update
user.lp_response_avg with arithmetic average of response times
user.lp_response_sd with standard deviation of response times

Average tells us after what time can we usually expect the player to make action
Standard deviation tells us how (un)reliable the average is

----

In a code that handles the request we than use these two values.

Basic sleep time will be (okey, this will really need polishing....SD is kinda hard to work with for me)
<?php
$safety_value = 1.05; // ... as we'd rather wait a little bit longer than exactly the expected time
$minimal_wait_time = X;
$sleep = max( min($average * $safety_value, max_execution_time / $safety_value) / $standard_deviation, $minimal_wait_time);
?>

This will of course not guarantee the max waiting time will be the expected response time, but it will be more efficient


Improving further...shifting times
Now we have intelligently and not ad hoc setup of check intervals / for 1 script run
Okey, but.... if we expect user's action to take place around cca $AVG in certain radius
why do we check all the time before that?

Of course player can act surprisingly fast, but it's not probable as we determined.

So we'd like to shift checks the way that they are more dense around expected time and more scarce in not probable time

Downside of this shifting is that while we make only a few checks outside expected range, the frequency is high around expected value
The check cycle of different requests starts in different times, so it's not that painful, but the chance is still there, so it's wise to setup a variable that will control an intensity of this shifting

<?php
$rescale = $total_final_sleep_before_shift / $total_final_sleep;
$sleepFinal = ($sleep + pow(abs( min($total_wait_time-$AVG, $limit) ), (float)$shift_intensity))/$rescale;
?>
(I hope this is ok, done via experimenting in Excel)
I think we should not incorporate deviation into this formula as we already used it in previous section which controls density too (in a certain way)

Unfortunately I still haven't figured out how to calculate $someVariable. It's purpose is to bring the absolute shift into relative, correcting also moving the expected moment, so it's very important number

$limit - using this formula without min(x, $limit) would mean that the checking will (possibly exponentialy) grow infinitely, which we don't want


Side-effect of AF (data gather):
Possibly detection of stupid bots

Performance

This improvement comes with costs I mentioned, however I don't consider them higher than will be benefits

a) Data storing

By default I now plan to store average response times of like 10 last fights, only 1 number per fight (not sure how I'll do that yet, but yea).
This number will then be stored only ONCE per whole combat (and combat consists in my case of many script executions), again Standard deviation is recalculated only ONCE upon fight finish.

There are some database capacity requirements, but it's just max 2*N+2 of ints per player

You can modify this as you want, you can set N as 1 (and remove SD) if needed, or you can store duration of every request (not just average for fight) if you find out it's not a performance concern


b) Data acquiring

Okey, we need to get AVG and SD to the script. However as I advised - we can put data into the table we want to work with anyway and fetch it in a necessary initial script phase, so number of queries will be +0 and number of transferred data +2 ints...almost zero change


c) Calculation
$sleep variable can be pre-calculated, so will be determined only ONCE
using the more extensive technique - with $sleepFinal we'll need calculation per cycle, however with reduced number of cycles by first technique and the fact that it's just one calculation, no cycles etc., it will be fast

Finall finished ;)

Feedback time :)

Of course I welcome and will be really glad for any feedback.

I'm also unfortunately not a mathematician, formulas are rather empiric and not really extremely thought through and I wasn't able to determine $rescale, so if anyone will know fix or some polish them, I'll be grateful

Hope it will help at least some of you
« Last Edit: May 16, 2010, 08:56:46 AM by Nox »
Meet us at an IRC irc.freenode.net #bbg as well
https://vimeo.com/36579366 (a must-watch) | Join BOINC - no longer a hype, but you can help never the less

Offline gnoh

  • Game Owner
  • Level 15
  • *
  • Posts: 127
  • Reputation: +2/-0
    • View Profile
Re: Long polling adaptive frequency & time shift
« Reply #1 on: May 18, 2010, 04:45:08 AM »
A little bit more complicated than what I've set up,  I have a single point of call for all commands to pass through on route to the server so all I do is increase the sleep time if it's a command that implements an interface for polling(system generated), reset the sleep time if it doesn't(user generated command).

I've set a limit on how high the sleep can go which is just a final field within the command itself, part of the output of a polling command is how long you should wait until it sends again.

I should also say that this is for my strategy game, the next game I'm developing is using comet.

 


SimplePortal 2.3.3 © 2008-2010, SimplePortal