Community Forums > General > General Discussion > New Rating Algorithms for AoE, AoK, AoC and AoM!

New Rating Algorithms for AoE, AoK, AoC and AoM!

 +[UB]Elusive


Group: Server Administrator
Join Date: 18 June 2007
Posts:7072
Edited 1 January 2014 - 10:23 pm by +[UB]Elusive
I have coded up some new algorithms that need testing and review before we enable them. Let me know your thoughts.

Supported algorithms:
  • Elo - Elo as it exists today, winning team has points distributed evenly among the winning team.
  • ESO - Elo variant used by ESO for AOM and AoEIIO
  • Elo Fair - ESO + Fair team balancer (lower rated players get more points when game is won) + points are also scaled up Nx if there are N players on a team. So instead of 16 pts split across the winners, on 2v2 it's 32 split across winners.
  • ESO Fair - ESO + Fair team balancer
  • Elo Fair 2 - Elo + Fair team balancer + no point scaling

ESO algorithm is targeted per-request for AOM and AOE III games. Elo Fair and Elo Fair #2 are targeted for AoE - AoC, sticking with Elo because that's what people are familiar with. The only change is how to distribute points.

To understand what the fair team balancer does take a look a examples below, look closely are '2100' player under these algorithms.

ELO today:
Code:
+--------------+---------------+---------------+-------------+ | Team 1 (Won) | Team 2 (Lose) | Team 1 (Lose) | Team 2(Win) | +--------------+---------------+---------------+-------------+ | 2100 (+3) | 1600 (-3) | 2100 (-5) | 1600 (+5) | +--------------+---------------+---------------+-------------+ | 1650 (+3) | 1655 (-3) | 1650 (-5) | 1655 (+5) | +--------------+---------------+---------------+-------------+ | 1500 (+3) | 1500 (-3) | 1500 (-5) | 1500 (+5) | +--------------+---------------+---------------+-------------+ | 1400 (+3) | 1400 (-3) | 1400 (-5) | 1400 (+5) | +--------------+---------------+---------------+-------------+

ELO Fair:
Code:
+--------------+---------------+---------------+-------------+ | Team 1 (Won) | Team 2 (Lose) | Team 1 (Lose) | Team 2(Win) | +--------------+---------------+---------------+-------------+ | 2100 (+1) | 1600 (-13) | 2100 (-35) | 1600 (+18) | +--------------+---------------+---------------+-------------+ | 1650 (+9) | 1655 (-15) | 1650 (-22) | 1655 (+16) | +--------------+---------------+---------------+-------------+ | 1500 (+14) | 1500 (-9) | 1500 (-15) | 1500 (+22) | +--------------+---------------+---------------+-------------+ | 1400 (+18) | 1400 (-5) | 1400 (-11) | 1400 (+26) | +--------------+---------------+---------------+-------------+


ELO Fair 2:
Code:
+--------------+---------------+---------------+-------------+ | Team 1 (Won) | Team 2 (Lose) | Team 1 (Lose) | Team 2(Win) | +--------------+---------------+---------------+-------------+ | 2100 (+1) | 1600 (-3) | 2100 (-8) | 1600 (+4) | +--------------+---------------+---------------+-------------+ | 1650 (+2) | 1655 (-3) | 1650 (-5) | 1655 (+4) | +--------------+---------------+---------------+-------------+ | 1500 (+3) | 1500 (-2) | 1500 (-3) | 1500 (+5) | +--------------+---------------+---------------+-------------+ | 1400 (+4) | 1400 (-1) | 1400 (-2) | 1400 (+6) | +--------------+---------------+---------------+-------------+

Instructions for testing



The test application is attached as algorithmtesting.zip. To run it is test.exe followed by the algorithm id, ids are:
  • RATING_ALGO_ELO 0 // Standard Elo
  • RATING_ALGO_ESO 3 // ESO -- no fair team balance
  • RATING_ALGO_ELO_FAIR 4 // Elo + Fair team balancer + points are also scaled up Nx if there are N players on a team. So instead of 16 pts split across the winners, on 2v2 it's 32 split across winners.
  • RATING_ALGO_ESO_FAIR 5 // ESO + Fair team balancer
  • RATING_ALGO_ELO_FAIR2 6 // Elo + Fair team balancer + no point scaling

So, for example to RATING_ALGO_ESO_FAIR you would run 'test.exe 5' in cmd.

I've attached the results of all the runs other than RATING_ALGO_ESO (out of splots to attach the files).

I created very simplistic test cases -- I would like you guys to expand upon what I have in testcases.ini

Test cases are simple:
Code:
[7] Team1=2100,2200 Team2=2050,2250

[7] = test case id (just a unique string).
Team 1 = list of ratings on team 1

Results show scenario of team 1 winning on left, and scenario of team 2 winner on right. Format of rating cell is: original rate (delta after rating algo).

To dump all the algorithm results to files use:
Code:
test.exe 0 > elo.txt test.exe 3 > eso.txt test.exe 4 > elo_fair.txt test.exe 5 > eso_fair.txt test.exe 6 > elo_fair2.tt
Attachments:
algorithmtesting.zip (file size: 1.13 MB)
elo.txt (file size: 3.25 KB)
eso_fair.txt (file size: 3.25 KB)
elo_fair.txt (file size: 3.25 KB)
elo_fair2.txt (file size: 3.25 KB)
Link | Reply | Quote
 [I3acI]_Taff


Group: Standard Membership
Join Date: 18 April 2008
Posts:12435
Posted 1 January 2014 - 10:09 pm
ELO Fair and #2 both have merits for different scenarios.

ELO Fair would be great for CS, but not so good for RM.

ELO Fair#2 would work well for RM games.

Link | Reply | Quote
 +[UB]Elusive


Group: Server Administrator
Join Date: 18 June 2007
Posts:7072
Posted 1 January 2014 - 10:12 pm
ELO Fair and #2 both have merits for different scenarios.

ELO Fair would be great for CS, but not so good for RM.

ELO Fair#2 would work well for RM games.

Why doesn't ELO Fair work for RM? Didn't we get complaints on RM - TG taking long time to reach players 'fair' rate?

At the moment, you get more points over 2v2 then 4v4, should it say like that for RM - TG ladder?
Link | Reply | Quote
 +[UB]Elusive


Group: Server Administrator
Join Date: 18 June 2007
Posts:7072
Edited 1 January 2014 - 10:24 pm by +[UB]Elusive
I updated original post with more context.
Link | Reply | Quote
 [I3acI]_Taff


Group: Standard Membership
Join Date: 18 April 2008
Posts:12435
Edited 1 January 2014 - 10:29 pm by [I3acI]_Taff
In RM TG even a 50 point difference can be a big skill gap.

The higher rated player will have done far more work to earn any win, in a team game.

That would just mean even less choice for players to join games. Why should a player 100 points higher do more for a win and then have less reward, when the lesser skilled player may have almost been dead and yet gains the most. Rated TG would be FAR more picky on who they allow to join.

So in theory for RM, we would just limit the choice of games even more.

Maybe all we end up with is more equal skilled games, but does the lack of choice outweigh that.

In RM I think it does.

The reason a 2v2 carries more points, is in effect it's 2 x 1v1's with very little chance to hide or be carried.


Link | Reply | Quote
 [BH_]lnIghTzl


Group: Standard Membership
Join Date: 9 June 2010
Posts:9106
Posted 1 January 2014 - 10:39 pm
i think ELO fair would be best for RM/DM lounge, and ELO fair 2 would be best for CS.

the point distribution and skill level in the RM/DM lounge matters alot more in comparison to the CS lobby.

Link | Reply | Quote
 +[UB]Elusive


Group: Server Administrator
Join Date: 18 June 2007
Posts:7072
Posted 1 January 2014 - 10:44 pm
Any concerns on fair vs. regular Elo?

lnIghTzl and Taff you both said the opposite thing for Elo Fair vs Elo Fair #2. I implemented both because I figured it would be controversial :)

Can you run through some test scenarios in the test tool?
Link | Reply | Quote
 [xCs]IYIastA


Group: Standard Membership
Join Date: 23 August 2011
Posts:3372
Edited 1 January 2014 - 10:53 pm by [xCs]IYIastA
Well since you are working in this area I'm going to throw out one of my oldest ideas for cs rates.

Make it so every major rated cs map has its own hidden ladder that only shows up in the room. Get rid of visible cs ratings even on the profiles. It does nothing for the community but cause dissension.

So for example there are these ladders:

CBA TG
CBA 1v1/2v2
CBA FFA
CBA Hero TG
CBA Hero 1v1/2v2
CBA Hero FFA
RCB
MCB
Blood TG (excludes the major maps)
Blood 1v1 (excludes the major maps)
Archer Blood
Path Blood
Smosh Blood
Dodgeball
Europe
Link | Reply | Quote
 [I3acI]_Taff


Group: Standard Membership
Join Date: 18 April 2008
Posts:12435
Edited 1 January 2014 - 11:28 pm by [I3acI]_Taff
+[UB]Elusive wrote:

lnIghTzl and Taff you both said the opposite thing for Elo Fair vs Elo Fair #2. I implemented both because I figured it would be controversial :)

No idea why anyone would think the opposite to what I suggested.

ELO Fair:
Code:
+--------------+---------------+---------------+-------------+ | Team 1 (Won) | Team 2 (Lose) | Team 1 (Lose) | Team 2(Win) | +--------------+---------------+---------------+-------------+ | 2100 (+1) | 1600 (-13) | 2100 (-35) | 1600 (+18) | +--------------+---------------+---------------+-------------+ | 1650 (+9) | 1655 (-15) | 1650 (-22) | 1655 (+16) | +--------------+---------------+---------------+-------------+ | 1500 (+14) | 1500 (-9) | 1500 (-15) | 1500 (+22) | +--------------+---------------+---------------+-------------+ | 1400 (+18) | 1400 (-5) | 1400 (-11) | 1400 (+26) | +--------------+---------------+---------------+-------------+

The above is what you suggested lnIghTzl

For a 4v4 RM Team game? I genuinely think that is not the way to go.

I am just one though
Link | Reply | Quote
 2394823


Group: Gold Membership
Join Date: 26 November 2012
Posts:2397
Posted 1 January 2014 - 11:45 pm
In RM TG even a 50 point difference can be a big skill gap.

The higher rated player will have done far more work to earn any win, in a team game.

That would just mean even less choice for players to join games. Why should a player 100 points higher do more for a win and then have less reward, when the lesser skilled player may have almost been dead and yet gains the most. Rated TG would be FAR more picky on who they allow to join.

+1 Strongly Agree


Imo, ELO Fair 2 would be the best for RM. Also for ELO Fair, it simply has no chance to work in RM.
Link | Reply | Quote
 [TheJedi]Genette


Group: Standard Membership
Join Date: 8 July 2012
Posts:2448
Edited 3 January 2014 - 4:25 am by [TheJedi]Genette
With the given example, team 1 has a rating average of ~1662 & team 2 ~1539 -> rating disparity 123

This is huge for a 4v4, so I think 6 pts for the lowest rated player on team 2 (instead of 4 pts in standard Elo), is not that much. Can't we find some middle ground between "Elo Fair" (which is obviously over the top for RM) and "Elo Fair2"?

Total points to gain for team 2 would be 19. Given this rating disparity (100+) between the teams, I'd make it 27 (6+6+7+8 ).
Link | Reply | Quote
 [TheJedi]Genette


Group: Standard Membership
Join Date: 8 July 2012
Posts:2448
Edited 3 January 2014 - 5:19 am by [TheJedi]Genette
Closely related to this topic:

What about matching the teams based on "fairness" -> matching them in order to achieve similar rating averages? Or are there technical restrictions here (disabling the team option in the waiting window showing "?" until the game starts)
Link | Reply | Quote
 +[UB]Elusive


Group: Server Administrator
Join Date: 18 June 2007
Posts:7072
Edited 3 January 2014 - 6:12 am by +[UB]Elusive
Can't we find some middle ground between "Elo Fair" (which is obviously over the top for RM) and "Elo Fair2"?

I've been thinking about the same thing, splitting it up and having the various knobs tunable.

My current thinking is below, I will upload a new algorithm test tool today or tomorrow where we can play with various permutations.
  • Base Algorithm: Elo or ESO
  • Team Fairness:

    Motivation:
    - To discourage noob bashing (high rated players continually playing with new/low rated players). If all someone does is play with low rated players their rate should go down to match because they've only shown they can beat low ranked players -- so they shouldn't have a high rate.
    - If someone is playing with higher rated players -- they should catch up to those players quickly

    Solution:
    - Split points won taking into account the players rate. The lower rated players get larger share of the pot.

    Knobs:
    - 0..100% to control between 0% (off/today) and 100% Elo Fair/2.. 100% based on "fair" split would likely make it harder to find games during off hours.
  • Point Pool Size:

    Motivation:
    - It takes much longer to arrive at ones fair rating with TGs. We shouldn't require players to play games they don't find as enjoyable to just to arrive at their rate faster.
    - Keeping higher skilled players at lower rates makes ratings less accurate.

    Solution:
    - Grow point pool as the # of players in the match increases.

    Knobs:
    - List to control point pool size in 1v1, 2v2, 3v3, 4v4, ...
  • New Player Boost:

    TBD
Closely related to this topic:

What about matching the teams based on "fairness" -> matching them in order to achieve similar rating averages? Or are there technical restrictions here (disabling the team option in the waiting window showing "?" until the game starts)

Last time I brought this up (if I understand what you're suggesting) was there wasn't much interest because similarly rated players already play together.

Though, I do like the idea of more random games instead of stacked games -- to encourage randomized team games we could have those games provide points.
Link | Reply | Quote
 +[UB]Elusive


Group: Server Administrator
Join Date: 18 June 2007
Posts:7072
Edited 3 January 2014 - 8:22 pm by +[UB]Elusive
The upgraded tool is ready! Results are below.
  • team_fairness=0 means split points as we do today, team_fairness=100 is splitting 100% using the new team fair algorithm.
  • point_pool_size is a list for 1v1,2v2,3v3,4v4 -- it's specified as an individual player's points in an evenly matched game. E.g.: point_pool_size=16,12,9,7, means for 4v4 there is 7*4=28 points split among the winning team.
  • tool output now has a top line above player #0 for team average rate (total team points gained).

I like the 2nd results the best, though, others are free to disagree.
  1. ELO today: ratecalc.exe --team_fairness=0
  2. ratecalc.exe --team_fairness=100 --point_pool_size=16,12,9,7
  3. ratecalc.exe --team_fairness=50 --point_pool_size=16,12,9,7
  4. ratecalc.exe --team_fairness=0 --point_pool_size=16,12,9,7
  5. ratecalc.exe --team_fairness=100 --point_pool_size=16,11,8,6
  6. ratecalc.exe --team_fairness=100
  7. ratecalc.exe --team_fairness=50 --point_pool_size=16,11,8,6
  8. ratecalc.exe --team_fairness=50

Excerpts from #2 are below:
Code:
Case: 3v3-Bash +-----+--------------+---------------+---------------+--------------+ | | Team 1 (Won) | Team 2 (Lose) | Team 1 (Lose) | Team 2 (Win) | +-----+--------------+---------------+---------------+--------------+ | A/T | 1686 (+20) | 1585 (-20) | 1686 (-34) | 1585 (+34) | +-----+--------------+---------------+---------------+--------------+ | P0 | 1840 (+3) | 1600 (-7) | 1840 (-15) | 1600 (+11) | +-----+--------------+---------------+---------------+--------------+ | P1 | 1650 (+8) | 1655 (-8) | 1650 (-10) | 1655 (+10) | +-----+--------------+---------------+---------------+--------------+ | P2 | 1570 (+9) | 1500 (-5) | 1570 (-9) | 1500 (+13) | +-----+--------------+---------------+---------------+--------------+ Case: 4v4-Bash +-----+--------------+---------------+---------------+--------------+ | | Team 1 (Won) | Team 2 (Lose) | Team 1 (Lose) | Team 2 (Win) | +-----+--------------+---------------+---------------+--------------+ | A/T | 2000 (+18) | 1850 (-18) | 2000 (-39) | 1850 (+39) | +-----+--------------+---------------+---------------+--------------+ | P0 | 2500 (+1) | 1920 (-6) | 2500 (-16) | 1920 (+8) | +-----+--------------+---------------+---------------+--------------+ | P1 | 1850 (+5) | 1830 (-4) | 1850 (-8) | 1830 (+10) | +-----+--------------+---------------+---------------+--------------+ | P2 | 1950 (+4) | 1800 (-4) | 1950 (-10) | 1800 (+11) | +-----+--------------+---------------+---------------+--------------+ | P3 | 1700 (+8) | 1850 (-4) | 1700 (-5) | 1850 (+10) | +-----+--------------+---------------+---------------+--------------+

You can download the new version of the rating calculator at http://www.voobly.com/updates/elusive/ratecalc.zip .

Files:
  • testcases.ini - customizable list of scenarios
  • ratecalc.exe - main exe, reuse / edit command lines I listed above
  • gentestcases.bat - batch script that runs ratecalc with parameters already filled out for #1 - #8. You don't need to use this but if you add new scenarios you can quickly regen the lists I link here.
Link | Reply | Quote
 [Foo]fobbix


Group: Standard Membership
Join Date: 29 July 2007
Posts:391
Posted 3 January 2014 - 11:44 am
This is just for NPL???

If not then its a bit unfair for players like myself that goes into DM random team/civs (michi 20min cut - both of which gives lesser players much more of a chance) games and put 1000% effort into games only to lose the game because of a noob. Not to mention people might just smurf so they can ruin higher rated players ratings. Also I don't really like playing AR type of maps since its too open and you can just camp and eco raid where as with BF or closed maps like michi you have to fight your way through, and as I've said it gives lesser players a chance.

But meh I already don't have a 2k rating which with being one of the best in DM you'd think I would have. I'm not a 1v1'er and I never team stack, and play lots of 3v3's and 4v4's. Therefore I only have a 18** rate and not the best win/loss ratio.


Link | Reply | Quote
[1]234
Displaying 1 - 15 out of 54 posts
Forum Jump:
1 User(s) are reading this topic (in the past 30 minutes)
0 members, 1 guests

Most active threads in past week: