Ok. After a month of researching, I've slightly modified the DWZ system to fit Shoddy battle's needs. Also, I've added in some additional elements.
This should be used as the CRE Only.
The range of ratings.
With ELO, and a K factor of 32, ratings on a Chess Server called "Internet Chess Club" has range of somewhere between 400 to 3400, which is by far too large for Shoddy Battle's needs.
Therefore, we include an acceleration factor, which is much higher than the K factor of 32, but slows down as more games are played. This is the DWZ's version of Volatility.
Our Old friend from ELO
So, we are going to take apart this system.
K = acceleration factor, Ro = Old rating, Rn = New Rating, r = result, Re = expected result.
I wasn't really able to mess with the Re as it required an in depth knowledge of Statistics which I don't have. So we will keep it as
The Acceleration Factor
Now, this is interesting. Kb = base K, and E is the volatility factor, and n is the number of games played.
Kb should be somewhere around 600-800, this should vary depending on the number of players actively playing Shoddy Battle.
Volatility Factor
This is where things get complicated...
a is the Ratings Factor (starting to run out of usable variables), and Eo is the volatility, and B is the Breaking Point.
This calculation raises Eo for higher rated players, therefore, lowering the K as we remember it is Kb / (E+n), so with higher Eo, the higher E, which means Kb is divided by a higher number. Of course, this is on the assumption that higher rated people are more stable.
(Which is not true in all cases).
The S is the Slowdown Constant, changes how you much you want to slow the rating down. I haven't played around with this value yet. (I've used 7.5 in all of my calculations), but it can be anywhere from 5 to 15. Of course the higher the slowdown factor, the lower weight the volatility has on the rating.
The Ratings Factor is to create some sort of floor. First we establish:
So, now, we want, optimally ratings to be from about 1000 ~ 2000. So, we establish a as
Now the final part. B, the Breaking Point.
B is tricky, it is so that lower rated players can accelerate faster.
This has the effect of reducing the K factor so less rating is lost when r is smaller than Re.
Also, we must note, to curb the ratings,
if B=0. and 5 < E < 150, if B > 0.
Notes
The current function of the CRE as Colin has said is a statistic to rank players. But the point of ratings is to rank players. Is it not? In the current system, you can lose rating due to inactivity, or even by winning games.
In this system, if you perform better than expected, then you gain rating, if you perform worse than expected, you lose rating. It fixes you to some specific ranking and adjusts it once a match is played. Is this not what Shoddy Battle needs?
This system allows you to start with a high K-factor, but it decreases as we go up. So, it we were to take statistics of everyone's rating, the curve would have few players ranked low (i.e. 800-1400). Many players ranked from (1500-1700) and very few players ranked 1700+**. This allows us to differentiate the better players from the worse ones, and is "a statistic introduced to simply rank players".
This system is still missing RD from the Glicko-2 system, but it doesn't matter, as the CRE is again "a statistic introduced to simply rank players", the slight rating difference is neglible since many players will be between 1500 to 1700. This system mainly differentiates very good players (1700+) that belong to the ladder board and players who are here just to have fun.
**You can also see this curve on tests. I.e. the SAT, where there will be ~3000 people with 600+ on a specific section, but 30,000 with 400-590 on another specific section.
This should be used as the CRE Only.
The range of ratings.
With ELO, and a K factor of 32, ratings on a Chess Server called "Internet Chess Club" has range of somewhere between 400 to 3400, which is by far too large for Shoddy Battle's needs.
Therefore, we include an acceleration factor, which is much higher than the K factor of 32, but slows down as more games are played. This is the DWZ's version of Volatility.
Our Old friend from ELO
So, we are going to take apart this system.
K = acceleration factor, Ro = Old rating, Rn = New Rating, r = result, Re = expected result.
I wasn't really able to mess with the Re as it required an in depth knowledge of Statistics which I don't have. So we will keep it as
The Acceleration Factor
Now, this is interesting. Kb = base K, and E is the volatility factor, and n is the number of games played.
Kb should be somewhere around 600-800, this should vary depending on the number of players actively playing Shoddy Battle.
Volatility Factor
This is where things get complicated...
a is the Ratings Factor (starting to run out of usable variables), and Eo is the volatility, and B is the Breaking Point.
This calculation raises Eo for higher rated players, therefore, lowering the K as we remember it is Kb / (E+n), so with higher Eo, the higher E, which means Kb is divided by a higher number. Of course, this is on the assumption that higher rated people are more stable.
(Which is not true in all cases).
The S is the Slowdown Constant, changes how you much you want to slow the rating down. I haven't played around with this value yet. (I've used 7.5 in all of my calculations), but it can be anywhere from 5 to 15. Of course the higher the slowdown factor, the lower weight the volatility has on the rating.
The Ratings Factor is to create some sort of floor. First we establish:
So, now, we want, optimally ratings to be from about 1000 ~ 2000. So, we establish a as
Now the final part. B, the Breaking Point.
B is tricky, it is so that lower rated players can accelerate faster.
This has the effect of reducing the K factor so less rating is lost when r is smaller than Re.
Also, we must note, to curb the ratings,
if B=0. and 5 < E < 150, if B > 0.
Notes
The current function of the CRE as Colin has said is a statistic to rank players. But the point of ratings is to rank players. Is it not? In the current system, you can lose rating due to inactivity, or even by winning games.
In this system, if you perform better than expected, then you gain rating, if you perform worse than expected, you lose rating. It fixes you to some specific ranking and adjusts it once a match is played. Is this not what Shoddy Battle needs?
This system allows you to start with a high K-factor, but it decreases as we go up. So, it we were to take statistics of everyone's rating, the curve would have few players ranked low (i.e. 800-1400). Many players ranked from (1500-1700) and very few players ranked 1700+**. This allows us to differentiate the better players from the worse ones, and is "a statistic introduced to simply rank players".
This system is still missing RD from the Glicko-2 system, but it doesn't matter, as the CRE is again "a statistic introduced to simply rank players", the slight rating difference is neglible since many players will be between 1500 to 1700. This system mainly differentiates very good players (1700+) that belong to the ladder board and players who are here just to have fun.
**You can also see this curve on tests. I.e. the SAT, where there will be ~3000 people with 600+ on a specific section, but 30,000 with 400-590 on another specific section.