Tuesday, December 22, 2009

Self-Balancing Systems?

I’ve been thinking about meta-games recently. No, not the metagame that I talked about earlier, but systems that produce sets of game rules—literally, software that procedurally generates entire games.

I’ve written posts abstracting the various parts of RPGs into neat definitions. The writing was dry and probably didn’t earn me much cred in blogging circles, but it needed to be done before I could even conceive of game rule generators for the RPG genre. Without some basic vocabulary we have difficulty even discussing simple abstract systems.

Instead of actually generating entire game systems from scratch, let’s try to solve an easier problem. How could we automate balancing of RPG combat mechanics? This requires algorithmically tweaking rules conceived by a designer—the designer gives a framework for abilities and the algorithm comes up with the exact numbers that should balance those abilities against one another to hopefully make all of them parts of viable strategies.

The processes used to self-balance combat mechanics probably can be modified and applied successfully to other aspects of the game, but being to abstract at this stage of discussion could lead us to severe language trouble as I fail to find feasible broad terms.

I will summarize two ideas I have for self-balancing combat systems in this post. In later posts, I will discuss each in significantly depth.

Zero-Sum Ability Balance

Assign utility values to each of the effects and costs that abilities can have(positive utilities for the effects and negative utilities for the costs). Pick out the effects each ability will have and make a weighted list of costs. The zero-sum ability balancer would go through and value the set of effects for an ability, then assign costs preferentially until the ability reaches a total utility of 0. If you value effects and costs properly, this will automatically balance the abilities in your game.

The primary difficulty is coming up with the proper valuations for different costs and effects. Genetic algorithms could be used to generate and cull valuations systems. The problem is highly multivariate, so writing an algorithm that attempts a direct solution process would be significantly trickier.

This method will work for both class- and skill-based systems.

Popularity-Based Ability Balance

Let players vote with their feet. If they think a certain ability is overpowered, let them choose the ability, but make that ability less powerful for everyone in the game world by some increment. The increment would be smaller for abilities that are staples for certain kinds of builds so that everyone using the level 1 firebolt won’t nerf it into being useless. Basically, The more people choose a power, the less powerful it is. Let the playerbase create characters and play the game in this way for a month (a shorter interval of time may work), then reset the base values for the effects on abilities and adjust the penalty-per-player. Run like this throughout the life of the game and watch it balance itself. If you have 10,000+ players, you have a reasonable chance of generating a balanced system—and that balanced system is actually self-adjusting when new patches come out!

If all players behave rationally and build characters based on information available about their abilities and popularity (the interface should provide some idea of how popular abilities are), this system would converge on balance faster than the zero-sum ability idea. This system would not counteract poor ability design in terms of putting useless or bugged effects onto expensive abilities, but it will balance the non-bugged abilities and it would make clear through the popularity numbers what abilities are bugged or need to be reworked. The game would need to make clear to the players that choosing the most popular ability is usually not the best idea—confusion in this area could lead to problems.

This approach probably wouldn’t work on a class-based system because it relies on the selection of individual abilities. If done at the class-level there’d be no way to isolate the abilities to buff or nerf because people don’t choose their class’ abilities, only the class itself.

OK, But…

These ideas are very experimental and very raw at the moment. I just wanted them to get some air and perhaps a few comments pointing out big holes that I missed.

Has anyone tried to implement a self-balancing system before?

If so, how was it done?

What ways can you think of to allow games to self-balance?

28 comments:

Stropp said...

Part of what you described, the self (de)scaling of abilities was already attempted in the original Asheron's Call magic system.

IIRC, the more a particular spell was used in a set period of time, the weaker it became.

I'm not sure why it ended up being dumped, but from what I remember it wasn't effective. Of course that could just be that particular implementation and doesn't mean the idea is worthless.

AC did try and attempt a few innovations with their magic system, but ultimately they didn't work as above, or players found their way around the game limitations (usually with external tools, or websites.)

evizaer said...

AC's magic system reduced ability power based on how many people had unlocked (or used) the ability? Or was it just that if you spammed an ability it'd have diminished effects for you alone?

Copperbird said...

I'm curious what you think about the way Guild Wars does things, where there are loads of abilities to choose from but players have to pick their builds carefully to deal with different encounters, and they can swap their builds around very easily.

It doesn't guarantee all abilities are equal but it leaves room for some very niche builds.

Stropp said...

There wasn't any unlocking per se. Players had to figure out what components had to be combined in order to cast a particular spell. These could be harvested or bought for the most part. Some of the higher level comps were drops iirc.

The reduction occurred globally for all users of the spell, not just for the player. But come to think of it, there might have been some scaling down just for the player if the spell was spammed. Can't quite remember though, it's been a while.

Borror0 said...

A friends of mine actually have developed a software to help game designers to balance, actually. Google "+7 balance" for more info.

I also mailed them the blog. Maybe they'll drop by and comment.

Argent said...

Ryzom actually has the first (zero-sum) system built into the ability system in-game. Instead of learning e.g. "level 1 fireball", you actually learn ability components and then combine them for an effect. An example, with made-up weights:

fire damage level 2: 13 points
short range (24 yards): -2 points
extended casting time (2s): -7 points
moderate mp cost(10mp): -4 points

Basically, you can choose a base effect (such as fire damage), and then apply positive or negative modifiers to the base effect to build the exact effect you're looking for. So if you've learned the right spell components, you can choose between an expensive, fast fire-2 effect and a cheap, slow fire-2 effect.

There are actually 4 different "schools", each of which has a couple of base effect types:

Physical Combat: swords, bows, shields, etc.
Magic: direct damage, channeled effects, healing, buffs
Gathering: skinning, herbalism, mining, I think
Crafting: various weapons, armor, magic items

You actually split the XP for defeating a foe between the physical and magic school depending on how much you used each to defeat the monster.

Argent said...

That is, Ryzom has a self-balancing system of the first type, not that they are necessarily the first game to do so.

Nils said...

For me to enjoy such a system I'd need an in-game explanation.

For e.g. magic spells that is even easy:
Just introduce magic like an often used spell becomes less effective.

It's harder for the typical skills of e.g. typical warriors.


However, my guess is that the real problem is identification of the imortant properties in the databanks. Once you did this you can also use a human to change the game rules slightly every few days.

Unknown said...

Are you talking about balancing for PVE or PVP? Or in a more general sense?

Also, are you balancing individually or for a group? I think balancing things individually will be MUCH easier than for the group.

Anonymous said...

I think I would not play a game called "Zero-Sum Ability Game".

Unless, the game mechanics was actually designed with specific paradigms in mind, to complement it all. Sitting on a horse back, enjoying some kind of bonus, but being vulnerable or limited in some other way.

I am confident I couldn't stand a pure number-balancing game. No matter how fancy the graphics/effects were.

I never liked WoW and have issues with Eve-Online (pacing) that I hope will improved before I grow old and die.

Nathaniel Bogan said...

As Borror0 mentioned, we at +7 Systems have been working on the 2nd method for several years, and we've successfully proved the concept in Hordes of Orcs. See our web site (linked from my name) for tons more on this.
So basically, I think it' a GREAT idea. :)

Nathaniel Bogan said...

As you might suspect, though, it has to be done VERY, VERY carefully.

evizaer said...

I'm glad that I'm not crazy. If you are looking to hire another developer, Nathaniel, I'm currently looking for work.

@Anon: The game obviously wouldn't be called "zero sum ability game"... It's just a balancing system that can be fit into any game.

Also, balancing abilities doesn't mean making everything have a negative. Look at games like Settlers of Catan that involve very little in the way of penalties but are still balanced. Sometimes it's simply a matter of opportunity costs.

The horse example is interesting, though, I'll consider that further...

@Argent:
I'm aware of Ryzom's system. They have inspired a lot of my thinking on ability design. I beta-tested that game five years ago and it has had a significant impact on how I think about balancing RPGs.

Ryzom's system isn't really self-balancing, it just relies on a zero-sum balance. A self-balancing system needs to actually observe player choices and adjust the values of various effects and costs in order to automatically rebalance abilities based on how players behave.

Matt said...

Very interesting topic. I've always perceived balance as an intractable problem in game design, especially when you introduce open worlds and the idea of RPG progression into the mix. My approach would have started with ideas found in fighting game design, and specifically some thoughts that Sirlin broached on his blog.

A few concerns come to mind, though I've not thought through them much yet.

1) Given enough time, do both of these methods tend to homogenize the effects of each ability. In other words, will a basic set of abilities emerge (direct ranged damage, ranged damage over time, direct touch damage, etc.) only differentiated by animation and flavor text?

2) How would one account for variables that are not easily quantified? For example, how does a crowd control ability fit into an equation with a direct damage ability? Is duration a descriptive enough attribute of crowd control to balance the equation? How about advantages conferred onto specific abilities by terrain or surroundings?

3) Do interactions among abilities play a role in the balancing? I'm envisioning a scenario where the interplay between two abilities makes for an especially potent combination. Which ability would be the target for balancing?

In thinking about it while I wrote the above points, I can see some potential for genetic algorithms to alleviate problems associated with points 2 and 3. A varied set of mutations would seem to zero in on the most effective balance given enough generations. However, I'm still unsure about the first point. Maybe it's not a problem. Maybe there are enough variations to make for an interesting game.

evizaer said...

Good criticisms, Matt. I was actually thinking about these points as I was writing this article. Including my analysis for each system would have led to a monstrous post, so I decided to summarize a couple of ideas and the basis for what I'm trying to do with self-balancing systems, later I will go into more depth on the differences between these systems and their assumptions, pitfalls, and benefits.

"1) Given enough time, do both of these methods tend to homogenize the effects of each ability."

Neither would homogenize on effects because a human game designer needs to actually give the balancing system what abilities have what kinds of effects and costs. We're not trying to generate abilities from whole cloth,the balancing system only adjusts the magnitudes of an ability's effects and costs. A balancing system isn't going to save you from bad design, but it will do a better job than you will at making the magnitudes of effects and costs balance between abilities.

"2) How would one account for variables that are not easily quantified?"

It's not much of a problem, actually. In a zero-sum system the crucial process is tuning the valuations for different kind of effects and costs. A debuff would have a different valuation method than a DD effect would. Same thing with CC. Over time you can tweak the valuations to a point where balance would be significantly improved over simply assigning whatever magnitudes you want.

"3) Do interactions among abilities play a role in the balancing?"

The zero-sum system would have some trouble with this, though you could base the valuations for some abilities off of the values of others (like a fire resist debuff, for instance, could be based off of the cost per damage for your fire spells).

The popularity system would have this kind of interaction balancing built-in. People who chose a specific combo would drive down the power of that combo and give rise to different combos coming into primacy. Those, in turn, will become popular and become weakened until people find something else. The migration of the player through the strategy-space would be facilitated by a popularity-based system--there'd be an interesting metagame there.

Sven said...

I do have one nagging concern with self-balancing games. Imagine (for the sake of argument) that it is possible to build a perfect system where there are no overpowered or underpowered abilities.

In that case, all character design choices would be effectively equal. The trouble with this is that there is no longer any skill involved in character building: you may as well pull your spec out of a hat. You've effectively de-skilled a large part of the game.

Kenny said...

Your zero sum algorithm will probably never work properly as assigning numerical values to components is just as arbitrary as assigning abilities to classes. Your other idea is basically a computer controlled Flavor of the Month system with all of the inherent weakness of such systems. Sven hit the nail on the head: eventually all possible combinations will balance out and it will not matter too much how you build your character as any and all strategies are viable. This is if your players are still around by that time.

Then there's the problem with immersion ("Gandalf was throwing around fire magic for all his life but all of a sudden he realized, for reasons beyond his understanding, that with a well placed ice cone he could achieve more than with an enraging Inferno"), not to mention that you penalize all those who initially found the best builds. The later can, with careful storytelling, work in a single player game (reducing effectiveness of frequently used stuff) but not in a MMO, especially not in an MMO for reasons beyond the players' controll. It might keep the most active of players to keep jumping between builds but the masses will be just annoyed out of their mind (especially because they always follow others and it will seem to them they are never be able to catch up or be "uber").

But given all classes/builds are equally viable you still have the problem with players themselves, because of their personality they _will_ congregate around certain builds. Is it fair to nerf a build compared to others simply because of player preference? But if you don't how do you know if they are "powerplaying" (avoiding metagaming here :] ) or not?


Personally I think the answer to this is periodical (think lots of months here) server resets with changes inbetween. But this would be a tuff pill not nlike permadeath. ;)

evizaer said...

@Sven: It wouldn't be strategically upsetting and boring to play the a perfectly balanced game because the viability of a strategy does not need to be universal. When you fight different kinds of enemies, you'll adopt different strategies even though you have the same abilities in each fight.

Character build choices wouldn't be effectively equal, because choosing a fire resist debuff and a fireball will never be worse than choosing an ice resist debuff and a fireball. The intercombination of abilities would produce the uniqueness of characters. Some people will choose to be full-on nukers, some people might choose to nuke but also heal. Yet more people might choose to wear heavy armor and have moderate healing abilities and good fighting abilities. In different situations, different abilities shine. The idea of self-balance is to not allow some abilities to outshine the rest on a consistent basis.

@Kenny:

I don't think gradual balance changes would be any more immersion-breaking than balance patches.

You have a good point about zero-sum systems. I think that the arbitrariness is sourced not at the level of assigning abilities to classes, but at the level of assigning effects and costs to abilities. Clearly a zero-sum system cannot fix big problems, but it can ensure that all abilities are conformant with certain standards of the usefulness of different effects and the costliness of different costs. The work shifts towards getting the valuations (standards of effectiveness/costliness) correct and appropriately spreading different effects and costs among different abilities.

The reason I want to systematize the valuation of different costs and effects is that I want to have a central place where I can manipulate specific mechanics without having to cascade my changes manually everywhere else they're found. If I decide that debuffing enemies is too powerful, I could simply intensify its valuation function and he increased costs would perhaps bring that entire class of abilities closer to balance.

A popularity system would automatically nerf FOTM builds before they can gather steam. Such a system is at its best when everyone picks the most overpowered abilities when they make their character. This assumption can be dangerous, though, because sometimes people will choose abilities because of their utility beyond the mere fact that their overpowered. I'll discuss this more in a future post.

Dblade said...

Popularity-based systems would be hard on us players. I can think of a few reasons:

One is that it can only effect a small set of abilities. You know what the most popular and widely used ability in an MMO is? A cure spell. You'd have to exempt healing, and probably hate management generating abilities otherwise you get the nasty situation where the more tanks and healers, the weaker they become in their jobs.

So that leaves it mostly to things like DPS and crowd control. I can tell you right now that DPS will hate you, since they out of all classes need to feel the best. Weaker DPS do not get invites: weaker tanks and healers can still muddle along.

This relates to another point. Sometimes people actually prefer the weaker class. Blue Mages in FFXI were very popular, because it was an iconic job, and it was a lot of fun to play. However, as a pure DPS class, it fell short compared to other heavy DD like warrior, due to its reliance on MP. What would happen under a popularity based system is you'd nerf a weaker class and make them even weaker.

I can't even comment on a zero-sum system, because I can't see how you can quantify combat in an MMO.

evizaer said...

"One is that it can only effect a small set of abilities."

No. It can effect all abilities, just to different extents depending on the inherent necessary popularity of the ability.

"You know what the most popular and widely used ability in an MMO is? A cure spell."

The most popular and widely used ability is to hit something with whatever you have in your hand, e.g. a basic melee attack.

Your conception of the popularity of certain abilities is very off, as well. Hate management abilities get used much less in general than abilities useful in soloing. Healing gets used much less than DD. The average ability available to a DPS class is more popular than the average ability of a tank or healer class because there are usually more people playing DPS classes (soloability)at any time than playing tanks and healers combined. In WoW, the ratio is usually 3 DPS to 2 non-DPS.

The goal of the system isn't to nerf certain abilities into oblivion. The effects of increased popularity could scale logarithmically based on the percentage of the population using the ability. In general, you wouldn't see abilities vary widely depending on popularity. The changes wouldn't be more than 10-20% of the magnitude of the ability's effects. It's a tweaking automation system, not a fundamental rebalancer.

"This relates to another point. Sometimes people actually prefer the weaker class. Blue Mages in FFXI were very popular, because it was an iconic job, and it was a lot of fun to play. However, as a pure DPS class, it fell short compared to other heavy DD like warrior, due to its reliance on MP. What would happen under a popularity based system is you'd nerf a weaker class and make them even weaker."

The number of people who do this may be statistically insignificant. Though you do point out an important assumption in the popularity-based system: players are assumed to pick overpowered abilities over others. This isn't necessarily the case, but is the variance so caused significant enough to wreck an automatic balancer that relies on this assumption? I don't know yet, but I lean towards "no."

"I can't even comment on a zero-sum system, because I can't see how you can quantify combat in an MMO."

It's really easy to quantify combat in an MMO. I'll be doing it in a future post.

Logan said...

Most everything i had in mind has already been said.

but there is 1 thing i can think of that hasn't really been mentioned... my thinking is that you should balance the whole CHARACTER, not the individual SKILLS.

If i were designing a system like this, i would make it so that the whole character operated on a Zero-Sum basis. so basically the player chooses their skills, and then they have attribute points that they can split up among their different skills. these attribute points can be applied to:

- Reduce Cast Time
- Reduce Cooldown
- Reduce Mana/Stamina/Energy Cost
- Increase Damage/Healing
- Increase Duration
Etc:..

you'd have to limit the number of points that can be put into each individual skill, and each individual attribute, and each individual attribute within a certain skill... i'd also add that each point you add should give less of an increase to the selected attribute than the previous point added. so like the first point adds 15%, the next adds an additional 10%, and the third point only adds 5%.

i've actually already designed a system like this and was just going to copy and paste it here but it would have been very long and detailed and i don't want to hijack this thread too much.

basically i'd just remember that it's the WHOLE player that you really need to balance, not just the individual skills... and quite frankly you WANT skills that are more or less powerful, if all the skills are the same strength, the game is going to be pretty boring... it's the niche skills that are only useful in certain situations that really add something interesting to the game.

overall i really enjoyed this post, much more than any other ob here so far. looking forward to more discussions like this in the future.

Dblade said...

Evi:

"Your conception of the popularity of certain abilities is very off, as well."

Here is a link:

http://www.playonline.com/ff11eu/guide/development/census/09/3.html

This is the Vanadiel Census, linking to the page of job distribution. The top ranking jobs happen to be the two healer classes, red and white mage. If you go by real numbers playing per job, you'd be nerfing the cure spells, as well as black magic. Irony is black mage is so undesired in parties that they solo to cap now. So its not always as open and shut as you think.

"The number of people who do this may be statistically insignificant. Though you do point out an important assumption in the popularity-based system: players are assumed to pick overpowered abilities over others. This isn't necessarily the case, but is the variance so caused significant enough to wreck an automatic balancer that relies on this assumption? I don't know yet, but I lean towards "no.""

Actually it would. Look at the very last job on the list, corsair. That job is actually incredibly powerful.

The thing is, the powerful jobs in there you can't tell just from numbers. The fourth most popular job is thief, but in play it is not very useful-it is popular because it is a starting job at low levels. Very few play corsair, but the job is in high demand, and you can level it within a month to cap. It is really potent even if played poorly.

Also, these jobs all vary wildly in each setting of the game. A black mage is useless in xp partying, but vital in endgame. A bard is vital everywheres. Corsairs are much less useful in endgame because of their skill caps for their guns-they lose a lot of utility as a dd and become pure support.

The situation will be differ based on the MMO you create, but even in the most themeparkiest of themepark games, the situation may be a lot more counterintuive and dynamic than you think.

evizaer said...

http://www.warcraftrealms.com/census.php

Notice how the healing classes are far far outnumbered by the DPS classes (even if you assume hybrids are HALF pure healer and half non-healer, which definitely isn't close to reality). It's much safer to assume that WoW is a more representative sample of potential players for an MMO than a low population and nearly obsolete game like FFXI is.

FFXI does not represent the general patterns of modern MMOs because it is a game of forced grouping primarily. Modern MMOs focus on soloability, which strongly biases the kind of classes people will play.

Also, in your FFXI numbers, the healing classes ARE outnumbered by non-healers by at least 10% if you add up the non-healers and compare them against blue and white mages.

The census is also voluntary. This leads to participation bias. Clearly only the more hardcore players will participate because only those that really care about the game will know about it.

Even if the healers did outnumber the non-healers, you can't reasonably assume that "cure" is the most used ability in the game. As I said before, basic melee attack is the most commonly use ability.

evizaer said...

A further note on the WoW census: It looks like 61% of people play classes that have no healing abilities. Even with a +-10% margin of error, the plurality of players are playing non-healers. If you remove tanks from non-healers, you have a 51% plurality of players playing DPS classes. If we count the people who are playing DPS warriors, shaman, and druids, that number would be at least 60% DPS builds at any given moment.

Kenny said...

"The reason I want to systematize the valuation of different costs and effects is that I want to have a central place where I can manipulate specific mechanics..."

How do you plan to assign numerical values to synergies and (hopefully) emergent uses of character abilities?

Brian 'Psychochild' Green said...

One problem I see is that you don't define balance. This is vital for determining the design goals of your system and measuring success. I worry, especially for the popularity-based system, that you're simply trying to get people not to play flavor-of-the-month combos; this usually is more a reflection of fashion rather than raw power.

evizaer wrote:
As I said before, basic melee attack is the most commonly use ability.

Are you going to use the system to balance basic attacks? If not, you're splitting hairs.

Some thoughts about zero sum:

1. If I understand the description correctly, you're still relying on the designer to adjust values. Imbalances in current games are mostly due to designer error.

2. Designers may not notice interaction. Say fire spells are too powerful for one class because they get a bonus to critical effects. Making the "fire damage" element more costly may harm other classes where these spells were more balanced out.

3. It ignores situational issues. A taunt function is more useful in group and raid situations than in solo play or PvP. How do you rate the taunt element of an ability used in different situations?

Some thoughts about popularity-based advancement:

1. Asheron's Call did indeed have a popularity-based system which reduced the effectiveness of a spell if it was used frequently. It was removed, but I don't quite remember the situation. Might want to ask around and find some old AC1 players to see if they remember.

2. This system makes it hard to plan out your character if the power variance is too wide. I might find that I want a melee character, but melee is the current fashion so my character is hindered.

3. Related to that, it kind of sucks if your character setup happens to become the flavor of the month.

4. It's open to player manipulation. If we're PvP enemies and I know you use fire spells, I might get my guild to start using fire spells to nerf your ability to fight back.

5. It hinders social aspects. If people learning my build hurts me, then I'm going to be hesitant to share that information with others, especially in a PvP environment. Eventually information gets passed around, though, which can cause hurt feelings.

6. The popular spells may not be the most powerful in need of adjustment. In WoW, Minor Beastslayer enchantments were common near launch not because it was potent, but because it was the lowest level enchantment that gave a weapon glow effect. Nerfing the enchantment would have been completely unnecessary despite the popularity.

Dblade said...

Evi:

Yeah, WoW and FFXI have radically different approaches. Of course when you make your own game you'll have your own issues. My point was linking this to give you some more hard numbers on how distribution may not follow theory as much as possible. That link should have shown that it was both voluntary and done with SE's own numbers. The players tend to err a bit, as you can see, but aren't too far off.

Don't get too hung up on the argument itself though, the specifics just show that its different from game to game what abilities may be affected, and that sometimes what you expect may not be quite what the reality is. You could make a great autobalancing system, but your players bitch because it looks at data you din't expect and you just nerfed healing magic by 25%.

Verilazic said...

I think you could use this method to balance the whole world of an MMO. Especially in PvP, but in PvE as well. If one boss in an instance goes down more often, let him be adjusted to be a little stronger, while a boss that doesn't die very often is adjusted to be a little weaker.

In PvP, if one faction is starting to dominate, adjust a number of factors to give the other side a small advantage over time (especially if one side is much more populous). Looking at WoW, I think something a little more than what Wintergrasp had.