Introduction


The dataset I chose for my project was called “Pokémon for Data Mining and Machine Learning” which I found on Kaggle. I originally picked this dataset because it contained many variables, both categorical and quantitative, that I could attempt to correlate in a number of ways. In the end, however, I decided that I would set out to compare the numerous Pokémon types. Pokémon embody one to two “types” which, in part, determine that Pokémon’s various stats, the moves it can use, and its appearance. There are currently eighteen types of Pokémon: Bug, Dark, Dragon, Electric, Fairy, Fighting, Fire, Flying, Ghost, Grass, Ground, Ice, Normal, Poison, Psychic, Rock, Steel, and Water.

In most Pokémon games, the player can only carry 6 Pokémon with him or her at one time, therefore it is always difficult to decide which Pokemon to bring along and which to leave behind; often times this decision is influenced by the Pokémon’s type. It is always a good idea to maintain a diverse team of Pokémon and this analysis of the Pokémon types is designed to help players with just that.

Pride And Prestige


In the Pokémon universe there are a number of Legendary Pokémon that are significantly rarer and more powerful than the average Pokémon. In my data set of 721 Pokémon, 46 of them are legendary.

In order to deduce which type of Pokémon yields the most legendaries, I created a bar chart comparing the number of legendary Pokémon that belong to each type. Note that the total number of Pokémon shown in this bar graph is 73, not 46 because I am looking at both primary and secondary types. If a legendary is of the types Psychic and Flying, for example, it will be counted once for the Psychic type and once for the Flying type; effectively meaning that all Pokémon with two types are double counted. I chose to do this because I set out to compare what Pokémon you would have available should you commit yourself to a particular type. By including secondary types I am changing the overall counts of legendaries but when it comes to the individual types, I am fairly and completely representing all of the pokemon that have that type, even those who only have it secondarily.

From the data we can see that the Psychic type is the most legendary type; followed closely by Dragon, Flying, and Fire. Curiously there are no legendary Pokémon of either the Bug or Poison type. The three Pokémon that have the Fighting type possess it only as a secondary type.

If you are seeking prestige of legendary proportions, then the Psychic, Dragon, Flying, and Fire types may be for you. If, on the other hand, you are a fan of the Bug or Poison types, you will have to settle for a team without legendary Pokémon.

Cold, Hard Stats


If you couldn’t care less about fame and legendary Pokémon and instead want only the strongest for your team, then you will want to know which Pokémon can deal the most damage, take the most hits, and respond the quickest. A Pokémon’s combat abilities can be summarized by six numbers: HP, Attack, Special Attack (Sp_Atk), Defense, Special Defense (Sp_Def), and Speed. A Pokémon’s overall combat rating is calculated by summing each of its six other stats. I plotted the overall combat rating for all 721 Pokémon in a histogram and showed the type distributions using different colors. This histogram can be used to see the overall distribution of the Total stat before dividing the Pokémon by type.

This data is very clearly multimodal and appears to have modes roughly every 100 steps on the x axis. This could potentially show that pokemon are designed to loosely fit into different classes of strength depending on their Total stat. Also, for the curious among you, the strongest Pokémon by total stat is Arceus and the weakest is Sunkern. Here is a summary table showing numerical information about the spread and the center of the Total stat.

Minimum Q1 Median Q3 Maximum Mean SD
180 320 424 499 720 418 110

As stated before, the histogram, regardless of the colors, was mainly intended to show the overall distribution of the Total stat, not the individual type distributions. For a clearer picture of the Total stat for each of the types individually, we can turn to a box-and-whisker plot.

Here we can see that the Dragon type is by far the strongest overall type. While the single strongest Pokémon is a Normal type, the median Total stat for the Dragon type exceeds even the Q3 of most other Pokémon types. To see exactly how all of the other types stack up we can turn to another summary table.

Type Minimum Q1 Median Q3 Maximum Mean SD
Bug 194 266.75 387.5 470.25 600 366 110
Dark 220 346 466.5 501.25 680 438 101
Dragon 245 410 520.5 600 680 499 136
Electric 205 330 435.5 495.25 680 426 106
Fairy 190 290 405 480 680 392 124
Fighting 210 391.25 462.5 510 580 441 98
Fire 250 357.5 465 531 680 447 109
Flying 244 348.25 436 508.75 680 434 114
Ghost 236 330 448 489.5 680 422 102
Grass 180 318 405 490 600 408 98
Ground 210 326 427.5 505.75 670 421 105
Ice 250 334 480 525 660 445 104
Normal 190 290 411 475 720 393 110
Poison 195 314 390 482 535 389 93
Psychic 198 325.75 463 520 680 444 118
Rock 280 355 440 500 600 437 88
Steel 300 420 490 525 680 470 98
Water 200 325 425 495 680 416 103

But maybe you have a very particular fighting style and a couple stats in particular are important to you. Well fear not! Here are another six boxplots for each of the six separate Pokémon stats. First up is HP, the stat that determines how much damage your Pokémon can take before falling out of battle.

As far as comparing Pokémon types go, this graph isn’t the most fascinating. Most of the types are rather similar in terms of HP though it is reasonable to say that Ground type Pokémon have somewhat higher HP than normal and Ghost types somewhat lower. The Pokémon with the greatest amount of HP is Blissey with 255 HP and the Pokémon with the least amount of HP is Shedinja with 1 HP.

Next up is the Attack stat which determines the amount of physical damage that a Pokémon can deal out. Think Tackle, Scratch, or other moves that involve physical contact.

Predictably, the Fighting type has the highest median Attack stat but what is surprising is how closely the Fighting type is matched by the Dragon type. Both the Dragon and Fighting types are capable of dealing out major physical damage. The Pokémon with the highest Attack stat is Rampardos with 165 Attack and the two Pokémon with the lowest Attack stats are Chansey and Happiny with 5 Attack each.

The Defense stat determines how much of an enemy’s attack is resisted and how much of it drains HP. The higher the Defense stat, the more damage is resisted and the less HP is lost.

Here the Rock and Steel types come out on top — their solid bodies able to resist the most amount of damage. Steel type Pokémon seem to vary less in the Defense stat than the Rock type. The Pokémon with the highest Defense stat is Shuckle with 230 Defense and the two Pokémon with the lowest Defense stats are Chansey and Happiny with 5 Defense each.

While Attack and Defense pertain only to physical damage, Special Attack and Special Defense deal with non-physical damage like is dealt during moves like Inferno, Water Gun, or other elemental attacks. First we will look at the Special Attack stat.

Special Attack is nearly a 4-way tie, but looking strictly at medians, the Psychic type comes out on top. The Psychic type is closely followed, however, by the Dragon, Electric, and Fire types. What these types may lack in the regular Attack stat, they make up for in Special Attack — particularly the Psychic type. The Pokémon with the highest Special Attack stat is Mewtwo with 154 Special Attack and the three Pokémon with the lowest Special Attack stats are Shuckle, Feebas, and Bonsly with 10 Special Attack each.

The ability to resist these non-physical, elemental attacks is determined by the Special Defense stat. It works the same way as the Defense stat, but applies to non-physical attacks.

When it comes to Special Defense the Dragon type comes out on top; it just barely beats out the Psychic and Steel types. Amazingly there is a 5-way tie for the Pokémon with the lowest Special Defense. The Pokémon with the highest Special Defense stat is Shuckle with 230 Special Defense and the five Pokémon with the lowest Special Defense stats are Caterpie, Weedle, Magikarp, Igglybuff, and Carvanha with 20 Special Defense each.

Finally there is the Speed stat which is used to determine which Pokémon moves first in battle. The Pokémon with the higher speed always moves first.

The fastest Pokémon type is the Electric type, handily beating all of the other types. The slowest overall type was the Rock type. The Pokémon with the highest Speed stat is Ninjask with 160 Speed and the two Pokémon with the lowest Speed stats are Shuckle and Munchlax with 5 Speed each.

As can be seen from the data, different types are stronger in different stats. Depending on your personal battle strategy you may want to choose a type other than Dragon. Or maybe you aren’t looking for the best of the best in a particular stat and are just looking for a type that is well rounded and has as few shortcomings as possible.

It should also be noted that Shuckle comes up a ridiculous number of times as either the strongest or weakest Pokémon in a particular stat. It also seems far more likely to tie for the lowest value in a particular stat than to tie for the highest value. Also, the Normal type seems to contain a particularly large number of outliers.

So You’ve Fallen In Love With A Type, How Many New Friends Do You Have?


Now that you have decided what type of Pokémon will always be at your side, how many options do you have? Out of 721 Pokémon, how many are your type? The following waffle chart shows how common each type of Pokémon is: every square represents one percent of the total Pokémon population.

As can be seen from the waffle chart above, the most common type of Pokémon, by a significant margin, is the Water type — representing eleven percent of all Pokémon. Types such as Fairy, Ice, and Ghost are much more rare at only three percent of the population each. Does rarity add to the value of a Pokémon? Or only add to the frustration involved in finding it? That is for you to decide!

Conclusion (Sorta)

With a dataset this big, there is far more to be explored! I do feel, however, that I have covered a good portion of the type-related variables. Overall the stats of Pokémon vary wildly, and the distribution of the Total stat, for example, is far from normal. The histogram most clearly shows the shape of this distribution as multimodal and with several outliers on the high-end.

Overall, it seems like the Dragon type is the best type. It has a high number of legendary Pokémon associated with it and it comes out on top in the Total stat and is a consistent leader in the other six stats as well. Its only drawback would be its relative rarity at only four percent of all Pokémon.

In the end, however, the choice of Pokémon is far more complicated than just looking at the types. In my analysis I failed to test for the most important Pokémon choosing factor: the cuteness factor. Oh well, perhaps someday statisticians will find a way to quantify “cuteness”, but until then, I’ll have to stick to my numbers.