I am very hyped about Pokémon Sword and Shield, and hopefully I get to play it soon. In anticipation of the gameplay, I have been binge-watching YouTube videos about the game, new Pokémon, strategies, and history. Then, I want to build the best team, without actually catching all Pokémon. I tried catching all Pokémon in Sapphire and Ultra Sun. It is just too tiring. I want to have a game plan to become the champion. So, I decided to find out which Pokémon types have the best stats.
Analysis
I spent one night to collect data about Galarian Pokémon (new Pokémon) and combined this dataset with the one I found on Kaggle. Then, I used Tableau to visualize and raw eye and common sense to analyze. I made some table like this and find the best type for each stat (Attack, Defense, Health, Special Attack, and Special Defense).
The best team I can build (without restriction):
1. Highest Attack: Ground + Fire: Groudon
2. Highest Defense: Steel + Ground: Steelix
3. Highest Special Attack: Psychic + Dark/ Dragon: Hoopa/Necrozma
4. Highest Special Defense: Naganadel/ Eternatus
5. Highest Speed: Fairy + Steel: Zacian
6. Highest Health: Ghost + Dragon type: Giratina
Zacian Groudon
Necrozma
Obviously, this team is impossible. I need a decent team to get through 80% of the game to get those legendary and some are not even available in the game.
More realistic team I can build in Pokémon Sword and Shield:
1. Highest Attack: Psychic + Fighting: Gallade
2. Highest Defense: Steel + Ground: Steelix
3. Highest Special Attack: Normal + Dragon: Drampa. Gengar (Ghost + Poison) and Lucario (Fighting + Steel) are not bad too.
4. Highest Special Defense: Rock + Fairy: Carbink. Dragalge (Poison + Dragon) is the next best.
5. Highest Speed: Dark + Ice type: Weavile
6. Highest Health: Ground + Electric type: Stunfisk. Drifblim which is Ghost + Flying, comes next.
Gallade Lucario
Steelix Dragalge
Weavile Stunfisk
This team is better and more realistic. Steelix is a must have apparently. Gallade and Lucario are my longtime my favorites. Carbink is apparently useless other than being a tank. Dragalge should be a better option because of its moveset and other better stats. Weavile is cool, I guess? And apparently Stunfisk is a good option according to the internet. It is not that cute though.
Ultimately, I need to play the game and the starter Pokémon would be a part of the team too. Some Pokémon are not obtainable in Pokémon Sw/Sh (probably). So, this is for reference and for my entertaining purpose.
Some fun facts:
Water-type Pokémon is the most common one.
47.4% of Pokémon are single-typing.
Dragon-type Pokémon has the highest average attack base stats.
I also did t-test (Using R) to see if legendary Pokémon have higher base stats. Obviously, they do. They are taller and heavier too.
Side note: when building a team, I have no strategies. I have certain types I like. I think offensive moves are the best defensive moves. I prefer Pokémon that look cool, elegant, or cute. And obviously it is important to consider the strengths and weaknesses of the Pokémon. It is fun to test around different ideas.
Modeling
I would like to make a model using RapidMiner that can predict if a Pokémon is Legendary (including Ultra Beast and Mythical), given a set of data about the Pokémon. I cleaned up the dataset from Kaggle and used it as my training dataset. I compiled the new Galar region Pokémon and used that as my testing dataset. When preparing the data, I want to exclude a few attributes that I think is not important or do not have complete information (Galarian Pokémon have incomplete data), hence I used “select attribute” operator to do so. Feature selection was performed using “weight by...” operators. I played around a few classifiers (those who can use polynomial input), such as decision tree and logistic regression. Decision tree worked better than other classifiers. Then, I settled at Random Forest, since it is an ensemble method for multiple decision trees.
To determine which model is the best in determining the legendary type, I used recall product (recall of true non-legendary multiplied by recall of true legendary). This model makes 97.5% recall rate of predicting true non-legendary Pokémon and makes 100% recall rate of true legendary Pokémon. Also, this model does not mispredict non-legendary Pokémon (100% precision) and has a 71.43% precision in predicting legendary Pokémon. In other words, in terms of predicting legendary type for the Galarian Pokémon, this model does its job fairly well and has an accuracy of 97.65%. Perhaps the cases of predicting non-legendary Pokémon to be legendary can be explained by some Pokémon to be pseudo-legendary (behaves like legendary Pokémon but is not legendary).
Cluster analysis
I also wanted to check out how are Pokémon categorized naturally. Using cluster analysis (k-means), I set k=5 and got 5 clusters.
Cluster 2 has the heaviest Pokémon (>=610kg) with decent experience yield and the highest attack, defense and health stats, which are mostly Legendary/Mythical/Ultra Beast like Groudon, Nihilego, and Zacian and non-legendary like Metagross (but it is heavy). Cluster 4 has the highest catch rate (and lowest weight), hence consisting most of the commonly seen Pokémon like Magikarp, Spheal, Geodude, Sandshrew, Trapinch, Wooloo, Yungoos, and Spinda. Cluster 0 has the lowest catch rate and the highest special attack stats, hence including some legendary Pokémon like Mewtwo, Mew, Latias and Latios (they are so hard to catch NGL), and many Pokémon that are at the last evolution stage, like Venusaur, Cinderace, Gengar, Greninja, and Garchomp. Cluster 3 has the second heaviest and second highest attack stats after cluster 2. Legendary Pokémon (with lighter weight and higher speed stats) like Lugia, Necrozma, and Solgaleo, and non-legendary Pokémon (pretty good Pokémon) like Tyranitar, Gyarados and Drampa, are in this cluster. Cluster 1 has the remaining Pokémon, with stats slightly better than cluster 4. Some of them are evolved version of cluster 4, such as Gloom and Dugtrio; some are pre-evolved cluster 0 like Raboot and Bulbasaur.
This is perhaps not the best categorization for the Pokémon. There are some overlapping, and weight is probably not the best indicator. It is still interesting to that the heaviest Pokémon are mostly Legendary, however.
Another side note: I am a newbie to predictive modeling (or modeling in general). Feel free to let me know if I have done anything wrong or inappropriate, and how can I improve.
All information and images about Pokemon are obtained from Serebii, Bulbapedia, and Pokemon.com.
Comments