On Characteristics of Machine Learning

My previous article, Do Neural Networks Dream of Pokémon? I have inferred it is not very effective use of the machine learning technology, and I wanted to do some follow up on that aspect.

First of all, the definition of machine learning is that technology that allows machine to determine outcome without explicitly programming covering all the cases. Generally, when the program is written, in context of normal application programs, programmers have already programmed what to do when certain things happens. For instance, if you write in URL to your web browser, it displays the page under that URL. It is because programmers who have programmed your web browser program to do that.

Therefore, with elements on Pokémon where all the combination of elements are already known, it is more reliable to program all the case for it. (And Pokémon is programmed to do so.)

Unlike the case above, machine learning is effective in the cases like below:

  • Cases where input parameters and their combinations are large, and is not practical to program for all.
  • There are unknowns and ambiguities in expected input.
  • Rate of change and characteristics in input is very subtle. (For it would require a lot more data to determine certain outcome, it is related to the characteristics on input parameters.)

For example, self-driving car is very difficult problem, because it is not possible to explicitly program virtually infinite cases that it has to deal with. (For instance, let’s say the self-driving car has to determine if someone’s standing on the road, it is not realistic to program for all the situations, for example, it can be varying weather, lighting condition, place where the person is standing, and their motion.)

However, there are way such ambiguities can be reduced, for example by the traffic infrastructure, such as traffic signals, and communications between cars, and while current self-driving car technologies focuses on co-existence between current cars and environment, but opposite approach of revising traffic infrastructure should be also taking place.

Before, Google CEO said “It’s a bug that cars were invented before computers” and if he meant that by having computer prior, traffic infrastructure would have been designed that way, I think he was right on.

Back to Pokémon, with Pokémon, it’s 18 elements that each can have up to 2 overlaps, and it is not a large information to process, and can be programmed so without too much of effort. However, if this becomes hundreds, and thousands, with varying overlaps, it becomes very difficult problem to program. (It is however, to program for their patterns, thus, it’s not something that has to be programmed one-by-one.)

Pokémon neural network I’ve experienced other day is a very simple case of neural network, but image recognition and other advanced recognitions are just mere extension of it.

Democratization of machine learning is certainly a big topic in near future, and my intent is to continue experimenting for it.

Do Neural Networks Dream of Pokémon?

One thing I wanted to experiment with TensorFlow (and neural network) is a some simple numerical processing. I have decided to put it into test.

I was looking for some subject, and then I noticed something about Pokémon Sun & Moon. There is a Festival Plaza feature in the game where players can play set of minigame. One of those minigame is where player can attempt to match “most effective” elements given a single or a set of element. For example, one of such example with be Bug and Grass — the answer is Fire gives Super Effective, which earns you point. There are four classes of effectiveness, from none to most effective are, No Effect, Not Very Effective, Normal and Super Effective.

There is a truth table on how this is layouted, but I wanted to see if some learning from mistake approach can predict outcome of this minigame.

Using a dozen of festival tickets, I’ve collected around 80 of log. Next quest is to research how these data can be normalized to test — as I am not yet very familiar with the way TensorFlow works in this regard, with a lot of focus on image recognition, it is surprisingly hard to find information about where numeral data is processed, then I stumbled upon existing effort by mtitg, using TensorFlow to see if TensorFlow can predict survivors of Titanic incident based on multiple parameters.

mtitg’s example uses many parameters, 8 to be exact, but in my case, I will be using three. mtitg’s code is very great starting point as it covers topics about how textual data can be prepared to be converted to numeral value for TensorFlow to process.

I have adopted mtitg’s code to work with my test data.

Long story short, with my limited testing on 80, it turns out to be 50% at most I can get leverage out of. I think the reason as follows:

  • Not enough variety of data; since this is obtained through manual process, there are manual process and considering fairly complexity of Pokémon’s elements, data represented is certainly not a exhaustive set of data.
  • Too many “Normal” outcome. Default for most of outcome that are not affected by certain elements, for example, fire element for bug is normal. Thus data tend to revert to normal, which provided more example where outcome is normal than other three classes.
  • Perhaps neural network is not very good approach for this problem. Perhaps simple logistic regression, and/or Bayesian algorithms would work better.

Conclusion

As I wrote earlier, with existence of actual element pairing data, there is no practical reason of this attempt; it’s really for learning and fun, after all.

With further optimization and research, TensorFlow and machine learning method have a great potential in making sense of data as well as to provide added value from dataset that may be already present. Machine visions and self-driving cars are very cool application for the machine learning technologies, but we shouldn’t forget adopting this technology on our personal computers we already have.