A ground breaking result in data mining!
Be hold, the most ground breaking results in data mining is finally revealed! Today, I finally got the results of a decision tree that I was making for classification on a dataset and the results were amazing! After using a data set of many samples (let’s say 175738 observations) and many variables (let’s say 69) and having the code running for the past four days the incredible decision tree tells me that only one variable was used and that was the SYMBOL in the dataset!! Assume you have thousands of samples from cow, sheep, horse, mouse, and dog and the task is building a model for classification of proteins. Then your model tells you that it is ignoring all the variables but the name of these animals (this would be SYMBOL variable in my dataset)! how amazing :)) It is funnier to see that in the comments I wrote “the SYMBOL variable should probably be removed” :)))