Machine Learning 2.0 

Not All Data is Created Equal 

Mislabeled Data




Low Quality Data


Irrelevant Data


ROT Data


Corrupted Data


Catastrophic Forgetting-Inducing Data


High Quality Data


Highly-Informational Data




Along with the assumption that more data necessarily leads to a better model, comes the belief that each and every record contains a similar quantity and quality of information. In reality, data falls under a wide spectrum of informational value, from useful to useless and even harmful observations.

