Introducing
Machine Learning 2.0
Not All Data is Created Equal

Mislabeled Data
Redundant
Information
Information
Low Quality Data
Irrelevant Data
ROT Data
Corrupted Data
Catastrophic Forgetting-Inducing Data
High Quality Data
Highly-Informational Data
Adversarial
Example
Example
Along with the assumption that more data necessarily leads to a better model, comes the belief that each and every record contains a similar quantity and quality of information. In reality, data falls under a wide spectrum of informational value, from useful to useless and even harmful observations.