Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
March 25, 2022 11:53 am GMT

Why Data Cleaning?

~ "Garbage In, Garbage Out": Bad data will lead to bad results, plain and simple.
~ It's hard for computers to judge whether the data makes sense or not.
~ To get accurate results, you need to remove errors from you data which confuses the algorithms.
~ It's time-consuming process but important.

What are the causes?

  • Input Errors
  • Duplicates
  • Mangled Data
  • Malfunctioning Sensors
  • Lack of Standardization

Identifying Problems

  • Range Constraints
  • Data-Type
  • Compulsory Constraints
  • Unique Constraints
  • Cross Field Constraints

Data Cleaning Techniques

  • Removing missing data
  • Direct correction
  • Normalization
  • Syntax errors
  • Data Imputation
  • Spell Check
  • Filter Unwanted Outliers
  • Remove Irrelevant Values
  • Fix structural errors

Original Link: https://dev.to/codewithsom/why-data-cleaning-2eof

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To