An Interest In:
Web News this Week
- April 24, 2024
- April 23, 2024
- April 22, 2024
- April 21, 2024
- April 20, 2024
- April 19, 2024
- April 18, 2024
March 25, 2022 11:53 am GMT
Original Link: https://dev.to/codewithsom/why-data-cleaning-2eof
Why Data Cleaning?
~ "Garbage In, Garbage Out": Bad data will lead to bad results, plain and simple.
~ It's hard for computers to judge whether the data makes sense or not.
~ To get accurate results, you need to remove errors from you data which confuses the algorithms.
~ It's time-consuming process but important.
What are the causes?
- Input Errors
- Duplicates
- Mangled Data
- Malfunctioning Sensors
- Lack of Standardization
Identifying Problems
- Range Constraints
- Data-Type
- Compulsory Constraints
- Unique Constraints
- Cross Field Constraints
Data Cleaning Techniques
- Removing missing data
- Direct correction
- Normalization
- Syntax errors
- Data Imputation
- Spell Check
- Filter Unwanted Outliers
- Remove Irrelevant Values
- Fix structural errors
Original Link: https://dev.to/codewithsom/why-data-cleaning-2eof
Share this article:
Tweet
View Full Article
Dev To
An online community for sharing and discovering great ideas, having debates, and making friendsMore About this Source Visit Dev To