Clean excel data in attached file named BUHI Customer Data. The U.S. demographics data will help you replace missing cells within the customer data.
- Interpolate missing data using the estimated values from the U.S. demographics dataset based on the customers’ zip codes
- Remove extreme outliers
- Correct incorrect value formats (e.g., age = “twelve” instead of 12)
- Remove impossible values (e.g., age = 203)
- Identify categories from granular data (e.g., determine the state to which a zip code belongs)
- Identify problem data with complex problems (e.g., the state is Montana, but the country is Canada; the purchase was $100, but the item is a new car; one column says the gender is female, but another says it is male)