Skip to content

Conversation

@Balapradeepck
Copy link
Contributor

We have completed the below items as part of this code

  1. Assigning the na values for missing and defaults values like ["NA","N/A","na","n/a"]
  2. Making sure to have all the data is in float
  3. If method is either given as drop or not provided then its been defaulted to have zero for all NaN's
  4. If method is given as avg then its been defaulted to have average of the column for all NaN's
  5. Tested for all possible data scenarios.

We have completed the below items as part of this code
1. Assigning the na values for missing and defaults values like ["NA","N/A","na","n/a"]
2. Making sure to have all the data  is in float
3. If method is either given as drop or not provided then its been defaulted to have zero for all NaN's
4. If method is given as avg then its been defaulted to have average of the column for all NaN's
Assumptions:
1. We are considering one independent and 3 maximum dependent variables.
2. We need to use two parameters 
       a. Location b. method
3. output will return two data frames 
       a. clean data b. outliers 
4. Input file provided is considered to be csv.

We have completed the below items as part of this code
1.Assigning the na values for missing and defaults values like ["NA","N/A","na","n/a"]
2.Making sure to have all the data is in float
3.If method is either given as drop or not provided then its been defaulted to row drop
4.If method is given as avg then its been defaulted to have average of the column for all NaN's 
5. Outliers are determined and removed from the clean data frame
6. Outliers will be sent in a separate data frame
7. Tested for all possible data scenarios.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant