Skip to content

mmendan/Test

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

*DO NOT MERGE

Unit-4-Challenge

pandas challange

Springboard Data Science Career Track Unit 4 Challenge - Tier 3 Complete Objectives Hey! Great job getting through those challenging DataCamp courses. You're learning a lot in a short span of time.

In this notebook, you're going to apply the skills you've been learning, bridging the gap between the controlled environment of DataCamp and the slightly messier work that data scientists do with actual datasets!

Here’s the mystery we’re going to solve: which boroughs of London have seen the greatest increase in housing prices, on average, over the last two decades?

A borough is just a fancy word for district. You may be familiar with the five boroughs of New York… well, there are 32 boroughs within Greater London (here's some info for the curious). Some of them are more desirable areas to live in, and the data will reflect that with a greater rise in housing prices.

This is the Tier 3 notebook, which means it's not filled in at all: we'll just give you the skeleton of a project, the brief and the data. It's up to you to play around with it and see what you can find out! Good luck! If you struggle, feel free to look at easier tiers for help; but try to dip in and out of them, as the more independent work you do, the better it is for your learning!

This challenge will make use of only what you learned in the following DataCamp courses:

Prework courses (Introduction to Python for Data Science, Intermediate Python for Data Science) Data Types for Data Science Python Data Science Toolbox (Part One) pandas Foundations Manipulating DataFrames with pandas Merging DataFrames with pandas Of the tools, techniques and concepts in the above DataCamp courses, this challenge should require the application of the following:

pandas data ingestion and inspection (pandas Foundations, Module One) exploratory data analysis (pandas Foundations, Module Two) tidying and cleaning (Manipulating DataFrames with pandas, Module Three) transforming DataFrames (Manipulating DataFrames with pandas, Module One) subsetting DataFrames with lists (Manipulating DataFrames with pandas, Module One) filtering DataFrames (Manipulating DataFrames with pandas, Module One) grouping data (Manipulating DataFrames with pandas, Module Four) melting data (Manipulating DataFrames with pandas, Module Three) advanced indexing (Manipulating DataFrames with pandas, Module Four) matplotlib (Intermediate Python for Data Science, Module One) fundamental data types (Data Types for Data Science, Module One) dictionaries (Intermediate Python for Data Science, Module Two) handling dates and times (Data Types for Data Science, Module Four) function definition (Python Data Science Toolbox - Part One, Module One) default arguments, variable length, and scope (Python Data Science Toolbox - Part One, Module Two) lambda functions and error handling (Python Data Science Toolbox - Part One, Module Four) The Data Science Pipeline This is Tier Three, so we'll get you started. But after that, it's all in your hands! When you feel done with your investigations, look back over what you've accomplished, and prepare a quick presentation of your findings for the next mentor meeting.

Data Science is magical. In this case study, you'll get to apply some complex machine learning algorithms. But as David Spiegelhalter reminds us, there is no substitute for simply taking a really, really good look at the data. Sometimes, this is all we need to answer our question.

Data Science projects generally adhere to the four stages of Data Science Pipeline:

Sourcing and loading Cleaning, transforming, and visualizing Modeling Evaluating and concluding

About

pandas challange

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%