Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,26 @@
## Preface

This GitBook was written by David Backus, Sarah Beckett-Hile, Chase Coleman, and Spencer Lyon for [a course](http://databootcamp.nyuecon.com/) at [NYU](http://www.nyu.edu/)'s [Stern School of Business](http://www.stern.nyu.edu/). We plan to give students experience with economic and financial data and introduce programming newbies to the benefits of moving beyond Excel. We use the Python programming language, specifically Python's data management and graphics tools. If that doesn't whet your appetite, we have a [more elaborate sales pitch](http://databootcamp.nyuecon.com/bootcamp_faq/).
This GitBook was written by David Backus, Sarah Beckett-Hile, Chase Coleman, and Spencer Lyon for [a course](http://databootcamp.nyuecon.com/) at [NYU](http://www.nyu.edu/)'s [Stern School of Business](http://www.stern.nyu.edu/). We plan to give students experience with economic and financial data and introduce programming newbies to the benefits of moving beyond Excel. We use the Python programming language, specifically Python's data management and graphics tools. If that doesn't whet your appetite, we have a [more elaborate sales pitch](http://databootcamp.nyuecon.com/bootcamp_faq/).

We designed the book to accompany a live class. We've tried to make it self-contained, but the written word is a poor substitute for the interaction you get in a classroom.
We designed the book to accompany a live class. We've tried to make it self-contained, but the written word is a poor substitute for the interaction you get in a classroom.

The book comes in multiple formats. You can access it on the internet. Or you can download (and print) a pdf file. The former comes with links, which we think is a huge advantage, and can be updated quickly, but if you like paper by all means try the pdf. All of them are available at
The book comes in multiple formats. You can access it on the internet. Or you can download (and print) a pdf file. The former comes with links, which we think is a huge advantage, and can be updated quickly, but if you like paper by all means try the pdf. All of them are available at

https://www.gitbook.com/book/davebackus/test/details

We welcome suggestions. Send them to Dave Backus at [db3@nyu.edu](mailto:db3@nyu.edu). Or, even better, post an issue on our [GitHub repository](https://github.com/DaveBackus/Data_Bootcamp_Book/issues).
We welcome suggestions. Send them to Dave Backus at [db3@nyu.edu](mailto:db3@nyu.edu). Or, even better, post an issue on our [GitHub repository](https://github.com/DaveBackus/Data_Bootcamp_Book/issues).


## Warning

This is **work in progress**. We've written seven chapters so far, more are on the way.
This is **work in progress**. We've written seven chapters so far, more are on the way.


## Acknowledgements
## Acknowledgements

This project was Glenn Okun's idea. He really should have done it himself, but we thank him for the idea and his ongoing support. Paul Backus, Hersh Iyer (MBA17), Matt McKay, Kim Ruhl, and Itamar Snir (MBA17) contributed technical support and applications. Ian Stewart provided his usual expert advice on teaching methods. You may also notice a family resemblance to Tom Sargent and John Stachurski's [Quantitative Economics](http://quant-econ.net/), a Python- and Julia-based course in dynamic macroeconomic theory. We thank them for their advice and encouragement.
This project was Glenn Okun's idea. He really should have done it himself, but we thank him for the idea and his ongoing support. Paul Backus, Hersh Iyer (MBA17), Matt McKay, Kim Ruhl, and Itamar Snir (MBA17) contributed technical support and applications. Ian Stewart provided his usual expert advice on teaching methods. You may also notice a family resemblance to Tom Sargent and John Stachurski's [Quantitative Economics](http://quant-econ.net/), a Python- and Julia-based course in dynamic macroeconomic theory. We thank them for their advice and encouragement.

## License
## License

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/

8 changes: 4 additions & 4 deletions SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

* [Preface](README.md)
* [Where are we headed?](intro.md)
* [The data mentality](data-mentality.md)
* [The data mentality](data-mentality.md)
* [Installing Python](installing-python.md)
* [Python fundamentals 1](py-fun1.md)
* [Python fundamentals 2](py-fun2.md)
* [Python fundamentals 1](py-fun1.md)
* [Python fundamentals 2](py-fun2.md)
* [Data input: Packages and Pandas](pandas-input.md)
* [Python graphics: Matplotlib fundamentals](graphs1.md)

Expand All @@ -19,6 +19,6 @@
* [Business cycle indicators](indicators.md)
* [Describing data 1: Distributions of things](random.md)
* [Other cool stuff](other.md)
-->
-->

* [Glossary](glossary.md)
38 changes: 19 additions & 19 deletions conda-pip.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,64 @@
# Updating Python: Conda and Pip
# Updating Python: Conda and Pip


---
**Overview.** We describe the tools used to update Anaconda and Python, and for adding new packages.

**Python tools.** conda, pip.
**Python tools.** conda, pip.

**Buzzwords.**
**Buzzwords.**

**Applications.**
**Applications.**

---


## Conda
## Conda


This is the Anaconda tool, useful for updating and extending Anaconda.
This is the Anaconda tool, useful for updating and extending Anaconda.

Link: http://conda.pydata.org/docs/
Cheat sheet: http://conda.pydata.org/docs/_downloads/conda-cheatsheet.pdf

Command line...
Command line...

conda info
conda update conda
conda update anaconda
conda info
conda update conda
conda update anaconda
conda install [package]


## Pip
## Pip

Link: https://pip.readthedocs.org/en/stable/



## Quandl
## Quandl

Access to lots of economic and financial data...
Access to lots of economic and financial data...


## Seaborn
## Seaborn

A terrific interface for Matplotlib...
A terrific interface for Matplotlib...



## tqdm

Progress bar for data loads...
Progress bar for data loads...


## Pyopendata

https://pypi.python.org/pypi/pyopendata/0.0.2

Use to get OECD data?
Use to get OECD data?


## Flappy bird

https://www.youtube.com/watch?v=h2Uhla6nLDU
https://github.com/Max00355/FlappyBird/blob/master/flappybird.py
https://www.youtube.com/watch?v=h2Uhla6nLDU
https://github.com/Max00355/FlappyBird/blob/master/flappybird.py
76 changes: 38 additions & 38 deletions data-mentality.md
Original file line number Diff line number Diff line change
@@ -1,90 +1,90 @@
# The data mentality

---
**Overview.** Thinking about data, ideas for projects. Things to remember: (1) Ideas aren't discovered, they're developed. (2) Ideas have friends: when you find one, there are others nearby.
**Overview.** Thinking about data, ideas for projects. Things to remember: (1) Ideas aren't discovered, they're developed. (2) Ideas have friends: when you find one, there are others nearby.

**Buzzwords.** Questions, data, idea machines.
**Buzzwords.** Questions, data, idea machines.

**Code.** Related [examples](https://github.com/DaveBackus/Data_Bootcamp/blob/master/Code/IPython/bootcamp_examples.ipynb).

---

Data analysis starts with a question. Generally, we want to learn something. In our world, we might ask:

* How is the US economy doing?
* What emerging market countries offer the best business opportunities?
* How do returns on US and European stocks compare?
* How is the US economy doing?
* What emerging market countries offer the best business opportunities?
* How do returns on US and European stocks compare?

You'll notice that the starting point is a question, something we'd like to know more about. We provide a toolkit for working effectively with data to find answers. Most of our examples are about economics and finance -- that's what we know -- but the same tools can be used to address any data we like.
You'll notice that the starting point is a question, something we'd like to know more about. We provide a toolkit for working effectively with data to find answers. Most of our examples are about economics and finance -- that's what we know -- but the same tools can be used to address any data we like.


<!--
Once we have a question, we can start looking for data that might help us come up with an answer. This leads to more questions:
Once we have a question, we can start looking for data that might help us come up with an answer. This leads to more questions:

* What data would be helpful in answering our question?
* Where can we find it?
* What should we do with it once we have it?
-->
* What data would be helpful in answering our question?
* Where can we find it?
* What should we do with it once we have it?
-->


## Thinking about data
## Thinking about data

It's not that we have no lives or anything, but we think about data all the time. If we see an interesting graphic in *The Economist* -- or the *Wall Street Journal*, or the *New York Times* -- it triggers a series of questions.
It's not that we have no lives or anything, but we think about data all the time. If we see an interesting graphic in *The Economist* -- or the *Wall Street Journal*, or the *New York Times* -- it triggers a series of questions.

* What did we learn from the graph?
* What else would we like to know?
* Where does the data come from?
* What else would we like to know?
* Where does the data come from?

Following up on these questions often leads to interesting insights. And it's fun.
Following up on these questions often leads to interesting insights. And it's fun.

Let's give it a try:
Let's give it a try:

**Exercise.** The 538 blog has a nice summary of [salaries of recent college graduates](http://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/). Skip to the bottom to sort by major and play around. Answer these questions:

* What did you learn from their table?
* What else would you like to know?
* Where did they get their data?
* What did you learn from their table?
* What else would you like to know?
* Where did they get their data?

**Exercise.** What kinds of things would you like to know more about? Think of this as improv, there are no bad answers.
**Exercise.** What kinds of things would you like to know more about? Think of this as improv, there are no bad answers.


## Generating project ideas
## Generating project ideas

One of our goals is for you to produce a piece of work -- data and graphics -- that you can show potential employers. There's nothing like a concrete example (regardless of the topic) to show off your skill set. We still have lots of time, but it can't hurt to start thinking about it now.
One of our goals is for you to produce a piece of work -- data and graphics -- that you can show potential employers. There's nothing like a concrete example (regardless of the topic) to show off your skill set. We still have lots of time, but it can't hurt to start thinking about it now.


**Idea machines.** The question is how to find a good project idea. That's not something you run across a lot in modern education, where our job is typically to absorb what's taught rather than come up with our ideas. So how would we get started?
**Idea machines.** The question is how to find a good project idea. That's not something you run across a lot in modern education, where our job is typically to absorb what's taught rather than come up with our ideas. So how would we get started?

We're looking for a topic that covers two bases: (1) we find it interesting and (2) we have access to data related to it. We can start with either one, or with an existing example we would ike to reproduce and extend:

* **Start with what interests you.** Economics, finance, marketing, emerging markets, movies, sports. You be the judge. Be specific: You want a topic, not a category.
* **Start with what interests you.** Economics, finance, marketing, emerging markets, movies, sports. You be the judge. Be specific: You want a topic, not a category.

* **Start with data.** Take a dataset you find interesting, ask what you might do with it. If you're not sure where to look, try our list of [data sources](http://databootcamp.nyuecon.com/bootcamp_data/).

* **Start with an example.** Find an analyst report, blog post, or graphic you like. Ask where the data comes from and think about whether you can replicate and/or extend it.
* **Start with an example.** Find an analyst report, blog post, or graphic you like. Ask where the data comes from and think about whether you can replicate and/or extend it.

If you're not sure how this works, watch Steve Levitt's [video](https://youtu.be/r5jATFtKtI8?t=5m10s) about working with company data. It's an entertaining and informative 50 minutes.
If you're not sure how this works, watch Steve Levitt's [video](https://youtu.be/r5jATFtKtI8?t=5m10s) about working with company data. It's an entertaining and informative 50 minutes.

Keep in mind: we're not looking for a perfect idea. Perfection takes time, and we may never get there. Long experience has shown us:
Keep in mind: we're not looking for a perfect idea. Perfection takes time, and we may never get there. Long experience has shown us:

<!--
* **Start small.** Small ideas often grow into bigger ones.
* **Start small.** Small ideas often grow into bigger ones.
-->

* **Ideas have friends.** If you have an idea, even a not very good one, it often triggers thoughts of other ideas, sometimes even better ones.
* **Ideas have friends.** If you have an idea, even a not very good one, it often triggers thoughts of other ideas, sometimes even better ones.

* **Ideas aren't discovered, they're developed.** Allow your ideas to mature, to evolve and improve. Like kimchi and red wine, they often get better with time.
* **Ideas aren't discovered, they're developed.** Allow your ideas to mature, to evolve and improve. Like kimchi and red wine, they often get better with time.

**Common mistakes -- and how to fix them.** We mean this in a good way, but in our experience there are a number of things students do that make this harder than it should be. Here's a list, with suggestions for overcoming them:
**Common mistakes -- and how to fix them.** We mean this in a good way, but in our experience there are a number of things students do that make this harder than it should be. Here's a list, with suggestions for overcoming them:

* **Reject an idea too soon,** before you’ve given it enough thought. Solution: Don't be critical too early, you don't want to inhibit your creativity. Collect ideas first, whittle them down later.
* **Reject an idea too soon,** before you’ve given it enough thought. Solution: Don't be critical too early, you don't want to inhibit your creativity. Collect ideas first, whittle them down later.

* **Choose a project that’s too large**. Solution: Keep it simple. Think it over for a while, and choose a small part of a larger project that is interesting on its own. You can always do more later.
* **Choose a project that’s too large**. Solution: Keep it simple. Think it over for a while, and choose a small part of a larger project that is interesting on its own. You can always do more later.

* **Your dataset doesn't have everything you want.** To be honest, that's pretty much every dataset we've ever seen. Solution: Make do with what you have.
* **Your dataset doesn't have everything you want.** To be honest, that's pretty much every dataset we've ever seen. Solution: Make do with what you have.

* **Pick a dataset that's not available.** Solution: Start with what you have, ask what you can do with it. We call this the [Jeopardy](https://en.wikipedia.org/wiki/Jeopardy!) approach: start with the answer, come up with the question. If that fails, find another dataset.
* **Pick a dataset that's not available.** Solution: Start with what you have, ask what you can do with it. We call this the [Jeopardy](https://en.wikipedia.org/wiki/Jeopardy!) approach: start with the answer, come up with the question. If that fails, find another dataset.

**Bottom line.** Projects are less structured than most things you'll run across in the academic world. It's challenging, at first, to work with so little structure, but most students find that the freedom to develop their own projects is one of the most rewarding things they can do.
**Bottom line.** Projects are less structured than most things you'll run across in the academic world. It's challenging, at first, to work with so little structure, but most students find that the freedom to develop their own projects is one of the most rewarding things they can do.

**Exercise.** Write down three project ideas. Don't overthink this, one or two lines each will do.
**Exercise.** Write down three project ideas. Don't overthink this, one or two lines each will do.
6 changes: 3 additions & 3 deletions emerging.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@

Include cases from Global....

Data from
Data from

* Penn World Table
* Penn World Table
* World Bank
* Doing Business
* Doing Business
* Maddison

## Assessing the business climate
Expand Down
14 changes: 7 additions & 7 deletions glossary.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Glossary
# Glossary


function
list
package
slicing
string
variable
function
list
package
slicing
string
variable
Loading