Skip to content

Apollo1840/Data-Analysis-Tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Analysis-Tools

Overview

the name of this repo is Data-Analysis-Tools, but it actually contains the whole data analysis pipeline, including:

folder meaning
data_wrangling From very raw data to some usable and easy-to-understand raw data
data_analysis From raw data to some easy-to-visualize or machine-learning-ready data.
data_visualization Visualize the data.
datasets Datasets to play with over those folders.
projects Very domain-specific data analysis prjects





Theory

0, how to do data analysis

Preparation -> Preporcessing -> analysis -> posprocessing

To my understanding is :

1) Be clear of needs.

This is the most valuable and always been underestimate.

2) Get data.

This is not only be DB, but also activities to acctual get the data from world and also generate the feature.

We need know how to cooperate with DBs and how to use pandas to generate feature (after step 3)

3) clean data.

it is also called data wrangling.

4) analysis.

Here is where pandas plays and also ML techs.

5) report.

For this we need varies of data visualization skills.






Some experience

  • Do researches on the topic, know the data by heart, use as much human knowledge as possible.
  • Notice the outlier





Related Project

Project Athena: AI-based automotive data analysis tool which acts like an experienced data scientist, tells you important facts in the dataframe, interact with you to make conclusions and predictions.

About

There are data-analysis tools designed based on existed tools like pandas, matplotlib, pyecharts...

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published