LeTourDataSet

TL;DR

If you use pandas, just get the data via:

import pandas as pd 
df = pd.read_csv("https://raw.githubusercontent.com/camminady/LeTourDataSet/master/data/TDF_Riders_History.csv")

If you use R instead of python, you can run:

library(readr)
df <- read_csv("https://raw.githubusercontent.com/camminady/LeTourDataSet/master/data/TDF_Riders_History.csv")

Disclaimer

For issues with this data set, see the Issues tab. There are some entries that are incorrect. However, so far it seems that the mistake stems from wrong data on the letour.fr website. Looking back, I should have probably scraped another website.

Data

Every cyclist of the Tour de France in a single CSV file, stored in the file data/TDF_Riders_History.csv. There's also data on every stage in data/TDF_Stages_History.csv.

How to run

To regenerate the data/TDF_Riders_History.csv file, execute all cells of the src/main.py. This might take a couple of minutes.

Analysis

The src/analysis.py contains some basic analysis and visualizations of the data. For example, the distance and winner pace are shown above.

Legacy code

This code has been completely rewritten. The previous code, including the output, is in the legacy repository. Especially legacy/README.txt should be read.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
data		data
src		src
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LeTourDataSet

TL;DR

Disclaimer

Data

How to run

Analysis

Legacy code

About

Uh oh!

Releases

Packages

Languages

License

ClementBesnard/LeTourDataSet

Folders and files

Latest commit

History

Repository files navigation

LeTourDataSet

TL;DR

Disclaimer

Data

How to run

Analysis

Legacy code

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages