Skip to content

Scrapes used car information from KSL and analyzes for best car deals using Linear Regression.

License

Notifications You must be signed in to change notification settings

Simon-Cheek/KSLCarScraper

Repository files navigation

KSL Car Scraper

Welcome to the KSL Car Scraper! This program is designed to gather data from KSL on used cars for sale to find those with the best price/mileage ratio.

How it Works:

KSL Car Scraper begins by making a request to KSL's backend for information on a specified car make and model. Once cars are gathered, the JSON is processed and cars are then fed through a simple linear regression model to create a line of best fit for prices and mileages. Then, the cars are ranked based on having the best mileage vs. predicted mileage based on price. Cars that rank higher than average are logged in a file called 'good_deals.txt' and those who rank below average are logged in 'bad_deals.txt'.

Run Command

python3 main.py {car_make} {car_model}

KSLCarScraper's fetch request works by sending a GET request for each 'page' of cars that the backend would normally supply to the website. Each 'page' contains 24 cars. The program default to querying for 6 'pages' of cars, which totals to 144 cars parsed at maximum. If you would like to override this, an optional 3rd argument can be passed to alter the number of pages parsed.

For Example:

Running the command python3 main.py Toyota Corolla 10 will query KSL for all of their Toyota Corollas up to 240 in quantity.

Scoring and Output

Each car is given a predictive mileage based on the line of best fit created by the model. The score is calculated as such: score = 1 - (predictive_mileage / mileage) This means that cars with less mileage than expected receive a negative score, which is a good thing. This formula also means that 1 is the worst score, and the best score could technically stretch to negative infinity. Usually anything smaller than -1 is considered very low mileage or price.

Each File outputted in this folder contains this information for each car:

  • Score
  • Price
  • Mileage
  • Location (Utah)
  • Year
  • Link (to KSL)

Max Price

There is a constant set at the top of send_request.py which sets the maximum car price to 15000 - this can be altered if desired. The algorithm also filters out cars that aren't in Utah (KSL is based in Utah) and those that have mileage over 200000. If you would like to override either of these, they can be found in the filter_cars method in main.py

About

Scrapes used car information from KSL and analyzes for best car deals using Linear Regression.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages