From be3251a0f4eab0bea63215b30f0a1587c50d1975 Mon Sep 17 00:00:00 2001 From: MericGit <41242144+MericGit@users.noreply.github.com> Date: Sun, 15 May 2022 18:15:40 -0500 Subject: [PATCH] Create readme.md --- readme.md | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 readme.md diff --git a/readme.md b/readme.md new file mode 100644 index 0000000..e90d546 --- /dev/null +++ b/readme.md @@ -0,0 +1,11 @@ +# Data Parse +Tools used to convert AP Exam PDFs into individual data components that were uploaded to a google sheet. The code in here is rudimentary and messy, and is only intended to run from the IDE. It utilizes hard coded file-paths and was not designed for actual use in anyway. + + +# Resources Used +- Tesseract (Machine Learning powered Optical Character Recognition tool by Google) +- pdf2image (Python library to convert pdfs to images) +- imageio (Python library to crop and process images) +- pygsheets (Python library to upload data to google sheets) +- Google Sheets API (API provided by google to allow automation of various GDrive tasks) +- pandas (Python data library)