Skip to content
View wessware's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report wessware

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
wessware/README.md

Wechale Shisia

github_bg_1

I will not just write code and build models: I can think adaptively, curate, and implement solutions to your problems - both those that you know, and ones that you do not know you have!

  • My data processing pipelines improved the efficiency of document verification for the Jiinue Youth Program - a Mastercard program run by the Kenya Private Sector Alliance (KEPSA) for financing MSMEs. In this project, I utilized advanced Excel querying techniques to enhance data ETL, which reduced the verification period for more than 60,000 records by over 70% -saving time and costs.

  • At ShopOkoa, I demonstrated how Generative Adversarial Networks (GANs) can be deployed in the training of reliable AI models that can be used for credit scoring and customer classification. This discovery has helped this Fintech startup roll out an AI-powered digital lending platform that has now onboarded more than 500 merchants in Kenya, enabling customers to shop on credit and pay later.

As a freelance data analyst, I have created numerous well-documented end-to-end projects for various clients, as highlighted below.

I am open to corporate and freelance work opportunities.

Essentials:


python_logo R_logo Azure SQL Database excel_logo power_bi_logo tableau_logo

Projects:


1. Credit Scoring Models for ShopOkoa

Overview:

The project utilized synthetic data to develop various predictive models that informed potential customer trends and credit repayment behaviors.

Insights:
  • Previous history of loan repayment was a more reliable metric of future repayment than other demographic variables.
  • Segregating customers into demographic clusters for model-training produced a better credit score than generalized linear models.
Tools & Methods: XGBoost, CATBoost, Random Forests

2. ARIMA Time Series Models for Farm Machinery Income Generation over a period of 12 months.

Overview:

The projects aimed to track and predict ROI for a farm tractor used for various activities, including transport services and ploughing. The farmer intended to understand which of the two services was most profitable and at what times of the year each service was most sought after.

Insights:
  • We noted that high income seasons repeatedly preceded the two rain seasons for this region - an indication that in times before the rains, the tractor owner should expect a higher demand for ploughing services in comparison to transport and towing services.
Tools & Methods: ARIMA, Auto-ARIMA, LSTM, BiLSTM, XGB, LGBM, GRU, CatBoost

3. An Analysis of Customers' Response to a Sales Survey

Overview:

This capstone project aimed at assessing best performing clustering models on a generic dataset. I tested: Random Forests, logistic regression, K-Nearest Neighbours, and Decision Trees.

Insights:
  • Random Forest classifiers outperformed all the other base nodels both in accuracy and runtime.
  • On fine tuning, Random Forest classifiers were less likely to overfit compared to the other control models.
  • There was a general improvement in accuracy accross all classifiers with the use of cross-validation data folds for model training & testing.
Tools & Methods: Python, Scikit-learn, plotly, KNN, Random Forest, Logistic Regression.

4. Traffic flow prediction

Overview:

Traffic snarls are a major challenge for modern cities. This project utilized deep learning neural networks to predict traffic flow at different road junctions over a period of years. The intention was to create a predictive AI system that helps road users plan better and save time and resources.

Insights:
  • Combined with Time Series Analysis, Gated Recurrent Units demonstrated a high efficiency and accuracy in predicting traffic flow for different road sections.
  • The efficiency and reliability of GNU Neural Networks for predicting traffic flow rely heavily on the size of the dataset (the period of observation).
Tools & Methods: Tensorflow, GNU, Time Series.

5. A deep dive into a Visualization of Loan Data using R.

Overview:

This project explored the robustness of R in data analysis and data visualization on a generic dataset.

Insights:
  • I discovered that other Python, R is an effective and robust language for Exploratory Data Analysis(EDA) & data visualization.
  • R carries comes with additional libraries, packages, and methods that allow for tweaks, better visuals, and better insights compared with Python.
Tools & Methods: R, Jupyter Notebooks, Mean Matching, Kaggle.

Publications:


Gated Recurrent Unit (GRU) TensorFlow Neural Networks: Time Series Analysis for traffic flow analysis & prediction

Contact:


gmail_lg whatsapp_logo linkedin_logo telegram_logo

Pinned Loading

  1. traffic_prediction_tensorflow traffic_prediction_tensorflow Public

    Jupyter Notebook

  2. sentiment_analysis_on_masculinity_saturday sentiment_analysis_on_masculinity_saturday Public

    Jupyter Notebook 1 1

  3. ARIMA_TIME_SERIES_FORECASTING ARIMA_TIME_SERIES_FORECASTING Public

    Jupyter Notebook 2 1

  4. missing_marks_prediction_analysis missing_marks_prediction_analysis Public

    Jupyter Notebook

  5. Boosting_Algorithms Boosting_Algorithms Public

    Jupyter Notebook

  6. clustering_algorithms clustering_algorithms Public

    Jupyter Notebook