Skip to content
View MarkJ-DC5's full-sized avatar

Organizations

@HilaLi-Tech

Block or report MarkJ-DC5

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MarkJ-DC5/README.md

πŸ‘‹ Hi there, I'm Mark Jayson πŸ‘‹

About me:

  • πŸ‘¨β€πŸ’» Data Scientist by profession, leveraging data to drive insights and solutions.
  • πŸ“œ Holds a Bachelors Degree in Computer Engineering from MapΓΊa University 🐦,
  • 🌟 Passionate about all things programming and technology.
  • 🧠 Currently immersed in the fascinating world of Machine Learning and Deep Learning.
  • πŸ‘¨β€πŸ’» My hobby is coding, which became my job and now I don't have any other hobby πŸ˜…

Connect with me:

LinkedIn

πŸ”¨ Tools and Frameworks

The following are some of the tools and frameworks I have used in building my Projects:

Python Java JavaScript Numpy Pandas Matplotlib Scikit-Learn TensorFlow JSON MySQL HTML5 Anaconda Colab

🚧 πŸ“˜ Portfolio Overview 🚧

This section is currently under construction, and the source codes have not yet been transferred to Git. If you require further information, please don't hesitate to reach out to me. Thank you for your understanding. ❀️

πŸ“· Image Classification: Skin Lesion Recognition for Cancer Detection

Keywords: Deep Learning | Computer Vision | Convolutional Neural Network (CNN) | HAM10000 | Pre-Augmentation | Transfer learning | EfficientNetV2 | Soft-Attention | View Source

  • Skin cancer is the most widespread form of cancer, with melanoma being its deadliest variant, responsible for 75% of skin cancer-related deaths. Early detection is paramount for effective treatment and positive outcomes.

  • In this project, I developed a Deep Convolutional Neural Network (CNN) model capable of classifying 7 types of skin lesions, achieving an 87% accuracy and demonstrating strong class discrimination with an 0.97 AUC.

  • The HAM10000 dataset by Tschandl, P., Rosendahl, C., & Kittler, H., consisting of 10,015 dermoscopic images, was used for model training and validation. Downsampling and Pre-Augmentation were performed to address the dataset's significant imbalance.

  • Transfer learning techniques were employed to achieve optimal performance while minimizing training time. EfficientNetV2 was the chosen pre-trained CNN architecture due to striking the best balance between model performance and hardware usage, as determined through testing.

  • Additionally, the concept of Soft-Attention, based on the study of Datta et al., was implemented to visualize the area of focus of the model. when identifying its class.

πŸ—¨οΈ Text Classification: Amazon Product Review Spam Detection

Keywords: Deep Learning | Natural Language Processing | Text Classification | Recurrent Neural Network | Long Short-Term Memory | Spam Reviews | Batch Processing | View Source

  • In today's world, online shopping has become incredibly common, reshaping global commerce and expected to generate $3.2 trillion in revenue by 2024. 93% of consumers states that their purchasing decisions are heavily influenced by reviews, there is a clear need for a system capable of detecting false or spam reviews to safeguard genuine feedback and enable informed choices.

  • A Text Classification model was constructed using Natural Language Processing (NLP) techniques. The model is based on a Recurrent Neural Network (RNN) model, specifically utilizing Long Short-Term Memory (LSTM) cells. It attains an accuracy of 89% in determining whether a review is spam or genuine.

  • I utilized the Amazon Product Review (Spam and Non-Spam) dataset from Naveed Hussain et al., with a total size of 18.4GB comprising of 26.7 million reviews distributed across six product categories. Each category is represented by a JSON file containing respective reviews.

  • Due to the RAM memory constraints and the dataset's JSON format, extensive preparation was necessary. I focused on reviews for fashion and clothing products, which ranked second in sales distribution on Amazon in 2022, representing 24.7% of total sales, with a JSON file size of 3.21GB. The dataset was then converted into a CSV file format containing 5.7 million reviews through batch processing, preventing memory overload and facilitating further processing and analysis.

Other Projects

  • Image Classification: Laboratory Apparatus Indetification
  • Image Classification: One-Class Convolutional Neural Network based on a study
  • Regression Model: Algae Count Prediction
  • Clustering Model: Customer Segmentation using both Numerical and Categorical Data
  • NLP Sentiment Analysis: Twitter Comment Polarity Prediction

Popular repositories Loading

  1. HilaLiTech HilaLiTech Public

    Forked from Blue-Hacks-2021/HilaLiTech

    Repository for the BAGYO! web app.

    HTML

  2. Mall-QCT-QR-Code-Contact-Tracinng- Mall-QCT-QR-Code-Contact-Tracinng- Public

    Python

  3. MarkJ-DC5 MarkJ-DC5 Public

  4. Skin-Cancer-Classification Skin-Cancer-Classification Public

  5. rental-management-api rental-management-api Public

    Java