Skip to content
View dhecloud's full-sized avatar

Block or report dhecloud

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dhecloud/README.md

πŸ‘‹ Hi, I'm Andrew

I'm a Computer Science PhD specializing in audio processing, multimodal retrieval, and machine learning systems.
These days, I enjoy building AI-powered Telegram bots, working on quantitative trading tools, and exploring computer vision + LLM agent workflows.

I also love photography (Sony A6700), gaming, and flying my drone.
This GitHub is a mix of research code, practical ML pipelines, and personal side-projects

πŸ“« Contact


🧠 Research Background

Ph.D. Computer Science – Nanyang Technological University (2020–2024)
Thesis: Audio Captioning and Retrieval with Improved Cross-Modal Objectives

Published research in:

  • Automated Audio Captioning
  • Language-based Audio Retrieval
  • Acoustic Event Detection
  • Word Sense Disambiguation (BERT)

See publications below πŸ‘‡


πŸš€ Personal Projects

πŸ—£οΈ SakuraSensei β€” A Japanese Conversational AI Tutor

A context-aware Telegram bot built with LangChain, RAG pipelines, and multilingual embeddings.
Features include:

  • JLPT grammar/vocab scraping + JMDICT + Tatoeba + JaQuAD
  • Automated Japanese news explanations via multi-agent workflows
  • Cloze-question generation from YouTube transcripts
  • Memory persistence + metadata filtering improvements
    πŸ‘‰ Bot: https://t.me/SakuraSenseiNoBot

πŸ˜† FaceChangerGIFBot β€” Real-Time Face Swap Bot

A production-grade Telegram bot that swaps user faces into GIFs, short videos, and β€œFeatured Clips”.
Highlights:

  • ONNX-based inference for fast face swapping
  • Usage tracking, quotas, and Stripe premium tier
  • Watermarking + file size/duration validation
  • Migrated webhooks from Ngrok β†’ Cloudflare Tunnel
    πŸ‘‰ Bot: https://t.me/FaceChangerGIFBot

πŸ“š Publications

  • Language-based Audio Retrieval with Converging Tied Layers and Contrastive Loss (APSIPA 2022)
  • Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning (APSIPA 2022)
  • Audio Captioning with Reconstruction Latent Space Regularization (ICASSP 2022)
  • Sound Event Detection with Weakified Strong Labels & Frequency Dynamic Convolution (arXiv 2023)
  • Adapting BERT for Word Sense Disambiguation with Gloss Selection (EMNLP Findings)

πŸ› οΈ Skills

Machine Learning: PyTorch, TensorFlow, ONNX, transformers, LangChain
Domains: Audio Processing, Computer Vision, NLP, Cross-Modal Retrieval
Tools: Docker, Cloudflare Tunnel, Whisper, VAD, BeautifulSoup
Languages: Python, English (Native), Chinese (Conversational), Japanese (Conversational)


Thanks for visiting!
Feel free to explore my projects or reach out if you'd like to collaborate.

Popular repositories Loading

  1. Hand-Pose-for-Rheumatoid-Arthritis Hand-Pose-for-Rheumatoid-Arthritis Public

    Python 4 1

  2. st_AED_deliverables st_AED_deliverables Public

    Python 4 2

  3. Question_Answering_MSF_Adoption Question_Answering_MSF_Adoption Public

    Python 3 1

  4. NEFCLASS NEFCLASS Public

    Implementation of NEFCLASS in python

    Python 3 1

  5. platypus_compounder platypus_compounder Public

    This repository helps you to compound $PTP from the liquidity pools on platypus.finance into a stablecoin of your choice.

    Python 2

  6. SCSE_QE_Presentation SCSE_QE_Presentation Public

    TeX 1