Skip to content

Getting hand landmarks data by using Mediapipe, which feed into deep learning model and put it on real-time.

Notifications You must be signed in to change notification settings

phamvoquoclong/RockPaperScissors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

28 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


🎯 Demo Objectives

The goal of this demo is to demonstrate a complete and practical computer vision pipeline, including:

  • Multi-object fashion detection
  • Improved recall for small objects (rings, watches, accessories)
  • Multi-label attribute prediction per detected object
  • Clean, standardized output format for downstream usage

This demo is intended as a proof-of-concept system.


πŸš€ Features

βœ… Object Detection

  • YOLOv11-based detection
  • Supports apparel and accessory categories
  • GPU acceleration via CUDA when available

βœ… SAHI Integration

  • Optional SAHI inference for small objects
  • Toggleable directly from the UI
  • Helps recover tiny fashion items often missed by standard inference

βœ… Attribute Recognition

  • Each detected object is cropped and passed to an attribute head
  • Multi-label prediction (one object β†’ multiple attributes)
  • Attribute IDs are mapped to human-readable names

βœ… Output Formats

  • Visual output with bounding boxes
  • Tabular summary (class, confidence, attributes)
  • JSON output matching submission format
image image

Example JSON output:

[
  {
    "label": "Cardigan",
    "confidence": 0.97,
    "box": [100, 175, 715, 971],
    "attributes": ["Plain pattern", "Short length", "Single breasted"]
  }
]

About

Getting hand landmarks data by using Mediapipe, which feed into deep learning model and put it on real-time.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published