Skip to content

Advanced Android photo editor powered by on-device AI models (LaMa, EdgeSAM, MobileViT) with GPU-accelerated adjustments, voice commands, generative fill, and professional editing tools. Built with Jetpack Compose and ONNX Runtime for fast, offline-first image processing.

Notifications You must be signed in to change notification settings

PrathamX595/Kortex-AdobeSubmission

Repository files navigation

Group 165

Table of Contents

  1. Overview
  2. Installation from APK
  3. Project Architecture
  4. Technology Stack
  5. On-Device AI Models
  6. Application Features
  7. Workflow and Data Flow
  8. Project Structure
  9. Setup and Installation
  10. Building and Running
  11. System Requirements

Overview

Kortex is a sophisticated Android photo editing app that offers professional level image editing capabilities by fusing cloud based AI services with on device machine learning models. The application leverages ONNX Runtime for efficient on device inference, GPU acceleration for real time adjustments, and offline speech recognition for voice-controlled editing.

Using Jetpack Compose for the UI layer, Kotlin Coroutines for asynchronous operations, and the MVVM (Model View ViewModel) architectural pattern for clear concern separation, the application is constructed using contemporary Android development techniques.


Frame 46

Installation from APK

To try Kortex quickly, a pre-built release APK is included in this repository for easy installation.

Installation Steps

For Android Device:

Step 1: Enable Unknown Sources

Before installing the APK, you need to allow installation from unknown sources:

  1. Open Settings on your Android device
  2. Navigate to Security or Privacy (location varies by Android version)
  3. Find and enable Install unknown apps or Unknown sources
  4. Select your file manager or browser and allow installation from that source

Note: On Android 8.0 (Oreo) and above, you grant permission per app. On older versions, there's a global setting.

Step 2: Transfer the APK to Your Device

Choose one of these methods:

Method A: Direct Download (if shared online)

  • Download the APK directly on your device from the shared link
  • The APK will be saved to your Downloads folder

Method B: USB Transfer

  1. Connect your Android device to your computer via USB
  2. Enable File Transfer mode when prompted on your device
  3. Navigate to the project folder: Kortex-app/apk/
  4. Copy app-release.apk to your device's Downloads folder or internal storage

Method C: Cloud Transfer

  1. Upload app-release.apk from Kortex-app/apk/ folder to Google Drive, Dropbox, etc.
  2. Download it on your Android device from the cloud service

Step 3: Install the APK

  1. Open your device's File Manager app
  2. Navigate to the folder where you saved the APK (usually Downloads)
  3. Tap on app-release.apk
  4. Review the permissions requested by the app:
    • Storage (for saving/loading images)
    • Camera (for taking photos)
    • Microphone (for voice commands)
    • Internet (for cloud AI features)
  5. Tap Install
  6. Wait for installation to complete (may take 30-60 seconds due to ML models)
  7. Tap Open to launch Kortex, or find it in your app drawer

Step 4: Grant Runtime Permissions

When you first use certain features, Android will ask for permissions:

  • Storage/Photos: Required to edit images from your gallery
  • Camera: Required to take new photos (optional)
  • Microphone: Required for voice commands (optional)
  • Internet: Required for cloud AI features (optional - app works offline)

Project Architecture

Architectural Pattern

The application uses the repository pattern for data management in conjunction with the MVVM (Model View ViewModel) architecture pattern:

┌─────────────────────────────────────────────────────────────┐
│                        View Layer                           │
│  (Jetpack Compose UI Components)                            │
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │PhotoEditor   │  │ Adjust       │  │ Background   │       │
│  │Screen        │  │ Screen       │  │ Removal      │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
└────────────┬────────────────────────────────────────────────┘
             │
             │ User Actions / State Observation
             ▼
┌─────────────────────────────────────────────────────────────┐
│                   ViewModel Layer                           │
│                                                             │
│  ┌──────────────────────────────────────────────────────┐   │
│  │        PhotoEditorViewModel                          │   │
│  │  • Manages UI state                                  │   │
│  │  • Handles user interactions                         │   │
│  │  • Coordinates with repositories and executors       │   │
│  └──────────────────────────────────────────────────────┘   │
└────────────┬────────────────────────────────────────────────┘
             │
             │ Data Requests / Commands
             ▼
┌─────────────────────────────────────────────────────────────┐
│                   Data/Model Layer                          │
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │ Repositories │  │ ML Executors │  │ Utilities    │       │
│  │              │  │              │  │              │       │
│  │ • Retouch    │  │ • LaMa       │  │ • Image      │       │
│  │ • CloudEdit  │  │ • SAM        │  │ • Watermark  │       │
│  │              │  │ • AutoEnhance│  │ • Font       │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
└─────────────────────────────────────────────────────────────┘

Component Interaction Flow

User Input
    │
    ▼
┌───────────────────┐
│  Compose UI       │
│  Components       │
└────────┬──────────┘
         │ Events
         ▼
┌───────────────────────────┐
│  PhotoEditorViewModel     │
│  • State Management       │
│  • Business Logic         │
└─────┬─────────────────────┘
      │
      ├──────────────────┬──────────────────┬────────────────┐
      ▼                  ▼                  ▼                ▼
┌─────────────┐   ┌─────────────┐   ┌────────────┐   ┌────────────┐
│ On-Device   │   │  Cloud      │   │ GPU Image  │   │  Local     │
│ ML Models   │   │  APIs       │   │ Processing │   │  Storage   │
│             │   │             │   │            │   │            │
│ • LaMa      │   │ • Retouch   │   │ • Adjust   │   │ • Files    │
│ • EdgeSAM   │   │ • SmartFill │   │ • Filters  │   │ • Cache    │
│ • AutoEnhc. │   │ • Sticker   │   │            │   │            │
│ • Vosk      │   │   Harmonize │   │            │   │            │
└─────────────┘   └─────────────┘   └────────────┘   └────────────┘

Technology Stack

Core Technologies

Language and Framework

  • Kotlin 2.0.21
  • Android SDK (Min SDK 24, Target SDK 34, Compile SDK 36)
  • Jetpack Compose (Material3)

Build System

  • Gradle 8.13.1 with Kotlin DSL
  • Android Gradle Plugin 8.13.1

Architecture Components

  • Lifecycle ViewModel Compose 2.7.0
  • Kotlin Coroutines with Dispatchers
  • StateFlow for reactive state management

Networking

  • Retrofit 2.11.0 (REST API communication)
  • OkHttp 4.12.0 (HTTP client with logging interceptor)
  • Gson Converter 2.11.0 (JSON serialization)

Machine Learning

  • ONNX Runtime Android 1.17.0 (On-device inference)
  • Vosk Android 0.3.32 (Offline speech recognition)
  • JNA 5.13.0 (Java Native Access)

Image Processing

  • Coil 2.5.0 (Async image loading)
  • GPUImage 2.1.0 (GPU-accelerated filters)
  • ExifInterface 1.3.7 (Image metadata handling)

UI and Permissions

  • Material Icons Extended
  • Accompanist Permissions 0.32.0

On-Device AI Models

The application uses multiple ONNX format neural network models that run entirely on the device without requiring internet connectivity. These models are stored in the assets folder and loaded at runtime.

Model Details

File: lama_fp32.onnx

Purpose: Advanced image inpainting for object removal and content aware fill

Architecture:

Input: image (1x3x512x512) + mask (1x1x512x512)
    │
    ▼
┌──────────────────────┐
│  Fast Fourier Conv   │
│  Encoder-Decoder     │
└──────────┬───────────┘
           ▼
Output: inpainted_image (1x3x512x512)

Technical Specifications:

  • Input Image Shape: [1, 3, 512, 512]
  • Input Mask Shape: [1, 1, 512, 512]
  • Output Shape: [1, 3, 512, 512]
  • Precision: FP32
  • Acceleration: NNAPI hardware acceleration when available (Android 8+)

Processing Pipeline:

Original Image → Resize to 512x512 → Normalize to [0,1]
                                            │
Mask Image → Resize to 512x512 → Dilate (10px) → Binary threshold
                                            │
                    ┌───────────────────────┴─────────────────┐
                    ▼                                         ▼
            Image Tensor (CHW format)               Mask Tensor
                    │                                         │
                    └──────────────┬──────────────────────────┘
                                   ▼
                          LaMa ONNX Inference
                                   │
                                   ▼
                          Inpainted Result
                                   │
                                   ▼
                    Denormalize → Resize to original size

Use Cases:

  • Removing objects from pictures
  • Cleaning up unwanted elements
  • Manual mask-based inpainting
  • Background object removal

Files: sam_encoder.onnx and sam_decoder.onnx

Purpose: Interactive image segmentation with point based selection

Two-Stage Architecture:

Stage 1: Encoder (Heavy, Run Once)
──────────────────────────────────
Input Image (1x3x1024x1024)
        │
        ▼
┌───────────────────┐
│  Vision Transform │
│  Encoder          │
└────────┬──────────┘
         ▼
Image Embeddings (1x256x64x64)
         │
         └──────► Cache for reuse

- User clicks somewhere or creates a bounding box

Stage 2: Decoder (Lightweight, Interactive)
───────────────────────────────────────────
Cached Embeddings + Point Coords + Labels
                │
                ▼
    ┌───────────────────────┐
    │  Mask Decoder         │
    │  + Prompt Encoder     │
    └───────────┬───────────┘
                ▼
        Segmentation Masks
    (1x1x256x256) × 4 variants
                │
                ▼
        IoU Scores (1x4)

Technical Specifications:

Encoder:

  • Input Shape: [1, 3, 1024, 1024]
  • Output Shape: [1, 256, 64, 64]
  • Execution: Once per image

Decoder:

  • Inputs:
    • Image Embeddings: [1, 256, 64, 64]
    • Point Coordinates: [1, N, 2]
    • Point Labels: [1, N] (1=foreground, 0=background, -1=padding)
    • Mask Input: [1, 1, 256, 256] (optional)
    • Has Mask Input: [1] (boolean)
    • Original Image Size: [2] (height, width)
  • Outputs:
    • Masks: [1, 4, 256, 256]
    • IoU Predictions: [1, 4]

Processing Flow:

User loads image
    │
    ▼
Run Encoder (slow, ~1-2 seconds)
    │
    ▼
Cache embeddings in memory
    │
    ▼
User taps on object ◄─────┐
    │                     │
    ▼                     │
Transform tap coords      │
to model space (1024x1024)│
    │                     │
    ▼                     │
Run Decoder (fast, <100ms)│
    │                     │
    ▼                     │
Select best mask by IoU   │
    │                     │
    ▼                     │
Resize to original size   │
    │                     │
    ▼                     │
Display segmentation      │
    │                     │
    └─────────────────────┘
    (User can tap again)

Use Cases:

  • Background removal with tap selection
  • Object isolation
  • Quick mask generation
  • Interactive segmentation

3. Auto Enhance (Image Analysis)

Files: analyzer_8param_v2.onnx and hdrnet_fixer_safe.onnx

Purpose: Automatic image quality analysis and parameter extraction

Dual-Model System:

Model 1: Analyzer (Parameter Extraction)
────────────────────────────────────────
Input Image (1x3x256x256)
        │
        ▼
┌──────────────────────┐
│  MobileViT Backbone  │
│  + Analysis Head     │
└──────────┬───────────┘
           │
           ├──► Edit Parameters (1x8)
           │    [exposure, contrast, saturation,
           │     brightness, highlights, shadows,
           │     temperature, sharpness]
           │
           └──► Rationale Logits (1x4)
                [underexposed, overexposed,
                 unsaturated, good]


Model 2: Fixer (Parameter Application)
───────────────────────────────────────
Original Image + Parameters
        │
        ▼
┌──────────────────────┐
│  HDRNet Architecture │
│  Bilateral Grid      │
└──────────┬───────────┘
           ▼
    Enhanced Image

Technical Specifications:

Analyzer:

  • Input Shape: [1, 3, 256, 256]
  • Outputs:
    • Parameters: [1, 8] float values
    • Rationale: [1, 4] classification logits
  • Parameter Ranges: Typically [-1, 1] or [0, 2]

Parameter Mapping:

Index 0: Exposure      → Exposure adjustment
Index 1: Contrast      → Local contrast
Index 2: Saturation    → Color intensity
Index 3: Brightness    → Overall Brightness
Index 4: Highlights    → Bright region control
Index 5: Shadows       → Dark region control
Index 6: Temperature   → White balance (cool/warm)
Index 7: Sharpness     → Edge enhancement

Analysis Workflow:

Input Image
    │
    ▼
Resize to 256x256
    │
    ▼
Normalize to [0,1]
    │
    ▼
Run Analyzer Model
    │
    ├──► Extract 8 parameters
    │    │
    │    ▼
    │    Map to adjustment sliders
    │    │
    │    ▼
    │    Apply via GPU filters
    │
    └──► Parse rationale
         │
         ▼
         Display diagnosis
         [Softmax classification]

Use Cases:

  • Automatic image enhancement suggestions
  • Quality analysis
  • Parameter extraction for manual adjustment

Directory: vosk-model-small-en-us-0.15/

Purpose: Offline voice command recognition for hands free editing

Model Type: language model (Kaldi based)

Architecture Overview:

Audio Input (16kHz PCM)
        │
        ▼
┌─────────────────────┐
│  Audio Preprocessing│
│  • Framing          │
│  • Feature Extract  │
│  • MFCC/Fbank       │
└─────────┬───────────┘
          ▼
┌─────────────────────┐
│  Acoustic Model     │
│  (DNN/TDNN)         │
└─────────┬───────────┘
          ▼
┌─────────────────────┐
│  Language Model     │
│  (N-gram/RNNLM)     │
└─────────┬───────────┘
          ▼
  Transcribed Text

Technical Specifications:

  • Sample Rate: 16,000 Hz
  • Model Size: Small (41MB approx)
  • Language: English (US)
  • Latency: Real-time streaming
  • Output: JSON with partial and final results

Recognition Flow:

User presses mic button
        │
        ▼
Request RECORD_AUDIO permission
        │
        ▼
Initialize Vosk Model (if first time)
        │
        ▼
Start SpeechService
        │
        ▼
Audio Stream ──┐
               │
    ┌──────────▼─────────────┐
    │ Continuous Recognition │
    │ • Partial results      │
    │ • Final results        │
    │ • Auto-stop (5s)       │
    └──────────┬─────────────┘
               │
               ▼
    Update chat interface
               │
               ▼
    Send to API or execute command

Use Cases:

  • Voice activated controls to "make it brighter"
  • Retouching based on instructions
  • Hands free operation
  • Accessibility features

Model Loading and Optimization

Initialization Strategy:

Application Start
        │
        ▼
MainActivity.onCreate()
        │
        ▼
User selects feature
        │
        ▼
Lazy initialization of required model
        │
        ├──► Copy from assets to cache (if needed)
        │
        ├──► Check device capabilities
        │    • NNAPI availability
        │    • GPU acceleration
        │    • Memory constraints
        │
        ├──► Configure OrtSession.SessionOptions
        │    • OptLevel.ALL_OPT
        │    • Add NNAPI provider (if available)
        │
        └──► Create OrtSession
             │
             └──► Model ready for inference

Memory Management:

┌──────────────────────────────────┐
│  Model Lifecycle                 │
├──────────────────────────────────┤
│                                  │
│  Feature Activated               │
│         │                        │
│         ▼                        │
│  Load model to memory            │
│         │                        │
│         ▼                        │
│  Keep in memory during use       │
│         │                        │
│         ▼                        │
│  User exits feature              │
│         │                        │
│         ▼                        │
│  Release tensors                 │
│         │                        │
│         ▼                        │
│  Session persists (reusable)     │
│         │                        │
│         ▼                        │
│  App background/destroy          │
│         │                        │
│         ▼                        │
│  Full cleanup                    │
│                                  │
└──────────────────────────────────┘

Application Features

Core Editing Features


1. GPU-Powered Image Adjustments

  • Fine-tune exposure, brightness, and contrast
  • Control highlights and shadows
  • Adjust saturation, vibrance, and hue
  • Set temperature/white balance
  • Enhance sharpness and texture
  • Smooth, real-time slider preview

2. AI Background Removal

  • One-tap subject selection using edgeSAM
  • Automatic background cutout
  • Manual brush-based refinement
  • Instant background replacement
  • Smart edge clean-up

3. Object Removal & Inpainting

  • Tap to remove unwanted objects
  • LaMa powered content aware fill
  • Manual mask painting
  • Multi object removal
  • Auto-dilation for smooth blending

4. AI Retouch with Natural Language

  • Edit using natural instructions (“make it warmer”)
  • Reference image style transfer
  • Offline voice commands
  • Parameter extraction & visualization
  • Smart filtering of relevant edits

5. Generative Fill (Cloud Enhanced)

  • Fill or extend images using custom prompts
  • Adjustable vibe/style strength
  • AI-powered background replacement
  • Automatic lighting/color harmonization
  • AI generated content watermark

6. Move Object

  • Select objects with edgeSAM
  • Drag to reposition anywhere
  • Automatic inpainting of original area
  • Harmonized shadows + lighting
  • Optional manual mask refinement

7. Crop & Transform

  • Popular aspect ratios (1:1, 4:3, 16:9, etc.)
  • Freeform cropping
  • Corner based perspective correction
  • Rotation with angle display
  • Grid overlay for precision

8. Text & Stickers

  • Add customizable text
  • Multiple font options
  • Color, opacity, and size control
  • Rotate, scale, and reposition
  • Shadow effects

9. Watermarking Tools

  • Add visible watermarks
  • Hidden LSB steganographic watermarking
  • Optional “Edited by AI” stamps
  • Full font & size customization

10. Selective Adjustments

  • Brush based local adjustments
  • Paint to select regions
  • Adjustable brush size
  • Preview before applying

Cloud-Based Features

1. Smart Fill API

  • Endpoint for generative inpainting
  • Prompt based content generation
  • Configurable strength parameters

2. Harmonization API

  • Blends moved objects naturally
  • Matches lighting and color

3. AI Retouch API

  • Reference based style transfer
  • Instruction based parameter extraction

Workflow and Data Flow

Main Application Flow

App Launch
    │
    ▼
MainActivity.onCreate()
    │
    ├──► Initialize Compose UI
    │
    └──► Create PhotoEditorViewModel
         │
         ▼
PhotoEditorScreen (Idle State)
         │
         │ User Action
         ▼
┌────────────────────────┐
│  Image Selection       │
│  • Gallery Picker      │
│  • Camera Capture      │
└────────┬───────────────┘
         │
         ▼
Load Image → Update ViewModel State
         │
         ▼
┌────────────────────────────────────┐
│  Main Editor View                  │
│                                    │
│  ┌──────────────┐  ┌────────────┐  │
│  │  Image Canvas│  │Side Panel  │  │
│  │              │  │            │  │
│  │  • Display   │  │• Adjust    │  │
│  │  • Interact  │  │• AI Tools  │  │
│  │  • Transform │  │• Effects   │  │
│  └──────────────┘  └────────────┘  │
│                                    │
│  ┌──────────────────────────────┐  │
│  │  Top App Bar                 │  │
│  │  • Undo/Redo                 │  │
│  │  • Save/Export               │  │
│  └──────────────────────────────┘  │
└────────────────────────────────────┘

Feature Activation Flow

User selects feature from Side Panel
                │
    ┌───────────┴────────────┬─────────────────┐
    ▼                        ▼                 ▼
Adjust Mode          AI Tool Mode       Transform Mode
    │                        │                 │
    ▼                        ▼                 ▼
Load AdjustScreen    Initialize Model    Enter Crop/Rotate
    │                        │                 │
    ▼                        ▼                 ▼
Display sliders      Wait for user input  Interactive overlay
    │                        │                 │
    ▼                        ▼                 ▼
Real-time preview    Process inference    Preview transform
    │                        │                 │
    ▼                        ▼                 ▼
Apply on confirm     Update image        Apply on confirm

Image Processing Pipeline

Original Image URI
        │
        ▼
Load to Bitmap
        │
        ▼
┌───────────────────────────────────┐
│  Processing Selection             │
├───────────────────────────────────┤
│                                   │
│  ┌─────────────────────────────┐  │
│  │ On-Device Processing        │  │
│  │                             │  │
│  │ • Adjustments (GPU)         │  │
│  │ • LaMa Inpainting           │  │
│  │ • edgeSAM Segmentation      │  │
│  │ • Auto Enhance              │  │
│  └──────────┬──────────────────┘  │
│             │                     │
│             ▼                     │
│      Process locally              │
│             │                     │
│             ▼                     │
│      Result Bitmap                │
│                                   │
│  ┌─────────────────────────────┐  │
│  │ Cloud Processing            │  │
│  │                             │  │
│  │ • Smart Fill                │  │
│  │ • Harmonization             │  │
│  │ • AI Retouch                │  │
│  └──────────┬──────────────────┘  │
│             │                     │
│             ▼                     │
│      Upload via Retrofit          │
│             │                     │
│             ▼                     │
│      Wait for response            │
│             │                     │
│             ▼                     │
│      Download result              │
└───────────┬─────────────────────┬─┘
            │                     │
            ▼                     ▼
    Save to cache           Update UI
            │                     │
            ▼                     ▼
    Generate URI          Display result

State Management Flow

User Interaction
        │
        ▼
Event Handler in Composable
        │
        ▼
Call ViewModel method
        │
        ▼
┌────────────────────────────────┐
│  PhotoEditorViewModel          │
│                                │
│  1. Validate action            │
│  2. Update state variables     │
│  3. Trigger coroutine          │
│  4. Launch processing          │
│  5. Handle result              │
│  6. Update state again         │
└────────┬───────────────────────┘
         │
         ▼
State changes trigger recomposition
         │
         ▼
UI updates automatically
         │
         ▼
User sees result

Undo/Redo System

Initial State: currentImageUri = null, history = []
        │
        ▼
User loads image → history = [uri1]
        │
        ▼
User applies adjustment → history = [uri1, uri2]
        │
        ▼
User applies crop → history = [uri1, uri2, uri3]
        │
        ▼
User clicks Undo
        │
        ▼
Pop from history → Display uri2
        │
        ▼
User clicks Undo again
        │
        ▼
Pop from history → Display uri1
        │
        ▼
User applies new edit
        │
        ▼
Clear forward history → history = [uri1, uri4]

Project Structure

Kortex/
│
├── app/
│   ├── build.gradle.kts           Build configuration
│   ├── proguard-rules.pro         Code obfuscation rules
│   │
│   └── src/
│       ├── main/
│       │   ├── AndroidManifest.xml
│       │   │
│       │   ├── java/test1/example/finalapp/
│       │   │   │
│       │   │   ├── MainActivity.kt                Main entry point
│       │   │   ├── PhotoEditorViewModel.kt        Central state manager
│       │   │   ├── LamaExecutor.kt                LaMa model executor
│       │   │   ├── ZimExecutor.kt                 edgeSAM model executor
│       │   │   │
│       │   │   ├── View/
│       │   │   │   ├── screens/
│       │   │   │   │   └── PhotoEditorScreen.kt   Main editor UI
│       │   │   │   │
│       │   │   │   └── components/
│       │   │   │       ├── BackgroundRemovalScreen.kt
│       │   │   │       ├── MoveObjectPlacementScreen.kt
│       │   │   │       ├── ManualMaskPainter.kt
│       │   │   │       ├── CropComponents.kt
│       │   │   │       ├── AspectRatioCropEngine.kt
│       │   │   │       ├── CornerCropEngine.kt
│       │   │   │       ├── RotationEngine.kt
│       │   │   │       ├── WatermarkComponents.kt
│       │   │   │       ├── LSBWatermarkScreen.kt
│       │   │   │       ├── TextSticker.kt
│       │   │   │       ├── TextEditingScreen.kt
│       │   │   │       ├── SidePanel.kt
│       │   │   │       ├── TopAppBar.kt
│       │   │   │       └── GenerativeFillDialog.kt
│       │   │   │
│       │   │   ├── ui/
│       │   │   │   ├── theme/
│       │   │   │   │   ├── Color.kt
│       │   │   │   │   ├── Theme.kt
│       │   │   │   │   ├── Type.kt
│       │   │   │   │   └── AppTheme.kt
│       │   │   │   │
│       │   │   │   └── adjust/
│       │   │   │       ├── AdjustSheetComposable.kt
│       │   │   │       └── AdjustIntegrationExample.kt
│       │   │   │
│       │   │   ├── component/
│       │   │   │   └── adjust/
│       │   │   │       ├── AdjustEngine.kt         GPU filter engine
│       │   │   │       ├── AdjustViewModel.kt
│       │   │   │       ├── AdjustParams.kt
│       │   │   │       ├── PreviewRenderer.kt
│       │   │   │       ├── CurveEditorView.kt
│       │   │   │       └── ToneCurveData.kt
│       │   │   │
│       │   │   ├── ml/
│       │   │   │   └── AutoEnhanceExecutor.kt     Auto enhance model
│       │   │   │
│       │   │   ├── audio/
│       │   │   │   └── OfflineSpeechManager.kt    Vosk integration
│       │   │   │
│       │   │   ├── data/
│       │   │   │   ├── api/
│       │   │   │   │   ├── RetrofitClient.kt
│       │   │   │   │   ├── RetouchApiService.kt
│       │   │   │   │   ├── CloudEditNetwork.kt
│       │   │   │   │   └── CloudEditApiService.kt
│       │   │   │   │
│       │   │   │   ├── repository/
│       │   │   │   │   ├── RetouchRepository.kt
│       │   │   │   │   └── CloudEditRepository.kt
│       │   │   │   │
│       │   │   │   └── model/
│       │   │   │       ├── RetouchModels.kt
│       │   │   │       └── AdjustState.kt
│       │   │   │
│       │   │   ├── utils/
│       │   │   │   ├── ImageUtils.kt            Image operations
│       │   │   │   ├── AIWatermarkHelper.kt     Watermarking
│       │   │   │   ├── LSBWatermarkUtil.kt      Steganography
│       │   │   │   ├── RetouchUtils.kt
│       │   │   │   └── FontManager.kt
│       │   │   │
│       │   │   └── model/
│       │   │       └── AiEditState.kt
│       │   │
│       │   ├── assets/
│       │   │   ├── lama_fp32.onnx                LaMa model
│       │   │   ├── sam_encoder.onnx              edgeSAM encoder
│       │   │   ├── sam_decoder.onnx              edgeSAM decoder
│       │   │   ├── analyzer_8param_v2.onnx       Auto enhance analyzer
│       │   │   ├── hdrnet_fixer_safe.onnx        Auto enhance fixer
│       │   │   ├── vosk-model-small-en-us-0.15/  Speech model
│       │   │   └── fonts/                        Custom fonts
│       │   │
│       │   └── res/
│       │       ├── values/
│       │       ├── drawable/
│       │       ├── mipmap/
│       │       └── xml/
│       │
│       ├── androidTest/                          Instrumented tests
│       └── test/                                 Unit tests
│
├── gradle/
│   ├── libs.versions.toml                        Dependency catalog
│   └── wrapper/
│       ├── gradle-wrapper.jar
│       └── gradle-wrapper.properties
│
├── build.gradle.kts                              Root build script
├── settings.gradle.kts                           Project settings
├── gradle.properties                             Gradle configuration
├── gradlew                                       Gradle wrapper (Unix)
├── gradlew.bat                                   Gradle wrapper (Windows)
└── local.properties                              Local SDK path

Setup and Installation

Prerequisites

Before setting up the project, ensure you have the following installed:

1. Development Environment

  • Android Studio Iguana (2023.2.1) or newer
  • JDK 11 or higher
  • Minimum 8 GB RAM (16 GB recommended)
  • At least 10 GB free disk space

2. Android SDK

  • Android SDK Platform 34
  • Android SDK Build-Tools 34.0.0 or higher
  • Android SDK Platform-Tools
  • Android Emulator (if testing on emulator)

3. Git

  • Git version control system

Open the zip

extract the content of the zip file and take out the Kortex directory

Configure Local Environment

1. Create local.properties file

Create a file named local.properties in the root directory with your Android SDK path:

sdk.dir=C:\\Users\\YourUsername\\AppData\\Local\\Android\\Sdk

On macOS/Linux:

sdk.dir=/Users/YourUsername/Library/Android/sdk

2. Verify Model Files

Ensure all ONNX model files are present in app/src/main/assets/:

  • lama_fp32.onnx
  • sam_encoder.onnx
  • sam_decoder.onnx
  • analyzer_8param_v2.onnx
  • hdrnet_fixer_safe.onnx
  • vosk-model-small-en-us-0.15/ (directory with model files)

3. Configure API Endpoints (Optional)

If using cloud features, update the API base URLs in:

  • app/src/main/java/test1/example/finalapp/data/api/RetrofitClient.kt
  • app/src/main/java/test1/example/finalapp/data/api/CloudEditNetwork.kt

Gradle Sync

Open the project in Android Studio and let Gradle sync automatically. If it doesn't start:

  1. Click "File" > "Sync Project with Gradle Files"
  2. Wait for dependencies to download
  3. Resolve any errors that appear

Dependency Installation

All dependencies are managed through Gradle and will be downloaded automatically during sync. Key dependencies include:

  • Jetpack Compose libraries
  • ONNX Runtime Android
  • Retrofit and OkHttp
  • GPUImage
  • Vosk Android
  • Coil image loader

Building and Running

Build Variants

The project supports two build variants:

Debug Build

  • Includes debugging symbols
  • Logging enabled
  • No code obfuscation
  • Faster build times

Release Build

  • Code optimization enabled
  • ProGuard rules applied
  • Smaller APK size
  • Production-ready

Running on Emulator

1. Create/Start Emulator

In Android Studio:

  • Tools > Device Manager
  • Create a new virtual device or start existing one
  • Recommended: Pixel 5 with API 34 (Android 14)
  • Enable "Hardware" for Graphics (for GPU acceleration)

2. Run the App

Click the "Run" button (green play icon)
or
Select Run > Run 'app'
or
Press Shift+F10 (Windows/Linux) or Control+R (macOS)

Running on Physical Device

1. Enable Developer Options

On your Android device:

  • Go to Settings > About Phone
  • Tap "Build Number" 7 times
  • Go back to Settings > System > Developer Options
  • Enable "USB Debugging"

2. Connect Device

  • Connect device via USB
  • Accept USB debugging prompt on device
  • Device should appear in Android Studio device dropdown

3. Run the App

Select your device from the dropdown and click Run

Building APK

Debug APK

./gradlew assembleDebug

Output location: app/build/outputs/apk/debug/app-debug.apk

Release APK

./gradlew assembleRelease

Output location: app/build/outputs/apk/release/app-release.apk

Note: Release builds require signing configuration

Building from Command Line

Windows

gradlew.bat clean
gradlew.bat assembleDebug

Troubleshooting Build Issues

Out of Memory Error

Increase heap size in gradle.properties:

org.gradle.jvmargs=-Xmx4096m -Dfile.encoding=UTF-8

Duplicate Class Errors

The project already handles libc++_shared.so conflicts in build.gradle.kts with pickFirsts configuration

Model Loading Failures

Ensure all ONNX files are in the assets folder and not compressed by adding to build.gradle.kts:

android {
    aaptOptions {
        noCompress "onnx"
    }
}

NNAPI Errors

Some devices may not support NNAPI. The code gracefully falls back to CPU execution

System Requirements

Minimum Requirements

Device

  • Android 7.0 (API 24) or higher
  • 3 GB RAM
  • 1 GB free storage
  • ARMv7 or ARM64 processor

Permissions

  • READ_EXTERNAL_STORAGE
  • WRITE_EXTERNAL_STORAGE (Android 9 and below)
  • READ_MEDIA_IMAGES (Android 13+)
  • RECORD_AUDIO (for voice commands)
  • CAMERA (for camera capture)
  • INTERNET (for cloud features)

Recommended Requirements

Device

  • Android 12.0 (API 31) or higher
  • 6 GB RAM or more
  • 2 GB free storage
  • ARM64 processor
  • GPU with OpenGL ES 3.0 or higher
  • NNAPI support for hardware acceleration

Performance Notes

Model inference times (approximate, varies by device):

Budget Device (Snapdragon 600 series):
├── LaMa Inpainting: 3-5 seconds
├── edgeSAM Encoder: 2-3 seconds
├── edgeSAM Decoder: 200-300 ms
└── Auto Enhance: 1-2 seconds

Mid-range Device (Snapdragon 700 series):
├── LaMa Inpainting: 1-2 seconds
├── edgeSAM Encoder: 1-1.5 seconds
├── edgeSAM Decoder: 100-150 ms
└── Auto Enhance: 500-800 ms

Flagship Device (Snapdragon 8 series):
├── LaMa Inpainting: 0.5-1 second
├── edgeSAM Encoder: 0.5-0.8 seconds
├── edgeSAM Decoder: 50-80 ms
└── Auto Enhance: 200-400 ms

Storage Requirements

App Size

  • APK: Approximately 150-200 MB
  • ONNX Models: ~180 MB
  • Vosk Model: ~50 MB
  • Code and Resources: ~20 MB

Runtime Storage

  • Cache: 50-500 MB (varies with usage)
  • Temporary files: 100-1000 MB (high-resolution editing)
  • User images: Depends on usage

Network Requirements

Optional (for cloud features)

  • Stable internet connection
  • Recommended: 5 Mbps or higher
  • Cloud API access

Fully Functional Offline

  • All core editing features
  • On-device AI models
  • Voice recognition
  • No internet required for basic operation

About

Advanced Android photo editor powered by on-device AI models (LaMa, EdgeSAM, MobileViT) with GPU-accelerated adjustments, voice commands, generative fill, and professional editing tools. Built with Jetpack Compose and ONNX Runtime for fast, offline-first image processing.

Topics

Resources

Stars

Watchers

Forks

Languages