GitHub - PrathamX595/Kortex-AdobeSubmission: Advanced Android photo editor powered by on-device AI models (LaMa, EdgeSAM, MobileViT) with GPU-accelerated adjustments, voice commands, generative fill, and professional editing tools. Built with Jetpack Compose and ONNX Runtime for fast, offline-first image processing.

Overview

Kortex is a sophisticated Android photo editing app that offers professional level image editing capabilities by fusing cloud based AI services with on device machine learning models. The application leverages ONNX Runtime for efficient on device inference, GPU acceleration for real time adjustments, and offline speech recognition for voice-controlled editing.

Using Jetpack Compose for the UI layer, Kotlin Coroutines for asynchronous operations, and the MVVM (Model View ViewModel) architectural pattern for clear concern separation, the application is constructed using contemporary Android development techniques.

Installation from APK

To try Kortex quickly, a pre-built release APK is included in this repository for easy installation.

Installation Steps

For Android Device:

Step 1: Enable Unknown Sources

Before installing the APK, you need to allow installation from unknown sources:

Open Settings on your Android device
Navigate to Security or Privacy (location varies by Android version)
Find and enable Install unknown apps or Unknown sources
Select your file manager or browser and allow installation from that source

Note: On Android 8.0 (Oreo) and above, you grant permission per app. On older versions, there's a global setting.

Step 2: Transfer the APK to Your Device

Choose one of these methods:

Method A: Direct Download (if shared online)

Download the APK directly on your device from the shared link
The APK will be saved to your Downloads folder

Method B: USB Transfer

Connect your Android device to your computer via USB
Enable File Transfer mode when prompted on your device
Navigate to the project folder: Kortex-app/apk/
Copy app-release.apk to your device's Downloads folder or internal storage

Method C: Cloud Transfer

Upload app-release.apk from Kortex-app/apk/ folder to Google Drive, Dropbox, etc.
Download it on your Android device from the cloud service

Step 3: Install the APK

Open your device's File Manager app
Navigate to the folder where you saved the APK (usually Downloads)
Tap on app-release.apk
Review the permissions requested by the app:
- Storage (for saving/loading images)
- Camera (for taking photos)
- Microphone (for voice commands)
- Internet (for cloud AI features)
Tap Install
Wait for installation to complete (may take 30-60 seconds due to ML models)
Tap Open to launch Kortex, or find it in your app drawer

Step 4: Grant Runtime Permissions

When you first use certain features, Android will ask for permissions:

Storage/Photos: Required to edit images from your gallery
Camera: Required to take new photos (optional)
Microphone: Required for voice commands (optional)
Internet: Required for cloud AI features (optional - app works offline)

Project Architecture

Architectural Pattern

The application uses the repository pattern for data management in conjunction with the MVVM (Model View ViewModel) architecture pattern:

┌─────────────────────────────────────────────────────────────┐
│                        View Layer                           │
│  (Jetpack Compose UI Components)                            │
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │PhotoEditor   │  │ Adjust       │  │ Background   │       │
│  │Screen        │  │ Screen       │  │ Removal      │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
└────────────┬────────────────────────────────────────────────┘
             │
             │ User Actions / State Observation
             ▼
┌─────────────────────────────────────────────────────────────┐
│                   ViewModel Layer                           │
│                                                             │
│  ┌──────────────────────────────────────────────────────┐   │
│  │        PhotoEditorViewModel                          │   │
│  │  • Manages UI state                                  │   │
│  │  • Handles user interactions                         │   │
│  │  • Coordinates with repositories and executors       │   │
│  └──────────────────────────────────────────────────────┘   │
└────────────┬────────────────────────────────────────────────┘
             │
             │ Data Requests / Commands
             ▼
┌─────────────────────────────────────────────────────────────┐
│                   Data/Model Layer                          │
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │ Repositories │  │ ML Executors │  │ Utilities    │       │
│  │              │  │              │  │              │       │
│  │ • Retouch    │  │ • LaMa       │  │ • Image      │       │
│  │ • CloudEdit  │  │ • SAM        │  │ • Watermark  │       │
│  │              │  │ • AutoEnhance│  │ • Font       │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
└─────────────────────────────────────────────────────────────┘

Component Interaction Flow

User Input
    │
    ▼
┌───────────────────┐
│  Compose UI       │
│  Components       │
└────────┬──────────┘
         │ Events
         ▼
┌───────────────────────────┐
│  PhotoEditorViewModel     │
│  • State Management       │
│  • Business Logic         │
└─────┬─────────────────────┘
      │
      ├──────────────────┬──────────────────┬────────────────┐
      ▼                  ▼                  ▼                ▼
┌─────────────┐   ┌─────────────┐   ┌────────────┐   ┌────────────┐
│ On-Device   │   │  Cloud      │   │ GPU Image  │   │  Local     │
│ ML Models   │   │  APIs       │   │ Processing │   │  Storage   │
│             │   │             │   │            │   │            │
│ • LaMa      │   │ • Retouch   │   │ • Adjust   │   │ • Files    │
│ • EdgeSAM   │   │ • SmartFill │   │ • Filters  │   │ • Cache    │
│ • AutoEnhc. │   │ • Sticker   │   │            │   │            │
│ • Vosk      │   │   Harmonize │   │            │   │            │
└─────────────┘   └─────────────┘   └────────────┘   └────────────┘

Technology Stack

Core Technologies

Language and Framework

Kotlin 2.0.21
Android SDK (Min SDK 24, Target SDK 34, Compile SDK 36)
Jetpack Compose (Material3)

Build System

Gradle 8.13.1 with Kotlin DSL
Android Gradle Plugin 8.13.1

Architecture Components

Lifecycle ViewModel Compose 2.7.0
Kotlin Coroutines with Dispatchers
StateFlow for reactive state management

Networking

Retrofit 2.11.0 (REST API communication)
OkHttp 4.12.0 (HTTP client with logging interceptor)
Gson Converter 2.11.0 (JSON serialization)

Machine Learning

ONNX Runtime Android 1.17.0 (On-device inference)
Vosk Android 0.3.32 (Offline speech recognition)
JNA 5.13.0 (Java Native Access)

Image Processing

Coil 2.5.0 (Async image loading)
GPUImage 2.1.0 (GPU-accelerated filters)
ExifInterface 1.3.7 (Image metadata handling)

UI and Permissions

Material Icons Extended
Accompanist Permissions 0.32.0

On-Device AI Models

The application uses multiple ONNX format neural network models that run entirely on the device without requiring internet connectivity. These models are stored in the assets folder and loaded at runtime.

Model Details

1. LaMa (Large Mask Inpainting)

File: lama_fp32.onnx

Purpose: Advanced image inpainting for object removal and content aware fill

Architecture:

Input: image (1x3x512x512) + mask (1x1x512x512)
    │
    ▼
┌──────────────────────┐
│  Fast Fourier Conv   │
│  Encoder-Decoder     │
└──────────┬───────────┘
           ▼
Output: inpainted_image (1x3x512x512)

Technical Specifications:

Input Image Shape: [1, 3, 512, 512]
Input Mask Shape: [1, 1, 512, 512]
Output Shape: [1, 3, 512, 512]
Precision: FP32
Acceleration: NNAPI hardware acceleration when available (Android 8+)

Processing Pipeline:

Original Image → Resize to 512x512 → Normalize to [0,1]
                                            │
Mask Image → Resize to 512x512 → Dilate (10px) → Binary threshold
                                            │
                    ┌───────────────────────┴─────────────────┐
                    ▼                                         ▼
            Image Tensor (CHW format)               Mask Tensor
                    │                                         │
                    └──────────────┬──────────────────────────┘
                                   ▼
                          LaMa ONNX Inference
                                   │
                                   ▼
                          Inpainted Result
                                   │
                                   ▼
                    Denormalize → Resize to original size

Use Cases:

Removing objects from pictures
Cleaning up unwanted elements
Manual mask-based inpainting
Background object removal

2. edgeSAM (Segment Anything Model)

Files: sam_encoder.onnx and sam_decoder.onnx

Purpose: Interactive image segmentation with point based selection

Two-Stage Architecture:

Stage 1: Encoder (Heavy, Run Once)
──────────────────────────────────
Input Image (1x3x1024x1024)
        │
        ▼
┌───────────────────┐
│  Vision Transform │
│  Encoder          │
└────────┬──────────┘
         ▼
Image Embeddings (1x256x64x64)
         │
         └──────► Cache for reuse

- User clicks somewhere or creates a bounding box

Stage 2: Decoder (Lightweight, Interactive)
───────────────────────────────────────────
Cached Embeddings + Point Coords + Labels
                │
                ▼
    ┌───────────────────────┐
    │  Mask Decoder         │
    │  + Prompt Encoder     │
    └───────────┬───────────┘
                ▼
        Segmentation Masks
    (1x1x256x256) × 4 variants
                │
                ▼
        IoU Scores (1x4)

Technical Specifications:

Encoder:

Input Shape: [1, 3, 1024, 1024]
Output Shape: [1, 256, 64, 64]
Execution: Once per image

Decoder:

Inputs:
- Image Embeddings: [1, 256, 64, 64]
- Point Coordinates: [1, N, 2]
- Point Labels: [1, N] (1=foreground, 0=background, -1=padding)
- Mask Input: [1, 1, 256, 256] (optional)
- Has Mask Input: [1] (boolean)
- Original Image Size: [2] (height, width)
Outputs:
- Masks: [1, 4, 256, 256]
- IoU Predictions: [1, 4]

Processing Flow:

User loads image
    │
    ▼
Run Encoder (slow, ~1-2 seconds)
    │
    ▼
Cache embeddings in memory
    │
    ▼
User taps on object ◄─────┐
    │                     │
    ▼                     │
Transform tap coords      │
to model space (1024x1024)│
    │                     │
    ▼                     │
Run Decoder (fast, <100ms)│
    │                     │
    ▼                     │
Select best mask by IoU   │
    │                     │
    ▼                     │
Resize to original size   │
    │                     │
    ▼                     │
Display segmentation      │
    │                     │
    └─────────────────────┘
    (User can tap again)

Use Cases:

Background removal with tap selection
Object isolation
Quick mask generation
Interactive segmentation

3. Auto Enhance (Image Analysis)

Files: analyzer_8param_v2.onnx and hdrnet_fixer_safe.onnx

Purpose: Automatic image quality analysis and parameter extraction

Dual-Model System:

Model 1: Analyzer (Parameter Extraction)
────────────────────────────────────────
Input Image (1x3x256x256)
        │
        ▼
┌──────────────────────┐
│  MobileViT Backbone  │
│  + Analysis Head     │
└──────────┬───────────┘
           │
           ├──► Edit Parameters (1x8)
           │    [exposure, contrast, saturation,
           │     brightness, highlights, shadows,
           │     temperature, sharpness]
           │
           └──► Rationale Logits (1x4)
                [underexposed, overexposed,
                 unsaturated, good]


Model 2: Fixer (Parameter Application)
───────────────────────────────────────
Original Image + Parameters
        │
        ▼
┌──────────────────────┐
│  HDRNet Architecture │
│  Bilateral Grid      │
└──────────┬───────────┘
           ▼
    Enhanced Image

Technical Specifications:

Analyzer:

Input Shape: [1, 3, 256, 256]
Outputs:
- Parameters: [1, 8] float values
- Rationale: [1, 4] classification logits
Parameter Ranges: Typically [-1, 1] or [0, 2]

Parameter Mapping:

Index 0: Exposure      → Exposure adjustment
Index 1: Contrast      → Local contrast
Index 2: Saturation    → Color intensity
Index 3: Brightness    → Overall Brightness
Index 4: Highlights    → Bright region control
Index 5: Shadows       → Dark region control
Index 6: Temperature   → White balance (cool/warm)
Index 7: Sharpness     → Edge enhancement

Analysis Workflow:

Input Image
    │
    ▼
Resize to 256x256
    │
    ▼
Normalize to [0,1]
    │
    ▼
Run Analyzer Model
    │
    ├──► Extract 8 parameters
    │    │
    │    ▼
    │    Map to adjustment sliders
    │    │
    │    ▼
    │    Apply via GPU filters
    │
    └──► Parse rationale
         │
         ▼
         Display diagnosis
         [Softmax classification]

Use Cases:

Automatic image enhancement suggestions
Quality analysis
Parameter extraction for manual adjustment

4. Vosk Speech Recognition

Directory: vosk-model-small-en-us-0.15/

Purpose: Offline voice command recognition for hands free editing

Model Type: language model (Kaldi based)

Architecture Overview:

Audio Input (16kHz PCM)
        │
        ▼
┌─────────────────────┐
│  Audio Preprocessing│
│  • Framing          │
│  • Feature Extract  │
│  • MFCC/Fbank       │
└─────────┬───────────┘
          ▼
┌─────────────────────┐
│  Acoustic Model     │
│  (DNN/TDNN)         │
└─────────┬───────────┘
          ▼
┌─────────────────────┐
│  Language Model     │
│  (N-gram/RNNLM)     │
└─────────┬───────────┘
          ▼
  Transcribed Text

Technical Specifications:

Sample Rate: 16,000 Hz
Model Size: Small (41MB approx)
Language: English (US)
Latency: Real-time streaming
Output: JSON with partial and final results

Recognition Flow:

User presses mic button
        │
        ▼
Request RECORD_AUDIO permission
        │
        ▼
Initialize Vosk Model (if first time)
        │
        ▼
Start SpeechService
        │
        ▼
Audio Stream ──┐
               │
    ┌──────────▼─────────────┐
    │ Continuous Recognition │
    │ • Partial results      │
    │ • Final results        │
    │ • Auto-stop (5s)       │
    └──────────┬─────────────┘
               │
               ▼
    Update chat interface
               │
               ▼
    Send to API or execute command

Use Cases:

Voice activated controls to "make it brighter"
Retouching based on instructions
Hands free operation
Accessibility features

Model Loading and Optimization

Initialization Strategy:

Application Start
        │
        ▼
MainActivity.onCreate()
        │
        ▼
User selects feature
        │
        ▼
Lazy initialization of required model
        │
        ├──► Copy from assets to cache (if needed)
        │
        ├──► Check device capabilities
        │    • NNAPI availability
        │    • GPU acceleration
        │    • Memory constraints
        │
        ├──► Configure OrtSession.SessionOptions
        │    • OptLevel.ALL_OPT
        │    • Add NNAPI provider (if available)
        │
        └──► Create OrtSession
             │
             └──► Model ready for inference

Memory Management:

┌──────────────────────────────────┐
│  Model Lifecycle                 │
├──────────────────────────────────┤
│                                  │
│  Feature Activated               │
│         │                        │
│         ▼                        │
│  Load model to memory            │
│         │                        │
│         ▼                        │
│  Keep in memory during use       │
│         │                        │
│         ▼                        │
│  User exits feature              │
│         │                        │
│         ▼                        │
│  Release tensors                 │
│         │                        │
│         ▼                        │
│  Session persists (reusable)     │
│         │                        │
│         ▼                        │
│  App background/destroy          │
│         │                        │
│         ▼                        │
│  Full cleanup                    │
│                                  │
└──────────────────────────────────┘

Application Features

Core Editing Features

1. GPU-Powered Image Adjustments

Fine-tune exposure, brightness, and contrast
Control highlights and shadows
Adjust saturation, vibrance, and hue
Set temperature/white balance
Enhance sharpness and texture
Smooth, real-time slider preview

2. AI Background Removal

One-tap subject selection using edgeSAM
Automatic background cutout
Manual brush-based refinement
Instant background replacement
Smart edge clean-up

3. Object Removal & Inpainting

Tap to remove unwanted objects
LaMa powered content aware fill
Manual mask painting
Multi object removal
Auto-dilation for smooth blending

4. AI Retouch with Natural Language

Edit using natural instructions (“make it warmer”)
Reference image style transfer
Offline voice commands
Parameter extraction & visualization
Smart filtering of relevant edits

5. Generative Fill (Cloud Enhanced)

Fill or extend images using custom prompts
Adjustable vibe/style strength
AI-powered background replacement
Automatic lighting/color harmonization
AI generated content watermark

6. Move Object

Select objects with edgeSAM
Drag to reposition anywhere
Automatic inpainting of original area
Harmonized shadows + lighting
Optional manual mask refinement

7. Crop & Transform

Popular aspect ratios (1:1, 4:3, 16:9, etc.)
Freeform cropping
Corner based perspective correction
Rotation with angle display
Grid overlay for precision

8. Text & Stickers

Add customizable text
Multiple font options
Color, opacity, and size control
Rotate, scale, and reposition
Shadow effects

9. Watermarking Tools

Add visible watermarks
Hidden LSB steganographic watermarking
Optional “Edited by AI” stamps
Full font & size customization

10. Selective Adjustments

Brush based local adjustments
Paint to select regions
Adjustable brush size
Preview before applying

Cloud-Based Features

1. Smart Fill API

Endpoint for generative inpainting
Prompt based content generation
Configurable strength parameters

2. Harmonization API

Blends moved objects naturally
Matches lighting and color

3. AI Retouch API

Reference based style transfer
Instruction based parameter extraction

Workflow and Data Flow

Main Application Flow

App Launch
    │
    ▼
MainActivity.onCreate()
    │
    ├──► Initialize Compose UI
    │
    └──► Create PhotoEditorViewModel
         │
         ▼
PhotoEditorScreen (Idle State)
         │
         │ User Action
         ▼
┌────────────────────────┐
│  Image Selection       │
│  • Gallery Picker      │
│  • Camera Capture      │
└────────┬───────────────┘
         │
         ▼
Load Image → Update ViewModel State
         │
         ▼
┌────────────────────────────────────┐
│  Main Editor View                  │
│                                    │
│  ┌──────────────┐  ┌────────────┐  │
│  │  Image Canvas│  │Side Panel  │  │
│  │              │  │            │  │
│  │  • Display   │  │• Adjust    │  │
│  │  • Interact  │  │• AI Tools  │  │
│  │  • Transform │  │• Effects   │  │
│  └──────────────┘  └────────────┘  │
│                                    │
│  ┌──────────────────────────────┐  │
│  │  Top App Bar                 │  │
│  │  • Undo/Redo                 │  │
│  │  • Save/Export               │  │
│  └──────────────────────────────┘  │
└────────────────────────────────────┘

Feature Activation Flow

User selects feature from Side Panel
                │
    ┌───────────┴────────────┬─────────────────┐
    ▼                        ▼                 ▼
Adjust Mode          AI Tool Mode       Transform Mode
    │                        │                 │
    ▼                        ▼                 ▼
Load AdjustScreen    Initialize Model    Enter Crop/Rotate
    │                        │                 │
    ▼                        ▼                 ▼
Display sliders      Wait for user input  Interactive overlay
    │                        │                 │
    ▼                        ▼                 ▼
Real-time preview    Process inference    Preview transform
    │                        │                 │
    ▼                        ▼                 ▼
Apply on confirm     Update image        Apply on confirm

Image Processing Pipeline

Original Image URI
        │
        ▼
Load to Bitmap
        │
        ▼
┌───────────────────────────────────┐
│  Processing Selection             │
├───────────────────────────────────┤
│                                   │
│  ┌─────────────────────────────┐  │
│  │ On-Device Processing        │  │
│  │                             │  │
│  │ • Adjustments (GPU)         │  │
│  │ • LaMa Inpainting           │  │
│  │ • edgeSAM Segmentation      │  │
│  │ • Auto Enhance              │  │
│  └──────────┬──────────────────┘  │
│             │                     │
│             ▼                     │
│      Process locally              │
│             │                     │
│             ▼                     │
│      Result Bitmap                │
│                                   │
│  ┌─────────────────────────────┐  │
│  │ Cloud Processing            │  │
│  │                             │  │
│  │ • Smart Fill                │  │
│  │ • Harmonization             │  │
│  │ • AI Retouch                │  │
│  └──────────┬──────────────────┘  │
│             │                     │
│             ▼                     │
│      Upload via Retrofit          │
│             │                     │
│             ▼                     │
│      Wait for response            │
│             │                     │
│             ▼                     │
│      Download result              │
└───────────┬─────────────────────┬─┘
            │                     │
            ▼                     ▼
    Save to cache           Update UI
            │                     │
            ▼                     ▼
    Generate URI          Display result

State Management Flow

User Interaction
        │
        ▼
Event Handler in Composable
        │
        ▼
Call ViewModel method
        │
        ▼
┌────────────────────────────────┐
│  PhotoEditorViewModel          │
│                                │
│  1. Validate action            │
│  2. Update state variables     │
│  3. Trigger coroutine          │
│  4. Launch processing          │
│  5. Handle result              │
│  6. Update state again         │
└────────┬───────────────────────┘
         │
         ▼
State changes trigger recomposition
         │
         ▼
UI updates automatically
         │
         ▼
User sees result

Undo/Redo System

Initial State: currentImageUri = null, history = []
        │
        ▼
User loads image → history = [uri1]
        │
        ▼
User applies adjustment → history = [uri1, uri2]
        │
        ▼
User applies crop → history = [uri1, uri2, uri3]
        │
        ▼
User clicks Undo
        │
        ▼
Pop from history → Display uri2
        │
        ▼
User clicks Undo again
        │
        ▼
Pop from history → Display uri1
        │
        ▼
User applies new edit
        │
        ▼
Clear forward history → history = [uri1, uri4]

Project Structure

Kortex/
│
├── app/
│   ├── build.gradle.kts           Build configuration
│   ├── proguard-rules.pro         Code obfuscation rules
│   │
│   └── src/
│       ├── main/
│       │   ├── AndroidManifest.xml
│       │   │
│       │   ├── java/test1/example/finalapp/
│       │   │   │
│       │   │   ├── MainActivity.kt                Main entry point
│       │   │   ├── PhotoEditorViewModel.kt        Central state manager
│       │   │   ├── LamaExecutor.kt                LaMa model executor
│       │   │   ├── ZimExecutor.kt                 edgeSAM model executor
│       │   │   │
│       │   │   ├── View/
│       │   │   │   ├── screens/
│       │   │   │   │   └── PhotoEditorScreen.kt   Main editor UI
│       │   │   │   │
│       │   │   │   └── components/
│       │   │   │       ├── BackgroundRemovalScreen.kt
│       │   │   │       ├── MoveObjectPlacementScreen.kt
│       │   │   │       ├── ManualMaskPainter.kt
│       │   │   │       ├── CropComponents.kt
│       │   │   │       ├── AspectRatioCropEngine.kt
│       │   │   │       ├── CornerCropEngine.kt
│       │   │   │       ├── RotationEngine.kt
│       │   │   │       ├── WatermarkComponents.kt
│       │   │   │       ├── LSBWatermarkScreen.kt
│       │   │   │       ├── TextSticker.kt
│       │   │   │       ├── TextEditingScreen.kt
│       │   │   │       ├── SidePanel.kt
│       │   │   │       ├── TopAppBar.kt
│       │   │   │       └── GenerativeFillDialog.kt
│       │   │   │
│       │   │   ├── ui/
│       │   │   │   ├── theme/
│       │   │   │   │   ├── Color.kt
│       │   │   │   │   ├── Theme.kt
│       │   │   │   │   ├── Type.kt
│       │   │   │   │   └── AppTheme.kt
│       │   │   │   │
│       │   │   │   └── adjust/
│       │   │   │       ├── AdjustSheetComposable.kt
│       │   │   │       └── AdjustIntegrationExample.kt
│       │   │   │
│       │   │   ├── component/
│       │   │   │   └── adjust/
│       │   │   │       ├── AdjustEngine.kt         GPU filter engine
│       │   │   │       ├── AdjustViewModel.kt
│       │   │   │       ├── AdjustParams.kt
│       │   │   │       ├── PreviewRenderer.kt
│       │   │   │       ├── CurveEditorView.kt
│       │   │   │       └── ToneCurveData.kt
│       │   │   │
│       │   │   ├── ml/
│       │   │   │   └── AutoEnhanceExecutor.kt     Auto enhance model
│       │   │   │
│       │   │   ├── audio/
│       │   │   │   └── OfflineSpeechManager.kt    Vosk integration
│       │   │   │
│       │   │   ├── data/
│       │   │   │   ├── api/
│       │   │   │   │   ├── RetrofitClient.kt
│       │   │   │   │   ├── RetouchApiService.kt
│       │   │   │   │   ├── CloudEditNetwork.kt
│       │   │   │   │   └── CloudEditApiService.kt
│       │   │   │   │
│       │   │   │   ├── repository/
│       │   │   │   │   ├── RetouchRepository.kt
│       │   │   │   │   └── CloudEditRepository.kt
│       │   │   │   │
│       │   │   │   └── model/
│       │   │   │       ├── RetouchModels.kt
│       │   │   │       └── AdjustState.kt
│       │   │   │
│       │   │   ├── utils/
│       │   │   │   ├── ImageUtils.kt            Image operations
│       │   │   │   ├── AIWatermarkHelper.kt     Watermarking
│       │   │   │   ├── LSBWatermarkUtil.kt      Steganography
│       │   │   │   ├── RetouchUtils.kt
│       │   │   │   └── FontManager.kt
│       │   │   │
│       │   │   └── model/
│       │   │       └── AiEditState.kt
│       │   │
│       │   ├── assets/
│       │   │   ├── lama_fp32.onnx                LaMa model
│       │   │   ├── sam_encoder.onnx              edgeSAM encoder
│       │   │   ├── sam_decoder.onnx              edgeSAM decoder
│       │   │   ├── analyzer_8param_v2.onnx       Auto enhance analyzer
│       │   │   ├── hdrnet_fixer_safe.onnx        Auto enhance fixer
│       │   │   ├── vosk-model-small-en-us-0.15/  Speech model
│       │   │   └── fonts/                        Custom fonts
│       │   │
│       │   └── res/
│       │       ├── values/
│       │       ├── drawable/
│       │       ├── mipmap/
│       │       └── xml/
│       │
│       ├── androidTest/                          Instrumented tests
│       └── test/                                 Unit tests
│
├── gradle/
│   ├── libs.versions.toml                        Dependency catalog
│   └── wrapper/
│       ├── gradle-wrapper.jar
│       └── gradle-wrapper.properties
│
├── build.gradle.kts                              Root build script
├── settings.gradle.kts                           Project settings
├── gradle.properties                             Gradle configuration
├── gradlew                                       Gradle wrapper (Unix)
├── gradlew.bat                                   Gradle wrapper (Windows)
└── local.properties                              Local SDK path

Setup and Installation

Prerequisites

Before setting up the project, ensure you have the following installed:

1. Development Environment

Android Studio Iguana (2023.2.1) or newer
JDK 11 or higher
Minimum 8 GB RAM (16 GB recommended)
At least 10 GB free disk space

2. Android SDK

Android SDK Platform 34
Android SDK Build-Tools 34.0.0 or higher
Android SDK Platform-Tools
Android Emulator (if testing on emulator)

3. Git

Git version control system

Open the zip

extract the content of the zip file and take out the Kortex directory

Configure Local Environment

1. Create local.properties file

Create a file named local.properties in the root directory with your Android SDK path:

sdk.dir=C:\\Users\\YourUsername\\AppData\\Local\\Android\\Sdk

On macOS/Linux:

sdk.dir=/Users/YourUsername/Library/Android/sdk

2. Verify Model Files

Ensure all ONNX model files are present in app/src/main/assets/:

lama_fp32.onnx
sam_encoder.onnx
sam_decoder.onnx
analyzer_8param_v2.onnx
hdrnet_fixer_safe.onnx
vosk-model-small-en-us-0.15/ (directory with model files)

3. Configure API Endpoints (Optional)

If using cloud features, update the API base URLs in:

app/src/main/java/test1/example/finalapp/data/api/RetrofitClient.kt
app/src/main/java/test1/example/finalapp/data/api/CloudEditNetwork.kt

Gradle Sync

Open the project in Android Studio and let Gradle sync automatically. If it doesn't start:

Click "File" > "Sync Project with Gradle Files"
Wait for dependencies to download
Resolve any errors that appear

Dependency Installation

All dependencies are managed through Gradle and will be downloaded automatically during sync. Key dependencies include:

Jetpack Compose libraries
ONNX Runtime Android
Retrofit and OkHttp
GPUImage
Vosk Android
Coil image loader

Building and Running

Build Variants

The project supports two build variants:

Debug Build

Includes debugging symbols
Logging enabled
No code obfuscation
Faster build times

Release Build

Code optimization enabled
ProGuard rules applied
Smaller APK size
Production-ready

Running on Emulator

1. Create/Start Emulator

In Android Studio:

Tools > Device Manager
Create a new virtual device or start existing one
Recommended: Pixel 5 with API 34 (Android 14)
Enable "Hardware" for Graphics (for GPU acceleration)

2. Run the App

Click the "Run" button (green play icon)
or
Select Run > Run 'app'
or
Press Shift+F10 (Windows/Linux) or Control+R (macOS)

Running on Physical Device

1. Enable Developer Options

On your Android device:

Go to Settings > About Phone
Tap "Build Number" 7 times
Go back to Settings > System > Developer Options
Enable "USB Debugging"

2. Connect Device

Connect device via USB
Accept USB debugging prompt on device
Device should appear in Android Studio device dropdown

3. Run the App

Select your device from the dropdown and click Run

Building APK

Debug APK

./gradlew assembleDebug

Output location: app/build/outputs/apk/debug/app-debug.apk

Release APK

./gradlew assembleRelease

Output location: app/build/outputs/apk/release/app-release.apk

Note: Release builds require signing configuration

Building from Command Line

Windows

gradlew.bat clean
gradlew.bat assembleDebug

Troubleshooting Build Issues

Out of Memory Error

Increase heap size in gradle.properties:

org.gradle.jvmargs=-Xmx4096m -Dfile.encoding=UTF-8

Duplicate Class Errors

The project already handles libc++_shared.so conflicts in build.gradle.kts with pickFirsts configuration

Model Loading Failures

Ensure all ONNX files are in the assets folder and not compressed by adding to build.gradle.kts:

android {
    aaptOptions {
        noCompress "onnx"
    }
}

NNAPI Errors

Some devices may not support NNAPI. The code gracefully falls back to CPU execution

System Requirements

Minimum Requirements

Device

Android 7.0 (API 24) or higher
3 GB RAM
1 GB free storage
ARMv7 or ARM64 processor

Permissions

READ_EXTERNAL_STORAGE
WRITE_EXTERNAL_STORAGE (Android 9 and below)
READ_MEDIA_IMAGES (Android 13+)
RECORD_AUDIO (for voice commands)
CAMERA (for camera capture)
INTERNET (for cloud features)

Recommended Requirements

Device

Android 12.0 (API 31) or higher
6 GB RAM or more
2 GB free storage
ARM64 processor
GPU with OpenGL ES 3.0 or higher
NNAPI support for hardware acceleration

Performance Notes

Model inference times (approximate, varies by device):

Budget Device (Snapdragon 600 series):
├── LaMa Inpainting: 3-5 seconds
├── edgeSAM Encoder: 2-3 seconds
├── edgeSAM Decoder: 200-300 ms
└── Auto Enhance: 1-2 seconds

Mid-range Device (Snapdragon 700 series):
├── LaMa Inpainting: 1-2 seconds
├── edgeSAM Encoder: 1-1.5 seconds
├── edgeSAM Decoder: 100-150 ms
└── Auto Enhance: 500-800 ms

Flagship Device (Snapdragon 8 series):
├── LaMa Inpainting: 0.5-1 second
├── edgeSAM Encoder: 0.5-0.8 seconds
├── edgeSAM Decoder: 50-80 ms
└── Auto Enhance: 200-400 ms

Storage Requirements

App Size

APK: Approximately 150-200 MB
ONNX Models: ~180 MB
Vosk Model: ~50 MB
Code and Resources: ~20 MB

Runtime Storage

Cache: 50-500 MB (varies with usage)
Temporary files: 100-1000 MB (high-resolution editing)
User images: Depends on usage

Network Requirements

Optional (for cloud features)

Stable internet connection
Recommended: 5 Mbps or higher
Cloud API access

Fully Functional Offline

All core editing features
On-device AI models
Voice recognition
No internet required for basic operation

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.idea		.idea
app		app
assets		assets
gradle		gradle
.gitattributes		.gitattributes
.gitignore		.gitignore
DOCUMENT.md		DOCUMENT.md
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

PrathamX595/Kortex-AdobeSubmission

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Overview

Installation from APK

Installation Steps

For Android Device:

Project Architecture

Architectural Pattern

Component Interaction Flow

Technology Stack

Core Technologies

On-Device AI Models

Model Details

1. LaMa (Large Mask Inpainting)

2. edgeSAM (Segment Anything Model)

3. Auto Enhance (Image Analysis)

4. Vosk Speech Recognition

Model Loading and Optimization

Application Features

Core Editing Features

1. GPU-Powered Image Adjustments

2. AI Background Removal

3. Object Removal & Inpainting

4. AI Retouch with Natural Language

5. Generative Fill (Cloud Enhanced)

6. Move Object

7. Crop & Transform

8. Text & Stickers

9. Watermarking Tools

10. Selective Adjustments

Cloud-Based Features

Workflow and Data Flow

Main Application Flow

Feature Activation Flow

Image Processing Pipeline

State Management Flow

Undo/Redo System

Project Structure

Setup and Installation

Prerequisites

Open the zip

Configure Local Environment

Gradle Sync

Dependency Installation

Building and Running

Build Variants

Running on Emulator

Running on Physical Device

Building APK

Building from Command Line

Troubleshooting Build Issues

System Requirements

Minimum Requirements

Recommended Requirements

Storage Requirements

Network Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Languages