- Overview
- Installation from APK
- Project Architecture
- Technology Stack
- On-Device AI Models
- Application Features
- Workflow and Data Flow
- Project Structure
- Setup and Installation
- Building and Running
- System Requirements
Kortex is a sophisticated Android photo editing app that offers professional level image editing capabilities by fusing cloud based AI services with on device machine learning models. The application leverages ONNX Runtime for efficient on device inference, GPU acceleration for real time adjustments, and offline speech recognition for voice-controlled editing.
Using Jetpack Compose for the UI layer, Kotlin Coroutines for asynchronous operations, and the MVVM (Model View ViewModel) architectural pattern for clear concern separation, the application is constructed using contemporary Android development techniques.
To try Kortex quickly, a pre-built release APK is included in this repository for easy installation.
Step 1: Enable Unknown Sources
Before installing the APK, you need to allow installation from unknown sources:
- Open Settings on your Android device
- Navigate to Security or Privacy (location varies by Android version)
- Find and enable Install unknown apps or Unknown sources
- Select your file manager or browser and allow installation from that source
Note: On Android 8.0 (Oreo) and above, you grant permission per app. On older versions, there's a global setting.
Step 2: Transfer the APK to Your Device
Choose one of these methods:
Method A: Direct Download (if shared online)
- Download the APK directly on your device from the shared link
- The APK will be saved to your Downloads folder
Method B: USB Transfer
- Connect your Android device to your computer via USB
- Enable File Transfer mode when prompted on your device
- Navigate to the project folder:
Kortex-app/apk/ - Copy
app-release.apkto your device's Downloads folder or internal storage
Method C: Cloud Transfer
- Upload
app-release.apkfromKortex-app/apk/folder to Google Drive, Dropbox, etc. - Download it on your Android device from the cloud service
Step 3: Install the APK
- Open your device's File Manager app
- Navigate to the folder where you saved the APK (usually Downloads)
- Tap on app-release.apk
- Review the permissions requested by the app:
- Storage (for saving/loading images)
- Camera (for taking photos)
- Microphone (for voice commands)
- Internet (for cloud AI features)
- Tap Install
- Wait for installation to complete (may take 30-60 seconds due to ML models)
- Tap Open to launch Kortex, or find it in your app drawer
Step 4: Grant Runtime Permissions
When you first use certain features, Android will ask for permissions:
- Storage/Photos: Required to edit images from your gallery
- Camera: Required to take new photos (optional)
- Microphone: Required for voice commands (optional)
- Internet: Required for cloud AI features (optional - app works offline)
The application uses the repository pattern for data management in conjunction with the MVVM (Model View ViewModel) architecture pattern:
┌─────────────────────────────────────────────────────────────┐
│ View Layer │
│ (Jetpack Compose UI Components) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │PhotoEditor │ │ Adjust │ │ Background │ │
│ │Screen │ │ Screen │ │ Removal │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└────────────┬────────────────────────────────────────────────┘
│
│ User Actions / State Observation
▼
┌─────────────────────────────────────────────────────────────┐
│ ViewModel Layer │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ PhotoEditorViewModel │ │
│ │ • Manages UI state │ │
│ │ • Handles user interactions │ │
│ │ • Coordinates with repositories and executors │ │
│ └──────────────────────────────────────────────────────┘ │
└────────────┬────────────────────────────────────────────────┘
│
│ Data Requests / Commands
▼
┌─────────────────────────────────────────────────────────────┐
│ Data/Model Layer │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Repositories │ │ ML Executors │ │ Utilities │ │
│ │ │ │ │ │ │ │
│ │ • Retouch │ │ • LaMa │ │ • Image │ │
│ │ • CloudEdit │ │ • SAM │ │ • Watermark │ │
│ │ │ │ • AutoEnhance│ │ • Font │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
User Input
│
▼
┌───────────────────┐
│ Compose UI │
│ Components │
└────────┬──────────┘
│ Events
▼
┌───────────────────────────┐
│ PhotoEditorViewModel │
│ • State Management │
│ • Business Logic │
└─────┬─────────────────────┘
│
├──────────────────┬──────────────────┬────────────────┐
▼ ▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌────────────┐ ┌────────────┐
│ On-Device │ │ Cloud │ │ GPU Image │ │ Local │
│ ML Models │ │ APIs │ │ Processing │ │ Storage │
│ │ │ │ │ │ │ │
│ • LaMa │ │ • Retouch │ │ • Adjust │ │ • Files │
│ • EdgeSAM │ │ • SmartFill │ │ • Filters │ │ • Cache │
│ • AutoEnhc. │ │ • Sticker │ │ │ │ │
│ • Vosk │ │ Harmonize │ │ │ │ │
└─────────────┘ └─────────────┘ └────────────┘ └────────────┘
Language and Framework
- Kotlin 2.0.21
- Android SDK (Min SDK 24, Target SDK 34, Compile SDK 36)
- Jetpack Compose (Material3)
Build System
- Gradle 8.13.1 with Kotlin DSL
- Android Gradle Plugin 8.13.1
Architecture Components
- Lifecycle ViewModel Compose 2.7.0
- Kotlin Coroutines with Dispatchers
- StateFlow for reactive state management
Networking
- Retrofit 2.11.0 (REST API communication)
- OkHttp 4.12.0 (HTTP client with logging interceptor)
- Gson Converter 2.11.0 (JSON serialization)
Machine Learning
- ONNX Runtime Android 1.17.0 (On-device inference)
- Vosk Android 0.3.32 (Offline speech recognition)
- JNA 5.13.0 (Java Native Access)
Image Processing
- Coil 2.5.0 (Async image loading)
- GPUImage 2.1.0 (GPU-accelerated filters)
- ExifInterface 1.3.7 (Image metadata handling)
UI and Permissions
- Material Icons Extended
- Accompanist Permissions 0.32.0
The application uses multiple ONNX format neural network models that run entirely on the device without requiring internet connectivity. These models are stored in the assets folder and loaded at runtime.
File: lama_fp32.onnx
Purpose: Advanced image inpainting for object removal and content aware fill
Architecture:
Input: image (1x3x512x512) + mask (1x1x512x512)
│
▼
┌──────────────────────┐
│ Fast Fourier Conv │
│ Encoder-Decoder │
└──────────┬───────────┘
▼
Output: inpainted_image (1x3x512x512)
Technical Specifications:
- Input Image Shape: [1, 3, 512, 512]
- Input Mask Shape: [1, 1, 512, 512]
- Output Shape: [1, 3, 512, 512]
- Precision: FP32
- Acceleration: NNAPI hardware acceleration when available (Android 8+)
Processing Pipeline:
Original Image → Resize to 512x512 → Normalize to [0,1]
│
Mask Image → Resize to 512x512 → Dilate (10px) → Binary threshold
│
┌───────────────────────┴─────────────────┐
▼ ▼
Image Tensor (CHW format) Mask Tensor
│ │
└──────────────┬──────────────────────────┘
▼
LaMa ONNX Inference
│
▼
Inpainted Result
│
▼
Denormalize → Resize to original size
Use Cases:
- Removing objects from pictures
- Cleaning up unwanted elements
- Manual mask-based inpainting
- Background object removal
Files: sam_encoder.onnx and sam_decoder.onnx
Purpose: Interactive image segmentation with point based selection
Two-Stage Architecture:
Stage 1: Encoder (Heavy, Run Once)
──────────────────────────────────
Input Image (1x3x1024x1024)
│
▼
┌───────────────────┐
│ Vision Transform │
│ Encoder │
└────────┬──────────┘
▼
Image Embeddings (1x256x64x64)
│
└──────► Cache for reuse
- User clicks somewhere or creates a bounding box
Stage 2: Decoder (Lightweight, Interactive)
───────────────────────────────────────────
Cached Embeddings + Point Coords + Labels
│
▼
┌───────────────────────┐
│ Mask Decoder │
│ + Prompt Encoder │
└───────────┬───────────┘
▼
Segmentation Masks
(1x1x256x256) × 4 variants
│
▼
IoU Scores (1x4)
Technical Specifications:
Encoder:
- Input Shape: [1, 3, 1024, 1024]
- Output Shape: [1, 256, 64, 64]
- Execution: Once per image
Decoder:
- Inputs:
- Image Embeddings: [1, 256, 64, 64]
- Point Coordinates: [1, N, 2]
- Point Labels: [1, N] (1=foreground, 0=background, -1=padding)
- Mask Input: [1, 1, 256, 256] (optional)
- Has Mask Input: [1] (boolean)
- Original Image Size: [2] (height, width)
- Outputs:
- Masks: [1, 4, 256, 256]
- IoU Predictions: [1, 4]
Processing Flow:
User loads image
│
▼
Run Encoder (slow, ~1-2 seconds)
│
▼
Cache embeddings in memory
│
▼
User taps on object ◄─────┐
│ │
▼ │
Transform tap coords │
to model space (1024x1024)│
│ │
▼ │
Run Decoder (fast, <100ms)│
│ │
▼ │
Select best mask by IoU │
│ │
▼ │
Resize to original size │
│ │
▼ │
Display segmentation │
│ │
└─────────────────────┘
(User can tap again)
Use Cases:
- Background removal with tap selection
- Object isolation
- Quick mask generation
- Interactive segmentation
Files: analyzer_8param_v2.onnx and hdrnet_fixer_safe.onnx
Purpose: Automatic image quality analysis and parameter extraction
Dual-Model System:
Model 1: Analyzer (Parameter Extraction)
────────────────────────────────────────
Input Image (1x3x256x256)
│
▼
┌──────────────────────┐
│ MobileViT Backbone │
│ + Analysis Head │
└──────────┬───────────┘
│
├──► Edit Parameters (1x8)
│ [exposure, contrast, saturation,
│ brightness, highlights, shadows,
│ temperature, sharpness]
│
└──► Rationale Logits (1x4)
[underexposed, overexposed,
unsaturated, good]
Model 2: Fixer (Parameter Application)
───────────────────────────────────────
Original Image + Parameters
│
▼
┌──────────────────────┐
│ HDRNet Architecture │
│ Bilateral Grid │
└──────────┬───────────┘
▼
Enhanced Image
Technical Specifications:
Analyzer:
- Input Shape: [1, 3, 256, 256]
- Outputs:
- Parameters: [1, 8] float values
- Rationale: [1, 4] classification logits
- Parameter Ranges: Typically [-1, 1] or [0, 2]
Parameter Mapping:
Index 0: Exposure → Exposure adjustment
Index 1: Contrast → Local contrast
Index 2: Saturation → Color intensity
Index 3: Brightness → Overall Brightness
Index 4: Highlights → Bright region control
Index 5: Shadows → Dark region control
Index 6: Temperature → White balance (cool/warm)
Index 7: Sharpness → Edge enhancement
Analysis Workflow:
Input Image
│
▼
Resize to 256x256
│
▼
Normalize to [0,1]
│
▼
Run Analyzer Model
│
├──► Extract 8 parameters
│ │
│ ▼
│ Map to adjustment sliders
│ │
│ ▼
│ Apply via GPU filters
│
└──► Parse rationale
│
▼
Display diagnosis
[Softmax classification]
Use Cases:
- Automatic image enhancement suggestions
- Quality analysis
- Parameter extraction for manual adjustment
Directory: vosk-model-small-en-us-0.15/
Purpose: Offline voice command recognition for hands free editing
Model Type: language model (Kaldi based)
Architecture Overview:
Audio Input (16kHz PCM)
│
▼
┌─────────────────────┐
│ Audio Preprocessing│
│ • Framing │
│ • Feature Extract │
│ • MFCC/Fbank │
└─────────┬───────────┘
▼
┌─────────────────────┐
│ Acoustic Model │
│ (DNN/TDNN) │
└─────────┬───────────┘
▼
┌─────────────────────┐
│ Language Model │
│ (N-gram/RNNLM) │
└─────────┬───────────┘
▼
Transcribed Text
Technical Specifications:
- Sample Rate: 16,000 Hz
- Model Size: Small (41MB approx)
- Language: English (US)
- Latency: Real-time streaming
- Output: JSON with partial and final results
Recognition Flow:
User presses mic button
│
▼
Request RECORD_AUDIO permission
│
▼
Initialize Vosk Model (if first time)
│
▼
Start SpeechService
│
▼
Audio Stream ──┐
│
┌──────────▼─────────────┐
│ Continuous Recognition │
│ • Partial results │
│ • Final results │
│ • Auto-stop (5s) │
└──────────┬─────────────┘
│
▼
Update chat interface
│
▼
Send to API or execute command
Use Cases:
- Voice activated controls to "make it brighter"
- Retouching based on instructions
- Hands free operation
- Accessibility features
Initialization Strategy:
Application Start
│
▼
MainActivity.onCreate()
│
▼
User selects feature
│
▼
Lazy initialization of required model
│
├──► Copy from assets to cache (if needed)
│
├──► Check device capabilities
│ • NNAPI availability
│ • GPU acceleration
│ • Memory constraints
│
├──► Configure OrtSession.SessionOptions
│ • OptLevel.ALL_OPT
│ • Add NNAPI provider (if available)
│
└──► Create OrtSession
│
└──► Model ready for inference
Memory Management:
┌──────────────────────────────────┐
│ Model Lifecycle │
├──────────────────────────────────┤
│ │
│ Feature Activated │
│ │ │
│ ▼ │
│ Load model to memory │
│ │ │
│ ▼ │
│ Keep in memory during use │
│ │ │
│ ▼ │
│ User exits feature │
│ │ │
│ ▼ │
│ Release tensors │
│ │ │
│ ▼ │
│ Session persists (reusable) │
│ │ │
│ ▼ │
│ App background/destroy │
│ │ │
│ ▼ │
│ Full cleanup │
│ │
└──────────────────────────────────┘
- Fine-tune exposure, brightness, and contrast
- Control highlights and shadows
- Adjust saturation, vibrance, and hue
- Set temperature/white balance
- Enhance sharpness and texture
- Smooth, real-time slider preview
- One-tap subject selection using edgeSAM
- Automatic background cutout
- Manual brush-based refinement
- Instant background replacement
- Smart edge clean-up
- Tap to remove unwanted objects
- LaMa powered content aware fill
- Manual mask painting
- Multi object removal
- Auto-dilation for smooth blending
- Edit using natural instructions (“make it warmer”)
- Reference image style transfer
- Offline voice commands
- Parameter extraction & visualization
- Smart filtering of relevant edits
- Fill or extend images using custom prompts
- Adjustable vibe/style strength
- AI-powered background replacement
- Automatic lighting/color harmonization
- AI generated content watermark
- Select objects with edgeSAM
- Drag to reposition anywhere
- Automatic inpainting of original area
- Harmonized shadows + lighting
- Optional manual mask refinement
- Popular aspect ratios (1:1, 4:3, 16:9, etc.)
- Freeform cropping
- Corner based perspective correction
- Rotation with angle display
- Grid overlay for precision
- Add customizable text
- Multiple font options
- Color, opacity, and size control
- Rotate, scale, and reposition
- Shadow effects
- Add visible watermarks
- Hidden LSB steganographic watermarking
- Optional “Edited by AI” stamps
- Full font & size customization
- Brush based local adjustments
- Paint to select regions
- Adjustable brush size
- Preview before applying
1. Smart Fill API
- Endpoint for generative inpainting
- Prompt based content generation
- Configurable strength parameters
2. Harmonization API
- Blends moved objects naturally
- Matches lighting and color
3. AI Retouch API
- Reference based style transfer
- Instruction based parameter extraction
App Launch
│
▼
MainActivity.onCreate()
│
├──► Initialize Compose UI
│
└──► Create PhotoEditorViewModel
│
▼
PhotoEditorScreen (Idle State)
│
│ User Action
▼
┌────────────────────────┐
│ Image Selection │
│ • Gallery Picker │
│ • Camera Capture │
└────────┬───────────────┘
│
▼
Load Image → Update ViewModel State
│
▼
┌────────────────────────────────────┐
│ Main Editor View │
│ │
│ ┌──────────────┐ ┌────────────┐ │
│ │ Image Canvas│ │Side Panel │ │
│ │ │ │ │ │
│ │ • Display │ │• Adjust │ │
│ │ • Interact │ │• AI Tools │ │
│ │ • Transform │ │• Effects │ │
│ └──────────────┘ └────────────┘ │
│ │
│ ┌──────────────────────────────┐ │
│ │ Top App Bar │ │
│ │ • Undo/Redo │ │
│ │ • Save/Export │ │
│ └──────────────────────────────┘ │
└────────────────────────────────────┘
User selects feature from Side Panel
│
┌───────────┴────────────┬─────────────────┐
▼ ▼ ▼
Adjust Mode AI Tool Mode Transform Mode
│ │ │
▼ ▼ ▼
Load AdjustScreen Initialize Model Enter Crop/Rotate
│ │ │
▼ ▼ ▼
Display sliders Wait for user input Interactive overlay
│ │ │
▼ ▼ ▼
Real-time preview Process inference Preview transform
│ │ │
▼ ▼ ▼
Apply on confirm Update image Apply on confirm
Original Image URI
│
▼
Load to Bitmap
│
▼
┌───────────────────────────────────┐
│ Processing Selection │
├───────────────────────────────────┤
│ │
│ ┌─────────────────────────────┐ │
│ │ On-Device Processing │ │
│ │ │ │
│ │ • Adjustments (GPU) │ │
│ │ • LaMa Inpainting │ │
│ │ • edgeSAM Segmentation │ │
│ │ • Auto Enhance │ │
│ └──────────┬──────────────────┘ │
│ │ │
│ ▼ │
│ Process locally │
│ │ │
│ ▼ │
│ Result Bitmap │
│ │
│ ┌─────────────────────────────┐ │
│ │ Cloud Processing │ │
│ │ │ │
│ │ • Smart Fill │ │
│ │ • Harmonization │ │
│ │ • AI Retouch │ │
│ └──────────┬──────────────────┘ │
│ │ │
│ ▼ │
│ Upload via Retrofit │
│ │ │
│ ▼ │
│ Wait for response │
│ │ │
│ ▼ │
│ Download result │
└───────────┬─────────────────────┬─┘
│ │
▼ ▼
Save to cache Update UI
│ │
▼ ▼
Generate URI Display result
User Interaction
│
▼
Event Handler in Composable
│
▼
Call ViewModel method
│
▼
┌────────────────────────────────┐
│ PhotoEditorViewModel │
│ │
│ 1. Validate action │
│ 2. Update state variables │
│ 3. Trigger coroutine │
│ 4. Launch processing │
│ 5. Handle result │
│ 6. Update state again │
└────────┬───────────────────────┘
│
▼
State changes trigger recomposition
│
▼
UI updates automatically
│
▼
User sees result
Initial State: currentImageUri = null, history = []
│
▼
User loads image → history = [uri1]
│
▼
User applies adjustment → history = [uri1, uri2]
│
▼
User applies crop → history = [uri1, uri2, uri3]
│
▼
User clicks Undo
│
▼
Pop from history → Display uri2
│
▼
User clicks Undo again
│
▼
Pop from history → Display uri1
│
▼
User applies new edit
│
▼
Clear forward history → history = [uri1, uri4]
Kortex/
│
├── app/
│ ├── build.gradle.kts Build configuration
│ ├── proguard-rules.pro Code obfuscation rules
│ │
│ └── src/
│ ├── main/
│ │ ├── AndroidManifest.xml
│ │ │
│ │ ├── java/test1/example/finalapp/
│ │ │ │
│ │ │ ├── MainActivity.kt Main entry point
│ │ │ ├── PhotoEditorViewModel.kt Central state manager
│ │ │ ├── LamaExecutor.kt LaMa model executor
│ │ │ ├── ZimExecutor.kt edgeSAM model executor
│ │ │ │
│ │ │ ├── View/
│ │ │ │ ├── screens/
│ │ │ │ │ └── PhotoEditorScreen.kt Main editor UI
│ │ │ │ │
│ │ │ │ └── components/
│ │ │ │ ├── BackgroundRemovalScreen.kt
│ │ │ │ ├── MoveObjectPlacementScreen.kt
│ │ │ │ ├── ManualMaskPainter.kt
│ │ │ │ ├── CropComponents.kt
│ │ │ │ ├── AspectRatioCropEngine.kt
│ │ │ │ ├── CornerCropEngine.kt
│ │ │ │ ├── RotationEngine.kt
│ │ │ │ ├── WatermarkComponents.kt
│ │ │ │ ├── LSBWatermarkScreen.kt
│ │ │ │ ├── TextSticker.kt
│ │ │ │ ├── TextEditingScreen.kt
│ │ │ │ ├── SidePanel.kt
│ │ │ │ ├── TopAppBar.kt
│ │ │ │ └── GenerativeFillDialog.kt
│ │ │ │
│ │ │ ├── ui/
│ │ │ │ ├── theme/
│ │ │ │ │ ├── Color.kt
│ │ │ │ │ ├── Theme.kt
│ │ │ │ │ ├── Type.kt
│ │ │ │ │ └── AppTheme.kt
│ │ │ │ │
│ │ │ │ └── adjust/
│ │ │ │ ├── AdjustSheetComposable.kt
│ │ │ │ └── AdjustIntegrationExample.kt
│ │ │ │
│ │ │ ├── component/
│ │ │ │ └── adjust/
│ │ │ │ ├── AdjustEngine.kt GPU filter engine
│ │ │ │ ├── AdjustViewModel.kt
│ │ │ │ ├── AdjustParams.kt
│ │ │ │ ├── PreviewRenderer.kt
│ │ │ │ ├── CurveEditorView.kt
│ │ │ │ └── ToneCurveData.kt
│ │ │ │
│ │ │ ├── ml/
│ │ │ │ └── AutoEnhanceExecutor.kt Auto enhance model
│ │ │ │
│ │ │ ├── audio/
│ │ │ │ └── OfflineSpeechManager.kt Vosk integration
│ │ │ │
│ │ │ ├── data/
│ │ │ │ ├── api/
│ │ │ │ │ ├── RetrofitClient.kt
│ │ │ │ │ ├── RetouchApiService.kt
│ │ │ │ │ ├── CloudEditNetwork.kt
│ │ │ │ │ └── CloudEditApiService.kt
│ │ │ │ │
│ │ │ │ ├── repository/
│ │ │ │ │ ├── RetouchRepository.kt
│ │ │ │ │ └── CloudEditRepository.kt
│ │ │ │ │
│ │ │ │ └── model/
│ │ │ │ ├── RetouchModels.kt
│ │ │ │ └── AdjustState.kt
│ │ │ │
│ │ │ ├── utils/
│ │ │ │ ├── ImageUtils.kt Image operations
│ │ │ │ ├── AIWatermarkHelper.kt Watermarking
│ │ │ │ ├── LSBWatermarkUtil.kt Steganography
│ │ │ │ ├── RetouchUtils.kt
│ │ │ │ └── FontManager.kt
│ │ │ │
│ │ │ └── model/
│ │ │ └── AiEditState.kt
│ │ │
│ │ ├── assets/
│ │ │ ├── lama_fp32.onnx LaMa model
│ │ │ ├── sam_encoder.onnx edgeSAM encoder
│ │ │ ├── sam_decoder.onnx edgeSAM decoder
│ │ │ ├── analyzer_8param_v2.onnx Auto enhance analyzer
│ │ │ ├── hdrnet_fixer_safe.onnx Auto enhance fixer
│ │ │ ├── vosk-model-small-en-us-0.15/ Speech model
│ │ │ └── fonts/ Custom fonts
│ │ │
│ │ └── res/
│ │ ├── values/
│ │ ├── drawable/
│ │ ├── mipmap/
│ │ └── xml/
│ │
│ ├── androidTest/ Instrumented tests
│ └── test/ Unit tests
│
├── gradle/
│ ├── libs.versions.toml Dependency catalog
│ └── wrapper/
│ ├── gradle-wrapper.jar
│ └── gradle-wrapper.properties
│
├── build.gradle.kts Root build script
├── settings.gradle.kts Project settings
├── gradle.properties Gradle configuration
├── gradlew Gradle wrapper (Unix)
├── gradlew.bat Gradle wrapper (Windows)
└── local.properties Local SDK path
Before setting up the project, ensure you have the following installed:
1. Development Environment
- Android Studio Iguana (2023.2.1) or newer
- JDK 11 or higher
- Minimum 8 GB RAM (16 GB recommended)
- At least 10 GB free disk space
2. Android SDK
- Android SDK Platform 34
- Android SDK Build-Tools 34.0.0 or higher
- Android SDK Platform-Tools
- Android Emulator (if testing on emulator)
3. Git
- Git version control system
extract the content of the zip file and take out the Kortex directory
1. Create local.properties file
Create a file named local.properties in the root directory with your Android SDK path:
sdk.dir=C:\\Users\\YourUsername\\AppData\\Local\\Android\\Sdk
On macOS/Linux:
sdk.dir=/Users/YourUsername/Library/Android/sdk
2. Verify Model Files
Ensure all ONNX model files are present in app/src/main/assets/:
- lama_fp32.onnx
- sam_encoder.onnx
- sam_decoder.onnx
- analyzer_8param_v2.onnx
- hdrnet_fixer_safe.onnx
- vosk-model-small-en-us-0.15/ (directory with model files)
3. Configure API Endpoints (Optional)
If using cloud features, update the API base URLs in:
app/src/main/java/test1/example/finalapp/data/api/RetrofitClient.ktapp/src/main/java/test1/example/finalapp/data/api/CloudEditNetwork.kt
Open the project in Android Studio and let Gradle sync automatically. If it doesn't start:
- Click "File" > "Sync Project with Gradle Files"
- Wait for dependencies to download
- Resolve any errors that appear
All dependencies are managed through Gradle and will be downloaded automatically during sync. Key dependencies include:
- Jetpack Compose libraries
- ONNX Runtime Android
- Retrofit and OkHttp
- GPUImage
- Vosk Android
- Coil image loader
The project supports two build variants:
Debug Build
- Includes debugging symbols
- Logging enabled
- No code obfuscation
- Faster build times
Release Build
- Code optimization enabled
- ProGuard rules applied
- Smaller APK size
- Production-ready
1. Create/Start Emulator
In Android Studio:
- Tools > Device Manager
- Create a new virtual device or start existing one
- Recommended: Pixel 5 with API 34 (Android 14)
- Enable "Hardware" for Graphics (for GPU acceleration)
2. Run the App
Click the "Run" button (green play icon)
or
Select Run > Run 'app'
or
Press Shift+F10 (Windows/Linux) or Control+R (macOS)
1. Enable Developer Options
On your Android device:
- Go to Settings > About Phone
- Tap "Build Number" 7 times
- Go back to Settings > System > Developer Options
- Enable "USB Debugging"
2. Connect Device
- Connect device via USB
- Accept USB debugging prompt on device
- Device should appear in Android Studio device dropdown
3. Run the App
Select your device from the dropdown and click Run
Debug APK
./gradlew assembleDebug
Output location: app/build/outputs/apk/debug/app-debug.apk
Release APK
./gradlew assembleRelease
Output location: app/build/outputs/apk/release/app-release.apk
Note: Release builds require signing configuration
Windows
gradlew.bat clean
gradlew.bat assembleDebug
Out of Memory Error
Increase heap size in gradle.properties:
org.gradle.jvmargs=-Xmx4096m -Dfile.encoding=UTF-8
Duplicate Class Errors
The project already handles libc++_shared.so conflicts in build.gradle.kts with pickFirsts configuration
Model Loading Failures
Ensure all ONNX files are in the assets folder and not compressed by adding to build.gradle.kts:
android {
aaptOptions {
noCompress "onnx"
}
}
NNAPI Errors
Some devices may not support NNAPI. The code gracefully falls back to CPU execution
Device
- Android 7.0 (API 24) or higher
- 3 GB RAM
- 1 GB free storage
- ARMv7 or ARM64 processor
Permissions
- READ_EXTERNAL_STORAGE
- WRITE_EXTERNAL_STORAGE (Android 9 and below)
- READ_MEDIA_IMAGES (Android 13+)
- RECORD_AUDIO (for voice commands)
- CAMERA (for camera capture)
- INTERNET (for cloud features)
Device
- Android 12.0 (API 31) or higher
- 6 GB RAM or more
- 2 GB free storage
- ARM64 processor
- GPU with OpenGL ES 3.0 or higher
- NNAPI support for hardware acceleration
Performance Notes
Model inference times (approximate, varies by device):
Budget Device (Snapdragon 600 series):
├── LaMa Inpainting: 3-5 seconds
├── edgeSAM Encoder: 2-3 seconds
├── edgeSAM Decoder: 200-300 ms
└── Auto Enhance: 1-2 seconds
Mid-range Device (Snapdragon 700 series):
├── LaMa Inpainting: 1-2 seconds
├── edgeSAM Encoder: 1-1.5 seconds
├── edgeSAM Decoder: 100-150 ms
└── Auto Enhance: 500-800 ms
Flagship Device (Snapdragon 8 series):
├── LaMa Inpainting: 0.5-1 second
├── edgeSAM Encoder: 0.5-0.8 seconds
├── edgeSAM Decoder: 50-80 ms
└── Auto Enhance: 200-400 ms
App Size
- APK: Approximately 150-200 MB
- ONNX Models: ~180 MB
- Vosk Model: ~50 MB
- Code and Resources: ~20 MB
Runtime Storage
- Cache: 50-500 MB (varies with usage)
- Temporary files: 100-1000 MB (high-resolution editing)
- User images: Depends on usage
Optional (for cloud features)
- Stable internet connection
- Recommended: 5 Mbps or higher
- Cloud API access
Fully Functional Offline
- All core editing features
- On-device AI models
- Voice recognition
- No internet required for basic operation