Skip to content

Releases: Liquid4All/on-prem-stack

`liquidai-cli@0.0.1b0`

12 May 06:48
c617b9d

Choose a tag to compare

Pre-release
0.0.1b0

LFM-1B-6GB@v0.0.2

04 May 15:11

Choose a tag to compare

Summary

This is the stack for LFM-1B that can run with 6GB GPU memory.

How to run for the first time

  • Download Source code (zip) below.
  • Unzip the file into an empty folder.
  • Run launch.sh.

How to upgrade

In .env, make these updates:

Variable Value
STACK_VERSION 2685ff757d-0312
MODEL_IMAGE liquidai/lfm-1b-e:0.0.1

In docker-compose.yaml, make these changes:

Argument Value
--max-model-len "2048"
--max-seq-len-to-capture "2048"
--max-num-seqs "100"

Then run launch.sh.

If the model container is throwing out-of-memory error, further decrease these arguments, keep max-seq-len-to-capture the same as max-model-len, and run launch.sh to retry.

How to test

  • After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
    • This script will trigger a smoke test to verify that the inference server is running correctly.
  • Visit 0.0.0.0:3000 and chat with the model in a web UI.

Full Changelog: https://github.com/Liquid4All/on-prem-stack/compare/1b-6gb@0.0.1...1b-6gb@0.0.2

v0.2.0

28 Mar 08:05
92ce553

Choose a tag to compare

Summary

Important

This version has breaking change. Now web, python-api, and vllm services each can have a different Docker image version. It will automatically upgrade web and python-api to the latest version. vllm will remain upgradable only manually.

What's Changed

Full Changelog: 0.1.0...0.2.0

v0.1.0

13 Mar 22:04
035546e

Choose a tag to compare

Summary

Important

This version has breaking changes. The on-prem stack now needs a config.yaml to run properly.

What's Changed

  • Get default model from yaml config by @tuliren in #17

Full Changelog: 0.0.4...0.1.0

v0.0.4

13 Mar 22:01

Choose a tag to compare

What's Changed

Full Changelog: 0.0.3...0.0.4

LFM-3B-JP v0.0.3

18 Jan 07:38
b276d59

Choose a tag to compare

Summary

This is the stack for LFM-3B-JP. Two models are available: lfm-3b-jp and lfm-3b-ichikara.

How to run for the first time

  • Download Source code (zip) below.
  • Unzip the file into an empty folder.
  • Run launch.sh.

Models

Currently, each on-prem stack can only run one model at a time. The launch script runs lfm-3b-jp by default. To switch models, run .switch-model.sh and select the desired model to run. The script will then stop the current model and start the newly chosen model.

Update

To update the stack, change STACK_VERSION and MODEL_IMAGE in the .env file and run the launch script again.

How to test

  • After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
    • This script will trigger a smoke test to verify that the inference server is running correctly.
  • Visit 0.0.0.0:3000 and chat with the model in a web UI.

LFM-3B-JP v0.0.2

18 Jan 06:59

Choose a tag to compare

Summary

This is the stack for LFM-3B-JP.

How to run for the first time

  • Download Source code (zip) below.
  • Unzip the file into an empty folder.
  • Run launch.sh.

Models

Currently, each on-prem stack can only run one model at a time. The launch script runs lfm-3b-jp by default. To switch models, change MODEL_IMAGE in the .env file according to table below, and run ./launch.sh again.

Model Image
liquidai/lfm-3b-jp:0.0.1-e
liquidai/lfm-3b-ichikara:0.0.1-e

Update

To update the stack, change STACK_VERSION and MODEL_IMAGE in the .env file and run the launch script again.

How to test

  • After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
    • This script will trigger a smoke test to verify that the inference server is running correctly.
  • Visit 0.0.0.0:3000 and chat with the model in a web UI.

LFM-3B-JP v0.0.1

18 Dec 03:16

Choose a tag to compare

Summary

This is the stack for LFM-3B-JP.

How to run for the first time

  • Download Source code (zip) below.
  • Unzip the file into an empty folder.
  • Run launch.sh.

Models

Currently, each on-prem stack can only run one model at a time. The launch script runs lfm-3b-jp by default. To switch models, change MODEL_NAME and MODEL_IMAGE in the .env file according to table below, and run ./launch.sh again.

Model Name Model Image
lfm-3b-jp liquidai/lfm-3b-jp:0.0.1-e
lfm-3b-ichikara liquidai/lfm-3b-ichikara:0.0.1-e

Update

To update the stack, change STACK_VERSION and MODEL_IMAGE in the .env file and run the launch script again.

How to test

  • After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
    • This script will trigger a smoke test to verify that the inference server is running correctly.
  • Visit 0.0.0.0:3000 and chat with the model in a web UI.

LFM-1B-6GB @ v0.0.1

11 Dec 09:35

Choose a tag to compare

Summary

This is the stack for LFM-1B that can run with 6GB GPU memory.

How to run for the first time

  • Download Source code (zip) below.
  • Unzip the file into an empty folder.
  • Run launch.sh.

How to upgrade

In .env, make these updates:

Variable Value
STACK_VERSION 2685ff757d
MODEL_IMAGE liquidai/lfm-1be:0.0.1

In docker-compose.yaml, make these changes:

Argument Value
--max-model-len "2048"
--max-seq-len-to-capture "2048"
--max-num-seqs "100"

Then run launch.sh.

If the model container is throwing out-of-memory error, further decrease these arguments, keep max-seq-len-to-capture the same as max-model-len, and run launch.sh to retry.

How to test

  • After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
    • This script will trigger a smoke test to verify that the inference server is running correctly.
  • Visit 0.0.0.0:3000 and chat with the model in a web UI.

v0.0.3

26 Nov 10:30

Choose a tag to compare

How to run for the first time

  • Download Source code (zip) below.
  • Unzip the file into an empty folder.
  • Run launch.sh.

How to upgrade

  • Download Source code (zip) below.
  • Unzip the file into the current deployment folder, overwriting all existing files.
  • Make sure to keep the existing .env file.
  • In the .env file, update the STACK_VERSION to 2b3f969864, and MODEL_IMAGE to liquidai/lfm-3be:0.0.6.
    • Please note that all previous versions have been removed.
  • Run launch.sh.

How to test

  • After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
    • This script will trigger a smoke test to verify that the inference server is running correctly.
  • Visit 0.0.0.0:3000 and chat with the model in a web UI.

What's Changed

  • More robust Python backend.
  • Updated 3B model.

Full Changelog: 0.0.2...0.0.3