Releases · Liquid4All/on-prem-stack

12 May 06:48

tuliren

0.0.1b0

c617b9d

`liquidai-cli@0.0.1b0` Pre-release

Pre-release

0.0.1b0

Assets 2

04 May 15:11

tuliren

1b-6gb@0.0.2

8fa9dc0

LFM-1B-6GB@v0.0.2 Latest

Latest

Summary

This is the stack for LFM-1B that can run with 6GB GPU memory.

How to run for the first time

Download Source code (zip) below.
Unzip the file into an empty folder.
Run launch.sh.

How to upgrade

In .env, make these updates:

Variable	Value
`STACK_VERSION`	`2685ff757d-0312`
`MODEL_IMAGE`	`liquidai/lfm-1b-e:0.0.1`

In docker-compose.yaml, make these changes:

Argument	Value
`--max-model-len`	`"2048"`
`--max-seq-len-to-capture`	`"2048"`
`--max-num-seqs`	`"100"`

Then run launch.sh.

If the model container is throwing out-of-memory error, further decrease these arguments, keep max-seq-len-to-capture the same as max-model-len, and run launch.sh to retry.

How to test

After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
- This script will trigger a smoke test to verify that the inference server is running correctly.
Visit 0.0.0.0:3000 and chat with the model in a web UI.

Full Changelog: https://github.com/Liquid4All/on-prem-stack/compare/1b-6gb@0.0.1...1b-6gb@0.0.2

Assets 2

28 Mar 08:05

tuliren

0.2.0

92ce553

v0.2.0

Summary

Important

This version has breaking change. Now web, python-api, and vllm services each can have a different Docker image version. It will automatically upgrade web and python-api to the latest version. vllm will remain upgradable only manually.

What's Changed

Support VLM by @tuliren in #18
Add local image as default vlm input by @tuliren in #19
Mount local files for checkpoint by @tuliren in #20
Use separate image versions by @tuliren in #21

Full Changelog: 0.1.0...0.2.0

Contributors

tuliren

Assets 2

13 Mar 22:04

tuliren

0.1.0

035546e

v0.1.0

Summary

Important

This version has breaking changes. The on-prem stack now needs a config.yaml to run properly.

What's Changed

Get default model from yaml config by @tuliren in #17

Full Changelog: 0.0.4...0.1.0

Contributors

tuliren

Assets 2

13 Mar 22:01

tuliren

0.0.4

1211bae

v0.0.4

What's Changed

Add script to run vLLM for any HF model by @tuliren in #5
Run local checkpoint by @tuliren in #6
Pass in HF token for gated repository by @tuliren in #7
Launch model checkpoint with one parameter by @tuliren in #8
Add more vLLM launch parameters by @tuliren in #9
Support 7B models by @tuliren in #11
Use fixed database password and api key by @tuliren in #12
Update service dependencies by @tuliren in #13

Full Changelog: 0.0.3...0.0.4

Contributors

tuliren

Assets 2

18 Jan 07:38

tuliren

lfm-3b-jp@0.0.3

b276d59

LFM-3B-JP v0.0.3

Summary

This is the stack for LFM-3B-JP. Two models are available: lfm-3b-jp and lfm-3b-ichikara.

How to run for the first time

Download Source code (zip) below.
Unzip the file into an empty folder.
Run launch.sh.

Models

Currently, each on-prem stack can only run one model at a time. The launch script runs lfm-3b-jp by default. To switch models, run .switch-model.sh and select the desired model to run. The script will then stop the current model and start the newly chosen model.

Update

To update the stack, change STACK_VERSION and MODEL_IMAGE in the .env file and run the launch script again.

How to test

After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
- This script will trigger a smoke test to verify that the inference server is running correctly.
Visit 0.0.0.0:3000 and chat with the model in a web UI.

Assets 2

18 Jan 06:59

tuliren

lfm-3b-jp@0.0.2

24f3f54

LFM-3B-JP v0.0.2

Summary

This is the stack for LFM-3B-JP.

How to run for the first time

Download Source code (zip) below.
Unzip the file into an empty folder.
Run launch.sh.

Models

Currently, each on-prem stack can only run one model at a time. The launch script runs lfm-3b-jp by default. To switch models, change MODEL_IMAGE in the .env file according to table below, and run ./launch.sh again.

Model Image
`liquidai/lfm-3b-jp:0.0.1-e`
`liquidai/lfm-3b-ichikara:0.0.1-e`

Update

To update the stack, change STACK_VERSION and MODEL_IMAGE in the .env file and run the launch script again.

How to test

After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
- This script will trigger a smoke test to verify that the inference server is running correctly.
Visit 0.0.0.0:3000 and chat with the model in a web UI.

Assets 2

18 Dec 03:16

tuliren

lfm-3b-jp@0.0.1

2a05ed4

LFM-3B-JP v0.0.1

Summary

This is the stack for LFM-3B-JP.

How to run for the first time

Download Source code (zip) below.
Unzip the file into an empty folder.
Run launch.sh.

Models

Currently, each on-prem stack can only run one model at a time. The launch script runs lfm-3b-jp by default. To switch models, change MODEL_NAME and MODEL_IMAGE in the .env file according to table below, and run ./launch.sh again.

Model Name	Model Image
`lfm-3b-jp`	`liquidai/lfm-3b-jp:0.0.1-e`
`lfm-3b-ichikara`	`liquidai/lfm-3b-ichikara:0.0.1-e`

Update

To update the stack, change STACK_VERSION and MODEL_IMAGE in the .env file and run the launch script again.

How to test

After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
- This script will trigger a smoke test to verify that the inference server is running correctly.
Visit 0.0.0.0:3000 and chat with the model in a web UI.

Assets 2

11 Dec 09:35

tuliren

1b-6gb@0.0.1

50899db

LFM-1B-6GB @ v0.0.1

Summary

This is the stack for LFM-1B that can run with 6GB GPU memory.

How to run for the first time

Download Source code (zip) below.
Unzip the file into an empty folder.
Run launch.sh.

How to upgrade

In .env, make these updates:

Variable	Value
`STACK_VERSION`	`2685ff757d`
`MODEL_IMAGE`	`liquidai/lfm-1be:0.0.1`

In docker-compose.yaml, make these changes:

Argument	Value
`--max-model-len`	`"2048"`
`--max-seq-len-to-capture`	`"2048"`
`--max-num-seqs`	`"100"`

Then run launch.sh.

If the model container is throwing out-of-memory error, further decrease these arguments, keep max-seq-len-to-capture the same as max-model-len, and run launch.sh to retry.

How to test

After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
- This script will trigger a smoke test to verify that the inference server is running correctly.
Visit 0.0.0.0:3000 and chat with the model in a web UI.

Assets 2

26 Nov 10:30

tuliren

0.0.3

0a45380

v0.0.3

How to run for the first time

Download Source code (zip) below.
Unzip the file into an empty folder.
Run launch.sh.

How to upgrade

Download Source code (zip) below.
Unzip the file into the current deployment folder, overwriting all existing files.
Make sure to keep the existing .env file.
In the .env file, update the STACK_VERSION to 2b3f969864, and MODEL_IMAGE to liquidai/lfm-3be:0.0.6.
- Please note that all previous versions have been removed.
Run launch.sh.

How to test

After running launch.sh, wait up to 2 min for model initialization, and run test-api.sh.
- This script will trigger a smoke test to verify that the inference server is running correctly.
Visit 0.0.0.0:3000 and chat with the model in a web UI.

What's Changed

More robust Python backend.
Updated 3B model.

Full Changelog: 0.0.2...0.0.3

Assets 2

Releases: Liquid4All/on-prem-stack

`liquidai-cli@0.0.1b0`

Uh oh!

LFM-1B-6GB@v0.0.2

Summary

How to run for the first time

How to upgrade

How to test

Uh oh!

v0.2.0

Summary

What's Changed

Contributors

Uh oh!

v0.1.0

Summary

What's Changed

Contributors

Uh oh!

v0.0.4

What's Changed

Contributors

Uh oh!

LFM-3B-JP v0.0.3

Summary

How to run for the first time

Models

Update

How to test

Uh oh!

LFM-3B-JP v0.0.2

Summary

How to run for the first time

Models

Update

How to test

Uh oh!

LFM-3B-JP v0.0.1

Summary

How to run for the first time

Models

Update

How to test

Uh oh!

LFM-1B-6GB @ v0.0.1

Summary

How to run for the first time

How to upgrade

How to test

Uh oh!

v0.0.3

How to run for the first time

How to upgrade

How to test

What's Changed

Uh oh!