-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Thanks for this great example!
To help with cold start, I did some experiments with provisioned concurrency, lazy load the transformers module in sentiment and prime the sentiment method if the provisioned concurrency is enabled. This reduced the cold start time to 1 ~ 2 seconds, and the first predict call complete in about 1 second.
def sentiment(payload):
from transformers import pipeline
clf = pipeline("sentiment-analysis", model="model/")
prediction = clf(payload, return_all_scores=True)
# convert list to dict
result = {}
for pred in prediction[0]:
result[pred["label"]] = pred["score"]
return result
# Prime the sentiment function for provisioned concurrency
init_type = os.environ.get("AWS_LAMBDA_INITIALIZATION_TYPE", "on-demand")
if init_type == "provisioned-concurrency":
payload = json.dumps({"fn_index": 0, "data": [
"Running Gradio on AWS Lambda is amazing"], "session_hash": "fpx8ngrma3d"})
sentiment(payload)Metadata
Metadata
Assignees
Labels
No labels
