gcloud ml-engine predict is very slow on inference -

February 15, 2010

i'm testing segmentation model on gcloud , inference incredibly slow. takes 3 min result (averaged on 5 runs). same model runs ~2.5 s on laptop when running through tf-serving. normal? didn't find mention in documentation on how define instance type , seems impossible run inference on gpu. steps i'm using straightforward , follows examples , tutorials:

gcloud ml-engine models create "seg_model" gcloud ml-engine versions create v1 \ --model "seg_model" \ --origin $deployment_source \ --runtime-version 1.2 \ --staging-bucket gs://$bucket_name     gcloud ml-engine predict --model ${model_name} --version v1 --json-instances request.json

upd: after running more experiments found redirecting output file gets inference time down 27s. model output size 512x512, causes delays on client side. although lower 3 min, still order of magnitude slower tf-serving.

Search This Blog

RT

gcloud ml-engine predict is very slow on inference -

Comments

Post a Comment

Popular posts from this blog

python - Selenium remoteWebDriver (& SauceLabs) Firefox moseMoveTo action exception -

html - How to custom Bootstrap grid height? -

Ansible warning on jinja2 braces on when -