gcloud ml-engine predict is very slow on inference -


i'm testing segmentation model on gcloud , inference incredibly slow. takes 3 min result (averaged on 5 runs). same model runs ~2.5 s on laptop when running through tf-serving. normal? didn't find mention in documentation on how define instance type , seems impossible run inference on gpu. steps i'm using straightforward , follows examples , tutorials:

gcloud ml-engine models create "seg_model" gcloud ml-engine versions create v1 \ --model "seg_model" \ --origin $deployment_source \ --runtime-version 1.2 \ --staging-bucket gs://$bucket_name     gcloud ml-engine predict --model ${model_name} --version v1 --json-instances request.json 

upd: after running more experiments found redirecting output file gets inference time down 27s. model output size 512x512, causes delays on client side. although lower 3 min, still order of magnitude slower tf-serving.


Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -