
5 minutes read
September 2, 2022
Run Machine Learning ONNX models in one line of python
TUTORIAL: How to deploy serverless GPU machine learning on Pipeline.ai
I’ve been working on a new serverless GPU inference feature on Pipeline.ai that I’m proud to be releasing this week.
You’ll be able to deploy ONNX formatted machine learning models with a single line in python and serve API inference calls on GPUs.
The simplicity and cost effectiveness of this service is really *chef’s kiss*. As an ML app developer a year ago, who struggled with deployment and ended up settling for slow serverless cpu infra that had long spin up times, this is my gift to my past self.
I’ll go through a simple example of deploying an ML model for production via ONNX with the pipeline-ai python package in 5 steps:
- Convert your model to ONNX format
- Create an account on pipeline.ai
- Install pipeline-ai python package
- Upload your model
- Make inference API calls
1. ONNX

Open Neural Network Exchange, see: https://onnx.ai/
In its own words, ONNX is:
“an open format built to represent machine learning models”
Developed by Microsoft, it aims to be a cross platform standard for ML models with conversion support for all major ML framework formats. If you’re looking to deploy ML models in production, it’s a really good idea to use ONNX. Not only can you easily get a 3x reduction in model size, ONNX Runtime (ONNX’s inference engine) has built in inference optimisations that can speed up inference as much as 17x.
ONNX is actively maintained (it recently extended its support for transformer models) and has great documentation. Here are examples for converting your own ML models into ONNX format. If you want to read more about ONNX runtime, you can find more info here.
2. Create an account on Pipeline.ai

3. Install pipeline-ai python package

To install the latest version of pipeline-ai, make sure you have pip installed on python version 3.9. Then run the following command in terminal:
1pip install pipeline-ai
4. Upload your model
Finally, we’re ready to do some programming. We’ll need the file path of our downloaded modnet.onnx file from earlier and our API token that we can get from the Pipeline dashboard in: Settings → API Tokens.
1from pipeline import PipelineCloud, onnx_to_pipeline
2
3# This line creates a pipeline from our onnx file
4onnx_pipeline = onnx_to_pipeline("MODNET_FILEPATH")
5
6# Authenticate with PipelineCloud
7api = PipelineCloud(token="YOUR_API_TOKEN")
8
9# Upload pipeline to PipelineCloud
10uploaded_pipeline = api.upload_pipeline(onnx_pipeline)
11# Keep track of the returned pipeline id for making API calls
12print(f"Uploaded pipeline: {uploaded_pipeline.id}")
5. Make inference API calls
The MODNet model removes the background from portrait photos. Here is the result from the example we’re about to run (you can download the starting image of Dr. Mike Levin here):
Portrait photo | Predicted alpha matte | Alpha composite
1import cv2
2import numpy as np
3from PIL import Image
4from pipeline import PipelineCloud
5
6# read image img = cv2.imread('IMAGE_FILEPATH')
7img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
8im_h, im_w, im_c = img.shape
9
10def preprocessing(im):
11 # determines input resolution to MODNet's model
12 ref_size = 512
13 # Get resized dim for MODNet input
14 def get_resize(im_h, im_w, ref_size):
15 if im_w >= im_h:
16 im_rh = ref_size
17 im_rw = int(im_w / im_h * ref_size)
18 elif im_w < im_h:
19 im_rw = ref_size
20 im_rh = int(im_h / im_w * ref_size)
21
22 im_rw = im_rw - im_rw % 32
23 im_rh = im_rh - im_rh % 32
24
25 return im_rw, im_rh
26
27 # unify image channels to 3
28 if len(im.shape) == 2:
29 im = im[:, :, None]
30 if im.shape[2] == 1:
31 im = np.repeat(im, 3, axis=2)
32 elif im.shape[2] == 4:
33 im = im[:, :, 0:3]
34
35 # normalize values to scale it between -1 to 1
36 im = (im - 127.5) / 127.5
37 # get resize dimensions for MODNet inference
38 x, y = get_resize(im_h, im_w, ref_size)
39
40 # resize image
41 im = cv2.resize(im, (x,y), interpolation = cv2.INTER_AREA)
42 # prepare input shape
43 im = np.transpose(im)
44 im = np.swapaxes(im, 1, 2)
45 im = np.expand_dims(im, axis = 0).astype('float32')
46
47 return im
48
49def post_processing(result,im):
50 matte = (np.squeeze(result) * 255).astype('uint8')
51 # resize matte to original image dim
52 matte = cv2.resize(matte, (im_w, im_h), interpolation = cv2.INTER_AREA)
53
54 def combined_display(image, matte):
55 # calculate display resolution
56 w, h = image.width, image.height
57 rw, rh = 800, int(h * 800 / (3 * w))
58
59 # obtain predicted foreground
60 image = np.asarray(image)
61 if len(image.shape) == 2:
62 image = image[:, :, None]
63 if image.shape[2] == 1:
64 image = np.repeat(image, 3, axis=2)
65 elif image.shape[2] == 4:
66 image = image[:, :, 0:3]
67 matte = np.repeat(np.asarray(matte)[:, :, None], 3, axis=2) / 255
68 foreground = image * matte + np.full(image.shape, 255) * (1 - matte)
69
70 # combine image, foreground, and alpha into one line
71 combined = np.concatenate((image, matte * 255, foreground), axis=1)
72 combined = Image.fromarray(np.uint8(combined)).resize((rw, rh))
73 return combined
74
75 # show composite
76 combined_display(Image.fromarray(im), matte).show()
77
78im = preprocessing(img)
79# authenticate with api
80api = PipelineCloud(token="YOUR_API_TOKEN")
81# inference call
82result_detailed = api.run_pipeline("PIPELINE_ID",[["output"], {"input": im}])
83# get MODNet result without metadata
84result = result_detailed['result_preview']
85# create and show composite
86post_processing(result,img)
None
“output_names” arg to onnxruntime should either be passed in as an empty list or as a list containing the explicit output names strings in PipelineCloud. You won’t have to worry over this in our example but here’s some explanation if you are using your own ONNX model.For more information, you can check out docs here.Conclusion
It’s dangerous to go alone! Take this.MLOps is a dark and treacherous landscape and finding your way through can be a gruelling endeavour. Pipeline.ai builds the tools that make that journey easier.Don’t hesitate to contact us with questions/suggestions. We’re very friendly and always happy to help. GLHF!