TransWikia.com

Parallel inference for 1000 different Tensorflow models using the GPU?

Data Science Asked by Daan Klijn on January 12, 2021

I have a 1000 Tensorflow models with the same architecture but having different weights each. For each of these models I would like to do inference, whereas each of the models also receives different inputs (of the same size). This task seemed to me like something that would benefit from the parallelization capabilities of the GPU. Although that is the case I did not find any documentation on doing this kind of parallel inference. Is anyone aware of a technique that would speed up my inference? Perhaps using the GPU?

The most basic form of what I am trying to achieve is this.

dataset = [np.random.rand(100,100) for i in range(1000)]
models = []
for _ in range(1000):
    input_layer = Input(shape=(100,100))
    layer1 = Dense(512)(input_layer)
    output = Dense(1)(layer1)
    models.append(Model(inputs=input_layer, outputs=output))

for i, model in enumerate(models):
    y = model.predict[dataset[i]]

I though it would be beneficial to put all 1000 models into a single big Tensorflow model (see code below). Although this resulted in a speedup, it was only 20%. While GPU utilization is still at 1%.

dataset = [np.random.rand(100,100) for i in range(1000)]
multiple_inputs = []
multiple_outputs = []
for i in range(1000):
    input = Input(shape=(100,100))
    layer1 = Dense(512)(input)
    output = Dense(1)(layer1)
    multiple_inputs.append(input)
    multiple_outputs.append(output)
model = Model(inputs=multiple_inputs, outputs=multiple_outputs)
model.predict(dataset)

Is there a better way do improve the performance?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP