AnswerBun.com

Tensorflow MirroredStrategy() looks like it may only be working on one GPU?

Data Science Asked by sectechguy on August 4, 2020

I finally got a computer with 2 gpus, and tested out https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html and https://github.com/tensorflow/models/tree/master/tutorials/image/cifar10_estimator and confirmed that both gpus are being utilized in each(The wattage increases to 160-180 on both, Memory is almost maxed out on both, and GPU-Util increased to about 45% on both at the same time).

So I decided I would try out tensorflow’s MirroredStrategy() on an exitsting neural net I had trained with one GPU in the past.

What I don’t understand is that the wattage increases in both, and the memory is pretty much maxed in both but only one GPU looks like it is being utilized at 98% and the other one just chills at 3%. Am I messing something up in my code? Or is this working as designed?

strategy = tensorflow.distribute.MirroredStrategy()
with strategy.scope():
    model = tensorflow.keras.models.Sequential([
        tensorflow.keras.layers.Dense(units=427, kernel_initializer='uniform', activation='relu', input_dim=853),
        tensorflow.keras.layers.Dense(units=427, kernel_initializer='uniform',activation='relu'),
        tensorflow.keras.layers.Dense(units=1, kernel_initializer='uniform', activation='sigmoid')])
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    model.fit(X_train, y_train, batch_size=1000, epochs=100)

nvidia-smi:

Fri Nov 22 09:26:21 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21       Driver Version: 435.21       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN Xp COLLEC...  Off  | 00000000:0A:00.0 Off |                  N/A |
| 24%   47C    P2    81W / 250W |  11733MiB / 12196MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN Xp COLLEC...  Off  | 00000000:41:00.0  On |                  N/A |
| 28%   51C    P2    64W / 250W |  11736MiB / 12187MiB |      3%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2506      C   python3                                    11721MiB |
|    1      1312      G   /usr/lib/xorg/Xorg                            18MiB |
|    1      1353      G   /usr/bin/gnome-shell                          51MiB |
|    1      1620      G   /usr/lib/xorg/Xorg                           108MiB |
|    1      1751      G   /usr/bin/gnome-shell                          72MiB |
|    1      2506      C   python3                                    11473MiB |
+-----------------------------------------------------------------------------+

One Answer

I'm seeing the same thing, here I enabled just the first 2 GPU's. nvidia-smi

If I force it to use just 1 GPU the memory utilization drops but the amount of time per epoch is unchanged.

enter image description here

Answered by gt1485a on August 4, 2020

Add your own answers!

Related Questions

Input shape of an Xception CNN model

1  Asked on March 16, 2021 by kfx

     

How can a CNN learn colours?

1  Asked on March 15, 2021 by mjimitater

   

Ask a Question

Get help from others!

© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir