RPI cluster performance related to network performance

Computer Science Asked by mozzie on October 21, 2020

I’m writing my thesis and i have built a RPI cluster, containing 10 nodes which consists of RPI model 3b. I’ve them connected to two gigabit switches. I don’t know the CAT of the cables. They are not connected to the Internet, they just live in their private network.

Further more, i’ve calculated the theoretical performance by the formula:

number of cores * average frequency * 16 FLOPs/cycle = x GFLOPS (Got the formula from )

after i applied it, it turns out i should have 76,8 GFLOPS in theory assuming that the source is relaiable and correct. When i benchmark it using HPL 2.1 i’ve only reached little about 7 GFLOPS at best when trying out various variations.

Now to my question: i’ve read up on the RPI model, and it says it only have 10/100 Ethernet. Is that the source of my problem? Seeing i get out GFLOPS from a node, but the network transportation is much less then Gigabit speed which would mean 76,8 GFLOPS * 0,1 Gigabit/second = 7,68 GFLOPS (in theory). Or am i way off the tracks in my thinking?

I really appriciate any help, so i can know if i have to keep working on the configuration for the benchmarking or if i can move on. Also, i’m sorry if i have posted in the wrong place.

Stay safe out there!

One Answer

I think i have solved it now. The problem was that in the example in the source they had:

8*3,50*16 and i was assuming (which is a very bad thing to do) that i should change everything except the 16, that it was a constant of sorts. Turns out, my RPI should have somewhere between 0 and 2 in the last multiple, alas it should be

4*1,2*x where x is somewhere between 0 and 2 (as my estimation) thus i have only to find out the actual number for X, and for some reason it seems hard to find it, one would have thought that the instructions per second or instructions per cycle would be documented for RPI.

bottom line, there was nothing wrong except for my thinking. sorry for this post, i should had read up a little bit more before i posted. I blame it on stress.

EDIT: i forgot to give the source that helped me solve the problem, here it is:

and thx for the help!

Answered by mozzie on October 21, 2020

Add your own answers!

Related Questions

PDA for $L= {w : n_a(w)+n_b(w) = 2n_c(w)}$

1  Asked on December 9, 2020 by mroo7


When does kernel accessing virtual memory cause problems

1  Asked on December 6, 2020 by programming_zeus


Semi-streaming algorithm for $s$-$t$ connectivity

1  Asked on November 30, 2020 by kalithegreat


Time interval correction for step detection algorithm

1  Asked on October 19, 2020 by johhny-bravo


Ask a Question

Get help from others!

© 2022 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP