TransWikia.com

NFS4 web directory throughput drops every 60 seconds - Apache / PHP-FPM / centos

Server Fault Asked by Gnosis on November 4, 2021

We recently switched to an NFS4 share for our web directory (/var/www/sites). Ever since the switch at exactly 60 seconds, I’m seeing a drop off in throughput from the NFS mounted drive on the client-side. CPU drops (apache/PHP are waiting) and I see a dip in the network load. It lasts between 500ms up to 1.5s. This happens exactly every 60 seconds.

I tested with the dd if=/dev/zero of=/mnt/files/samplefile bs=1M count=1024 oflag=direct and was able to see an increase in the read/write times during one of the 60-second drops.

On the NFS mount, I’ve added FS-cache, noatime, and nodiratime without any change.

/etc/export

/mnt/files {clientIP} (rw,fsid=0,sync,no_root_squash)

Client mount

mount -v -t nfs4 {server_ip}:/ /mnt/files -o fsc,noatime,nodiratime

Based on the exact timing of the dropoff it seems to be some sort of setting and/or misconfiguration.

Any tips would be greatly appreciated.

Server side nfsstat:

Server rpc stats:
calls      badcalls   badfmt     badauth    badclnt
4066505251   262        22         240        0

Server nfs v3:
null             getattr          setattr          lookup           access
8       100%     0         0%     0         0%     0         0%     0         0%
readlink         read             write            create           mkdir
0         0%     0         0%     0         0%     0         0%     0         0%
symlink          mknod            remove           rmdir            rename
0         0%     0         0%     0         0%     0         0%     0         0%
link             readdir          readdirplus      fsstat           fsinfo
0         0%     0         0%     0         0%     0         0%     0         0%
pathconf         commit
0         0%     0         0%

Server nfs v4:
null             compound
72        0%     4066507670 99%

Server nfs v4 operations (centos 8):
op0-unused       op1-unused       op2-future       access           close
0         0%     0         0%     0         0%     187752303  1%     117353691  0%
commit           create           delegpurge       delegreturn      getattr
6175      0%     7467      0%     0         0%     36808013  0%     3988907750 31%
getfh            link             lock             lockt            locku
20592505  0%     0         0%     1988679   0%     0         0%     1978415   0%
lookup           lookup_root      nverify          open             openattr
32913665  0%     0         0%     0         0%     117761749  0%     0         0%
open_conf        open_dgrd        putfh            putpubfh         putrootfh
0         0%     24        0%     4050816618 32%     0         0%     328       0%
read             readdir          readlink         remove           rename
3970684   0%     1199340   0%     480       0%     181949    0%     18432     0%
renew            restorefh        savefh           secinfo          setattr
0         0%     0         0%     18432     0%     0         0%     2287964   0%
setcltid         setcltidconf     verify           write            rellockowner
0         0%     0         0%     0         0%     211708    0%     0         0%
bc_ctl           bind_conn        exchange_id      create_ses       destroy_ses
0         0%     2         0%     40        0%     46        0%     37        0%
free_stateid     getdirdeleg      getdevinfo       getdevlist       layoutcommit
1978400   0%     0         0%     0         0%     0         0%     0         0%
layoutget        layoutreturn     secinfononam     sequence         set_ssv
0         0%     0         0%     37        0%     4066651259 32%     0         0%
test_stateid     want_deleg       destroy_clid     reclaim_comp     allocate
13642707  0%     0         0%     31        0%     37        0%     0         0%
copy             copy_notify      deallocate       ioadvise         layouterror
0         0%     0         0%     0         0%     0         0%     0         0%
layoutstats      offloadcancel    offloadstatus    readplus         seek
0         0%     0         0%     0         0%     0         0%     0         0%
write_same
0         0%

Client side nfsstat (centos 7):

calls      badcalls   badclnt    badauth    xdrcall
0          0          0          0          0

Client rpc stats:
calls      retrans    authrefrsh
4157327074   6          4157501443

Client nfs v4:
null         read         write        commit       open         open_conf
0         0% 12539371  0% 2010537   0% 171586    0% 17387625  0% 19761     0%
open_noat    open_dgrd    close        setattr      fsinfo       renew
117435773  2% 28        0% 134408077  3% 2365580   0% 425       0% 736357    0%
setclntid    confirm      lock         lockt        locku        access
68577     0% 14        0% 1998403   0% 0         0% 1988136   0% 73334903  1%
getattr      lookup       lookup_root  remove       rename       link
3686184054 88% 35401700  0% 149       0% 4909916   0% 378484    0% 0         0%
symlink      create       pathconf     statfs       readlink     readdir
0         0% 15960     0% 276       0% 11593628  0% 490       0% 2002535   0%
server_caps  delegreturn  getacl       setacl       fs_locations rel_lkowner
931       0% 36853705  0% 0         0% 0         0% 0         0% 0         0%
secinfo      exchange_id  create_ses   destroy_ses  sequence     get_lease_t
0         0% 0         0% 31        0% 37        0% 28        0% 16        0%
reclaim_comp layoutget    getdevinfo   layoutcommit layoutreturn getdevlist
251       0% 28        0% 0         0% 0         0% 0         0% 0         0%
(null)
34        0%

Update: watching htop on the client, I noticed when this is happening the top process is

{NFS-IP}-mana

Each time the interrupts occur I am getting this process

48800 R ?        00:00:00 [{nfsIP_address}-mana]
48800 R ?        00:00:00 [{nfsIP_address}-mana]
48800 R ?        00:00:00 [{nfsIP_address}-mana]
48800 R ?        00:00:00 [{nfsIP_address}-mana]
48800 R ?        00:00:00 [{nfsIP_address}-mana]
48800 R ?        00:00:00 [{nfsIP_address}-mana]
48800 R ?        00:00:00 [{nfsIP_address}-mana]
48800 R ?        00:00:00 [{nfsIP_address}-mana]
48800 R ?        00:00:01 [{nfsIP_address}-mana]
48800 R ?        00:00:01 [{nfsIP_address}-mana]
48800 R ?        00:00:01 [{nfsIP_address}-mana]
48800 R ?        00:00:01 [{nfsIP_address}-mana]
48800 R ?        00:00:01 [{nfsIP_address}-mana]
48800 R ?        00:00:01 [{nfsIP_address}-mana]
48800 R ?        00:00:01 [{nfsIP_address}-mana]
48800 R ?        00:00:01 [{nfsIP_address}-mana]
48800 R ?        00:00:02 [{nfsIP_address}-mana]
48800 R ?        00:00:02 [{nfsIP_address}-mana]
48800 R ?        00:00:02 [{nfsIP_address}-mana]
48800 R ?        00:00:02 [{nfsIP_address}-mana]

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP