TransWikia.com

Ansible Gather facts failing at findmnt command for some hosts

Unix & Linux Asked by Indranil on August 27, 2020

ANSIBLE VERSION

ansible 2.4.6.0
config file = /home/xxxxxx/ansible.cfg
configured module search path = [u'/home/xxxxxx/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version =

2.7.5 (default, Aug 7 2019, 00:51:29) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]

CONFIGURATION

cat ~/.ansible.cfg

[defaults]
host_key_checking = False
forks = 5
log_path = /home/userid/ansible.log

[ssh_connection]
pipelining = true

grep ^[^#] /etc/ansible/ansible.cfg
[defaults]
roles_path = /etc/ansible/roles:/usr/share/ansible/roles
host_key_checking = False

OS / ENVIRONMENT
Client: CentOS Linux release 7.5.1804 (Core)

STEPS TO REPRODUCE
Ansible all Playbook works fine except any reference to Gather facts. Gather facts module and any reference to Gather facts hangs.

Example – Command
ansible all -i ansible/inventory/inventory -m setup -u userid -k -K -vvv

ACTUAL RESULTS

ansible all -i ansible/inventory/inventory-file -m setup -u userid -k -K --limit="130.100.136.118,130.100.136.114" -vvv
ansible 2.4.6.0
config file = /home/userid/ansible.cfg
configured module search path = [u'/home/userid/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.5 (default, Aug 7 2019, 00:51:29) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
Using /home/userid/ansible.cfg as config file
SSH password:
SUDO password[defaults to SSH password]:
Parsed /home/userid/ansible/inventory/dop-poc-ibm inventory source with ini plugin
META: ran handlers
Using module file /usr/lib/python2.7/site-packages/ansible/modules/system/setup.py
<130.100.136.114> ESTABLISH SSH CONNECTION FOR USER: userid
Using module file /usr/lib/python2.7/site-packages/ansible/modules/system/setup.py
<130.100.136.118> ESTABLISH SSH CONNECTION FOR USER: userid
<130.100.136.114> SSH: EXEC sshpass -d14 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=userid -o ConnectTimeout=10 -o ControlPath=/home/userid/.ansible/cp/1f9f8629ab 130.100.136.114 '/bin/sh -c '"'"'/usr/bin/python && sleep 0'"'"''
<130.100.136.118> SSH: EXEC sshpass -d15 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=userid -o ConnectTimeout=10 -o ControlPath=/home/userid/.ansible/cp/e3a887b653 130.100.136.118 '/bin/sh -c '"'"'/usr/bin/python && sleep 0'"'"''
<130.100.136.114> (1, 'n{"exception": "Traceback (most recent call last):n File "/tmp/ansible_5w_PfH/ansible_modlib.zip/ansible/module_utils/basic.py", line 2786, in run_commandn cmd = subprocess.Popen(args, **kwargs)n File "/usr/lib64/python2.7/subprocess.py", line 711, in initn errread, errwrite)n File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_childn data = _eintr_retry_call(os.read, errpipe_read, 1048576)n File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_calln return func(args)n File "/tmp/ansible_5w_PfH/ansible_modlib.zip/ansible/module_utils/facts/timeout.py", line 37, in _handle_timeoutn raise TimeoutError(msg)nTimeoutError: Timer expired after 10 secondsn", "cmd": "/usr/bin/findmnt --list --noheadings --notruncate", "failed": true, "rc": 257, "invocation": {"module_args": {"filter": "", "gather_subset": ["all"], "fact_path": "/etc/ansible/facts.d", "gather_timeout": 10}}, "msg": "Timer expired after 10 seconds"}n', '')
The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_5w_PfH/ansible_modlib.zip/ansible/module_utils/basic.py", line 2786, in run_command
cmd = subprocess.Popen(args, **kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child
data = _eintr_retry_call(os.read, errpipe_read, 1048576)
File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_call
return func(*args)
File "/tmp/ansible_5w_PfH/ansible_modlib.zip/ansible/module_utils/facts/timeout.py", line 37, in _handle_timeout
raise TimeoutError(msg)
TimeoutError: Timer expired after 10 seconds

130.100.136.114 | FAILED! => {
"changed": false,
"cmd": "/usr/bin/findmnt --list --noheadings --notruncate",
"failed": true,
"invocation": {
"module_args": {
"fact_path": "/etc/ansible/facts.d",
"filter": "*",
"gather_subset": [
"all"
],
"gather_timeout": 10
}
},
"msg": "Timer expired after 10 seconds",
"rc": 257
}
<130.100.136.118> (1, 'n{"exception": "Traceback (most recent call last):n File "/tmp/ansible_Alx9Sv/ansible_modlib.zip/ansible/module_utils/basic.py", line 2786, in run_commandn cmd = subprocess.Popen(args, **kwargs)n File "/usr/lib64/python2.7/subprocess.py", line 711, in initn errread, errwrite)n File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_childn data = _eintr_retry_call(os.read, errpipe_read, 1048576)n File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_calln return func(args)n File "/tmp/ansible_Alx9Sv/ansible_modlib.zip/ansible/module_utils/facts/timeout.py", line 37, in _handle_timeoutn raise TimeoutError(msg)nTimeoutError: Timer expired after 10 secondsn", "cmd": "/usr/bin/findmnt --list --noheadings --notruncate", "failed": true, "rc": 257, "invocation": {"module_args": {"filter": "", "gather_subset": ["all"], "fact_path": "/etc/ansible/facts.d", "gather_timeout": 10}}, "msg": "Timer expired after 10 seconds"}n', '')
The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_Alx9Sv/ansible_modlib.zip/ansible/module_utils/basic.py", line 2786, in run_command
cmd = subprocess.Popen(args, **kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib64/python2.7/subprocess.py", line 1308, in _execute_child
data = _eintr_retry_call(os.read, errpipe_read, 1048576)
File "/usr/lib64/python2.7/subprocess.py", line 478, in _eintr_retry_call
return func(*args)
File "/tmp/ansible_Alx9Sv/ansible_modlib.zip/ansible/module_utils/facts/timeout.py", line 37, in _handle_timeout
raise TimeoutError(msg)
TimeoutError: Timer expired after 10 seconds

130.100.136.118 | FAILED! => {
"changed": false,
"cmd": "/usr/bin/findmnt --list --noheadings --notruncate",
"failed": true,
"invocation": {
"module_args": {
"fact_path": "/etc/ansible/facts.d",
"filter": "*",
"gather_subset": [
"all"
],
"gather_timeout": 10
}
},
"msg": "Timer expired after 10 seconds",
"rc": 257
}

Steps Tried

Increased gather_timeout = 20 or 30 in home folder  ansible.cfg, Didnt helped.

Tried gather_subset = !all, Didnt helped.

Manual execution of 
ansible -i ansible/inventory/inventory -u userid@domain --become -m shell -a '/usr/bin/findmnt --list --noheadings --notruncate' linux -k -K Worked. Noticed, it takes a few seconds to publish results.

Workaround so far

Commented section in "/usr/lib/python2.7/site-packages/ansible/module_utils/facts/hardware/linux.py"
#def _run_findmnt(self, findmnt_path):
   #     args = ['--list', '--noheadings', '--notruncate']
   #     cmd = [findmnt_path] + args
   #     rc, out, err = self.module.run_command(cmd, errors='surrogate_then_replace')
   #     return rc, out, err

   #def _find_bind_mounts(self):
   #     bind_mounts = set()
   #     findmnt_path = self.module.get_bin_path("findmnt")
   #     if not findmnt_path:
   #         return bind_mounts

   #     rc, out, err = self._run_findmnt(findmnt_path)
   #     if rc != 0:
   #         return bind_mounts

        # find bind mounts, in case /etc/mtab is a symlink to /proc/mounts
   #     for line in out.splitlines():
   #         fields = line.split()
            # fields[0] is the TARGET, fields[1] is the SOURCE
   #         if len(fields) < 2:
   #             continue

            # bind mounts will have a [/directory_name] in the SOURCE column
    #        if self.BIND_MOUNT_RE.match(fields[1]):
    #            bind_mounts.add(fields[0])

     #   return bind_mounts

2 Answers

After updating ansible to 2.8, it was no more.

Answered by Indranil on August 27, 2020

I'm not sure this is the problem, but I've had problems due to stale NFS mounts. If you could ssh to one of the failing servers and see if df command will work without hanging to rule that out.

Answered by user103944 on August 27, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP