Bug ID 1032593
Summary NFS Client Hangs on "df" command or "find" or "ls" or getting state data
Classification openSUSE
Product openSUSE Distribution
Version Leap 42.1
Hardware All
OS openSUSE 42.1
Status NEW
Severity Major
Priority P5 - None
Component Other
Assignee bnc-team-screening@forge.provo.novell.com
Reporter manish2479@gmail.com
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

I am seeing the NFS client mounts getting hung up for 5-60 seconds    
for the jobs or any tasks and it can also be reproduced using "df"      
command. Though this issues happens intermittently and it's not         
associated to particular client node.                   

I have captured Sysrq trace logs from the NFS hung client node when df 
was ran and from that it seems that NFS client is getting  Processes in 
a "D" or uninterruptible sleep state are usually waiting on I/O when    
the "df" or "ls" or "find" command is ran and this seems related to     
stats loading issues, where sometimes it getting locked up in getting   
stats from the CNFS servers.               

NFS client Node Detail                                                  

# cat /etc/SuSE-release                                                 
openSUSE 42.1 (x86_64)                                                  
VERSION = 42.1                                                          
CODENAME = Malachite                                                    
# uname -a                                                              
Linux monster2200 4.1.34-33-default #1 SMP PREEMPT Thu Oct 20 08:03:29  
UTC 2016 (fe18aba) x86_64 x86_64 x86_64 GNU/Linux                       

NFS version on client                                                   

# rpm -qa | grep ^nfs                                                   
nfs-client-1.3.0-26.1.x86_64                                            
nfsidmap-0.25-7.1.x86_64                                                
nfs-kernel-server-1.3.0-26.1.x86_64                                     


Here is the stack trace from that NFS client when "df" was ran.         

[10069929.999275] Call Trace:                                           
[10069929.999278]  [<ffffffff81662107>] schedule+0x37/0x90              
[10069929.999296]  [<ffffffffa06ad4a1>] rpc_wait_bit_killable+0x31/0x80 
[sunrpc]                                                                
[10069929.999300]  [<ffffffff8166252d>] __wait_on_bit+0x5d/0x90         
[10069929.999302]  [<ffffffff816625ce>]                                 
out_of_line_wait_on_bit+0x6e/0x80                                       
[10069929.999309]  [<ffffffffa06ae65b>] __rpc_execute+0x18b/0x4a0       
[sunrpc]                                                                
[10069929.999315]  [<ffffffffa06a531d>] rpc_run_task+0x6d/0x90 [sunrpc] 
[10069929.999321]  [<ffffffffa06a5384>] rpc_call_sync+0x44/0xa0 [sunrpc]
[10069929.999333]  [<ffffffffa043b596>] nfs3_rpc_wrapper.constprop.     
12+0x66/0xa0 [nfsv3]                                                    
[10069929.999337]  [<ffffffffa043b906>] nfs3_proc_access+0xb6/0x170     
[nfsv3]                                                                 
[10069929.999343]  [<ffffffffa0514c9b>] nfs_do_access+0xdb/0x430 [nfs]  
[10069929.999349]  [<ffffffffa0515161>] nfs_permission+0x141/0x1d0 [nfs]
[10069929.999354]  [<ffffffff811e98b9>] __inode_permission+0x59/0xb0    
[10069929.999357]  [<ffffffff811ece26>] link_path_walk+0x226/0x790      
[10069929.999359]  [<ffffffff811ed0cc>] link_path_walk+0x4cc/0x790      
[10069929.999362]  [<ffffffff811ed80d>] path_lookupat+0x1d/0x690        
[10069929.999364]  [<ffffffff811edea3>] filename_lookup+0x23/0xb0       
[10069929.999366]  [<ffffffff811efc18>] user_path_at_empty+0x48/0x80    
[10069929.999368]  [<ffffffff811e3c66>] vfs_fstatat+0x46/0x90           
[10069929.999371]  [<ffffffff811e410c>] SYSC_newstat+0x1c/0x40          
[10069929.999373]  [<ffffffff81665fb2>] system_call_fastpath+0x16/0x75  
[10069929.999377]  [<00007f673250c625>] 0x7f673250c625                  
[10069929.999404] df              D ffff887f7f4d6700     0 16361  16304 
0x00000080                                                              
[10069929.999405]  ffff88799db3fa38 ffff883f66e26390 ffff8871b9fb85d0   
ffff887b8675f040                                                        
[10069929.999407]  ffff88799db40000 ffff88807ffd7100 0000000000000082   
ffffffffa06ad470                                                        
[10069929.999408]  ffff88799db3fe28 ffff88799db3fa58 ffffffff81662107   
ffff88799db3fe28                                                        
[10069929.999409] Call Trace:                                           
[10069929.999412]  [<ffffffff81662107>] schedule+0x37/0x90              
[10069929.999419]  [<ffffffffa06ad4a1>] rpc_wait_bit_killable+0x31/0x80 
[sunrpc]                                                                
[10069929.999422]  [<ffffffff8166252d>] __wait_on_bit+0x5d/0x90         
[10069929.999423]  [<ffffffff816625ce>]                                 
out_of_line_wait_on_bit+0x6e/0x80                                       
[10069929.999430]  [<ffffffffa06ae65b>] __rpc_execute+0x18b/0x4a0       
[sunrpc]                                                                
[10069929.999435]  [<ffffffffa06a531d>] rpc_run_task+0x6d/0x90 [sunrpc] 
[10069929.999441]  [<ffffffffa06a5384>] rpc_call_sync+0x44/0xa0 [sunrpc]
[10069929.999444]  [<ffffffffa043b596>] nfs3_rpc_wrapper.constprop.     
12+0x66/0xa0 [nfsv3]                                                    
[10069929.999446]  [<ffffffffa043c118>] nfs3_proc_getattr+0x48/0x90     
[nfsv3]                                                                 
[10069929.999452]  [<ffffffffa051b535>]                                 
__nfs_revalidate_inode+0xb5/0x360 [nfs]                                 
[10069929.999458]  [<ffffffffa0512b6b>] nfs_weak_revalidate+0x4b/0xf0   
[nfs]                                                                   
[10069929.999461]  [<ffffffff811e8f1f>] complete_walk+0x8f/0xf0         
[10069929.999463]  [<ffffffff811eb6c0>] do_last+0x120/0x1200            
[10069929.999465]  [<ffffffff811ee473>] path_openat+0x83/0x5b0          
[10069929.999467]  [<ffffffff811efce0>] do_filp_open+0x30/0x80          
[10069929.999469]  [<ffffffff811de429>] do_sys_open+0x129/0x220         
[10069929.999472]  [<ffffffff81665fb2>] system_call_fastpath+0x16/0x75  
[10069929.999476]  [<00007f387edecac0>] 0x7f387edecac0                  
[10069929.999480] Sched Debug Version: v0.11, 4.1.34-33-default #1      
[10069929.999482] ktime                                   :             
10065954755.141627                                                      
[10069929.999483] sched_clk                               :             
10069929999.480075                                                      
[10069929.999483] cpu_clk                                 :             
10069929999.480090                                                      
[10069929.999484] jiffies                                 : 6811380985  
[10069929.999485] sched_clock_stable()                    : 1           

[10069929.999486] sysctl_sched                                          
[10069929.999487]   .sysctl_sched_latency                    : 24.000000
[10069929.999488]   .sysctl_sched_min_granularity            : 3.000000 
[10069929.999488]   .sysctl_sched_wakeup_granularity         : 4.000000 
[10069929.999489]   .sysctl_sched_child_runs_first           : 0        
[10069929.999490]   .sysctl_sched_features                   : 220795   
[10069929.999491]   .sysctl_sched_tunable_scaling            : 1        
(logaritmic)                                                            

I have attached complete stack trace logs to this ticket and if you need  
additional logs or data to be collected then let me know.               

Thank You,                                                              
          Manish


You are receiving this mail because: