Quantcast
Channel: Questions in topic: "heavy-forwarder"
Viewing all articles
Browse latest Browse all 727

Diagnosing Issues with Python and Splunk Add-on for EMC VNX data_loader scripts "hanging"

$
0
0
We are trying to perform storage monitoring and both the EMC VNX and EMC XtremIO seem to be running python scripts as part of the Splunk Add-on for EMC VNX that break after a period of time. I think it's due to sockets staying open or the .py scripts not ending cleanly, but I am not proficient in python enough to diagnose... This post is specific to the Splunk Add-on for EMC VNX. We have 2 heavy forwarders that run the Splunk Add-on for EMC VNX against several different arrays, it's consistently seems to be the python scripts staying running and doing a splunk stop and splunk start fixes the issue for half-day to several days... Here is the error from the VNX log : [splunk@log1 splunk]$ tail -f data_loader.log File "/opt/splunk/etc/apps/Splunk_TA_emc-vnx/bin/timed_popen.py", line 55, in timed_popen return _do_timed_popen(args, timeout) File "/opt/splunk/etc/apps/Splunk_TA_emc-vnx/bin/timed_popen.py", line 41, in _do_timed_popen sub = Popen(args, stdout=PIPE, stderr=PIPE) File "/opt/splunk/lib/python2.7/subprocess.py", line 710, in __init__ errread, errwrite) File "/opt/splunk/lib/python2.7/subprocess.py", line 1335, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory it repeats this over and over in both Heavy Forwarders, until Splunk is stopped and started. Running a Splunk restart from cron every morning at 5am did not work as a workaround for this issue. The healthy log looks like this (you can see it ending from Splunk stop also) : 2017-04-05 11:39:19,173 INFO 140455369918272 - Data loader is going to exit... 2017-04-05 11:39:19,173 INFO 140454121158400 - Worker thread Thread-16 going to exit 2017-04-05 11:39:19,174 INFO 140454146336512 - Worker thread Thread-13 going to exit 2017-04-05 11:39:19,174 INFO 140455200052992 - Worker thread Thread-1 going to exit 2017-04-05 11:39:19,174 INFO 140454137943808 - Worker thread Thread-14 going to exit 2017-04-05 11:39:19,174 INFO 140454154729216 - Worker thread Thread-12 going to exit 2017-04-05 11:39:19,175 INFO 140455191660288 - Worker thread Thread-2 going to exit 2017-04-05 11:39:19,175 INFO 140454129551104 - Worker thread Thread-15 going to exit 2017-04-05 11:39:19,175 INFO 140454691600128 - Worker thread Thread-5 going to exit 2017-04-05 11:39:19,175 INFO 140454683207424 - Worker thread Thread-6 going to exit 2017-04-05 11:39:19,175 INFO 140455174874880 - Worker thread Thread-4 going to exit 2017-04-05 11:39:19,175 INFO 140455183267584 - Worker thread Thread-3 going to exit 2017-04-05 11:39:19,176 INFO 140454658029312 - Worker thread Thread-9 going to exit 2017-04-05 11:39:19,176 INFO 140454666422016 - Worker thread Thread-8 going to exit 2017-04-05 11:39:19,176 INFO 140454649636608 - Worker thread Thread-10 going to exit 2017-04-05 11:39:19,176 INFO 140454674814720 - Worker thread Thread-7 going to exit 2017-04-05 11:39:19,176 INFO 140454641243904 - Worker thread Thread-11 going to exit 2017-04-05 11:39:19,178 INFO 140455369918272 - ProcessPool is going to exit... 2017-04-05 11:39:19,210 INFO 140454112765696 - Event writer thread is going to exit... 2017-04-05 11:39:19,229 INFO 140454104372992 - TimerQueue thread is going to exit... 2017-04-05 11:39:43,188 INFO 140321121437504 - thread_pool_size = 16 2017-04-05 11:39:43,190 INFO 140321121437504 - process_pool_size = 2 2017-04-05 11:39:43,807 INFO 140321121437504 - Get 0 ready jobs, next duration is 5.506924, and there are 12 jobs scheduling 2017-04-05 11:39:49,318 INFO 140321121437504 - Get 1 ready jobs, next duration is 3.996371, and there are 12 jobs scheduling 2017-04-05 11:39:49,321 INFO 140320742307584 - thread work_queue_size=0 2017-04-05 11:39:53,315 INFO 140321121437504 - Get 1 ready jobs, next duration is 8.999508, and there are 12 jobs scheduling 2017-04-05 11:39:53,315 INFO 140320733914880 - thread work_queue_size=0 2017-04-05 11:40:02,315 INFO 140321121437504 - Get 1 ready jobs, next duration is 0.999262, and there are 12 jobs scheduling 2017-04-05 11:40:02,315 INFO 140320725522176 - thread work_queue_size=0 2017-04-05 11:40:03,314 INFO 140321121437504 - Get 1 ready jobs, next duration is 11.999513, and there are 12 jobs scheduling 2017-04-05 11:40:03,315 INFO 140320717129472 - thread work_queue_size=0 2017-04-05 11:40:15,315 INFO 140321121437504 - Get 2 ready jobs, next duration is 7.999429, and there are 12 jobs scheduling 2017-04-05 11:40:15,315 INFO 140320708736768 - thread work_queue_size=1 2017-04-05 11:40:15,315 INFO 140320700344064 - thread work_queue_size=0 2017-04-05 11:40:23,315 INFO 140321121437504 - Get 1 ready jobs, next duration is 0.999395, and there are 12 jobs scheduling 2017-04-05 11:40:23,315 INFO 140320691951360 - thread work_queue_size=0 2017-04-05 11:40:24,315 INFO 140321121437504 - Get 1 ready jobs, next duration is 3.999494, and there are 12 jobs scheduling 2017-04-05 11:40:24,315 INFO 140320205436672 - thread work_queue_size=0 2017-04-05 11:40:28,315 INFO 140321121437504 - Get 1 ready jobs, next duration is 7.999428, and there are 12 jobs scheduling 2017-04-05 11:40:28,315 INFO 140320197043968 - thread work_queue_size=0 2017-04-05 11:40:36,314 INFO 140321121437504 - Get 1 ready jobs, next duration is 0.999498, and there are 12 jobs scheduling 2017-04-05 11:40:36,315 INFO 140320188651264 - thread work_queue_size=0 2017-04-05 11:40:37,314 INFO 140321121437504 - Get 1 ready jobs, next duration is 2.999524, and there are 12 jobs scheduling 2017-04-05 11:40:37,315 INFO 140320180258560 - thread work_queue_size=0 2017-04-05 11:40:40,314 INFO 140321121437504 - Get 1 ready jobs, next duration is 95.000096, and there are 12 jobs scheduling 2017-04-05 11:40:40,315 INFO 140320171865856 - thread work_queue_size=0

Viewing all articles
Browse latest Browse all 727

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>