There are two heavy forwarders with F5 load balancer placed behind these servers to manage the load (syslog) and these two servers are used to monitor and forward the syslog (tcp port) information to the indexer clusters.
Currently, the file system size is increasing drastically under this path `/opt/syslogs/generic/*/*.log` and we are unable **to delete or do log rotate the syslogs**, as there are too many subdirectories under this generic folder and each containing millions of data. Due to this, splunkd is failing every few intervals and in splunkd.log could not get the exact error logs to find out why splunkd process keeps on failing frequently.
My question:
1) Is this due to the space crunch in the /opt file system where Splunk is configured? I am not sure if this is this causing the problem and in turn makes the splunkd process fail.
2) How to measure the amount of data getting into two heavy forwarder servers from the source="*syslogs*" everyday or every hour? I have tried to execute the search below, but I am not sure whether it's fetching the correct details. Correct me if it's not the right search.
index =* source ="syslogs" sourcetype ="/opt/syslogs/generic*" | eval indextime=strftime(_indextime, "%Y-%m-%d %H:%M:%S" ) | eval length = len (_raw) /1024 | stats sum(length) count by source indextime index host
3) If the volume of the data getting into the servers are known, then what kind of calculation should be done to measure the disk handling capacity? This way, I can suggest to hardware team to increase the size of the partition or ask them to place a separate server for syslogs.
4) Or, should do any changes in the Splunk configuration files to limit the size of the syslogs getting into the server?
↧