Tuesday, August 14, 2012

PSOFT : Hung DBX processes consuming maximum CPU% on AIX


Description : I came across an issue related to CPU% getting hiked up due to some active dbx processes associated with PSAPPSRV processes consuming high % of CPU on our AIX 6.1.

dbx in AIX is used as a process debugger. which doesn't serve a useful purpose from PeopleSoft performance perspective. hence we always used to kill the dbx processes manually to bring down CPU%.
(in AIX check man pages for details on dbx)


Take a look at this Oracle document ID 1275845.1 for reference.

E-AS: dbx Process Hangs on AIX and can use Significant System CPU resources
Cause :

When an exception occurs for a PeopleSoft process, typically PSAPPSRV, a script, psprocinfo, is run to collect process specific information needed to help identify what might have caused the exception. When psprocinfo runs it will use dbx to collect stack and library specific data from the process. It has been seen recently, that the dbx process will start but may not terminate. In these instances the dbx process has also been seen to use a significant amount of CPU resources which can impact overall usability of the AIX box.
Solution
That dbx hangs and consumes CPU is not a problem PSoft development can resolve directly. In PT 8.51, and later Tools versions, the psprocinfo script was rewritten to capture the data we need using different AIX utilities such that it does not need to use dbx.
This dbx issue is not common. Under most instances dbx does not get stuck in a pointer loop, hence currently the psprocinfo script still uses it if it's available.
Note:
If encountering this issue you can avoid it by renaming the dbx process. When renamed, the psprocinfo script will not be able to execute it hence the overall performance impact on the AIX server is negated. Psprocinfo will collect other available data on the process, will log a message indicating dbx could not be run and will then terminate normally.


So here I wrote & scheduled a script every 30 min to report & kill any dbx process running on server.

_________________________________________________________________________________
#!/usr/bin/ksh
# This script checks for any dbx process running & kills it.
# Usage : MonitorKill_DBEX.ksh 
# ( I intentionally didnt prefer to hv dbx in script name to prevent conflicting it with running dbx processes)
# Niraj Patil

export maillist="nirajdpatil1986@gmail.com groupemail@abc.com"
LOGFILE1=/home/NjServer/logs/MonitorKill_DBEX.log ; export LOGFILE1
echo "\n ----------- MonitorKill_DBEX executed at `date`---------" >> $LOGFILE1
ps -ef | grep -i dbx | grep -i $USER | grep -v grep > /tmp/dbex_processes.log
if [ `ps -ef | grep -i dbx | grep -i $USER |  grep -v grep | awk -F" " '{ print $2 }' | wc -l` -ge 1 ]; then
for PRCKLL in `ps -ef | grep -i dbx | grep -i $USER |  grep -v grep | awk -F" " '{ print $2 }'`
do
echo "Killed dbx process :\n `ps -ef | grep $PRCKLL | grep -v grep`" | tee -a $LOGFILE1
kill -9 $PRCKLL
done
mailx -s "DBX processes spawned by $USER killed on `hostname` at `date`" $maillist < /tmp/dbex_processes.log
else
echo "No dbx process was found for $USER on `hostname`" | tee -a $LOGFILE1
fi
_________________________________________________________________________________


script looks dirty with multiple grep's no??
I know I m not a great unix developer but my script serves the purpose well :)
I just tried to make script as simpler as possible to understand.

No comments:

Post a Comment