Tuesday, March 12, 2019

Weblogic Troubleshooting Issues

Weblogic Troubleshooting Issues


Deployment issues:

code issues: we will send error log to the application team for modification.
Caused By: weblogic.utils.ErrorCollectionException:
There are 1 nested errors:
weblogic.j2ee.dd.xml.AnnotationProcessException: Duplicate ejb name 'BDAccountEjbBean' found: annotation 'Stateless' on bean
failed due to connection pool issue: we will fix connection pool issues and then redeploy the application

Out of memory issue during the deployment:
error: java.lang.outofmemory.permgenspace
this error occured due to space in perm area.
setDomainEnv.sh
xx:permsize 64m
xx:maxpermsize 128m
we have set intial permsize=maxpermsize then restarted the servers, redeployed the application
If one or two application faile
d when we are triggering through scipt. we will fix that issue and do a deployment using console

jdbc issues:

1) DB down (raise a ticket to db team)
2) Incorrect hostname or port number ( raise a ticket to network team)
3) Data base connection lost ( telnet ipaddress port )
4) Data base user_acc lock ( raise a ticket to db team for unlocking user_acc)
5) Invalid pakage error (raise a ticket to db team)
6) TNS listener error (raise a ticket to db team)
7) Schema does not exist (raise a ticket to db team)
8) Cannot allocate resource error
Intial capacity : 5
max      : 15 increase max to 25
9) connection leaks ( send error to application team)
10) Connection time out ( raise a tickect to db team for long running quries)

jms issues:

stuck message issues
Check whether dest queue is available, check message format, check queue name.
rolling message issues (messages will run continuously in the loop)
delete those messages in the queue.

diskspace issues:

If the disk space usage is 95%-100% then we will delete old log files
[root@localhost ~]# df -kh
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             3.8G  1.9G  1.8G  52% /
/dev/sda1              46M  9.2M   35M  22% /boot
tmpfs                 506M     0  506M   0% /dev/shm
/dev/sda3              14G  14G    0G  100% /home

du -kh (disk usage)

[root@localhost ~]# du -sh /home
1.8G    /home

[root@localhost bea10.3]# du -sh *
181M    jdk160_05
211M    jrockit_160_05
28K     logs
100M    modules
24K     registry.dat
8.0K    registry.xml
19M     user_projects
556K    utils
429M    wlserver_10.3

delete old log files
/home/bea10.3/user_projects/domains/sherkhan/servers/AdminServer/logs

rm -rf Adminserver.log00001  Adminserver.log00002 Adminserver.log00003
rm -rf Adminserver.out00001 Adminserver.out00002 Adminserver.out00003
rm -rf access.log00001 access.log00002 access.log00003

/home/bea10.3/user_projects/domains/sherkhan/servers/ms1/logs

rm -rf ms1.log00001
rm -rf ms1.out00001
or zip the log files
/home/bea10.3/user_projects/domains/sherkhan/servers/AdminServer/logs
gzip -r *
/home/bea10.3/user_projects/domains/sherkhan/servers/AdminServer
gzip -r logs


High cpu utilization:

top (linux)
prstat (solaris)

top - 07:45:22 up  3:03,  3 users,  load average: 0.16, 0.33, 0.17
Tasks: 113 total,   2 running, 109 sleeping,   0 stopped,   2 zombie
Cpu(s):  0.0%us,  0.7%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1035400k total,  1020348k used,    15052k free,    77688k buffers
Swap:  2040212k total,        0k used,  2040212k free,   483724k cached

ps -ef | grep 9523

If you find any zombie process count >50 raise a ticket to solaris admins
If any java processes are occupying 95-100% cpu usage then check the log files for any continuous looping messages or jdbc transaction time outs.
fix the problem and kill manged  server using kill -9 pid and restart the service instance.

Application logs files not rotating issue:

check the diskspace if it is full then delete old logs
check whether log4j properties file set in classpath

404 error:

404 error: page can't be displayed.
10.4.5 404 Not Found

The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent.

sol:

1) check whether they are using correct url
2) check whether apache server is running ( ps -ef | grep httpd) ( ps -ef | grep -i apache)

Apache2.2/bin

httpd -k start
httpd -k stop

apachectl -k start
apachectl -k stop

3) check the diskspace of Apache server if it is full then delete the log files (df -kh)

goto Apache2.2/logs

delete old logs

4) Check whether the deployed application is in active state
5) If the deployed application is failed then fix the issue and redeploy the application

500 error:

Service unavailable
this error is due to server down
check apache or weblogic service instance is the server is down then start the server.

403 error:

Access forbidden
check whether the proxy mapping is correct
check syngrants and synanyms run properly in data base side
check whether the user having access to the application
issue:replicas.prop file corrupted

<BEA-000386> Server subsystem failed. Reason: java.lang.NumberFormatException: null
java.lang.NumberFormatException: null
        at java.lang.Integer.parseInt(Integer.java:417)
        at java.lang.Integer.parseInt(Integer.java:499)
        at weblogic.ldap.EmbeddedLDAP.validateVDEDirectories(EmbeddedLDAP.java:1097)
        at weblogic.ldap.EmbeddedLDAP.start(EmbeddedLDAP.java:242)
        at weblogic.t3.srvr.SubsystemRequest.run(SubsystemRequest.java:64)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:207)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:176)

This mostly happens when LDAP files are corrupted under the ../domain-name/server/AdminServer/data/ldap/ directory. A possible cause of corruption is when space on server is full. When the associated volume is full (100%) weblogic server will corrupt these files.

sol:
rm -rf /home/bea10.3/user_projects/domains/sherkhan/servers/AdminServer/data/ldap/conf/replicas.prop

To fix the above error tried the below:
Remove the ../domain-name/server/AdminServer/data/ldap/conf/replicas.prop file and restart the Admin server. It should work now.

Error: unable to obtain lock file

<May 13, 2012 8:34:54 PM IST> <Critical> <WebLogicServer> <BEA-000362> <Server failed. Reason:

There are 1 nested errors:
weblogic.management.ManagementException: Unable to obtain lock on C:\bea10.3\user_projects\domains\sherkhan\servers\AdminServer\tmp\AdminServer.lok. Server may already be running
        at weblogic.management.internal.ServerLocks.getServerLock(ServerLocks.java:159)
        at weblogic.management.internal.ServerLocks.getServerLock(ServerLocks.java:58)
        at weblogic.management.internal.DomainDirectoryService.start(DomainDirectoryService.java:73)
        at weblogic.t3.srvr.ServerServicesManager.startService(ServerServicesManager.java:459)
        at weblogic.t3.srvr.ServerServicesManager.startInStandbyState(ServerServicesManager.java:164)
        at weblogic.t3.srvr.T3Srvr.initializeStandby(T3Srvr.java:711)
        at weblogic.t3.srvr.T3Srvr.startup(T3Srvr.java:482)
        at weblogic.t3.srvr.T3Srvr.run(T3Srvr.java:440)
        at weblogic.Server.main(Server.java:67)
<May 13, 2012 8:34:54 PM IST> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FAILED>
<May 13, 2012 8:34:54 PM IST> <Error> <WebLogicServer> <BEA-000383> <A critical service failed. The server will shut itself down>
<May 13, 2012 8:34:54 PM IST> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SHUTTING_DOWN>

rm -rf \bea10.3\user_projects\domains\sherkhan\servers\AdminServer\tmp\AdminServer.lok
if the server is already running then ignore this error
if you are unable to start the server delete lok file and restart the server.


Error: Users are getting 404 error some times and they are able to access the application sometimes.

1) check whether all managed servers are in running state.
if one of the managed server is in shutdown state then bring up the server.
check the http requests in access.log file for all managed server
if you are getting 404 error in one of the managed server log. then check server log for any errors
i got the below error in log file:
java.lang.socket exception: address or port already in use
netstat -anp | grep 8002
if the port is listened on any other instance. restat managed server.
if the issue still persists then raise a reqest to network team..

servers are running in admin mode:

server will run in admin mode due to deployment or connection pool issue

fix deployment or jdbc problems and resume servers.

Failed_not_restartable mode:

if the disk space is full then servers will go to failed not restable mode

stack overflow error:

If you get stack overflow error in log file.
we need to restart the server or increase the stack size using XSS:1024 (2048)

No comments:

Post a Comment