Learning Oracle: Weblogic Troubleshooting Issues

Weblogic Troubleshooting Issues

Deployment issues:

code issues: we will send error log to the application team for modification.
Caused By: weblogic.utils.ErrorCollectionException:
There are 1 nested errors:
weblogic.j2ee.dd.xml.AnnotationProcessException: Duplicate ejb name 'BDAccountEjbBean' found: annotation 'Stateless' on bean
failed due to connection pool issue: we will fix connection pool issues and then redeploy the application

Out of memory issue during the deployment:
error: java.lang.outofmemory.permgenspace
this error occured due to space in perm area.
setDomainEnv.sh
xx:permsize 64m
xx:maxpermsize 128m
we have set intial permsize=maxpermsize then restarted the servers, redeployed the application
If one or two application faile
d when we are triggering through scipt. we will fix that issue and do a deployment using console

jdbc issues:

1) DB down (raise a ticket to db team)
2) Incorrect hostname or port number ( raise a ticket to network team)
3) Data base connection lost ( telnet ipaddress port )
4) Data base user_acc lock ( raise a ticket to db team for unlocking user_acc)
5) Invalid pakage error (raise a ticket to db team)
6) TNS listener error (raise a ticket to db team)
7) Schema does not exist (raise a ticket to db team)
8) Cannot allocate resource error
Intial capacity : 5
max : 15 increase max to 25
9) connection leaks ( send error to application team)
10) Connection time out ( raise a tickect to db team for long running quries)

jms issues:

stuck message issues
Check whether dest queue is available, check message format, check queue name.
rolling message issues (messages will run continuously in the loop)
delete those messages in the queue.

diskspace issues:

If the disk space usage is 95%-100% then we will delete old log files
[root@localhost ~]# df -kh
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 3.8G 1.9G 1.8G 52% /
/dev/sda1 46M 9.2M 35M 22% /boot
tmpfs 506M 0 506M 0% /dev/shm
/dev/sda3 14G 14G 0G 100% /home

du -kh (disk usage)

[root@localhost ~]# du -sh /home
1.8G /home

[root@localhost bea10.3]# du -sh *
181M jdk160_05
211M jrockit_160_05
28K logs
100M modules
24K registry.dat
8.0K registry.xml
19M user_projects
556K utils
429M wlserver_10.3

delete old log files
/home/bea10.3/user_projects/domains/sherkhan/servers/AdminServer/logs

rm -rf Adminserver.log00001 Adminserver.log00002 Adminserver.log00003
rm -rf Adminserver.out00001 Adminserver.out00002 Adminserver.out00003
rm -rf access.log00001 access.log00002 access.log00003

/home/bea10.3/user_projects/domains/sherkhan/servers/ms1/logs

rm -rf ms1.log00001
rm -rf ms1.out00001
or zip the log files
/home/bea10.3/user_projects/domains/sherkhan/servers/AdminServer/logs
gzip -r *
/home/bea10.3/user_projects/domains/sherkhan/servers/AdminServer
gzip -r logs

High cpu utilization:

top (linux)
prstat (solaris)

top - 07:45:22 up 3:03, 3 users, load average: 0.16, 0.33, 0.17
Tasks: 113 total, 2 running, 109 sleeping, 0 stopped, 2 zombie
Cpu(s): 0.0%us, 0.7%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1035400k total, 1020348k used, 15052k free, 77688k buffers
Swap: 2040212k total, 0k used, 2040212k free, 483724k cached

ps -ef | grep 9523

If you find any zombie process count >50 raise a ticket to solaris admins
If any java processes are occupying 95-100% cpu usage then check the log files for any continuous looping messages or jdbc transaction time outs.
fix the problem and kill manged server using kill -9 pid and restart the service instance.

Application logs files not rotating issue:

check the diskspace if it is full then delete old logs
check whether log4j properties file set in classpath

404 error:

404 error: page can't be displayed.
10.4.5 404 Not Found

The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent.

sol:

1) check whether they are using correct url
2) check whether apache server is running ( ps -ef | grep httpd) ( ps -ef | grep -i apache)

Apache2.2/bin

httpd -k start
httpd -k stop

apachectl -k start
apachectl -k stop

3) check the diskspace of Apache server if it is full then delete the log files (df -kh)

goto Apache2.2/logs

delete old logs

4) Check whether the deployed application is in active state
5) If the deployed application is failed then fix the issue and redeploy the application

500 error:

Service unavailable
this error is due to server down
check apache or weblogic service instance is the server is down then start the server.

403 error:

Access forbidden
check whether the proxy mapping is correct
check syngrants and synanyms run properly in data base side
check whether the user having access to the application
issue:replicas.prop file corrupted

<BEA-000386> Server subsystem failed. Reason: java.lang.NumberFormatException: null
java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:417)
at java.lang.Integer.parseInt(Integer.java:499)
at weblogic.ldap.EmbeddedLDAP.validateVDEDirectories(EmbeddedLDAP.java:1097)
at weblogic.ldap.EmbeddedLDAP.start(EmbeddedLDAP.java:242)
at weblogic.t3.srvr.SubsystemRequest.run(SubsystemRequest.java:64)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:207)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:176)

This mostly happens when LDAP files are corrupted under the ../domain-name/server/AdminServer/data/ldap/ directory. A possible cause of corruption is when space on server is full. When the associated volume is full (100%) weblogic server will corrupt these files.

sol:
rm -rf /home/bea10.3/user_projects/domains/sherkhan/servers/AdminServer/data/ldap/conf/replicas.prop

To fix the above error tried the below:
Remove the ../domain-name/server/AdminServer/data/ldap/conf/replicas.prop file and restart the Admin server. It should work now.

Error: unable to obtain lock file

<May 13, 2012 8:34:54 PM IST> <Critical> <WebLogicServer> <BEA-000362> <Server failed. Reason:

There are 1 nested errors:
weblogic.management.ManagementException: Unable to obtain lock on C:\bea10.3\user_projects\domains\sherkhan\servers\AdminServer\tmp\AdminServer.lok. Server may already be running
at weblogic.management.internal.ServerLocks.getServerLock(ServerLocks.java:159)
at weblogic.management.internal.ServerLocks.getServerLock(ServerLocks.java:58)
at weblogic.management.internal.DomainDirectoryService.start(DomainDirectoryService.java:73)
at weblogic.t3.srvr.ServerServicesManager.startService(ServerServicesManager.java:459)
at weblogic.t3.srvr.ServerServicesManager.startInStandbyState(ServerServicesManager.java:164)
at weblogic.t3.srvr.T3Srvr.initializeStandby(T3Srvr.java:711)
at weblogic.t3.srvr.T3Srvr.startup(T3Srvr.java:482)
at weblogic.t3.srvr.T3Srvr.run(T3Srvr.java:440)
at weblogic.Server.main(Server.java:67)
<May 13, 2012 8:34:54 PM IST> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FAILED>
<May 13, 2012 8:34:54 PM IST> <Error> <WebLogicServer> <BEA-000383> <A critical service failed. The server will shut itself down>
<May 13, 2012 8:34:54 PM IST> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to FORCE_SHUTTING_DOWN>

rm -rf \bea10.3\user_projects\domains\sherkhan\servers\AdminServer\tmp\AdminServer.lok
if the server is already running then ignore this error
if you are unable to start the server delete lok file and restart the server.

Error: Users are getting 404 error some times and they are able to access the application sometimes.

1) check whether all managed servers are in running state.
if one of the managed server is in shutdown state then bring up the server.
check the http requests in access.log file for all managed server
if you are getting 404 error in one of the managed server log. then check server log for any errors
i got the below error in log file:
java.lang.socket exception: address or port already in use
netstat -anp | grep 8002
if the port is listened on any other instance. restat managed server.
if the issue still persists then raise a reqest to network team..

servers are running in admin mode:

server will run in admin mode due to deployment or connection pool issue

fix deployment or jdbc problems and resume servers.

Failed_not_restartable mode:

if the disk space is full then servers will go to failed not restable mode

stack overflow error:

If you get stack overflow error in log file.
we need to restart the server or increase the stack size using XSS:1024 (2048)

Learning Oracle

Pages

Tuesday, March 12, 2019

Weblogic Troubleshooting Issues

Weblogic Troubleshooting Issues

No comments:

Post a Comment

Translate

Followers