Most of the WebSphere instability is caused by application code defect. However, as WebSphere system engineers, we need to help stabilise the WebSphere system. There are three areas where we can help.
To achieve relative stability is possible if you take the right approach. For example, for a high traffic system under heavy load, adding JVM instances is quite often the shortest path to relative stability and help significantly improve customer experience. For a slow memory leak, scheduled recycling of the servers is very effective in achieving a level of stability.
Before you create more JVM instances, you have the following to consider.
For slow memory leak, increasing heap size, switching to 64 bit systems, as well as more frequent recycle can help with maintain a level of stability and buy time to fix the bugs causing memory leak.
Work closely with the application team, stay away from finger pointing, build a good work relationship with all engineering and application teams, and pro-actively produce dumps and share logs. Teamwork and collaboration will help you to isolate the defects and fix them.
Be very careful in performing changes. Design and implement audit and peer review processes. Diligent and careful system engineering processes help in preventing system from occurring or recurring.
- Take appropriate measures to achieve relative stability and help with customer experience
- Work with the application team to isolate and fix the defects
- Use engineering processes to prevent system instability
To achieve relative stability is possible if you take the right approach. For example, for a high traffic system under heavy load, adding JVM instances is quite often the shortest path to relative stability and help significantly improve customer experience. For a slow memory leak, scheduled recycling of the servers is very effective in achieving a level of stability.
Before you create more JVM instances, you have the following to consider.
- Does the application support vertical clustering?
- Does the application support horizontal clustering?
- Does the application have only limited clustering support?
For slow memory leak, increasing heap size, switching to 64 bit systems, as well as more frequent recycle can help with maintain a level of stability and buy time to fix the bugs causing memory leak.
Work closely with the application team, stay away from finger pointing, build a good work relationship with all engineering and application teams, and pro-actively produce dumps and share logs. Teamwork and collaboration will help you to isolate the defects and fix them.
Be very careful in performing changes. Design and implement audit and peer review processes. Diligent and careful system engineering processes help in preventing system from occurring or recurring.
No comments:
Post a Comment