Serious Memory Leak 64 Bit Windows 2008 Server R2 SP1
There is a fairly serious memory leak which certainly affects Windows 2008 Server R2 SP1 64-bit on VM Ware ESX 4.1 and ColdFusion 9.01 and which may to be an overall issue with Windows 2008 Server R2 SP1. It is important to give you the full details as it can be hard to identify and may lead to decisions that it is ColdFusion or the JVM which has the issue and it is decidely not the case; here are the details...
I was on-site at a client and we were load-testing a new infrastructure before going live. By the way and I realize this is my ongoing mantra, please load test thoroughly before ever moving to new hardware or even more succinctly, before ever putting new code into production. It is no exaggeration to say that it could save tens of thousands of dollars in lost time and wrong decisions. So back to my client and I have to give them a shout out, they have a first class team and we were able to work, in harmony, with both the development and infrastructure team to hunt down this issue. That is another key issue and here I am addressing all you VP's and CTO's, it is imperative to ensure all divisions under your watch work cohesively and with understanding of the overall needs of the enterprise; all ego's should be left at the door; this is also critical. Apologies for the deviation from details here but all of this is truly critical to have any chances of success.
We were load testing fairly stridently and noticed that all VM Ware instances, 64-bit with 8GB RAM (3GB allocated to the JVM Heap, each with two horizontally clustered ColdFusion instances) were almost "maxed-out" on system memory. We then did what seemed logical and added another 4GB of system RAM to each VM Ware server. After a restart we noted that once again we were running out of system RAM and looking in Windows Task Manager, it appeared that ColdFusion was taking up 4.5GB of resource when the JVM Heap, was currently set to 3GB maximum, that in my experience had never happened. So one of the team had the good idea to set down one of the ColdFusion installs to 700MB and once again reboot the VM Ware server, once again and now with 12GB of overall system RAM, memory utilization was climbing and once again ColdFusion appeared to be taking 4.5GB of system RAM, even though it was now set to 700MB maximum. Another member of the team had been coding away during this time and rolled out some CF Code which was designed to give good details on multiple instance, clustered installs of CF, including heap memory allocated and sure enough, ColdFusion only had just under 700MB allocated. We then realized we had an overall system problems and sure enough found many articles on the issue.
This issue was/is not just cosmetic, it impacts performance very markedly so I wanted to get this blog post out there quickly.