Thursday 14 June 2012

Investigating memory issues. High memory usage.

I divide memory issues in 2 types:
  • customer complains about OOM exception that sometimes occur.  For OOM exception, dump should be gathered with DebugDiag set for catching OOM (explained here)
  • customer complains about high memory usage, however OOM exception is not thrown. For high memory usage - with DebugDiag set for high performance counters value (explained here)
Let's first grasp how to handle high memory usage.

After you've got a dump, what are our next steps? First of all, you should get main idea: find objects that consume most of the memory, then find where these objects come from ... and at last find a solution, how to reduce memory used.

I'll explain this main idea on one dump sample.

1. Run !dumpheap -stat command. This will output all objects by their type. If you occasionally forget -stat flag, and windbg started to produce hundreds of lines in a second, simply press Ctrl+Break to stop execution. This also relates to every other command execution in windbg :)

As all of the screenshots from WinDBG look extremely boring, I'll draw some pictures on them )) don't blame me for being childish, cause I will totally agree ))


Here I show tail of the output of !dumpheap command for my project

This dump is from healthy and almost clean Sitecore 6.4.1, so if count the overall memory usage, it is not that high. 11903568 bytes is only about 10Mb, this is not a number for really struggling solution.

System.Object and System.String can freely be ignored, only  unless they are too huge. A lot of properties/objects are stored as Strings or Objects, so there is always a huge number of them in each dump."Free" objects are space that already used by application, but it is now free and new portions of information will be stored in this space. As I understand it, Free is already allocated memory, but it wasn't passed yet back to OS, so our application may use it to our needs. This relates not only to memory that appears after cleaning and compacting Gen0, Gen1,Gen2, but also by cleaning (and not compacting!) LOH (Large Object Heap)

So having a lot of Hashtable+bucket objects and Sitecore.Data.IDs look now suspicious to us. Surely, buckets come from Hashtables. So our next step is to find all hashtables:

2. !dumpheap -type System.Collections.Hashtable, Ctrl+Break.


We output all objects of types, that have System.Collections.Hashtable in their names. Our buckets are also here.

Let's now take one of these objects and investigate it. I'll take one of 576 size, I do not like large objects :)

3. !do 00000000ffb6f800 


!do stands for DumpObject. It shows us main information about an object, like its type/size/fields. If this is a string, we can even read that string!

0:000> !do 00000000ffb6f800
Name: System.Collections.Hashtable+bucket[]
MethodTable: 000007feec10f7c0
EEClass: 000007feebdb14a8
Size: 576(0x240) bytes
Array: Rank 1, Number of elements 23, Type VALUETYPE
Element Type: System.Collections.Hashtable+bucket
Fields:
None
Some beautiful watercolor roots


Ok, so this is a bucket (one of the pieces of any hashtable). Next valuable command will allow us to find any Garbage Collector roots to our object. What do these roots mean? Every object on the heap should be referenced from somewhere. This object may be a local variable of some method - then it is rooted from this method. Or this object may be a field of some other object - then it is rooted from it's "mommy". gcroot command allows us to find these roots for an object of interest.


What do you think happens, if our object hasn't got any roots? If you have ever heard about memory leaks, this is one of them - it could be one of them, but not necessary. Here is a wonderful scheme explaining memory leaks: memory leak for children.

Case of memory leak is so special, that I'm going to dedicate a separate post to it. Let's assume that everything goes well and we will get our roots.

4. !gcroot 00000000ffb6f800


Note: Roots found on stacks may be false positives. Run "!help gcroot" for more info.
Scan Thread 8 OSTHread 16dc
Scan Thread 20 OSTHread 1478
...
Scan Thread 61 OSTHread 1748
Scan Thread 3 OSTHread 1210
DOMAIN(00000000010134B0):HANDLE(WeakSh):1381be8:Root:000000013f58d4c0(Sitecore.Caching.Cache)->
000000013f58d530(System.Collections.Hashtable)->
000000013f58d588(System.Collections.Hashtable+bucket[])->
000000017fe475e8(Sitecore.Caching.Cache+CacheEntry)->
000000017fe474b8(Sitecore.Data.Items.Item)->
000000013f58cd10(Sitecore.Data.Database)->
000000013f58cdf8(Sitecore.Collections.DataProviderCollection)->
000000013f58ce18(System.Collections.Generic.List`1[[Sitecore.Data.DataProviders.DataProvider, Sitecore.Kernel]])->
000000013f5b1718(System.Object[])->
000000013f595250(Sitecore.Data.SqlServer.SqlServerDataProvider)->
000000017fe13fe0(Sitecore.Caching.Cache)->
000000017fe14050(System.Collections.Hashtable)->
000000017fe2e930(System.Collections.Hashtable+bucket[])->
000000017fe21a18(Sitecore.Caching.Cache+CacheEntry)->
00000000ffb6df78(Sitecore.Data.DataProviders.PrefetchData)->
00000000ffb6dfc0(Sitecore.Collections.SafeDictionary`2[[System.String, mscorlib],[Sitecore.Data.FieldList, Sitecore.Kernel]])->
00000000ffb6dfe8(System.Collections.Hashtable)->
00000000ffb6e040(System.Collections.Hashtable+bucket[])->
00000000ffb6f398(Sitecore.Data.FieldList)->
00000000ffb6f3b8(Sitecore.Collections.HashList`2[[Sitecore.Data.ID, Sitecore.Kernel],[System.String, mscorlib]])->
00000000ffb6f3d8(System.Collections.Hashtable)->
00000000ffb6f800(System.Collections.Hashtable+bucket[])

There are several useful things that we can get from this output.

Scan Thread 3 OSTHread 1210

It means that our object belongs to thread number 3. Later we can go into this thread and check what happens over there if we need.

You can read our stack of roots from the end to beginning. Last one is our object in question:

00000000ffb6f800(System.Collections.Hashtable+bucket[])

First one is beginning of "relationships chain":

You can check any of object in this chain using !do command. I'll look at the first one:

Root:000000013f58d4c0(Sitecore.Caching.Cache)

5. !do 000000013f58d4c0

Name: Sitecore.Caching.Cache
MethodTable: 000007ff005bd678
EEClass: 000007ff005a81d0
Size: 64(0x40) bytes
 (C:\Windows\Microsoft.NET\Framework64\v2.0.50727\Temporary ASP.NET Files\root\628e0e86\514809be\assembly\dl3\f8260b31\000daaab_d481cc01\Sitecore.Kernel.DLL)
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
000007feec107ca0  4000eec        8        System.String  0 instance 000000013f58d488 name
000007ff005bdd28  4000eed       10     Sitecore.Data.ID  0 instance 000000013f58d500 id
000007feec10b1c0  4000eef       20         System.Int64  1 instance 10485760 maxSize
000007feec10b1c0  4000ef0       28         System.Int64  1 instance 5030 currentSize
000007feec10f5f8  4000ef1       18 ...ections.Hashtable  0 instance 000000013f58d530 data
000007ff005bd520  4000ef2       30         System.Int32  1 instance                3 defaultPriority
000007feec106e70  4000ef3       34       System.Boolean  1 instance                1 enabled
000007feec106e70  4000ef4       35       System.Boolean  1 instance                0 <Scavengable>k__BackingField
000007feec106e70  4000eee      e30       System.Boolean  1   static                0 debugEnabled

This is definitely one of the Sitecore caches. Sitecore is our beloved CMS system - just to note where all this class names comes from :)

We are interested in the exact kind of cache that we are investigating. Remember that we can see the string from dump of string object? So we can take "name" field of our object above, and dump it:

6. !do 000000013f58d488

Name: System.String
MethodTable: 000007feec107ca0
EEClass: 000007feebd0e550
Size: 52(0x34) bytes
 (C:\Windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
String: master[items]

Hooray! this is "master[items]" cache, that means Sitecore's item cache for specific database. So we are digging somewhere in it.
But what about our exact object? "master[items]" is not its "mommy", as we can see a lot of other object's between first root and end of relationship chain.

7. Look again at the output of gcroot command in step #4.

00000000ffb6f398(Sitecore.Data.FieldList)->
00000000ffb6f3b8(Sitecore.Collections.HashList`2[[Sitecore.Data.ID, Sitecore.Kernel],[System.String, mscorlib]])->
00000000ffb6f3d8(System.Collections.Hashtable)->
00000000ffb6f800(System.Collections.Hashtable+bucket[])

We can see that our object belongs to some FieldList. Where does it come from?

00000000ffb6df78(Sitecore.Data.DataProviders.PrefetchData)->
00000000ffb6dfc0(Sitecore.Collections.SafeDictionary`2[[System.String, mscorlib],[Sitecore.Data.FieldList, Sitecore.Kernel]])->
00000000ffb6dfe8(System.Collections.Hashtable)->
00000000ffb6e040(System.Collections.Hashtable+bucket[])->

Looks like our bucket comes from PrefetchData. Here we go!

We can also run gcroot for some other hashtables from the list in step #2, but most of them will be from some Sitecore caches - I dare you.

Conclusion.

This was just a simple dump of healthy Sitecore installation, so having a lot of objects in Sitecore caches is not a problem - caches should be large to give us fast access to objects. But in another situation - when customer's solution really suffers from memory problems - you will be able to suggest customer to reduce size of the caches. And go have some rest! :)


Credits for images to:

3 comments:

  1. very nice pictures on the boring dumps! :D :D

    ReplyDelete
  2. Wonderful blog. Thanks it helped me...appreciate it..keep it up. Visit more info:- Skype Help and Call +1-800-231-46Free).

    ReplyDelete