Finding out the size (memory/heap usage) of Java Objects and Classes can be somewhat tricky as there is no built-in sizeof()-like functionality. Luckily most people don’t have to face this problem. But when you do, it can be difficult and in the end is almost always a best-guess approximation.
The reason that I started to look in to this subject was to build a memory-aware, LRU eviction cache for the open source database project HBase. You can see the final result of that effort here.
When sizing a Class you can take two approaches. One is to use the methods totalMemory and freeMemory from the Java Runtime class. The other strategy is to analyze the internal structure of your Class. The first approach has the benefit that it gives you the actual memory footprint of your object rather than the expected one, but has the downside of relying on the garbage collector to do it’s thing. So for this to work you often have to make repeated calls to the GC and sleep in between to give it time to process and minimize heap usage. As one can see, this is not a solution suitable to be used in real time system, but can be a solid solution that you go back to when everything else fails, or just to verify your findings.
The second way of sizing your Class is to dissect it into it’s fundamental building blocks. Let’s start by introducing the notion of a reference, known in other languages as a pointer. A reference, or REF for short, is 4 bytes on a 32 bit system and 8 bytes on a 64 bit system. Also note, the total size of an Object will always be word-aligned (4 byte aligned on 32bit, 8 byte aligned on 64bit).
Every instantiated Object in Java has a fixed overhead of 2 REFs. To that, you add the contributions from all the non-static variables/fields contained within the Object. For the most basic primitives, these sizes are:
byte = 1 byte, int = 4 bytes, long = 8 bytes
For example, let’s define and size the Class PrimitiveContainer
public class PrimitiveContainer { public int x = 1; public long y = 2; }
The calculation for this would be:
overhead + sizeof(int) + sizeof(long) = (2 * REF) + 4 + 8 = 28 bytes on 64-bit, after word-alignment we get 32 bytes
Another example that uses Objects instead of primitives:
public class ObjectContainer { public Integer x = 1; public Long y = 2; }
Since this Object contains references to other Objects (not primitives), we will calculate and include both the size of the referenced Objects (Integer and Long) as well as the references to those Objects (in ObjectContainer). Both Integer and Long will have the fixed Object overhead of 2 * REF in addition to either an int or long primitive in each. So:
sizeof(Integer) = (2 * REF) + 4 = 20
sizeof(Long) = (2 * REF) + 8 = 24
The total calculation would be:
overhead + reference to Integer + sizeof(Integer) + reference to Long + sizeof(Long) = (2 * REF) + (REF + 20) + (REF + 24) = 80 bytes
As you can see that there is a significant increase in memory usage between using primitives when compared to their Object counterparts, in this case 2.5X larger.
Arrays in Java, though underneath are really Objects, are unlike other complex Classes (Lists, Maps, etc) because they can be used with primtives directly. Through experimentation and instrumentation of the JVM, the fixed overhead of any array is 3 * REF.
So with this in mind you can treat it as a regular object that aligns with it’s daughter objects.
For example if you have:
byte [] bs = {1,2,3,4};
on a 32 bit system, you get a total size of : 3 * REF + 4 = 3 * 4 + 4 = 16 bytes
on a 64 bit system, you get a total size of: 3 * REF + 4 = 3 * 8 + 4 = 28 which aligns to 32 bytes (twice the memory usage in this case)
One more thing that is worth mentioning is the cost for static and final variables. Usually static variables are not taken into consideration when sizing you object, since this is only done once and not for each instantiated Object. Final variables should almost always be included, but of course it depends on your use case and the reason that you are sizing your objects.
#1 by Shen at July 24th, 2009
Dear Erik Holstad,
In the article, you mentioned it “Every instantiated Object in Java has a fixed overhead of 2 REFs.”. Could you share any resources or references about the topic? Thanks.
#2 by Erik Holstad at July 26th, 2009
Hi Shen!
All the numbers from that part is measured using http://sizeof.sourceforge.net/ and/or a small test program like :
public class Sizer { public static void main(String [] args) throws Exception { Runtime r = Runtime.getRuntime(); // Pre-instantiate variables long memoryBefore = 0; long memoryAfter = 0; int loops = 10; runGC(r, loops); memoryBefore = getMemoryUsage(r); // Long lo = new Long(1); TestClass in = new TestClass(); runGC(r, loops); memoryAfter = getMemoryUsage(r); System.out.println("Diff in size is " + (memoryAfter - memoryBefore)); } public static void runGC(Runtime r, int loops) throws Exception { for(int i=0; i<loops; i++) { r.gc(); Thread.sleep(2000); } } public static long getMemoryUsage(Runtime r) throws Exception { long usedMemory = r.totalMemory() - r.freeMemory(); System.out.println("Memory Usage: " + usedMemory); return usedMemory; } private static class TestClass { public TestClass(){} } }Erik
#3 by Shen at July 27th, 2009
Thanks Erik,
I try to explain that “Every instantiated Object in Java has a fixed overhead of 2 REFs”.
Above snippet code:
TestClass in = new TestClass();
“in” is a reference, so it hold 4 bytes on 32-bit. Another is “this” implicit reference, so “Every instantiated Object in Java has a fixed overhead of 2 REFs”, is right?
#4 by Shen at July 27th, 2009
class Test
{
int i = 1;
}
In above example, I got 24(2*REF+sizeof(int)) bytes on a 64 bit system, why is “16 bytes” on a 32 bit system but not “12 bytes”? Thanks.
#5 by Erik Holstad at August 3rd, 2009
Hi Shen!
The reason that you get 16B instead of 12 is because of alignment, as I mentioned in the post, everything is aligned to 8B chunks, that is why you get that extra overhead on the 32 bit system.
Erik
#6 by Shen at August 10th, 2009
Do you mean everything is aligned to 8B chunks in spite of 32bit or 64bit system? Why don’t be 4B chunks in 32bit system? Thanks.
#7 by Erik Holstad at August 11th, 2009
Hi Shen!
Yeah, that is what I’m saying. For both 32 bit a and 64 bit systems the alignment is 8 bytes. I’m not 100% sure why this is but I would be really weird if you had to send a long, 8 bytes, in two packages, right?
#8 by Doug Clayton at August 25th, 2009
The reason everything is aligned to 8-byte boundaries is because most processors won’t let you access an 8-byte value unless it is aligned to 8 bytes. Since the longest primitive Java supports is 8 bytes (double, long, or reference in 64 bit), it aligns its objects to that so that any object can be placed to any valid location in memory without worrying about whether it has a long or double inside it that can handle the resulting alignment. It just has to make sure that the long or double is aligned properly to the top of the object, and that guarantees it will be aligned in memory for the processor.
#9 by Doug Clayton at August 25th, 2009
Shen,
The instance of TestClass itself has 8 bytes of overhead. That has to contain at least one reference to point to the class object for that instance. Based on my tests, the other 4 hold some value, because the overhead is 16 bytes on a 64-bit JVM, rather than 8 as it would be if those 4 bytes were just padding.
In your example, the 4 bytes taken up by the “in” reference are allocated on the stack, not on the heap. Also, the “this” implicit reference is also allocated on the stack: it’s passed to methods just like any parameter, but it’s just hidden.