Java GCs

Java does not explicitly release memory in application code. Instead, garbage collector finds unnecessary objects and deallocates those.

Garbage collector is created under two preconditions:

  1. Most of the objects soon become unnecessary.
  2. Reference from old object to young object is rare.

Under this hypotheses, memory is divided to Young / Old / Metaspace and managed hierarchically in Hotspot VM.

Young
Most of the newly created object allocated here. Minor GC occurs when objects are wiped out from this space. Young Space is divided to 1 eden space and 2 survivor space.

Old
Objects survived from young gc and meet some criteria are moved to old space. Major GC (or Full GC) occurs when objects are wiped out from this space.

Metaspace
metadatas of classes saved in metaspace. It is also major GC occurs in metaspace.

Allocation of Objects

  1. Most of newly created objects are allocated in Eden space.
  2. Survived objects from young GC in eden space move to one of the Survivor Space.
  3. When one survivor space is full, it clears unnecessary objects in space and move survived objects to another survivor space.
  4. Repeating this step. When object is old enough (survived enough), it moves to old space.


GC Implementations

There are many ways to implement GC. I’ll introduce 3 of them:

  1. Serial GC
  2. Concurrent Mark and Sweep GC ( = CMS)
  3. G1(Garbage First) GC

Serial GC

Utilizes one CPU core. Mark-Sweep-Compact way:

  1. It “marks” necessary objects.
  2. Is “sweeps” unnecessary objects from the top of the heap.
  3. Then it “compacts” objects from the top of the heap.

CMS GC

CMS focuses on reducing STW(Stop-The-World) time. Mark goes with:

  1. On initial mark, it only marks nearest objects from classloader.
  2. On concurrent mark, it concurrently marks objects, following down from marked object from sequence 1.
  3. On remark, it checks newly created objects or dereferenced objects.

On concurrent sweep, it concurrently collects unnecessary objects. Disadvantages of CMS are:

  1. High CPU and memory usage
  2. There are no compaction step on each GC. That is when there are too much fragmented objects, it might become much more slower on compaction than other GCs.

G1 GC

G1 GC divides memory with “Region”.

  1. It allocates objects in each region, and runs GC in region if necessary.
  2. When region is full, it runs GC and allocates objects in another region.
  3. With certain criteria, it elevates survivor region to old region.

Disadvantages of G1GC are:

  1. Once object is moved to old space, it is hard to clean up (hard to avoid full gc).
  2. huge object(size > region size / 2) manipulation is not optimized.