Summary

This page describes adding support for Async Monitor Deflation to OpenJDK. The primary goal of this project is to reduce the time spent in safepoint cleanup operations.

RFE: 8153224 Monitor deflation prolong safepoints
         https://bugs.openjdk.java.net/browse/JDK-8153224

Full Webrev: http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/

Inc Webrev: http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/

Background

This patch for Async Monitor Deflation is based on Carsten Varming's

http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/

which has been ported to work with monitor lists. Monitor lists were optional via the '-XX:+MonitorInUseLists' option in JDK8, the option became default 'true' in JDK9, the option became deprecated in JDK10 via JDK-8180768, and the option became obsolete in JDK12 via JDK-8211384. Carsten's webrev is based on JDK10 so there was a bit of porting work needed to merge his code and/or algorithms with jdk/jdk.

Carsten also submitted a JEP back in the JDK10 time frame:

JDK-8183909 Concurrent Monitor Deflation

https://bugs.openjdk.java.net/browse/JDK-8183909

The OpenJDK JEP process has evolved a bit since JDK10 and a JEP is no longer required for a project that is well defined to be within one area of responsibility. Async Monitor Deflation is clearly defined to be in the JVM Runtime team's area of responsibility so it is likely that the JEP (JDK-8183909) will be withdrawn and the work will proceed via the RFE (JDK-8153224).

Introduction

The current idle monitor deflation mechanism executes at a safepoint during cleanup operations. Due to this execution environment, the current mechanism does not have to worry about interference from concurrently executing JavaThreads. Async Monitor Deflation uses JavaThreads and the ServiceThread to deflate idle monitors so the new mechanism has to detect interference and adapt as appropriate. In other words, data races are natural part of Async Monitor Deflation and the algorithms have to detect the races and react without data loss or corruption.

Key Parts of the Algorithm

1) Deflation With Interference Detection

ObjectSynchronizer::deflate_monitor_using_JT() is the new counterpart to ObjectSynchronizer::deflate_monitor() and does the heavy lifting of asynchronously deflating a monitor using a three part prototcol:

  1. Setting a NULL owner field to DEFLATER_MARKER with cmpxchg() forces any contending thread through the slow path. A racing thread would be trying to set the owner field.
  2. Making a zero contentions field a large negative value with cmpxchg() forces racing threads to retry. A racing thread would have set the owner field (after we stored DEFLATER_MARKER) and would be trying to increment the contentions field.
  3. If the owner field is still equal to DEFLATER_MARKER, then we have won all the races and can deflate the monitor.

If we lose any of the races, the monitor cannot be deflated at this time.

Once we know it is safe to deflate the monitor (which is mostly field resetting and monitor list management), we have to restore the object's header. That's another racy operation that is described below in "Restoring the Header With Interference Detection".

2) Restoring the Header With Interference Detection

ObjectMonitor::install_displaced_markword_in_object() is the new piece of code that handles all the racy situations with restoring an object's header asynchronously. The function is called from a few of places (deflation, object monitor entry, and saving an ObjectMonitor* in an ObjectMonitorHandle). The restoration protocol for the object's header uses the mark bit along with the hash() value staying at zero to indicate that the object's header is being restored. Only one of the possible racing scenarios can win and the losing scenarios all adapt to the winning scenario's object header value.

3) Using "owner" or "contentions" With Interference Detection

Various code paths have been updated to recognize an owner field equal to DEFLATER_MARKER or a negative contentions field and those code paths will retry their operation. This is the shortest "Key Part" description, but don't be fooled. See "Gory Details" below.

An Example of ObjectMonitor Interference

For example, when ObjectMonitor::enter() detects genuine contention via the owner field, it atomically increments the contentions field to indicate that the ObjectMonitor is busy. The thread calling enter() (T-enter) is potentially racing with an Async Monitor Deflation by another JavaThread (T-deflate) so both threads have to check the result of the race.

Start of the Race

                      ObjectMonitor              T-deflate
T-enter           +-----------------------+  ----------------------------------------
----------------  | owner=NULL            | deflate_monitor_using_JT() {
                   | contentions=0         | cmpxchg(DEFLATER_MARKER, &owner, NULL)
                   +-----------------------+

Racing Threads

                            ObjectMonitor              T-deflate
    T-enter           +-----------------------+  --------------------------------------------
    ---------------------- | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
    <owner is contended> | contentions=0         |  cmpxchg(DEFLATER_MARKER, &owner, NULL)
    atomic inc contentions  +-----------------------+  :
prev = cmpxchg(-max_jint, &contentions, 0)

T-deflate Wins

                                     ObjectMonitor              T-deflate
    T-enter                     +-----------------------+  --------------------------------------------
    -------------------------------  | owner=DEFLATER_MARKER |  deflate_monitor_using_JT() {
    <owner is contended>         | contentions=-max_jint |  cmpxchg(DEFLATER_MARKER, &owner, NULL)
    atomic inc contentions     +-----------------------+  :
    if (contentions <= 0 &&                                prev = cmpxchg(-max_jint, &contentions, 0)
        owner == DEFLATER_MARKER) {                         if (prev == 0 &&
      restore obj header                                        owner == DEFLATER_MARKER) {
      retry enter                                          restore obj header
    }                                                       finish the deflation
}

T-enter Wins

                                     ObjectMonitor              T-deflate
    T-enter           +-----------------------+  --------------------------------------------
    ------------------------------- | owner=DEFLATER_MARKER |  deflate_monitor_using_JT() {
    <owner is contended>    | contentions=1         |  cmpxchg(DEFLATER_MARKER, &owner, NULL)
    atomic inc contentions  +-----------------------+  :
    if (contentions <= 0 &&                               prev = cmpxchg(-max_jint, &contentions, 0)
      owner == DEFLATER_MARKER) {                  if (prev == 0 &&
    } else {                                   owner == DEFLATER_MARKER) {
do contended } else {
enter work cmpxchg(NULL, &owner, DEFLATER_MARKER)

T-enter Wins By A-B-A

                                                ObjectMonitor                T-deflate
    T-enter           +-------------------------+  ------------------------------------------
    ------------------------------------------ | owner=DEFLATER_MARKER |  deflate_monitor_using_JT() {
    <owner is contended>   | contentions=1           |  cmpxchg(DEFLATER_MARKER, &owner, NULL)
    atomic inc contentions  +-------------------------+ 1> :
 1> if (contentions <= 0 &&                         || 2> : <thread_stalls>
      owner == DEFLATER_MARKER) {                      \/ :
    } else {   +-------------------------+ :
EnterI() | owner=Self/T-enter | :
cmpxchg(Self, &owner, DEFLATER_MARKER) | contentions=0 | : <thread_resumes>
atomic dec contentions +-------------------------+ prev = cmpxchg(-max_jint, &contentions, 0)
2> } || if (prev == 0 &&
// finished with enter \/ owner == DEFLATER_MARKER) {
3> : <does app work> +-------------------------+ } else {
exit() monitor | owner=Self/T-enter|NULL | cmpxchg(NULL, &owner, DEFLATER_MARKER)
owner = NULL | contentions=0 | atomic add max_jint to contentions
+-------------------------+ 3> bailout on deflation
}

T-enter and T-deflate Both Lose

This subsection is pure theory right now. I don't have a failing test case that illustrates this race result.

After working out the bug described in the "T-deflate and T-hash Both Lose" subsection below, it is time to take a closer look at the T-enter versus T-deflate race. For analysis of this race to make sense, the ref_count field has to be introduced in this subsection instead of in the "Hashcodes and Object Header Interference" section below.

                                                ObjectMonitor                T-deflate
    T-enter           +-------------------------+  -----------------------------------------------
    ------------------------------------------ | owner=DEFLATER_MARKER |  deflate_monitor_using_JT() {
    ref_count inc by ObjectMonitorHandle | contentions=0           |  cmpxchg(DEFLATER_MARKER, &owner, NULL)
    <owner is contended>   | ref_count=1 | if (waiters != 0 || ref_count != 0) {
 1> atomic inc contentions                      +-------------------------+ }
  if (contentions <= 0 &&                      || 1> prev = cmpxchg(-max_jint, &contentions, 0)
    owner == DEFLATER_MARKER) {   \/ 2> if (prev == 0 &&
2> restore obj header +-------------------------+ owner == DEFLATER_MARKER &&
retry enter | owner=DEFLATER_MARKER | cmpxchg(-max_jint, &ref_count, 0) == 0) {
} | contentions=-max_jint | restore obj header
| ref_count=1 | finish the deflation
+-------------------------+ } else {
cmpxchg(NULL, &owner, DEFLATER_MARKER)
atomic add max_jint to contentions
bailout on deflation
}

I have to look at this new theory with fresh eyes, but if it holds together, then T-enter's "contentions <= 0 && owner == DEFLATER_MARKER" check will need to be changed to "contentions <= 0 && owner == DEFLATER_MARKER && ref_count <= 0" as was done for save_om_ptr().

An Example of Object Header Interference

After T-deflate has won the race for deflating an ObjectMonitor it has to restore the header in the associated object. Of course another thread can be trying to do something to the object's header at the same time. Isn't asynchronous work exciting?!?!

ObjectMonitor::install_displaced_markword_in_object() is called from two places so we can have a race between a T-enter thread and a T-deflate thread:

Start of the Race

    T-enter                                      object           T-deflate
    -------------------------------------------  +-------------+  --------------------------------------------
install_displaced_markword_in_object() { | mark=om_ptr |  install_displaced_markword_in_object() {
    dmw = header()                    +-------------+  dmw = header()
    if (!dmw->is_marked() &&                                     if (!dmw->is_marked() &&
      dmw->hash() == 0) {                                          dmw->hash() == 0) {
      create marked_dmw                    create marked_dmw
    dmw = cmpxchg(marked_dmw, &header, dmw)                      dmw = cmpxchg(marked_dmw, &header, dmw)
} }

T-deflate Wins First Race

    T-enter                                      object            T-deflate
    -------------------------------------------  +-------------+   -------------------------------------------
    install_displaced_markword_in_object() {   | mark=om_ptr |  install_displaced_markword_in_object() {
     dmw = header()                    +-------------+  dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
         dmw->hash() == 0) {                                           dmw->hash() == 0) {
       create marked_dmw                                             create marked_dmw
       dmw = cmpxchg(marked_dmw, &header, dmw)                       dmw = cmpxchg(marked_dmw, &header, dmw)
     }                                                             }
     // dmw == marked_dmw here                                     // dmw == original dmw here
     if (dmw->is_marked())                                         if (dmw->is_marked())
      unmark dmw                                                    unmark dmw
    obj = object()                                                obj = object()
    obj->cas_set_mark(dmw, this)                                  obj->cas_set_mark(dmw, this)

T-enter Wins First Race

    T-enter                                      object            T-deflate
    -------------------------------------------  +-------------+   -------------------------------------------
    install_displaced_markword_in_object() {    | mark=om_ptr |  install_displaced_markword_in_object() {
    dmw = header()                    +-------------+  dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
         dmw->hash() == 0) {                                           dmw->hash() == 0) {
       create marked_dmw                                             create marked_dmw
       dmw = cmpxchg(marked_dmw, &header, dmw)                       dmw = cmpxchg(marked_dmw, &header, dmw)
    }                                                             }
    // dmw == original dmw here                                   // dmw == marked_dmw here
    if (dmw->is_marked())                                         if (dmw->is_marked())
       unmark dmw                                                    unmark dmw
    obj = object()                                                obj = object()
    obj->cas_set_mark(dmw, this)                                  obj->cas_set_mark(dmw, this)

Either Wins the Second Race

    T-enter                                      object            T-deflate
    -------------------------------------------  +-------------+   -------------------------------------------
    install_displaced_markword_in_object() {   | mark=dmw    |  install_displaced_markword_in_object() {
     dmw = header()                   +-------------+  dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
         dmw->hash() == 0) {                                           dmw->hash() == 0) {
       create marked_dmw                                             create marked_dmw
       dmw = cmpxchg(marked_dmw, &header, dmw)                       dmw = cmpxchg(marked_dmw, &header, dmw)
     }                                                             }
     // dmw == ...                                    // dmw == ...
    if (dmw->is_marked())                                         if (dmw->is_marked())
       unmark dmw                                                    unmark dmw
     obj = object()                                                obj = object()
     obj->cas_set_mark(dmw, this)                                  obj->cas_set_mark(dmw, this)

Please notice that install_displaced_markword_in_object() does not do any retries on any code path:

Hashcodes and Object Header Interference

If we have a race between a T-deflate thread and a thread trying to get/set a hashcode (T-hash), then the race is between the ObjectMonitorHandle.save_om_ptr(obj, mark) call in T-hash and deflation protocol in T-deflate.

Note: ref_count is not mentioned in any of the previous sections for simplicity.

Start of the Race

    T-hash                  ObjectMonitor              T-deflate
    ----------------------  +-----------------------+  ----------------------------------------
    save_om_ptr() {         | owner=NULL            |  deflate_monitor_using_JT() {
      :                     | contentions=0         | 1> cmpxchg(DEFLATER_MARKER, &owner, NULL)
   1> atomic inc ref_count  | ref_count=0           |
                            +-----------------------+

Racing Threads

    T-hash                  ObjectMonitor              T-deflate
    ----------------------  +-----------------------+  --------------------------------------------
    save_om_ptr() {         | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
      :   | contentions=0  |  cmpxchg(DEFLATER_MARKER, &owner, NULL)
   1> atomic inc ref_count  | ref_count=0           | 1> if (waiters != 0 || ref_count != 0) {
                            +-----------------------+  }
prev = cmpxchg(-max_jint, &contentions, 0)

T-deflate Wins

If T-deflate wins the race, then T-hash will have to retry at most once.

    T-hash                      ObjectMonitor              T-deflate
    -------------------------  +-----------------------+  -----------------------------------------------
    save_om_ptr() {           | owner=DEFLATER_MARKER |  deflate_monitor_using_JT() {
   1> atomic inc ref_count    | contentions=-max_jint |  cmpxchg(DEFLATER_MARKER, &owner, NULL)
   if (owner ==           | ref_count=0           |  if (waiters != 0 || ref_count != 0) {
          DEFLATER_MARKER &&  +-----------------------+  }
        contentions <= 0 &&             ||              prev = cmpxchg(-max_jint, &contentions, 0)
        ref_count <= 0) {            \/             1> if (prev == 0 &&
     restore obj header   +-----------------------+     owner == DEFLATER_MARKER &&
     atomic dec ref_count  | owner=DEFLATER_MARKER |    cmpxchg(-max_jint, &ref_count, 0) == 0) {
     2> return false to      | contentions=-max_jint |    restore obj header
cause a retry | ref_count=-max_jint | 2> finish the deflation
} +-----------------------+ }

T-hash Wins Scenario 1

If T-hash wins the race, then the ref_count will cause T-deflate to bail out on deflating the monitor.

Note: header is not mentioned in any of the previous sections for simplicity.

    T-hash                      ObjectMonitor              T-deflate
    -------------------------  +-----------------------+  --------------------------------------------
    save_om_ptr() {           | header=dmw_no_hash | deflate_monitor_using_JT() {
      atomic inc ref_count    | owner=DEFLATER_MARKER |   cmpxchg(DEFLATER_MARKER, &owner, NULL)
   1> if (owner ==            | contentions=0      | 1> if (waiters != 0 || ref_count != 0) {
          DEFLATER_MARKER && | ref_count=1 |     cmpxchg(NULL, &owner, DEFLATER_MARKER)
         contentions <= 0 &&  +-----------------------+  2> bailout on deflation
      ref_count <= 0) {               ||              }
      } \/ prev = cmpxchg(-max_jint, &contentions, 0)
      if (object no longer +-----------------------+
          has a monitor or | header=dmw_no_hash |
        is a different | owner=NULL |
        monitor) { | contentions=0 |
        atomic dec ref_count | ref_count=1 |
      return false to +-----------------------+
      cause a retry ||
   } \/
   2> save om_ptr in the +-----------------------+
ObjectMonitorHandle | header=dmw_hash |
} | owner=NULL |
if save_om_ptr() { | contentions=0 |
if no hash | ref_count=1 |
gen hash & merge +-----------------------+
hash = hash(header)
}
3> atomic dec ref_count
return hash

T-hash Wins Scenario 2

In this T-hash wins scenario, the need for setting ref_count to a large negative value in the third part of the protocol is illustrated.

    T-hash                      ObjectMonitor              T-deflate
    -------------------------  +-----------------------+  -----------------------------------------------
    save_om_ptr() {           | header=dmw_no_hash | deflate_monitor_using_JT() {
      atomic inc ref_count    | owner=DEFLATER_MARKER |   cmpxchg(DEFLATER_MARKER, &owner, NULL)
   if (owner ==            | contentions=0      | if (waiters != 0 || ref_count != 0) {
          DEFLATER_MARKER && | ref_count=1 |   }
         contentions <= 0 &&  +-----------------------+ 1> prev = cmpxchg(-max_jint, &contentions, 0)
      ref_count <= 0) {                ||             2> if (prev == 0 &&
   } \/   owner == DEFLATER_MARKER &&
   1> if (object no longer +-----------------------+ cmpxchg(-max_jint, &ref_count, 0) == 0) {
          has a monitor or | header=dmw_no_hash | } else {
        is a different | owner=DEFLATER_MARKER | cmpxchg(NULL, &owner, DEFLATER_MARKER)
        monitor) { | contentions=-max_jint | atomic add max_jint to contentions
        atomic dec ref_count | ref_count=1 | 3> bailout on deflation
      return false to +-----------------------+ }
      cause a retry ||
   } \/
   2> save om_ptr in the +-----------------------+
ObjectMonitorHandle | header=dmw_hash |
} | owner=NULL |
if save_om_ptr() { | contentions=0 |
if no hash | ref_count=1 |
gen hash & merge +-----------------------+
hash = hash(header)
}
3> atomic dec ref_count
return hash

T-deflate and T-hash Both Lose

This subsection title is NOT a typo. It was previously possible for both T-deflate and T-hash to lose the race. In the "CR0/v2.00/3-for-jdk13" version of the code, the double loss was not an issue. The addition of "restore obj header" to save_om_ptr() created a bug where the ObjectMonitor can be disconnected from the object without the ObjectMonitor being deflated. This led to a situation where the ObjectMonitor is still on the in-use list and thinks it is owned by the object, but the object does not agree. This rare bug existed in the "CR1/v2.01/4-for-jdk13" version of the code. save_om_ptr() and deflate_monitor_using_JT() have been changed to recognize a large negative ref_count value as a marker that async deflation has won the race. With that change in place, it is no longer possible for both T-deflate and T-hash to lose the same race.

    T-hash                      ObjectMonitor              T-deflate
    -------------------------  +-----------------------+  -----------------------------------------------
    save_om_ptr() {           | owner=DEFLATER_MARKER |  deflate_monitor_using_JT() {
   1> atomic inc ref_count    | contentions=-max_jint |  cmpxchg(DEFLATER_MARKER, &owner, NULL)
   if (owner ==           | ref_count=0           |  if (waiters != 0 || ref_count != 0) {
          DEFLATER_MARKER &&  +-----------------------+  }
        contentions <= 0 &&                            prev = cmpxchg(-max_jint, &contentions, 0)
        ref_count <= 0) {                             if (prev == 0 &&
     restore obj header      owner == DEFLATER_MARKER &&
     atomic dec ref_count     1>  cmpxchg(-max_jint, &ref_count, 0) == 0) {
      return false to         } else {
cause a retry cmpxchg(NULL, &owner, DEFLATER_MARKER)
} atomic add max_jint to contentions
bailout on deflation
}

Please note that in Carsten's original prototype, there was another race in ObjectSynchronizer::FastHashCode() when the object's monitor had to be inflated. The setting of the hashcode in the ObjectMonitor's header/dmw could race with T-deflate. That race is resolved in this version by the use of an ObjectMonitorHandle in the call to ObjectSynchronizer::inflate(). The ObjectMonitor* returned by ObjectMonitorHandle.om_ptr() has a non-zero ref_count so no additional races with T-deflate are possible.

Housekeeping Parts of the Algorithm

The devil is in the details! Housekeeping or administrative stuff are usually detailed, but necessary.

Gory Details