This page describes adding support for Async Monitor Deflation to OpenJDK. The primary goal of this project is to reduce the time spent in safepoint cleanup operations.
RFE: 8153224 Monitor deflation prolong safepoints
         https://bugs.openjdk.java.net/browse/JDK-8153224
Full Webrev: http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.full/
Inc Webrev: http://cr.openjdk.java.net/~dcubed/8153224-webrev/5-for-jdk13.inc/
This patch for Async Monitor Deflation is based on Carsten Varming's
http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/
which has been ported to work with monitor lists. Monitor lists were optional via the '-XX:+MonitorInUseLists' option in JDK8, the option became default 'true' in JDK9, the option became deprecated in JDK10 via JDK-8180768, and the option became obsolete in JDK12 via JDK-8211384. Carsten's webrev is based on JDK10 so there was a bit of porting work needed to merge his code and/or algorithms with jdk/jdk.
Carsten also submitted a JEP back in the JDK10 time frame:
JDK-8183909 Concurrent Monitor Deflation
https://bugs.openjdk.java.net/browse/JDK-8183909
The OpenJDK JEP process has evolved a bit since JDK10 and a JEP is no longer required for a project that is well defined to be within one area of responsibility. Async Monitor Deflation is clearly defined to be in the JVM Runtime team's area of responsibility so it is likely that the JEP (JDK-8183909) will be withdrawn and the work will proceed via the RFE (JDK-8153224).
The current idle monitor deflation mechanism executes at a safepoint during cleanup operations. Due to this execution environment, the current mechanism does not have to worry about interference from concurrently executing JavaThreads. Async Monitor Deflation uses JavaThreads and the ServiceThread to deflate idle monitors so the new mechanism has to detect interference and adapt as appropriate. In other words, data races are natural part of Async Monitor Deflation and the algorithms have to detect the races and react without data loss or corruption.
ObjectSynchronizer::deflate_monitor_using_JT() is the new counterpart to ObjectSynchronizer::deflate_monitor() and does the heavy lifting of asynchronously deflating a monitor using a three part prototcol:
If we lose any of the races, the monitor cannot be deflated at this time.
Once we know it is safe to deflate the monitor (which is mostly field resetting and monitor list management), we have to restore the object's header. That's another racy operation that is described below in "Restoring the Header With Interference Detection".
ObjectMonitor::install_displaced_markword_in_object() is the new piece of code that handles all the racy situations with restoring an object's header asynchronously. The function is called from a few of places (deflation, object monitor entry, and saving an ObjectMonitor* in an ObjectMonitorHandle). The restoration protocol for the object's header uses the mark bit along with the hash() value staying at zero to indicate that the object's header is being restored. Only one of the possible racing scenarios can win and the losing scenarios all adapt to the winning scenario's object header value.
Various code paths have been updated to recognize an owner field equal to DEFLATER_MARKER or a negative contentions field and those code paths will retry their operation. This is the shortest "Key Part" description, but don't be fooled. See "Gory Details" below.
For example, when ObjectMonitor::enter() detects genuine contention via the owner field, it atomically increments the contentions field to indicate that the ObjectMonitor is busy. The thread calling enter() (T-enter) is potentially racing with an Async Monitor Deflation by another JavaThread (T-deflate) so both threads have to check the result of the race.
ObjectMonitor T-deflate
T-enter +-----------------------+ ----------------------------------------
---------------- | owner=NULL | deflate_monitor_using_JT() {
| contentions=0 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
+-----------------------+
ObjectMonitor T-deflate
T-enter +-----------------------+ --------------------------------------------
---------------------- | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
owner contended | contentions=0 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
atomic inc contentions +-----------------------+ :
prev = cmpxchg(-max_jint, &contentions, 0)
ObjectMonitor T-deflate
T-enter +-----------------------+ --------------------------------------------
------------------------------- | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
owner contended | contentions=-max_jint | cmpxchg(DEFLATER_MARKER, &owner, NULL)
atomic inc contentions +-----------------------+ :
if (contentions <= 0 && prev = cmpxchg(-max_jint, &contentions, 0)
owner == DEFLATER_MARKER) { if (prev == 0 &&
restore obj header owner == DEFLATER_MARKER) {
retry enter restore obj header
} finish the deflation
}
ObjectMonitor T-deflate
T-enter +-----------------------+ --------------------------------------------
------------------------------- | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
owner contended | contentions=1 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
atomic inc contentions +-----------------------+ :
if (contentions <= 0 && prev = cmpxchg(-max_jint, &contentions, 0)
owner == DEFLATER_MARKER) { if (prev == 0 &&
} else { owner == DEFLATER_MARKER) {
do contended } else {
enter work cmpxchg(NULL, &owner, DEFLATER_MARKER)
ObjectMonitor T-deflate
T-enter +-------------------------+ ------------------------------------------
------------------------------------------ | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
owner contended | contentions=1 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
atomic inc contentions +-------------------------+ 1> :
1> if (contentions <= 0 && || 2> : <thread_stalls>
owner == DEFLATER_MARKER) { \/ :
} else { +-------------------------+ :
EnterI() | owner=Self/T-enter | :
cmpxchg(Self, &owner, DEFLATER_MARKER) | contentions=0 | : <thread_resumes>
atomic dec contentions +-------------------------+ prev = cmpxchg(-max_jint, &contentions, 0)
2> } || if (prev == 0 &&
// finished with enter \/ owner == DEFLATER_MARKER) {
3> : <does app work> +-------------------------+ } else {
exit() monitor | owner=Self/T-enter|NULL | cmpxchg(NULL, &owner, DEFLATER_MARKER)
owner = NULL | contentions=0 | atomic add max_jint to contentions
+-------------------------+ 3> bailout on deflation
}
If the T-enter thread has managed to enter but not exit the monitor during the T-deflate stall, then our owner field A-B-A transition is:
    NULL → DEFLATE_MARKER → Self/T-enter
so we really have A1-B-A2, but the A-B-A principal still holds.
After T-deflate has won the race for deflating an ObjectMonitor it has to restore the header in the associated object. Of course another thread can be trying to do something to the object's header at the same time. Isn't asynchronous work exciting?!?!
ObjectMonitor::install_displaced_markword_in_object() is called from two places so we can have a race between a T-enter thread and a T-deflate thread:
T-enter object T-deflate
------------------------------------------- +-------------+ --------------------------------------------
install_displaced_markword_in_object() { | mark=om_ptr | install_displaced_markword_in_object() {
dmw = header() +-------------+ dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
dmw->hash() == 0) { dmw->hash() == 0) {
create marked_dmw create marked_dmw
dmw = cmpxchg(marked_dmw, &header, dmw) dmw = cmpxchg(marked_dmw, &header, dmw)
} }
T-enter object T-deflate
------------------------------------------- +-------------+ -------------------------------------------
install_displaced_markword_in_object() { | mark=om_ptr | install_displaced_markword_in_object() {
dmw = header() +-------------+ dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
dmw->hash() == 0) { dmw->hash() == 0) {
create marked_dmw create marked_dmw
dmw = cmpxchg(marked_dmw, &header, dmw) dmw = cmpxchg(marked_dmw, &header, dmw)
} }
// dmw == marked_dmw here // dmw == original dmw here
if (dmw->is_marked()) if (dmw->is_marked())
unmark dmw unmark dmw
obj = object() obj = object()
obj->cas_set_mark(dmw, this) obj->cas_set_mark(dmw, this)
T-enter object T-deflate
------------------------------------------- +-------------+ -------------------------------------------
install_displaced_markword_in_object() { | mark=om_ptr | install_displaced_markword_in_object() {
dmw = header() +-------------+ dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
dmw->hash() == 0) { dmw->hash() == 0) {
create marked_dmw create marked_dmw
dmw = cmpxchg(marked_dmw, &header, dmw) dmw = cmpxchg(marked_dmw, &header, dmw)
} }
// dmw == original dmw here // dmw == marked_dmw here
if (dmw->is_marked()) if (dmw->is_marked())
unmark dmw unmark dmw
obj = object() obj = object()
obj->cas_set_mark(dmw, this) obj->cas_set_mark(dmw, this)
T-enter object T-deflate
------------------------------------------- +-------------+ -------------------------------------------
install_displaced_markword_in_object() { | mark=dmw | install_displaced_markword_in_object() {
dmw = header() +-------------+ dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
dmw->hash() == 0) { dmw->hash() == 0) {
create marked_dmw create marked_dmw
dmw = cmpxchg(marked_dmw, &header, dmw) dmw = cmpxchg(marked_dmw, &header, dmw)
} }
// dmw == ... // dmw == ...
if (dmw->is_marked()) if (dmw->is_marked())
unmark dmw unmark dmw
obj = object() obj = object()
obj->cas_set_mark(dmw, this) obj->cas_set_mark(dmw, this)
Please notice that install_displaced_markword_in_object() does not do any retries on any code path:
If we have a race between a T-deflate thread and a thread trying to get/set a hashcode (T-hash), then the race is between the ObjectMonitorHandle.save_om_ptr(obj, mark) call in T-hash and deflation protocol in T-deflate.
Note: ref_count is not mentioned in any of the previous sections for simplicity.
T-hash ObjectMonitor T-deflate
---------------------- +-----------------------+ ----------------------------------------
save_om_ptr() { | owner=NULL | deflate_monitor_using_JT() {
: | contentions=0 | 1> cmpxchg(DEFLATER_MARKER, &owner, NULL)
1> atomic inc ref_count | ref_count=0 |
+-----------------------+
T-hash ObjectMonitor T-deflate
---------------------- +-----------------------+ --------------------------------------------
save_om_ptr() { | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
: | contentions=0 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
1> atomic inc ref_count | ref_count=0 | 1> if (waiters != 0 || ref_count != 0) {
+-----------------------+ }
prev = cmpxchg(-max_jint, &contentions, 0)
If T-deflate wins the race, then T-hash will have to retry at most once.
T-hash ObjectMonitor T-deflate
------------------------- +-----------------------+ -----------------------------------------------
save_om_ptr() { | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
1> atomic inc ref_count | contentions=-max_jint | cmpxchg(DEFLATER_MARKER, &owner, NULL)
if (owner == | ref_count=0 | if (waiters != 0 || ref_count != 0) {
DEFLATER_MARKER && +-----------------------+ }
contentions <= 0 && || prev = cmpxchg(-max_jint, &contentions, 0)
ref_count <= 0) { \/ 1> if (prev == 0 &&
restore obj header +-----------------------+ owner == DEFLATER_MARKER &&
atomic dec ref_count | owner=DEFLATER_MARKER | cmpxchg(-max_jint, &ref_count, 0) == 0) {
2> return false to | contentions=-max_jint | restore obj header
cause a retry | ref_count=-max_jint | 2> finish the deflation
} +-----------------------+ }
If T-hash wins the race, then the ref_count will cause T-deflate to bail out on deflating the monitor.
Note: header is not mentioned in any of the previous sections for simplicity.
T-hash ObjectMonitor T-deflate
------------------------- +-----------------------+ --------------------------------------------
save_om_ptr() { | header=dmw_no_hash | deflate_monitor_using_JT() {
atomic inc ref_count | owner=DEFLATER_MARKER | cmpxchg(DEFLATER_MARKER, &owner, NULL)
1> if (owner == | contentions=0 | 1> if (waiters != 0 || ref_count != 0) {
DEFLATER_MARKER && | ref_count=1 | cmpxchg(NULL, &owner, DEFLATER_MARKER)
contentions <= 0 && +-----------------------+ 2> bailout on deflation
ref_count <= 0) { || }
} \/ prev = cmpxchg(-max_jint, &contentions, 0)
if (object no longer +-----------------------+
has a monitor or | header=dmw_no_hash |
is a different | owner=NULL |
monitor) { | contentions=0 |
atomic dec ref_count | ref_count=1 |
return false to +-----------------------+
cause a retry ||
} \/
2> save om_ptr in the +-----------------------+
ObjectMonitorHandle | header=dmw_hash |
} | owner=NULL |
if save_om_ptr() { | contentions=0 |
if no hash | ref_count=1 |
gen hash & merge +-----------------------+
hash = hash(header)
}
3> atomic dec ref_count
return hash
In this T-hash wins scenario, the need for setting ref_count to a large negative value in the third part of the protocol is illustrated.
T-hash ObjectMonitor T-deflate
------------------------- +-----------------------+ -----------------------------------------------
save_om_ptr() { | header=dmw_no_hash | deflate_monitor_using_JT() {
atomic inc ref_count | owner=DEFLATER_MARKER | cmpxchg(DEFLATER_MARKER, &owner, NULL)
if (owner == | contentions=0 | if (waiters != 0 || ref_count != 0) {
DEFLATER_MARKER && | ref_count=1 | }
contentions <= 0 && +-----------------------+ 1> prev = cmpxchg(-max_jint, &contentions, 0)
ref_count <= 0) { || 2> if (prev == 0 &&
} \/ owner == DEFLATER_MARKER &&
1> if (object no longer +-----------------------+ cmpxchg(-max_jint, &ref_count, 0) == 0) {
has a monitor or | header=dmw_no_hash | } else {
is a different | owner=DEFLATER_MARKER | cmpxchg(NULL, &owner, DEFLATER_MARKER)
monitor) { | contentions=-max_jint | atomic add max_jint to contentions
atomic dec ref_count | ref_count=1 | 3> bailout on deflation
return false to +-----------------------+ }
cause a retry ||
} \/
2> save om_ptr in the +-----------------------+
ObjectMonitorHandle | header=dmw_hash |
} | owner=NULL |
if save_om_ptr() { | contentions=0 |
if no hash | ref_count=1 |
gen hash & merge +-----------------------+
hash = hash(header)
}
3> atomic dec ref_count
return hash
This subsection title is NOT a typo. It was previously possible for both T-deflate and T-hash to lose the race. In the "CR0/v2.00/3-for-jdk13" version of the code, the double loss was not an issue. The addition of "restore obj header" to save_om_ptr() created a bug where the ObjectMonitor can be disconnected from the object without the ObjectMonitor being deflated. This led to a situation where the ObjectMonitor is still on the in-use list and thinks it is owned by the object, but the object does not agree. This rare bug existed in the "CR1/v2.01/4-for-jdk13" version of the code. save_om_ptr() and deflate_monitor_using_JT() have been changed to recognize a large negative ref_count value as a marker that async deflation has won the race. With that change in place, it is no longer possible for both T-deflate and T-hash to lose the same race.
T-hash ObjectMonitor T-deflate
------------------------- +-----------------------+ -----------------------------------------------
save_om_ptr() { | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
1> atomic inc ref_count | contentions=-max_jint | cmpxchg(DEFLATER_MARKER, &owner, NULL)
if (owner == | ref_count=0 | if (waiters != 0 || ref_count != 0) {
DEFLATER_MARKER && +-----------------------+ }
contentions <= 0 && prev = cmpxchg(-max_jint, &contentions, 0)
ref_count <= 0) { if (prev == 0 &&
restore obj header owner == DEFLATER_MARKER &&
atomic dec ref_count 1> cmpxchg(-max_jint, &ref_count, 0) == 0) {
return false to } else {
cause a retry cmpxchg(NULL, &owner, DEFLATER_MARKER)
} atomic add max_jint to contentions
bailout on deflation
}
Please note that in Carsten's original prototype, there was another race in ObjectSynchronizer::FastHashCode() when the object's monitor had to be inflated. The setting of the hashcode in the ObjectMonitor's header/dmw could race with T-deflate. That race is resolved in this version by the use of an ObjectMonitorHandle in the call to ObjectSynchronizer::inflate(). The ObjectMonitor* returned by ObjectMonitorHandle.om_ptr() has a non-zero ref_count so no additional races with T-deflate are possible.
The devil is in the details! Housekeeping or administrative stuff are usually detailed, but necessary.