Summary
This page describes adding support for Async Monitor Deflation to OpenJDK. The primary goal of this project is to reduce the time spent in safepoint cleanup operations.
RFE: 8153224 Monitor deflation prolong safepoints
https://bugs.openjdk.java.net/browse/JDK-8153224
Webrev: http://cr.openjdk.java.net/~dcubed/8153224-webrev/3-for-jdk13
Background
This patch for Async Monitor Deflation is based on Carsten Varming's
http://cr.openjdk.java.net/~cvarming/monitor_deflate_conc/0/
which has been ported to work with monitor lists. Monitor lists were optional via the '-XX:+MonitorInUseLists' option in JDK8, the option became default 'true' in JDK9, the option became deprecated in JDK10 via JDK-8180768, and the option became obsolete in JDK12 via JDK-8211384. Carsten's webrev is based on JDK10 so there was a bit of porting work needed to merge his code and/or algorithms with jdk/jdk.
Carsten also submitted a JEP back in the JDK10 time frame:
JDK-8183909 Concurrent Monitor Deflation
https://bugs.openjdk.java.net/browse/JDK-8183909
The OpenJDK JEP process has evolved a bit since JDK10 and a JEP is no longer required for a project that is well defined to be within one area of responsibility. Async Monitor Deflation is clearly defined to be in the JVM Runtime team's area of responsibility so it is likely that the JEP (JDK-8183909) will be withdrawn and the work will proceed via the RFE (JDK-8153224).
Introduction
The current idle monitor deflation mechanism executes at a safepoint during cleanup operations. Due to this execution environment, the current mechanism does not have to worry about interference from concurrently executing JavaThreads. Async Monitor Deflation uses JavaThreads and the ServiceThread to deflate idle monitors so the new mechanism has to detect interference and adapt as appropriate. In other words, data races are natural part of Async Monitor Deflation and the algorithms have to detect the races and react without data loss or corruption.
Key Parts of the Algorithm
1) Deflation With Interference Detection
ObjectSynchronizer::deflate_monitor_using_JT() is the new counterpart to ObjectSynchronizer::deflate_monitor() and does the heavy lifting of asynchronously deflating a monitor using a three part prototcol:
- Setting a NULL owner field to DEFLATER_MARKER with cmpxchg() forces any contending thread through the slow path. A racing thread would be trying to set the owner field.
- Making a zero contentions field a large negative value with cmpxchg() forces racing threads to retry. A racing thread would have set the owner field (after we stored DEFLATER_MARKER) and would be trying to increment the contentions field.
- If the owner field is still equal to DEFLATER_MARKER, then we have won all the races and can deflate the monitor.
If we lose any of the races, the monitor cannot be deflated at this time.
Once we know it is safe to deflate the monitor (which is mostly field resetting and monitor list management), we have to restore the object's header. That's another racy operation that is described below in "Restoring the Header With Interference Detection".
2) Restoring the Header With Interference Detection
ObjectMonitor::install_displaced_markword_in_object() is the new piece of code that handles all the racy situations with restoring an object's header asynchronously. The function is called from a couple of places (deflation and object monitor entry). The restoration protocol for the object's header uses the mark bit along with the hash() value staying at zero to indicate that the object's header is being restored. Only one of the possible racing scenarios can win and the losing scenarios all adapt to the winning scenario's object header value.
3) Using "owner" or "contentions" With Interference Detection
Various code paths have been updated to recognize an owner field equal to DEFLATER_MARKER or a negative contentions field and those code paths will retry their operation. This is the shortest "Key Part" description, but don't be fooled. See "Gory Details" below.
An Example of ObjectMonitor Interference
For example, when ObjectMonitor::enter() detects genuine contention via the owner field, it atomically increments the contentions field to indicate that the ObjectMonitor is busy. The thread calling enter() (T-enter) is potentially racing with an Async Monitor Deflation by another JavaThread (T-deflate) so both threads have to check the result of the race.
Start of the Race
ObjectMonitor T-deflate
T-enter +-----------------------+ ----------------------------------------
---------------- | owner=NULL | deflate_monitor_using_JT() {
| contentions=0 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
+-----------------------+
- The data fields are at their starting values.
- T-deflate is about to execute cmpxchg().
- T-enter hasn't done anything yet.
Racing Threads
ObjectMonitor T-deflate
T-enter +-----------------------+ --------------------------------------------
---------------------- | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
owner contended | contentions=0 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
atomic inc contentions +-----------------------+ :
prev = cmpxchg(-max_jint, &contentions, 0)
- T-deflate has executed cmpxchg() and set owner to DEFLATE_MARKER.
- T-enter has observed the contended owner field.
- T-enter and T-deflate are racing to update the contentions field.
T-deflate Wins
ObjectMonitor T-deflate
T-enter +-----------------------+ --------------------------------------------
----------------------------- | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
owner contended | contentions=-max_jint | cmpxchg(DEFLATER_MARKER, &owner, NULL)
atomic inc contentions +-----------------------+ :
if (contentions <= 0 && owner prev = cmpxchg(-max_jint, &contentions, 0)
== DEFLATER_MARKER) { if (prev == 0 &&
restore obj header owner == DEFLATER_MARKER) {
retry enter restore obj header
} finish the deflation
}
- This diagram starts after "Racing Threads".
- T-enter and T-deflate both observe owner == DEFLATER_MARKER and a negative contentions field.
- T-enter has lost the race, it restores the obj header and it retries.
- T-deflate restores the obj header and it finishes deflation of the ObjectMonitor.
T-enter Wins
ObjectMonitor T-deflate
T-enter +-----------------------+ --------------------------------------------
---------------------- | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
owner contended | contentions=1 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
atomic inc contentions +-----------------------+ :
if (contentions > 0) prev = cmpxchg(-max_jint, &contentions, 0)
do contended if (prev == 0 &&
enter work owner == DEFLATER_MARKER) {
} else {
cmpxchg(NULL, &owner, DEFLATER_MARKER)
- This diagram starts after "Racing Threads".
- T-enter and T-deflate both observe a contentions field > 0.
- T-enter has won the race and it proceeds with the normal contended enter work.
- T-deflate detects that it has lost the race (prev != 0) and bails out on deflating the ObjectMonitor:
- Before bailing out T-deflate tries to restore the owner field to NULL if it is still DEFLATER_MARKER.
T-enter Wins By A-B-A
ObjectMonitor T-deflate
T-enter +-------------------------+ ------------------------------------------
------------------------------------------ | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
owner contended | contentions=1 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
atomic inc contentions +-------------------------+ 1> :
1> if (contentions > 0) || 2> : <thread_stalls>
EnterI() \/ :
cmpxchg(Self, &owner, DEFLATER_MARKER) +-------------------------+ :
atomic dec contentions | owner=Self/T-enter | :
2> } | contentions=0 | : <thread_resumes>
// finished with enter +-------------------------+ prev = cmpxchg(-max_jint, &contentions, 0)
3> : <does app work> || if (prev == 0 &&
exit() monitor \/ owner == DEFLATER_MARKER) {
owner = NULL +-------------------------+ } else {
| owner=Self/T-enter|NULL | cmpxchg(NULL, &owner, DEFLATER_MARKER)
| contentions=0 | atomic add max_jint to contentions
+-------------------------+ 3> bailout on deflation
}
- This diagram starts after "Racing Threads".
- T-enter incremented contentions to 1.
- The first ObjectMonitor box is showing the fields at this point and the "1>" markers are showing where each thread is at for that ObjectMonitor box.
- T-deflate stalls after setting the owner field to DEFLATER_MARKER.
- T-enter has won the race and calls EnterI() to do the contended enter work.
- EnterI() observes owner == DEFLATER_MARKER and uses cmpxchg() to set the owner field to Self/T-enter.
- T-enter decrements the contentions field because it is no longer contending for the monitor; it owns the monitor.
- The second ObjectMonitor box is showing the fields at this point and the "2>" markers are showing where each thread is at for that ObjectMonitor box.
- T-deflate resumes, sets the contentions field to -max_jint (not shown), and passes the first part of the bailout expression because "prev == 0".
- T-deflate observes that "owner != DEFLATE_MARKER" and bails out on deflation:
- Depending on when T-deflate resumes after the stall, it will see "owner == T-enter" or "owner == NULL".
- Both of those values will cause deflation to bailout so we have to conditionally undo work:
- restore the owner field to NULL if it is still DEFLATER_MARKER (it's not DEFLATER_MARKER)
- undo setting contentions to -max_jint by atomically adding max_jint to contentions which will restore contentions to its proper value.
- The third ObjectMonitor box is showing the fields at this point and the "3>" markers are showing where each thread is at for that ObjectMonitor box.
If the T-enter thread has managed to enter but not exit the monitor during the T-deflate stall, then our owner field A-B-A transition is:
NULL → DEFLATE_MARKER → Self/T-enter
so we really have A1-B-A2, but the A-B-A principal still holds.- If the T-enter thread has managed to enter and exit the monitor during the T-deflate stall, then our owner field A-B-A transition is:
NULL → DEFLATE_MARKER → Self/T-enter → NULL
so we really have A-B1-B2-A, but the A-B-A principal still holds.
An Example of Object Header Interference
After T-deflate has won the race for deflating an ObjectMonitor it has to restore the header in the associated object. Of course another thread can be trying to do something to the object's header at the same time. Isn't asynchronous work exciting?!?!
ObjectMonitor::install_displaced_markword_in_object() is called from two places so we can have a race between a T-enter thread and a T-deflate thread:
Start of the Race
T-enter object T-deflate
------------------------------------------- +-------------+ --------------------------------------------
install_displaced_markword_in_object() { | mark=om_ptr | install_displaced_markword_in_object() {
dmw = header() +-------------+ dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
dmw->hash() == 0) { dmw->hash() == 0) {
create marked_dmw create marked_dmw
dmw = cmpxchg(marked_dmw, &header, dmw) dmw = cmpxchg(marked_dmw, &header, dmw)
} }
- The data field (mark) is at its starting value.
- 'dmw' and 'marked_dmw' are local copies in each thread.
- T-enter and T-deflate are both calling install_displaced_markword_in_object() at the same time.
- Both threads are poised to call cmpxchg() at the same time.
T-deflate Wins First Race
T-enter object T-deflate
------------------------------------------- +-------------+ -------------------------------------------
install_displaced_markword_in_object() { | mark=om_ptr | install_displaced_markword_in_object() {
dmw = header() +-------------+ dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
dmw->hash() == 0) { dmw->hash() == 0) {
create marked_dmw create marked_dmw
dmw = cmpxchg(marked_dmw, &header, dmw) dmw = cmpxchg(marked_dmw, &header, dmw)
} }
// dmw == marked_dmw here // dmw == original dmw here
if (dmw->is_marked()) if (dmw->is_marked())
unmark dmw unmark dmw
obj = object() obj = object()
obj->cas_set_mark(dmw, this) obj->cas_set_mark(dmw, this)
- The return value from cmpxchg() in each thread will be different.
- Since T-deflate won the race, its 'dmw' variable contains the header/dmw from the ObjectMonitor.
- Since T-enter lost the race, its 'dmw' variable contains the 'marked_dmw' set by T-deflate.
- T-enter will unmark its 'dmw' variable.
- Both threads are poised to call cas_set_mark() at the same time.
T-enter Wins First Race
T-enter object T-deflate
------------------------------------------- +-------------+ -------------------------------------------
install_displaced_markword_in_object() { | mark=om_ptr | install_displaced_markword_in_object() {
dmw = header() +-------------+ dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
dmw->hash() == 0) { dmw->hash() == 0) {
create marked_dmw create marked_dmw
dmw = cmpxchg(marked_dmw, &header, dmw) dmw = cmpxchg(marked_dmw, &header, dmw)
} }
// dmw == original dmw here // dmw == marked_dmw here
if (dmw->is_marked()) if (dmw->is_marked())
unmark dmw unmark dmw
obj = object() obj = object()
obj->cas_set_mark(dmw, this) obj->cas_set_mark(dmw, this)
- This diagram is the same as "T-deflate Wins First Race" except we've swapped the post cmpxchg() comments.
- Since T-enter won the race, its 'dmw' variable contains the header/dmw from the ObjectMonitor.
- Since T-deflate lost the race, its 'dmw' variable contains the 'marked_dmw' set by T-enter.
- T-deflate will unmark its 'dmw' variable.
- Both threads are poised to call cas_set_mark() at the same time.
Either Wins the Second Race
T-enter object T-deflate
------------------------------------------- +-------------+ -------------------------------------------
install_displaced_markword_in_object() { | mark=dmw | install_displaced_markword_in_object() {
dmw = header() +-------------+ dmw = header()
if (!dmw->is_marked() && if (!dmw->is_marked() &&
dmw->hash() == 0) { dmw->hash() == 0) {
create marked_dmw create marked_dmw
dmw = cmpxchg(marked_dmw, &header, dmw) dmw = cmpxchg(marked_dmw, &header, dmw)
} }
// dmw == ... // dmw == ...
if (dmw->is_marked()) if (dmw->is_marked())
unmark dmw unmark dmw
obj = object() obj = object()
obj->cas_set_mark(dmw, this) obj->cas_set_mark(dmw, this)
- It does not matter whether T-enter or T-deflate won the cmpxchg() call so the comment does not say who won.
- It does not matter whether T-enter or T-deflate won the cas_set_mark() call; in this scenario both were trying to restore the same value.
- The object's mark field has changed from 'om_ptr' → 'dmw'.
Please notice that install_displaced_markword_in_object() does not do any retries on any code path:
- Instead the code adapts to being the loser in a cmpxchg() by unmarking its copy of the dmw.
- In the second race, if a thread loses the cas_set_mark() race, there is also no need to retry because the object's header has been restored by the other thread.
Hashcodes and Object Header Interference
If we have a race between a T-deflate thread and a thread trying to get/set a hashcode (T-hash), then the race is between the ObjectMonitorHandle.save_om_ptr(obj, mark) call in T-hash and deflation protocol in T-deflate.
Note: ref_count is not mentioned in any of the previous sections for simplicity.
Start of the Race
T-hash ObjectMonitor T-deflate
---------------------- +-----------------------+ ----------------------------------------
save_om_ptr() { | owner=NULL | deflate_monitor_using_JT() {
: | contentions=0 | 1> cmpxchg(DEFLATER_MARKER, &owner, NULL)
1> atomic inc ref_count | ref_count=0 |
+-----------------------+
- The data fields are at their starting values.
- T-deflate is about to execute cmpxchg().
- T-hash is about to increment ref_count.
- The "1>" markers are showing where each thread is at for the ObjectMonitor box.
Racing Threads
T-hash ObjectMonitor T-deflate
---------------------- +-----------------------+ --------------------------------------------
save_om_ptr() { | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
: | contentions=0 | cmpxchg(DEFLATER_MARKER, &owner, NULL)
1> atomic inc ref_count | ref_count=0 | 1> if (waiters != 0 || ref_count != 0) {
+-----------------------+ }
prev = cmpxchg(-max_jint, &contentions, 0)
- T-deflate has set the owner field to DEFLATER_MARKER.
- T-deflate is about to check the waiters and ref_count fields.
- T-hash is about to inc the ref_count field (T-hash has made no progress).
- The "1>" markers are showing where each thread is at for the ObjectMonitor box.
T-deflate Wins
If T-deflate wins the race, then T-hash will have to retry until the object and/or ObjectMonitor are stable.
T-hash ObjectMonitor T-deflate
------------------------ +-----------------------+ --------------------------------------------
save_om_ptr() { | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
atomic inc ref_count | contentions=-max_jint | cmpxchg(DEFLATER_MARKER, &owner, NULL)
1> if (owner == | ref_count=1 | if (waiters != 0 || ref_count != 0) {
DEFLATER_MARKER && +-----------------------+ }
contentions <= 0) { || prev = cmpxchg(-max_jint, &contentions, 0)
restore obj header \/ 1> if (prev == 0 &&
atomic dec ref_count +-----------------------+ owner == DEFLATER_MARKER &&
2> return false to | owner=DEFLATER_MARKER | ref_count == 0) {
cause a retry | contentions=-max_jint | restore obj header
} | ref_count=0 | 2> finish the deflation
+-----------------------+ }
- T-deflate made it past the first ref_count check before T-hash incremented it.
- T-deflate set the contentions field to -max_jint and T-enter incremented the ref_count field.
- The first ObjectMonitor box is showing the fields at this point and the "1>" markers are showing where each thread is at for that ObjectMonitor box.
- T-hash observes "owner == DEFLATER_MARKER && contentions <= 0" so it restores obj header (not shown) and decrements ref_count.
- T-deflate sees "prev == 0 && owner == DEFLATER_MARKER && ref_count == 0" so it has won the race.
- T-deflate restores obj header (not shown).
- The second ObjectMonitor box is showing the fields at this point and the "2>" markers are showing where each thread is at for that ObjectMonitor box.
- T-deflate finishes the deflation work.
- T-hash returns false to cause a retry and when T-hash retries:
- if it observes "owner == DEFLATER_MARKER && contentions <= 0" it will retry again.
- if it observes the restored object header:
- if the object's header does not have a hash, then generate a hash and merge it with the object's header.
- Otherwise, extract the hash from the object's header and return it.
T-hash Wins Scenario 1
If T-hash wins the race, then the ref_count will cause T-deflate to bail out on deflating the monitor.
Note: header is not mentioned in any of the previous sections for simplicity.
T-hash ObjectMonitor T-deflate
------------------------ +-----------------------+ --------------------------------------------
save_om_ptr() { | header=dmw_no_hash | deflate_monitor_using_JT() {
atomic inc ref_count | owner=DEFLATER_MARKER | cmpxchg(DEFLATER_MARKER, &owner, NULL)
1> if (owner == | contentions=0 | 1> if (waiters != 0 || ref_count != 0) {
DEFLATER_MARKER && | ref_count=1 | cmpxchg(NULL, &owner, DEFLATER_MARKER)
contentions <= 0) { +-----------------------+ 2> bailout on deflation
} || }
if (object no longer \/ prev = cmpxchg(-max_jint, &contentions, 0)
has a monitor or +-----------------------+
is a different | header=dmw_no_hash |
monitor) { | owner=NULL |
atomic dec ref_count | contentions=0 |
return false to | ref_count=1 |
cause a retry +-----------------------+
} ||
2> save om_ptr in the \/
ObjectMonitorHandle +-----------------------+
} | header=dmw_hash |
if save_om_ptr() { | owner=NULL |
if no hash | contentions=0 |
gen hash & merge | ref_count=1 |
hash = hash(header) +-----------------------+
}
3> atomic dec ref_count
return hash
- T-hash has incremented ref_count before T-deflate made it past that check.
- The first ObjectMonitor box is showing the fields at this point and the "1>" markers are showing where each thread is at for that ObjectMonitor box.
- T-deflate bails out on deflation, but first it tries to restore the owner field:
- The return value of cmpxchg() is not checked here.
- If T-deflate cannot restore the owner field to NULL, then another thread has managed to enter the monitor (or enter and exit the monitor) and we don't want to overwrite that information.
- T-hash observes:
- "owner == DEFLATER_MARKER && contentions == 0" or
- "owner == NULL && contentions == 0" so it does not cause a retry.
- T-hash verifies that the object still has a monitor and that monitor still refers to our current ObjectMonitor.
- The second ObjectMonitor box is showing the fields at this point and the "2>" markers are showing where each thread is at for that ObjectMonitor box.
- T-hash saves the ObjectMonitor* in the ObjectMonitorHandle (not shown) and returns to the caller.
- save_om_ptr() returns true since the ObjectMonitor is safe:
- if ObjectMonitor's 'header/dmw' field does not have a hash, then generate a hash and merge it with the 'header/dmw' field.
- Otherwise, extract the hash from the ObjectMonitor's 'header/dmw' field.
- The third ObjectMonitor box is showing the fields at this point and the "3>" marker is showing where T-hash is at for that ObjectMonitor box.
- T-hash decrements the ref_count field.
- T-hash returns the hash value.
T-hash Wins Scenario 2
In this T-hash wins scenario, the need for the "ref_count == 0" check in the third phase of the protocol is illustrated.
T-hash ObjectMonitor T-deflate
------------------------ +-----------------------+ --------------------------------------------
save_om_ptr() { | header=dmw_no_hash | deflate_monitor_using_JT() {
atomic inc ref_count | owner=DEFLATER_MARKER | cmpxchg(DEFLATER_MARKER, &owner, NULL)
if (owner == | contentions=0 | if (waiters != 0 || ref_count != 0) {
DEFLATER_MARKER && | ref_count=1 | }
contentions <= 0) { +-----------------------+ 1> prev = cmpxchg(-max_jint, &contentions, 0)
} || 2> if (prev == 0 &&
1> if (object no longer \/ owner == DEFLATER_MARKER &&
has a monitor or +-----------------------+ ref_count == 0) {
is a different | header=dmw_no_hash | } else {
monitor) { | owner=DEFLATER_MARKER | cmpxchg(NULL, &owner, DEFLATER_MARKER)
atomic dec ref_count | contentions=-max_jint | atomic add max_jint to contentions
return false to | ref_count=1 | 3> bailout on deflation
cause a retry +-----------------------+ }
} ||
2> save om_ptr in the \/
ObjectMonitorHandle +-----------------------+
} | header=dmw_hash |
if save_om_ptr() { | owner=NULL |
if no hash | contentions=0 |
gen hash & merge | ref_count=1 |
hash = hash(header) +-----------------------+
}
3> atomic dec ref_count
return hash
- T-deflate made it past the first ref_count check before T-hash incremented it.
- T-hash made it past the "owner == DEFLATER_MARKER && contentions <= 0" check before T-deflate updated contentions.
- The first ObjectMonitor box is showing the fields at this point and the "1>" markers are showing where each thread is at for that ObjectMonitor box.
- T-deflate sets the contentions field to -max_jint and is about the make the last of the protocol checks.
- T-hash verifies that the object still has a monitor and that monitor still refers to our current ObjectMonitor.
- The second ObjectMonitor box is showing the fields at this point and the "2>" markers are showing where each thread is at for that ObjectMonitor box.
- T-deflate sees that "ref_count != 0" and bails out on deflation but it has to restore some data if possible:
- The return value of cmpxchg() is not checked here.
- If T-deflate cannot restore the owner field to NULL, then another thread has managed to enter the monitor (or enter and exit the monitor) and we don't want to overwrite that information.
- Add back max_jint to restore the contentions field to its proper value (which may not be the same as when we started).
- T-hash saves the ObjectMonitor* in the ObjectMonitorHandle (not shown) and returns to the caller.
- save_om_ptr() returns true since the ObjectMonitor is safe:
- if ObjectMonitor's 'header/dmw' field does not have a hash, then generate a hash and merge it with the 'header/dmw' field.
- Otherwise, extract the hash from the ObjectMonitor's 'header/dmw' field.
- The third ObjectMonitor box is showing the fields at this point and the "3>" markers are showing where each thread is at for that ObjectMonitor box.
- T-hash decrements the ref_count field.
- T-hash returns the hash value.
T-deflate and T-hash Both Lose
This subsection title is NOT a typo. It is possible for both T-deflate and T-hash to lose the race.
T-hash ObjectMonitor T-deflate
------------------------ +-----------------------+ --------------------------------------------
save_om_ptr() { | owner=DEFLATER_MARKER | deflate_monitor_using_JT() {
atomic inc ref_count | contentions=-max_jint | cmpxchg(DEFLATER_MARKER, &owner, NULL)
1> if (owner == | ref_count=1 | if (waiters != 0 || ref_count != 0) {
DEFLATER_MARKER && +-----------------------+ }
contentions <= 0) { || prev = cmpxchg(-max_jint, &contentions, 0)
restore obj header \/ 1> if (prev == 0 &&
atomic dec ref_count +-----------------------+ owner == DEFLATER_MARKER &&
2> return false to | owner=NULL | ref_count == 0) {
cause a retry | contentions=0 | } else {
} | ref_count=0 | cmpxchg(NULL, &owner, DEFLATER_MARKER)
+-----------------------+ atomic add max_jint to contentions
2> bailout on deflation
}
- T-deflate made it past the first ref_count check before T-hash incremented it.
- T-deflate set the contentions field to -max_jint and T-enter incremented the ref_count field.
- The first ObjectMonitor box is showing the fields at this point and the "1>" markers are showing where each thread is at for that ObjectMonitor box.
- T-hash observes "owner == DEFLATER_MARKER && contentions <= 0" and starts to bail out.
- T-deflate sees "ref_count != 0" and bails out on deflation but it has to restore some data if possible:
- The return value of cmpxchg() is not checked here.
- If T-deflate cannot restore the owner field to NULL, then another thread has managed to enter the monitor (or enter and exit the monitor) and we don't want to overwrite that information.
- Add back max_jint to restore the contentions field to its proper value (which may not be the same as when we started).
- T-hash restores obj header (not shown) and decrements ref_count.
- The second ObjectMonitor box is showing the fields at this point and the "2>" markers are showing where each thread is at for that ObjectMonitor box.
- T-hash returns false to cause a retry and when T-hash retries:
- the object's header will still refer to the ObjectMonitor (i.e., not deflated)
- it will observe "owner != DEFLATER_MARKER"
- if the ObjectMonitor's header/dmw does not have a hash, then generate a hash and merge it with the ObjectMonitor's header/dmw.
- Otherwise, extract the hash from the ObjectMonitor's header/dmw and return it.
Note: The addition of "restore obj header" to save_om_ptr() created a bug where the ObjectMonitor can be disconnected from the object without the ObjectMonitor being deflated. This leads to a situation where the ObjectMonitor is still on the in-use list and thinks it is owned by the object, but the object does not agree. This bug exists in the "CR1/v2.01/4-for-jdk13" version of the code that is currently out for review. The bug occurs rarely in extensive testing. I have a fix and I'm in the process of testing it.
Please note that in Carsten's original prototype, there was another race in ObjectSynchronizer::FastHashCode() when the object's monitor had to be inflated. The setting of the hashcode in the ObjectMonitor's header/dmw could race with T-deflate. That race is resolved in this version by the use of an ObjectMonitorHandle in the call to ObjectSynchronizer::inflate(). The ObjectMonitor* returned by ObjectMonitorHandle.om_ptr() has a non-zero ref_count so no additional races with T-deflate are possible.
Housekeeping Parts of the Algorithm
The devil is in the details! Housekeeping or administrative stuff are usually detailed, but necessary.
- New diagnostic option '-XX:AsyncDeflateIdleMonitors' that is default 'true' so that the new mechanism is used by default, but it can be disabled for potential failure diagnosis.
- ObjectMonitor deflation is still initiated or signaled as needed at a safepoint. When Async Monitor Deflation is in use, flags are set so that the work is done by JavaThreads and the ServiceThread which offloads the safepoint cleanup mechanism.
- ObjectSynchronizer::omAlloc() is modified to call (as needed) ObjectSynchronizer::deflate_per_thread_idle_monitors_using_JT(). Having the JavaThread cleanup its own per-thread monitor list permits this work to happen without any per-thread list locking or critical sections.
- Having a JavaThread deflate a potentially long list of in-use monitors could potentially delay the start of a safepoint. This is detected in ObjectSynchronizer::deflate_monitor_list_using_JT() which will save the current state when it is safe to do so and return to its caller to drop locks as needed before honoring the safepoint request.
- ObjectSynchronizer::inflate() has to be careful how omAlloc() is called. If the inflation cause is inflate_cause_vm_internal, then it is not safe to deflate monitors on the per-thread lists so we skip that. When monitor deflation is done, inflate() has to do the oop refresh dance that is common to any code that can go to a safepoint while holding a naked oop. And, no you can't use a Handle here either. :-)
- Everything else is just monitor list management, infrastructure, logging, debugging and the like. :-)
Gory Details
- Counterpart function mapping for those that know the existing code:
- ObjectSynchronizer class:
- deflate_idle_monitors() has deflate_global_idle_monitors_using_JT(), deflate_per_thread_idle_monitors_using_JT(), and deflate_common_idle_monitors_using_JT().
- deflate_monitor_list() has deflate_monitor_list_using_JT()
- deflate_monitor() has deflate_monitor_using_JT()
- ObjectMonitor class:
- is_busy() has is_busy_async()
- clear() has clear_using_JT()
- ObjectSynchronizer class:
- These functions recognize the Async Monitor Deflation protocol and adapt their operations:
- ObjectMonitor::enter()
- ObjectMonitor::EnterI()
- ObjectMonitor::ReenterI()
- most callers to enter() had to indirectly adapt to the protocol and retry their operations.
- Also these functions had to adapt and retry their operations:
- ObjectSynchronizer::quick_enter()
- ObjectSynchronizer::slow_enter()
- ObjectSynchronizer::reenter()
- ObjectSynchronizer::jni_enter()
- ObjectSynchronizer::FastHashCode()
- ObjectSynchronizer::current_thread_holds_lock()
- ObjectSynchronizer::query_lock_ownership()
- ObjectSynchronizer::get_lock_owner()
- ObjectSynchronizer::monitors_iterate()
- ObjectSynchronizer::inflate_helper()
- ObjectSynchronizer::inflate()
- Various assertions had to be modified to pass without their real check when AsyncDeflateIdleMonitors is true; this is due to the change in semantics for the ObjectMonitor owner and contentions fields.
- ObjectMonitor has a new allocation_state field that supports three states: 'Free', 'New', 'Old'. Async Monitor Deflation is only applied to ObjectMonitors that have reached the 'Old' state. When the Async Monitor Deflation code sees an ObjectMonitor in the 'New' state, it is changed to the 'Old' state, but is not deflated. This prevents a newly allocated ObjectMonitor from being immediately deflated which could cause an inflation<->deflation oscillation.
- ObjectMonitor has a new ref_count field that is used to indicate that an ObjectMonitor* is in use so the ObjectMonitor should not be deflated; this is needed for operations on non-busy monitors so that ObjectMonitor values don't change while they are being queried. There is a new ObjectMonitorHandle helper to manage the ref_count.
- The ObjectMonitor::owner() accessor detects DEFLATER_MARKER and returns NULL in that case to minimize the places that need to understand the new DEFLATER_MARKER value.
- System.gc()/JVM_GC() causes a special monitor list cleanup request which uses the safepoint based monitor list mechanism. So even if AsyncDeflateIdleMonitors is enabled, the safepoint based mechanism is still used by this special case.
- This is necessary for those tests that do something to cause an object's monitor to be inflated, clear the only reference to the object and then expect that enough System.gc() calls will eventually cause the object to be GC'ed even when the thread never inflates another object's monitor. Yes, we have several tests like that. :-)