...
Source: Source compatibility concerns translating Java source code into class files.
Binary: Binary compatibility is defined in The Java Language Specification as preserving the ability to link without error.
Behavioral: Behavioral compatibility includes the semantics of the code that is executed at runtime.
Note that non-source compatibility is sometimes colloquially referred to as "binary compatibility." Such usage is incorrect since the Java Language Specification (JLS) spends an entire chapter precisely defining the term binary compatibility; often behavioral compatibility is the intended notion instead.
...
A Java compiler's job also includes mapping more abstract names to more concrete ones, specifically mapping simple and qualified names appearing in source code into binary names in class files. Source compatibility concerns this mapping of source code into class files, not only whether or not such a mapping is possible, but also whether or not the resulting class files are suitable. Source compatibility is influenced by changing the set of types available during compilation, such as adding a new class, as well as changes within existing types themselves, such as adding an overloaded method. There is a large set of possible changes to classes and interfaces examined for their binary compatibility impact. All these changes could also be classified according to their source compatibility repercussions, but only a few kinds of changes will be analyzed below.
The most rudimentary kind of positive source compatibility is whether code that compiles against L1 will continue to compile against L2; however, that is not the entirety of the space of concerns since the class file resulting from compilation might not be equivalent. Java source code often uses simple names for types; using information about imports, the compiler will interpret these simple names and transform them into binary names for use in the resulting class file(s). In a class file, the binary name of an entity (along with its signature in the case of methods and constructors) serves as the unique, universal identifier to allow the entity to be referenced. So different degrees of source compatibility can be identified:
...
Due to the *
import wrinkle, a more reasonable definition of source compatibility considers programs transformed to only use fully qualified names. Let FQN(P, L) be program P where each name is replaced by its fully qualified form in the context of libraries L. Call such a library transformation from L1 to L2 binary-preserving source compatible with source program P if FQN(P, L1) equals FQN(P, L2). This is a strict form of source compatibility that will usually result in class files for P using the same binary names when compiled against both versions of the library. Class files with the same binary names will result when each type has a distinct fully qualified name. Multiple types can have the same fully qualified name but differing binary names; those cases do not arise when the standard naming conventions are being followed.
...
In the original version of Lib
, a call to foo
with an integer argument will resolve to foo(double)
and under the rules for method invocation conversion the value of the int
argument will be converted to a double
through a primitive widening conversion. So given client code
public class Client { public static void main(String... args) { int i = 42; double d = (new Lib()).foo(i); } }
...
In the previous versions of Lib
, a call to foo
with a long
argument would resolve to calling foo(double)
, in which case the value of the long
argument would be converted to double
before the method was called. Then inside the body of foo
, the value would be multiplied by 2.0 and returned. However, with the presence of the foo(long)
overloading, call sites to foo
with a long
argument will resolve to calling foo(long)
instead of foo(double)
. The foo(long)
method first multiplies by 2 and then converts to double
, the opposite order of operations compared to calling foo(double)
. Whether or not the argument value is converted to double
before or after the multiply by two matters since the two sequences of operations can yield different results. For example, a large positive long
value multiplied by two can overflow to a negative value, but a large positive double
value when multiplied by two will retain a positive sign. This kind of subtle change in overloading behavior occurred with the addition of a BigDecimal
constructor taking a long
argument as part of JSR 13.
When adding an overloaded method or constructor to an existing library, if the newly added method could be applicable to the same call sites as the original method, such as if the new method takes the same number of arguments as an original method and has more specific types, call sites in existing clients may now resolve to the new method when recompiled. Well-written programs will follow the Liskov substitution principle and perform "the same" operation on the argument no matter which overloaded method is called. Less than well-written programs may fail to follow this principle.
...
If a new method or constructor cannot change resolution in existing clients, then the change is a binary-preserving source transformation. In binary-preserving source compatibility, existing clients will yield equivalent class files if recompiled. The difference between behaviorally equivalent and compilation preserving source compatibility that is not behaviorally equivalent depends on the implementation of the methods in question. If a new method changes resolution, if the different class file that results has similar enough behavior, the change may still be acceptable, while changing resolution in such a way that does not preserve semantics is likely problematic. Changing a library in such a way that current clients no longer compile is seldom appropriate.
Binary Compatibility
JLSv3 §13.2 – What Binary Compatibility Is and Is Not
A change to a type is binary compatible with (equivalently, does not break binary compatibility with) preexisting binaries if preexisting binaries that previously linked without error will continue to link without error.
The JLS defines binary compatibility strictly according to linkage; if P links with L1 and continues to link with L2, the change made in L2 is binary compatible. The runtime behavior after linking is not included in binary compatibility:
JLSv3 §13.4.22 – Method and Constructor Body
Changes to the body of a method or constructor do not break [binary] compatibility with pre-existing binaries.
As an extreme example, if the body of a method is changed to throw an error instead of compute a useful result, while the change is certainly a compatibility issue, it is not a binary compatibility issue since client classes would continue to link. Also, it is not a binary compatibility issue to add methods to an interface. Class files compiled against the old version of the interface will still link against the new interface despite the class not having an implementation of the new method. If the new method is called at runtime, an AbstractMethodError
is thrown; if the new method is not called, the existing methods can be used without incident. (Adding a method to an interface is a source incompatibility that can break compilation though.)
...
Intuitively, behavioral compatibility should mean that with the same inputs program P does "the same" or an "equivalent" operation under different versions of libraries or the platform. Defining equivalence can be a bit involved; for example, even just defining a proper equals
method in a class can be nontrivial. In this case, to formalize this concept would require an operational semantics for the JVM for the aspects of the system a program was interested in. For example, there is a fundamental difference in visible changes between programs that introspect on the system and those that do not. Examples of introspection include calling core reflection, relying on stack trace output, using timing measurements to influence code execution, and so on. For programs that do not use, say, core reflection, changes to the structure of libraries, such as adding new public
methods, is entirely transparent. In contrast, a (poorly behaved) program could use reflection to look up the set of public
methods on a library class and throw an exception if any unexpected methods were present. A tricky program could even make decisions based on information like a timing side channel. For example, two threads could repeatedly run different operations and make some indication of progress, for example, incrementing an atomic counter, and the relative rates of progress could be compared. If the ratio is over a certain threshold, some unrelated action could be taken, or not. This allows a program to create a dependence on the optimization capabilities of a particular JVM implementation, which is generally outside a reasonable behavioral compatibility contract.
The evolution of a library is constrained by the library's contract included in its specification; for final
classes this contract doesn't usually include a prohibition of adding new public methods! While an end-user may not care why a program does not work with a newer version of a library, what contracts are being followed or broken should determine which party has the onus for fixing the problem. That said, there are times in evolving the JDK when differences are found between the specified behavior and the actual behavior (for example JDK-4707389, JDK-6365176). The two basic approaches to fixing these bugs are to change the implementation to match the specified behavior or to change the specification (in a platform release) to match the implementation's (perhaps long-standing) behavior; often the latter option is chosen since it has a lower de facto impact on behavioral compatibility.
While many classes and methods in the platform describe the exact input-output relationship between arguments and returned values, a few methods eschew this approach and are specified to have unspecified behavior. One such example is HashSet
:
[
HashSet
] makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.
...[The
iterator
method] Returns an iterator over the elements in this set. The elements are returned in no particular order.
...