Internationalization Enhancements for JDK 9 include: Unicode 8.0, UTF-8 Properties Files and enabling CLDR Locale Data by Default.
1. Unicode 8.0
Java 8 supported Unicode 6.2.
Java 9 now supports up to Unicode 8.0 standards with 10,555 characters, 29 scripts, and 42 blocks.
2. UTF-8 Properties Files
In previous releases, ISO-8859-1 encoding was used when loading property resource bundles (PropertyResourceBundle
– constructing its instance from an InputStream
requires that the input stream be encoded in ISO-8859-1). But using ISO-8859-1 is not a convenient way to represent non-Latin characters.
In Java 9, properties files are loaded in UTF-8 encoding. These are rules:
– By default, reading the input stream could throw a MalformedInputException
or an UnmappableCharacterException
. In this case, PropertyResourceBundle
instance will:
+ reset to the state before the exception
+ re-reads the input stream in ISO-8859-1
+ continues reading.
– If the system property java.util.PropertyResourceBundle.encoding
is set to either ISO-8859-1 or UTF-8, PropertyResourceBundle
instance will read the input stream in that encoding, and throw the exception if encountering an invalid sequence.
– The system property is read and evaluated when initializing PropertyResourceBundle
class. Then any action that changes or removes the property has no effect.
– If we specify ISO-8859-1:
+ characters that cannot be represented in ISO-8859-1 encoding must be represented by Unicode Escapes.
+ other encoding values are ignored for this system property.
If there is an issue, consider the following options:
– convert the properties file into UTF-8 encoding.
– specify the runtime system property:
java.util.PropertyResourceBundle.encoding=ISO-8859-1
3. Default Locale Data Change
There are 4 locale providers:
– CLDR: the locale data provided by the Unicode Common Locale Data Repository (CLDR) project.
– HOST: the current user’s customization of the underlying operating system’s settings and depending on the operating system, but primarily date, time, number, and currency formats are supported.
– SPI: the locale sensitive services implemented in the installed SPI providers.
– COMPAT (JRE): the locale data that is compatible with releases prior to JDK 9. The value JRE can still be used but deprecated and will be removed in the future. COMPAT is preferred.
In JDK 8 and previous releases, JRE is the default locale data. JDK 9 sets CLDR as highest priority by default.
This is how we select locale data source in the preferred order using java.locale.providers
system property. If a provider is failed to request locale data, the next provider will be processed:
java.locale.providers=COMPAT,CLDR,HOST,SPI
If we don’t set the property, default behaviour is:
java.locale.providers=CLDR,COMPAT,SPI
To make compatible with JDK 8, keep COMPAT ahead of CLDR.
java.locale.providers=COMPAT,CLDR