Java 9 Internationalization Enhancements

java-9-internationalization-enhancements-feature-image

Internationalization Enhancements for JDK 9 include: Unicode 8.0, UTF-8 Properties Files and enabling CLDR Locale Data by Default.

1. Unicode 8.0

Java 8 supported Unicode 6.2.
Java 9 now supports up to Unicode 8.0 standards with 10,555 characters, 29 scripts, and 42 blocks.

2. UTF-8 Properties Files

In previous releases, ISO-8859-1 encoding was used when loading property resource bundles (PropertyResourceBundle – constructing its instance from an InputStream requires that the input stream be encoded in ISO-8859-1). But using ISO-8859-1 is not a convenient way to represent non-Latin characters.

In Java 9, properties files are loaded in UTF-8 encoding. These are rules:
– By default, reading the input stream could throw a MalformedInputException or an UnmappableCharacterException. In this case, PropertyResourceBundle instance will:
+ reset to the state before the exception
+ re-reads the input stream in ISO-8859-1
+ continues reading.

– If the system property java.util.PropertyResourceBundle.encoding is set to either ISO-8859-1 or UTF-8, PropertyResourceBundle instance will read the input stream in that encoding, and throw the exception if encountering an invalid sequence.

– The system property is read and evaluated when initializing PropertyResourceBundle class. Then any action that changes or removes the property has no effect.

– If we specify ISO-8859-1:
+ characters that cannot be represented in ISO-8859-1 encoding must be represented by Unicode Escapes.
+ other encoding values are ignored for this system property.

If there is an issue, consider the following options:
– convert the properties file into UTF-8 encoding.
– specify the runtime system property:

java.util.PropertyResourceBundle.encoding=ISO-8859-1
3. Default Locale Data Change

There are 4 locale providers:
CLDR: the locale data provided by the Unicode Common Locale Data Repository (CLDR) project.
HOST: the current user’s customization of the underlying operating system’s settings and depending on the operating system, but primarily date, time, number, and currency formats are supported.
SPI: the locale sensitive services implemented in the installed SPI providers.
COMPAT (JRE): the locale data that is compatible with releases prior to JDK 9. The value JRE can still be used but deprecated and will be removed in the future. COMPAT is preferred.

In JDK 8 and previous releases, JRE is the default locale data. JDK 9 sets CLDR as highest priority by default.

This is how we select locale data source in the preferred order using java.locale.providers system property. If a provider is failed to request locale data, the next provider will be processed:

java.locale.providers=COMPAT,CLDR,HOST,SPI

If we don’t set the property, default behaviour is:

java.locale.providers=CLDR,COMPAT,SPI

To make compatible with JDK 8, keep COMPAT ahead of CLDR.

java.locale.providers=COMPAT,CLDR
0 0 votes
Article Rating
Subscribe
Notify of
guest
380 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments