Mutant World

Monday, January 01, 2007

UTF-8 handling for ResourceBundle and Properties

Every new version of the JDK comes with small improvements, often unnoticed, that however correct long-standing bugs or request for enhancements.

One such improvements is - finally! - the ability for java.util.Properties to handle non ISO-8859-1 properties files.
At first seems like a very small improvement, but anyone that worked on localizing an application (for example translating messages into a country specific language) knows that encoding problems are just painful, and that is far too easy to waste days trying to get them right.

JDK 5 added a new API, namely Properties.loadFromXML(InputStream), that allowed to specify properties using an XML syntax, and hence to specify the encoding of the XML file in the XML declaration (the first line of an XML file):

<?xml version="1.0" encoding="UTF-8" ?>
<properties>
<entry key="hello.world">Καλημέρα κόσμε</entry>
</properties>

JDK 6 went further, adding Properties.load(Reader): in case the encoding is known a priori by the program, it is possible to create a Reader that uses the specific encoding to read the properties file.

However, in JDK 5, java.util.ResourceBundle was only able to read properties files that were encoded in ISO-8859-1, and this forced the translators to put horrible unicode escapes instead of real text. Following the above example, in JDK 5 you have to write the properties file containing translations like this:

hello.world=\u039A\u03B1\u03BB\u03B7\u03BC\u03AD\u03C1\u03B1 \u03BA\u03CC\u03C3\u03BC\u03B5

Pretty ugly, especially when you think that the API to read properties file in XML format is already present in JDK 5, only that ResourceBundle has not been updated to use it.

Fortunately, with JDK 6 it is possible to solve this problem, although requires a bit of coding (see below).

In JDK 6 ResourceBundle has been extended to give the user more control on how to load resources, by subclassing ResourceBundle.Control, and passing an instance of the ResourceBundle.Control subclass like this:

ResourceBundle bundle = ResourceBundle.getBundle("messages", new MyControl());

where MyControl is the ResourceBundle.Control subclass.
The class ResourceBundle.Control allows to tune:
  • the kind of resource to load (a Java class, a properties file, a properties file in XML format, or any other format you decide)

  • the policy of expiration of the resource, thus allowing reload the resource upon some condition

  • the ResourceBundle subclass to instantiate, normally depending on the kind of resource

Here's the JDK 6 class that allows to load XML properties files as resource bundles.

/**
* JDK 6's {@link ResourceBundle.Control} subclass that allows
* loading of bundles in XML format.
* The bundles are searched first as Java classes, then as
* properties files (these two methods are the standard
* search mechanism of ResourceBundle), then as XML properties
* files.
* The filename extension of the XML properties files is assumed
* to be *.properties.xml
*/
public class ExtendedControl extends ResourceBundle.Control
{
private static final String FORMAT_XML_SUFFIX = "properties.xml";
private static final String FORMAT_XML = "java." + FORMAT_XML_SUFFIX;
private static final List<String> FORMATS;
static
{
List<String> formats = new ArrayList<String>(FORMAT_DEFAULT);
formats.add(FORMAT_XML);
FORMATS = Collections.unmodifiableList(formats);
}

@Override
public List<String> getFormats(String baseName)
{
return FORMATS;
}

@Override
public ResourceBundle newBundle(String baseName, Locale locale,
String format, ClassLoader loader,
boolean reload)
throws IllegalAccessException, InstantiationException, IOException
{
if (!FORMAT_XML.equals(format))
return super.newBundle(baseName, locale, format, loader, reload);

String bundleName = toBundleName(baseName, locale);
String resourceName = toResourceName(bundleName, FORMAT_XML_SUFFIX);
final URL resourceURL = loader.getResource(resourceName);
if (resourceURL == null) return null;

InputStream stream = getResourceInputStream(resourceURL, reload);

try
{
PropertyXMLResourceBundle result = new PropertyXMLResourceBundle();
result.load(stream);
return result;
}
finally
{
stream.close();
}
}

private InputStream getResourceInputStream(final URL resourceURL,
boolean reload)
throws IOException
{
if (!reload) return resourceURL.openStream();

try
{
// This permission has already been checked by
// ClassLoader.getResource(String), which will return null
// in case the code has not enough privileges.
return AccessController.doPrivileged(
new PrivilegedExceptionAction<InputStream>()
{
public InputStream run() throws IOException
{
URLConnection connection = resourceURL.openConnection();
connection.setUseCaches(false);
return connection.getInputStream();
}
});
}
catch (PrivilegedActionException x)
{
throw (IOException)x.getCause();
}
}

/**
* ResourceBundle that loads definitions from an XML properties file.
*/
public static class PropertyXMLResourceBundle extends ResourceBundle
{
private final Properties properties = new Properties();

public void load(InputStream stream) throws IOException
{
properties.loadFromXML(stream);
}

protected Object handleGetObject(String key)
{
return properties.getProperty(key);
}

public Enumeration<String> getKeys()
{
final Enumeration<Object> keys = properties.keys();
return new Enumeration<String>()
{
public boolean hasMoreElements()
{
return keys.hasMoreElements();
}

public String nextElement()
{
return (String)keys.nextElement();
}
};
}
}
}

Labels: ,

6 Comments:

  • Yup, it's a lot nicer when you can just use the characters rather than the Unicode escapes. I've been working with JavaCC a lot lately; it handles the escaped characters and can also accept a Reader with the encoding of your choice. Just using a Reader and the real characters is a big readability win.

    By Blogger Tom Copeland, at 05 January, 2007 17:29  

  • Thanks for the nice post!

    By Anonymous free ps3, at 08 September, 2007 15:08  

  • yap thanks for the nice post, I'm fighting in these days with exactly the same problem and being limited to the jdk1.5 version I'm looking for different solution.
    At the moment I found the spring implementation ReloadableResourceBundleMessageSource pretty good, 'cause you can specify which encoding to use to load the file .properties.
    The only problem is that there's not a custom tag implementation ready to use with this set of classes.

    By Blogger Giorgio, at 07 February, 2008 10:54  

  • grazie, simo'


    antonio signore

    By Blogger antonio signore, at 16 November, 2011 16:10  

  • I have no words for this great post such a awe-some information i got gathered. Thanks to Author.

    By Anonymous html5 audio player, at 30 March, 2012 09:07  

  • Hello Simon!

    To translate .properties files online, it's possible to use the app localization platform POEditor, which makes it easy to edit strings alone or in a team and to automate various aspects of the localization process.

    I highly recommend it if you wish to save precious time while working on l10n projects.

    By Blogger Sonia Krugers, at 13 December, 2016 15:31  

Post a Comment

<< Home