JSP page encoding
I recently had a problem with a web application using JSP pages that was not handling non-ASCII characters correctly: submitted text containing characters such as ù or è resulted in garbage characters stored in the database and consequently garbage displayed by the browser.
The application was developed in Linux, using an IDE configured to save files using UTF-8 encoding, and deployed on a Linux box.
The database was configured to use UTF-8 encoding.
Every page contained the directive
I thought that given these premises, there was no chance of misbehaviors with respect to character encoding, but I was wrong.
The browser kept telling me that the page it was displaying had ISO-8859-1 encoding.
It turned out that the JSP specification says that if the page encoding of the JSP pages is not explicitely declared, then ISO-8859-1 should be used (!).
The Jetty servlet container was correctly setting the HTTP header as:
The fix is simple, just add this to
It will be interesting to know why the JSP expert group did not pick up UTF-8 as the default character encoding.
The application was developed in Linux, using an IDE configured to save files using UTF-8 encoding, and deployed on a Linux box.
The database was configured to use UTF-8 encoding.
Every page contained the directive
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> in the <head> section.I thought that given these premises, there was no chance of misbehaviors with respect to character encoding, but I was wrong.
The browser kept telling me that the page it was displaying had ISO-8859-1 encoding.
It turned out that the JSP specification says that if the page encoding of the JSP pages is not explicitely declared, then ISO-8859-1 should be used (!).
The Jetty servlet container was correctly setting the HTTP header as:
Content-Type: text/html; charset=ISO-8859-1, following the specification.The fix is simple, just add this to
web.xml:
<jsp-config>
<jsp-property-group>
<url-pattern>*.jsp</url-pattern>
<page-encoding>UTF-8</page-encoding>
</jsp-property-group>
</jsp-config>
It will be interesting to know why the JSP expert group did not pick up UTF-8 as the default character encoding.




3 Comments:
I recently came across your blog and have been reading along. I thought I would leave my first comment. I don't know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.
Betty
http://www.my-foreclosures.info
By
donald, at 15 December, 2008 08:07
The problem with the JSP spec it is very Euro/America centric. I noticed even the latest specification still says ISO-8859-1 instead of UTF-8. The XML specification says UTF-8 as the min and UTF-16 are the only two required items to be supported. This covers the vast majority of character sets.
It would be nice of the Servlet spec took a wider world view.
By
David Carver, at 10 December, 2009 04:53
You star, thanks. You just saved me going through all my JSPs and adding to each one to fix my UTF-8 character encoding issue <%@page contentType="text/html; charset=UTF-8" %>
By
Anonymous, at 25 September, 2012 11:14
Post a Comment
<< Home