I was recently trying to provide character encoding “UTF-8” to our site. This is spring application running on tomcat. We are using velocity templating engine, with mysql backend.
To complete the loop, I had to do all of the following
- Update tomcat character encoding setting
- Add Filter to set Request objects character encoding
- Update Spring VelocityViewResolver to set content-type with correct character encoding
- Set character encoding for the database connection
- Set character encoding in the database
Update tomcat character encoding setting
In the tomcat server, edit the conf/server.xml to set URIEncoding
<Connector connectionTimeout="20000" port="80" protocol="HTTP/1.1" redirectPort="8443" URIEncoding="UTF-8"/>
Add Filter to set Request objects character encoding
Add the following filter to the web.xml in the tomcat application. If you dont use the spring framework, create a Filter extending the javax.servlet.Filter to set the character encoding on HttpServletRequest to UTF-8
<filter> <filter-name>Spring character encoding filter</filter-name> <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class> <init-param> <param-name>encoding</param-name> <param-value>UTF-8</param-value> </init-param> <init-param> <param-name>forceEncoding</param-name> <param-value>true</param-value> </init-param> </filter>
Update Spring VelocityLayoutViewResolver to set content-type with correct character encoding
<bean id="viewResolver" class="org.springframework.web.servlet.view.velocity.VelocityLayoutViewResolver"> <property name="contentType" value="text/html;charset=UTF-8"></property> </bean>
Set character encoding for the database connection
jdbc:mysql://${database.host}:3306/${database.name}?maxQuerySizeToLog=10000&dumpQueriesOnException=true&useUnicode=true&characterEncoding=UTF-8&includeInnodbStatusInDeadlockExceptions=true
Set character encoding in the database
ALTER TABLE tbl_name DEFAULT CHARACTER SET 'utf8';
I don’t quite agree with your assessment here. Although, I haven’t tested this on Spring libraries, specifying UTF-8 for the JVM file encoding parameter, specifically as Dfile.encoding=UTF-8, should do the trick. Of course, there are exceptions; For example, in JSPs, UTF-8 should be specified for page encoding.
Obviously, databases would need specific attention, but I’m sure there is a JVM equivalent option to deal with character encoding. After all, character encoding is fundamental to software.
In case you test my suggestion out, leave your feedback here.
Comment by Getting Around — May 10, 2009 @ 12:46 am
[…] This is actually an update and extension to Learning Monk’s nice article “How many palces [sic] do you have to set Character encoding for a site“. […]
Pingback by Unicode Web Apps « Ted Young — August 8, 2011 @ 6:03 pm