UTF-8 text is garbled when form is posted as multipart/form-data
I had the same problem using Apache commons-fileupload. I did not find out what causes the problems especially because I have the UTF-8 encoding in the following places: 1. HTML meta tag 2. Form accept-charset attribute 3. Tomcat filter on every request that sets the "UTF-8" encoding
-> My solution was to especially convert Strings from ISO-8859-1 (or whatever is the default encoding of your platform) to UTF-8:
new String (s.getBytes ("iso-8859-1"), "UTF-8");
hope that helps
Edit: starting with Java 7 you can also use the following:
new String (s.getBytes (StandardCharsets.ISO_8859_1), StandardCharsets.UTF_8);
I had the same problem and it turned out that in addition to specifying the encoding in the Filter
request.setCharacterEncoding("UTF-8");
response.setCharacterEncoding("UTF-8");
it is necessary to add "acceptcharset" to the form
<form method="post" enctype="multipart/form-data" acceptcharset="UTF-8" >
and run the JVM with
-Dfile.encoding=UTF-8
The HTML meta tag is not necessary if you send it in the HTTP header using response.setCharacterEncoding().
Just use Apache commons upload library.
Add URIEncoding="UTF-8"
to Tomcat's connector, and use FileItem.getString("UTF-8") instead of FileItem.getString() without charset specified.
Hope this help.
I got stuck with this problem and found that it was the order of the call to
request.setCharacterEncoding("UTF-8");
that was causing the problem. It has to be called before any all call to request.getParameter(), so I made a special filter to use at the top of my filter chain.
https://rogerkeays.com/servletrequest-setcharactercoding-ignored