JAVA CHARACTER ENCODING PART 2
CHARACTER ENCODING PART 2
In one of my development phase the encoding type was the huge problem why foreign characters that are UTF-8 showed like undefined characters.
Problem was on Jsp page and jsp had like the structure below
<%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF8" %>
As you can see above the document was coded on UTF8 and the content charset on screen will be displayed was also UTF-8
If you want to type special characters into your jsp page that you may want to use special characters for your variable names or etc... You should type your
pageEncoding to the desired type.
1 Problem and Solution
Suppose if you want to use special characters in the jsp coding but the pageEncoding does not permit it so what will you do?
My Solution == If you use your special characters for displaying in your page i would use unicode of the characters with the appropriate encoding defined for
out or in, actually in is for getting the value from the querystring or from any other external as correctly typed.
As an example i will export a figure and an Hebrew character to display then Turkish Character to the console
As i notice that in most of the character encoding problems does not derive from the jsp encoding lots of time changing only charset of the pages did not give me
the right result.
1 ) Server language parameter in one of my problem
2 ) Some explicitly encodings are hided in your 3rd party jars such as
Locale explicitylyLocale = new Locale("...");
...
3 ) Some problems are occurred by the response character encoding. As i encounter defining 1 response character encoding affects whole the page.
and also request character encoding of course
4 ) Also from the os language.
In this small example
<%@ page contentType="text/html; charset=ISO-8859-9" pageEncoding="ISO-8859-9" %>
This is arranged for the Turkish characters to display
<%
response.setCharacterEncoding("ISO-8859-8");
out.println("\u2708");
out.println("\u05E0");
System.out.println("\u022A");
%>
but response.setCharacterEncoding defined for the Hebrew characters
first one is the figure second one is Hebrew character third one is Turkish character.
Printing on the console depends on the server parameter for example tomcat uses -Dfile.encoding=ISO-8859-9, ISO-8859-8,cp-1252,windows-1256 etc... for different languages
i used the ISO-8859-8 to change the server language that affects the console printing encoding.
No comments:
Post a Comment