Using Unicode Literals in Python

By: Guido Emailed: 1765 times Printed: 2514 times    

Latest comments
By: rohit kumar - how this program is work
By: Kirti - Hi..thx for the hadoop in
By: Spijker - I have altered the code a
By: ali mohammed - why we use the java in ne
By: ali mohammed - why we use the java in ne
By: mizhelle - when I exported the data
By: raul - no output as well, i'm ge
By: Rajesh - thanx very much...
By: Suindu De - Suppose we are executing

In Python source code, specific Unicode code points can be written using the \u escape sequence, which is followed by four hex digits giving the code point. The \U escape sequence is similar, but expects eight hex digits, not four:

>>> s = "a\xac\u1234\u20ac\U00008000"
          ^^^^ two-digit hex escape
               ^^^^^ four-digit Unicode escape
                          ^^^^^^^^^^ eight-digit Unicode escape
>>> for c in s:  print(ord(c), end=" ")
...
97 172 4660 8364 32768

Using escape sequences for code points greater than 127 is fine in small doses, but becomes an annoyance if you’re using many accented characters, as you would in a program with messages in French or some other accent-using language. You can also assemble strings using the chr() built-in function, but this is even more tedious.

Ideally, you’d want to be able to write literals in your language’s natural encoding. You could then edit Python source code with your favorite editor which would display the accented characters naturally, and have the right characters used at runtime.

Python supports writing source code in UTF-8 by default, but you can use almost any encoding if you declare the encoding being used. This is done by including a special comment as either the first or second line of the source file:

#!/usr/bin/env python
# -*- coding: latin-1 -*-

u = 'abcdé'
print(ord(u[-1]))

The syntax is inspired by Emacs’s notation for specifying variables local to a file. Emacs supports many different variables, but Python only supports ‘coding’. The -*- symbols indicate to Emacs that the comment is special; they have no significance to Python but are a convention. Python looks for coding: name or coding=name in the comment.

If you don’t include such a comment, the default encoding used will be UTF-8.


Python Home | All Python Tutorials | Latest Python Tutorials

Sponsored Links

If this tutorial doesn't answer your question, or you have a specific question, just ask an expert here. Post your question to get a direct answer.



Bookmark and Share

Comments(0)


Be the first one to add a comment

Your name (required):


Your email(required, will not be shown to the public):


Your sites URL (optional):


Your comments:



More Tutorials by Guido
Using Unicode Literals in Python

More Tutorials in Python
What is the need for Python language?
How to compile python script and create .pyc file?
How to find the current module name in python
How to force rereading of a changed module in python
UnboundLocalError in python
call by reference in python
Callable objects in python
Ternary operator in python
Hexadecimal and Octal integers in python
Convert string to number in python
Convert number to string in python
Perl's chomp() equivalent for removing trailing newlines from strings in python
Convert between tuples and lists in python
Iterate over a sequence in reverse order in python
Remove duplicates from a list in python

More Latest News
Most Viewed Articles (in Python )
Using Unicode Literals in Python
Perl's chomp() equivalent for removing trailing newlines from strings in python
Installing gedit for python programming in Windows
Installing gedit for python programming in Linux
Your First Program in Python
Comments And Pound Characters in Python
Numbers And Math in Python
Variables And Names in Python
Variables And Printing in Python
Strings And Text in Python
Printing in Python
Formatted printing in Python
Unknown command: 'migrate'
error: â_mysql_ConnectionObjectâ has no member named
What is the need for Python language?
Most Emailed Articles (in Python)
Convert number to string in python
How to force rereading of a changed module in python
Hexadecimal and Octal integers in python
Formatted printing in Python
Ternary operator in python
Convert string to number in python
Remove duplicates from a list in python
Your First Program in Python
Numbers And Math in Python
Variables And Names in Python
error: â_mysql_ConnectionObjectâ has no member named
call by reference in python
Perl's chomp() equivalent for removing trailing newlines from strings in python
Iterate over a sequence in reverse order in python
Schwartzian Transform in python