How does the magic comment ( # Encoding: utf-8 ) in ruby​​ work?

This magic comment tells Ruby the source encoding of the currently parsed file. As Ruby 1.9.x by default assumes US_ASCII you have tell the interpreter what encoding your source code is in if you use non-ASCII characters (like umlauts or accented characters).

The comment has to be the first line of the file (or below the shebang if used) to be recognized.

There are other encoding settings. See this question for more information.

Since version 2.0, Ruby assumes UTF-8 encoding of the source file by default. As such, this magic encoding comment has become a rarer sight in the wild if you write your source code in UTF-8 anyway.


As you noted, magic comments are a special preprocessing construct. They must be defined at the top of the file (except, if there is already a unix shebang at the top). As of Ruby 2.3 there are three kinds of magic comments:

  • Encoding comment: See other answers. Must always be the first magic comment. Must be ASCII compatible. Sets the source encoding, so you will run into problems if the real encoding of the file does not match the specified encoding
  • frozen_string_literal: true: Freezes all string literals in the current file
  • warn_indent: true: Activates indentation warnings for the current file

More info: Magic Instructions


Ruby interpreter instructions at the top of the source file - this is called magic comment. Before processing your source code interpreter reads this line and sets proper encoding. It's quite common for interpreted languages I believe. At least Python uses the same approach.

You can specify encoding in a number of different ways (some of them are recognized by editors):

# encoding: UTF-8
# coding: UTF-8
# -*- coding: UTF-8 -*-

You can read some interesting stuff about source encoding in this article.

The only thing I'm aware of that has similar construction is shebang, but it is related to Unix shells in general and is not Ruby-specific.

magic_comments defined in ruby/ruby

Tags:

Ruby

Encoding