How to print strings separated by TAB in bash?
the whitespace between the two is actually 5 spaces.
No, it's not. Not in the output of echo
or printf
.
$ echo -e 'foo\tbar' | od -c
0000000 f o o \t b a r \n
0000010
What is the correct way to force tab being printed as tab, so I can select the output and copy it to somewhere else, with tabs?
This is a different issue. It's not about the shell but the terminal emulator, which converts the tabs to spaces on output. Many, but not all of them do that.
It may be easier to redirect the output with tabs to a file, and copy it from there, or to use unexpand
on the output to convert spaces to tabs. (Though it also can't know what whitespace was tabs to begin with, and will convert all of it to tabs, if possible.) This of course would depend on what, exactly, you need to do with the output.
Like ilkkachu said, this isn't an issue with bash, but with the terminal emulator which converts tabs to spaces on output.
Checking different terminals, putty, xterm, and konsole convert tabs to spaces, while urxvt and gnome-terminal do not. So, another solution is to switch terminals.
In printf '%s\t%s\n' foo bar
, printf
does output foo<TAB>bar<LF>
.
f
, o
, b
, a
and r
are single-width graphical characters.
Upon receiving those characters, the terminal will display a corresponding glyph and move the cursor one column to the right, unless it's already reached the right edge of the screen (paper in original tele-typewriters)), in which case it may feed a line and return to the left edge of the screen (wrap) or just discard the character depending on the terminal and how it's been configured.
<Tab>
and <LF>
are two control characters. <LF>
(aka newline) is the line delimiter in Unix text, but for terminals, it just feeds a line (move the cursor one position down). So the terminal driver in the kernel will actually translate it to <CR>
(return to the left edge of the screen), <LF>
(cursor down) (stty onlcr
generally on by default).
<Tab>
tells the terminal to move the cursor to the next tab stop (which on most terminals are 8 positions apart by default but can also be configured to be set anywhere) without filling the gap with blanks.
So if those characters are sent to a terminal with tab stops every 8 columns whilst the cursor is at the start of an empty line, that will result in:
foo bar
printed on the screen at that line. If they are sent whilst the cursor is in third position in a line that contains xxxxyyyyzzzz
, that will result in:
xxfooyyybarz
On terminals that don't support tabulation, the terminal driver can be configured to translate those tabs to sequences of spaces. (stty tab3
).
The SPC character, in original tele-typewriters would move the cursor to the right, while backspace (\b
) would move it to the left. Now in modern terminals, SPC moves to the right and also erases (writes a space character as you'd expect). So the pendant of \b
had to be something newer than ASCII. On most modern terminals, it's actually a sequence of characters: <Esc>
, [
, C
.
There are more escape sequences to move n
characters left, right, up, down or at any position on the screen. There are other escape sequences to erase (fill with blank) parts of lines or regions of the screen, etc.
Those sequences are typically used by visual applications like vi
, lynx
, mutt
, dialog
where text is written at arbitrary positions on the screen.
Now, all X11 terminal emulators and a few other non-X11 ones like GNU screen
let you select areas of the screen for copy paste. When you select a part of what you see in the vi
editor, you don't want to copy all the escape sequences that have been used to produce that output. You want to select the text you see there.
For instance if you run:
printf 'abC\rAC\bB\t\e[C\b\bD\n'
Which simulates an editor session where you enter abC
, go back to the beginning, replace ab
with AC
, C
with B
, move to the next tab stop, then one more column to the right, then two columns to the left, then enter D
.
You see:
ABC D
That is, ABC
, a 4 column gap and D
.
If you select that with the mouse in xterm
or putty
, they will store in the selection ABC
, 4 space characters and D
, not abC<CR>AC<BS>B<Tab><Esc>[C<BS><BS>D
.
What ends up in the selection is what has been sent by printf
but post-processed by both the terminal driver and the terminal emulator.
For other kinds of transformation, see the <U+0065><U+0301>
(e
followed by a combining acute accent) changed to <U+00E9>
(é
the pre-composed form) by xterm
.
Or echo abc
that ends up being translated to ABC
by the terminal driver before sending to the terminal after a stty olcuc
.
Now, <Tab>
, like <LF>
is one of those few control characters that are actually sometimes found in text files (also <CR>
in MSDOS text files, and sometimes <FF>
for page break).
So some terminal emulators do choose to copy them when possible in the copy-paste buffers to preserve them (that's generally not the case of <CR>
nor <LF>
though).
For instance, in VTE-based terminals like gnome-terminal
, you may see that, when you select the output of printf 'a\tb\n'
on an empty line, gnome-terminal
actually stores a\tb
in the X11 selection instead of a
, 7 spaces and b
.
But for the output of printf 'a\t\bb\n'
, it stores a
, 6 spaces and b
, and for printf 'a\r\tb\n'
, a
, 7 spaces and b
.
There are other cases where the terminals will try to copy the actual input, like when you select two lines after running printf 'a \nb\n'
where that invisible trailing space will be preserved. Or when selecting two lines doesn't include a LF character when the two lines result from wrapping at the right margin.
Now, if you want to store the output of printf
into the CLIPBOARD X11
select, best is to do it directly like with:
printf 'foo\tbar\n' | xclip -sel c
Note that when you paste that in xterm
or most other terminals, xterm
actually replaces that \n
with \r
because that's the character xterm
sends when you press Enter (and the terminal driver may translate it back to \n
).