How do I split a multi-line string into multiple lines?
Use inputString.splitlines()
.
Why splitlines
is better
splitlines
handles newlines properly, unlike split
.
It also can optionally return the newline character in the split result when called with a True
argument, which is useful in some specific scenarios.
Why you should NOT use split("\n")
Using split
creates very confusing bugs when sharing files across operating systems.
\n
in Python represents a Unix line-break (ASCII decimal code 10), independently of the OS where you run it. However, the ASCII linebreak representation is OS-dependent.
On Windows, \n
is two characters, CR
and LF
(ASCII decimal codes 13 and 10, \r
and \n
), while on modern Unix (Mac OS X, Linux, Android), it's the single character LF
.
print
works correctly even if you have a string with line endings that don't match your platform:
>>> print " a \n b \r\n c "
a
b
c
However, explicitly splitting on "\n", has OS-dependent behaviour:
>>> " a \n b \r\n c ".split("\n")
[' a ', ' b \r', ' c ']
Even if you use os.linesep
, it will only split according to the newline separator on your platform, and will fail if you're processing text created in other platforms, or with a bare \n
:
>>> " a \n b \r\n c ".split(os.linesep)
[' a \n b ', ' c ']
splitlines
solves all these problems:
>>> " a \n b \r\n c ".splitlines()
[' a ', ' b ', ' c ']
Reading files in text mode partially mitigates the newline representation problem, as it converts Python's \n
into the platform's newline representation.
However, text mode only exists on Windows. On Unix systems, all files are opened in binary mode, so using split('\n')
in a UNIX system with a Windows file will lead to undesired behavior. This can also happen when transferring files in the network.
Like the others said:
inputString.split('\n') # --> ['Line 1', 'Line 2', 'Line 3']
This is identical to the above, but the string module's functions are deprecated and should be avoided:
import string
string.split(inputString, '\n') # --> ['Line 1', 'Line 2', 'Line 3']
Alternatively, if you want each line to include the break sequence (CR,LF,CRLF), use the splitlines
method with a True
argument:
inputString.splitlines(True) # --> ['Line 1\n', 'Line 2\n', 'Line 3']
inputString.splitlines()
Will give you a list with each item, the splitlines()
method is designed to split each line into a list element.