Why does Python say this Netscape cookie file isn't valid?

I see nothing in your example code or copy of the cookies.txt file that is obviously wrong.

I've checked the source code for the MozillaCookieJar._really_load method, which throws the exception that you see.

The first thing this method does, is read the first line of the file you specified (using f.readline()) and use re.search to look for the regular expression pattern "#( Netscape)? HTTP Cookie File". This is what fails for your file.

It certainly looks like your cookies.txt would match that format, so the error you see is quite surprising.

Note that your file is opened with a simple open(filename) call earlier on, so it'll be opened in text mode with universal line ending support, meaning it doesn't matter that you are running this on Windows. The code will see \n newline terminated strings, regardless of what newline convention was used in the file itself.

What I'd do in this case is triple-check that your file's first line is really correct. It needs to either contain "# HTTP Cookie File" or "# Netscape HTTP Cookie File" (spaces only, no tabs, between the words, capitalisation matching). Test this with the python prompt:

>>> f = open('cookies.txt')
>>> line = f.readline()
>>> line
'# Netscape HTTP Cookie File\n'
>>> import re
>>> re.search("#( Netscape)? HTTP Cookie File", line)
<_sre.SRE_Match object at 0x10fecfdc8>

Python echoed the line representation back to me when I typed line at the prompt, including the \n newline character. Any surprises like tab characters or unicode zero-width spaces will show up there as escape codes. I also verified that the regular expression used by the cookiejar code matches.

You can also use the pdb python debugger to verify what the http.cookiejar module really does:

>>> import pdb
>>> import http.cookiejar
>>> jar = http.cookiejar.MozillaCookieJar('cookies.txt')
>>> pdb.run('jar.load()')
> <string>(1)<module>()
(Pdb) s
--Call--
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1759)load()
-> def load(self, filename=None, ignore_discard=False, ignore_expires=False):
(Pdb) s
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1761)load()
-> if filename is None:
(Pdb) s
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1762)load()
-> if self.filename is not None: filename = self.filename
(Pdb) s
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1765)load()
-> f = open(filename)
(Pdb) n
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1766)load()
-> try:
(Pdb) 
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1767)load()
-> self._really_load(f, filename, ignore_discard, ignore_expires)
(Pdb) s
--Call--
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1989)_really_load()
-> def _really_load(self, f, filename, ignore_discard, ignore_expires):
(Pdb) s
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1990)_really_load()
-> now = time.time()
(Pdb) n
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1992)_really_load()
-> magic = f.readline()
(Pdb) 
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1993)_really_load()
-> if not self.magic_re.search(magic):
(Pdb) 
> /opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/http/cookiejar.py(1999)_really_load()
-> try:

In the above sample pdb session I used a combination of the step and next commands to verify that the regular expression test (self.magic_re.search(magic)) actually passed.


As of my scenario, two modifications are needed to the MozillaCookieJar under (/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/)

  1. The magic header

    You can remove the check logic or add that magic header which I prefer

    # Netscape HTTP Cookie File

  2. The new file format seems allow you to omit the expires

    vals = line.split("\t")
    if len(vals) == 7 :
        domain, domain_specified, path, secure, expires, name, value = vals
    if len(vals) == 6 :
        domain, domain_specified, path, secure, name, value = vals
        expires = None
    

Lastly I really hope the implementation could be updated to the new changes.


please this in your dev console

copy('# Netscape HTTP Cookie File\n' + document.cookie.split(/; /g).map(e => e.replace('=', '\t')).map(e => window.location.hostname.replace('www.', '.') + '\tTRUE\t/\tFALSE\t-1\t' + e).join('\n'))

Netscape-formatted cookies will be in your system's clipboard :)

Tags:

Python

Cookies