dimanche 19 avril 2015

Python fails to open 11gb csv in r+ mode but opens in r mode

I'm having problems with some code that loops through a bunch of .csvs and deletes the final line if there's nothing in it (i.e. files that end with the "\n" newline character)


My code works successfully on all files except one, which is the largest file in the directory at 11gb. The second largest file is 4.5gb.


The line it fails on is simply:



with open(path_str,"r+") as my_file:


and i get the following message:



IOError: [Errno 22] invalid mode ('r+') or filename: 'F:\\Shapefiles\\ab_premium\\processed_csvs\\a.csv'


The path_str I create using os.file.join to avoid errors, and I tried renaming the file to a.csv just to make sure there wasn't anything odd going on with the filename. This made no difference.


Even more strangely, the file is happy to open in r mode. I.e. the following code works fine:



with open(path_str,"r") as my_file:


I have tried navigating around the file in read mode, and it's happy to read characters at the start, end, and in the middle of the file.


Does anyone know of any limits on the size of file that Python can deal with or why I might be getting this error?! I'm on Windows 7 64bit and have 16gb of RAM.


Thanks,


Robin


Aucun commentaire:

Enregistrer un commentaire