Python string.replace is stripping off my quote characters -


what i'm doing feeding python script csv file contains millions of records separated commas. strings "contained double qoutes".

i pass .csv file through python script

import csv import string import sys, getopt  infile = open(sys.argv[1], 'r') outfile = open(sys.argv[1][:-4] + '_no-nulls.csv', 'w') data = csv.reader(infile) writer = csv.writer(outfile)  specials = "null"  line in data:     line = [value.replace(specials, '') value in line]     writer.writerow(line)  infile.close() outfile.close() 

and end result has quotes stipped off strings.
doing wrong?

edit

sample input:

897555,2021-03-31 00:00:00.000,null,"45687","b","qa",29,null,null,null,null,null,null,null,"5648987qexxx",6,null,null,"doe","john",null,null,null,null,null,"q",1994-04-24 00:00:00.000,"r","cx","zz",null,null,null,null,null,"y",null,"ga","r","de",null,null,null,null,null,"en",null,"y","op",null,"r","xz",null,null,null,"8945564",2005-03-01 12:00:00.000,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null 

sample output:

897555,2021-03-31 00:00:00.000,,"45687","b","qa",29,,,,,,,,"5648987qexxx",6,,,"doe","john",,,,,,"q",1994-04-24 00:00:00.000,"r","cx","zz",,,,,,"y",,"ga","r","de",,,,,,"en",,"y","op",,"r","xz",,,,"8945564",2005-03-01 12:00:00.000,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 

this normal. when reading, csv.reader strip off quotes because it's assumed program consuming data doesn't want or need them. csv.writer put them on if necessary, depending on setting of quoting pass, default being quote_minimal - add quotes if there characters in string misinterpreted.

you set both reader , writer quote_none preserve quotes in original file, or set writer quote_all requote output.


Comments