python - Error tokenizing data -

this code:

import pandas import datetime decimal import decimal  file_ = open('myfile.csv', 'r') result = pandas.read_csv(     file_, header=none,     names=('sec', 'date', 'sale', 'buy'),     usecols=('date', 'sale', 'buy'),     parse_dates=['date'],     iterator=true,     chunksize=100,     compression=none,     engine="c",     date_parser=lambda dt: datetime.datetime.strptime(dt, '%y%m%d %h:%m:%s.%f'),     converters={'sale': (lambda u: decimal(u)), 'buy': (lambda u: decimal(u))} )

and try...

result.get_chunk()

only error this:

cparsererror: error tokenizing data. c error: expected 3 fields in line 3, saw 4

from file (i show first 4 lines - file has no header, , lines have format):

eur/usd,20160701 00:00:00.071,1.11031,1.11033 eur/usd,20160701 00:00:00.255,1.11031,1.11033 eur/usd,20160701 00:00:00.256,1.11025,1.11033 eur/usd,20160701 00:00:00.258,1.11027,1.11033 ... > l0.000.000 lines these

my intention object iterate chunks , not have whole crap in memory (the actual file has 560mb!). want discard first column (there 4 columns since file has same value in first column, want discard such column). want keep columns 1, 2, , 3 (discarding 0) date, sale, , purchase price.

actually first attempt pandas, since former solution used standard python csv module, , takes lot of time.

what missing? why getting such error?

#try code import pandas pd import numpy np import csv  # print 3 columns , create data frame,to give names columns in csv file ',' seperator myfile.csv: sec,date,sale,buy eur/usd,20160701 00:00:00.071,1.11031,1.11033 eur/usd,20160701 00:00:00.255,1.11031,1.11033 eur/usd,20160701 00:00:00.256,1.11025,1.11033 eur/usd,20160701 00:00:00.258,1.11027,1.11033  data = pd.read_csv('myfile.csv',sep=',') df = pd.dataframe({'date':data.date,'sale':data.sale,'buy':data.buy}) print(df)  output:        buy                   date     sale 0  1.11033  20160701 00:00:00.071  1.11031 1  1.11033  20160701 00:00:00.255  1.11031 2  1.11033  20160701 00:00:00.256  1.11025 3  1.11033  20160701 00:00:00.258  1.11027

Thr

Search This Blog

python - Error tokenizing data -

Comments

Post a Comment