i have etl requirement like:
i need fetch around 20000 records table , process each record separately.(the processing of each record involves couple of steps creating table each record , inserting data it). prototype implemented 2 jobs(with corresponding transformations). rather table created simple empty file. simple case doesn't seem work smoothly. (when create table each record kettle exits after 5000 reocrds)
when run kettle goes slow , hangs after 2000-3000 files though processing complete after long time though kettle seems stop @ time. design approach right?. when replace write file actual requirement creating new table(through sql script step) each id , inserting data it, kettle exits after 5000 records. need flow works. increasing java memory(xmx @ 2gb)?. there other configuration can change? or there other way? time shouldn't constraint flow should work.
my initial guess since not storing data prototype atleast should work smoothly. using kettle 3.2.
i seem remember known issue/restriction, hence why job looping deprecated these days.
are able re-build job using transformation and/or job executor steps? can execute number of rows via stops.
these steps have own issues - namely have explicitly handle errors, it's worth try see if can achieve want. it's different mindset, nicer way build loops job approach.
Comments
Post a Comment