i using spout library reading , writing excel files in php. have question on speed.
when try read/write 100k records took 15 min when try read/write 200k records took 1.5 hours
i tried uploading 600k records overnight , took 9 hours.
i don't know if it's machine or what. increasing generation time not double.
any tips speeding up?
thanks in advance! :)
the time spout takes write data spreadsheet should more or less proportional size of dataset. reading spreadsheet different though. there 3 possible options:
- your spreadsheet uses inline strings instead of shared strings: reading time should proportional dataset size.
- your spreadsheet uses shared strings:
- the number of shared strings limited , can fit in memory: reading time should proportional dataset size.
- there many shared strings fit in memory: spout split shared strings chunks can fit in memory. each chunk saved disk , chunk containing string being read loaded in memory.
with 2 first options, fine , spout goes fast possible. 3rd option though, things take longer. that's catch avoid going out of memory... if spreadsheet uses shared strings more or less ordered (a1 uses string 1, b1 uses string 2... z10 uses string 840), perf hit won't bad (it adds few io operations read data disk). if shared strings not (a1 uses string 1 b1 uses string 200,000 - stored in chunk - , c1 uses string 3), because spout reads cells sequentially, have lot more io operations load correct chunks in memory.
so problem, can take @ how data defined in xml files describing spreadsheet. if used spout create spreadsheet, make sure use inline strings (the final file size bigger reading way faster).
something else can modify file: cachingstrategyfactory.php. if know characters 1 byte characters, you'll able increase number of strings can put in memory 4 (as spout assumes 4-bytes characters computation).
hope helps!
Comments
Post a Comment