performance - Efficiently join multiple CSV files keeping the header from first file in C# -


given multiple csv files, can hundreds of megabytes or more per file. have same header row starting file , have crlf @ end of each line. each file may or may not have crlf @ end of file. goal to:

  1. join list of files.
  2. keep header first file.
  3. output them new file.
  4. these files may have thousands of columns , millions of rows.
  5. the files must processed in order given, , order of rows significant.

given size of files, needs fast , memory efficient possible.

if headers same, can open write stream, go through input files, opening read streams them , copying data. first file copied in entirety. subsequent files have first line skipped.

that approach fastest, long 100% sure columns align , it's first line needs skipping.

this kind of thing quite straightforward on unix-style command line, btw.


Comments