Member-only story
Please Stop Using CSV Files as an ETL Delivery Vehicle
I realize how convenient and “compatible” using CSV files for import and export appears to be at first glance, and I also realize that many languages tantalizingly have libraries that make enabling CSV import/export to structs or classes native to your language of choice look appealing.
I’m here to go against your initial intuition and point out some things you may not have thought of that can (and likely will) come back to bite you in the ass. The problem with using CSV is the data itself. Data in a DB is not often as sanitized as it should be. A product DB, for example, might contain things like double-quotation marks in the middle of text fields (like maybe for an inch abbreviation, or just random quotes someone on the floor input, just because). Fields which could contain free-form text such notes or remarks fields, will contain quotes, commas, or other embedded delimiters which you might look fine upon export, but muck up your import.
Whether you’re writing your own parser (CSV parsers are not as trivial as they first appear because of some of these things) or using someone else’s, chances are you’re going to have put in some time finding workarounds for the weird exceptions that you encounter when you’re processing data, particularly on the import side.
To that I say, if you have any influence at all over the export process, please do not use CSV files. Instead, I suggest you use SQLite. It’s open-source, fast, available on all platforms and pretty much all…