JREPL to check 12 columns for a certain string?
Posted: 17 Jun 2020 04:18
HI Folks -
I have a massive data file (600K rows) that I need to "clean". The source team where I get the file from is pushing back on doing this clean up from their end so I need to build a solution on my end before I import the file into our financial application.
There are TONS of rows that can be deleted. For instance the ENTIRE row can be deleted if columns 5-16 include "" or '#MISSING'. Example:
I have a massive data file (600K rows) that I need to "clean". The source team where I get the file from is pushing back on doing this clean up from their end so I need to build a solution on my end before I import the file into our financial application.
There are TONS of rows that can be deleted. For instance the ENTIRE row can be deleted if columns 5-16 include "" or '#MISSING'. Example:
These rows however can't be deleted:E01,S00900,6016-10,2020,"","","","","","",'#MISSING','#MISSING','#MISSING','#MISSING','#MISSING','#MISSING'
E01,S00900,6016-13,2020,"","","","","","",'#MISSING','#MISSING','#MISSING','#MISSING','#MISSING','#MISSING'
E01,S00900,6016-14,2020,"","","","","","",'#MISSING','#MISSING','#MISSING','#MISSING','#MISSING','#MISSING'
E01,S00900,6016-15,2020,"","","","","","",'#MISSING','#MISSING','#MISSING','#MISSING','#MISSING','#MISSING'
Can this be done with JREPL? If so, how would I code the find portion of it? I guess you could say if columns 5+ do not have a numerical value, then delete entire row.E01,S00900,6520-00,2020,279.56,2812.98,435.01,1072.39,51.75,318.20,'#MISSING','#MISSING','#MISSING','#MISSING','#MISSING','#MISSING'
E01,S00900,6520-01,2020,32.10,32.10,"",450.00,"","",'#MISSING','#MISSING','#MISSING','#MISSING','#MISSING','#MISSING'