How to remove CR/LF in between the line

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: How to remove CR/LF in between the line

#16 Post by dbenham » 07 Mar 2013 19:35

I think records are supposed to be terminated by CR/LF, but I'm not sure about new lines within columns. I shouldn't think it would matter.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: How to remove CR/LF in between the line

#17 Post by foxidrive » 07 Mar 2013 20:03

dbenham wrote:I think records are supposed to be terminated by CR/LF, but I'm not sure about new lines within columns. I shouldn't think it would matter.


I think it will certainly matter.

aaksar
Posts: 105
Joined: 17 Nov 2012 05:13

Re: How to remove CR/LF in between the line

#18 Post by aaksar » 07 Mar 2013 21:19

Mine is txt file,

Squashman
Expert
Posts: 4488
Joined: 23 Dec 2011 13:59

Re: How to remove CR/LF in between the line

#19 Post by Squashman » 07 Mar 2013 21:46

foxidrive wrote:
dbenham wrote:I think records are supposed to be terminated by CR/LF, but I'm not sure about new lines within columns. I shouldn't think it would matter.


I think it will certainly matter.

I would say that is a true statement for batch files. I do work with software that I can specify a record length and it will ignore the CR and LF.

But if I receive a file from a client that is a delimited file it is pretty much hopeless unless it is just a LF in the middle of the data. At which point you could just use SED or TR to remove the Line Feeds from the data. But if it is truly a CRLF in the middle of the data it is probably almost impossible to fix.

aaksar
Posts: 105
Joined: 17 Nov 2012 05:13

Re: How to remove CR/LF in between the line

#20 Post by aaksar » 07 Mar 2013 23:04

Aacini wrote:@aaksar:

- A file is a series of lines/records, each one terminated in CR+LF.

- If your file have several CR+LF embedded in one record, then THERE IS NO WAY to fix your problem, unless there is a way to distinguish the "real" CR+LF that terminate normal records from the additional CR+LF that comin "in the middle of the line". You should started your description with these details!

- I have assumed that your file have records with a fixed number of fields (70 in your example) and that the fields are separated by commas. The Batch file below read lines from the input file and concatenates several lines until complete 70 fields in one record. Note that this method would fail if a field in a "broken" record have a comma (even if it is enclosed in quotes).

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set numOfFields=70

set record=
set fields=1
(for /F "delims=" %%a in (inputFile.txt) do (
   set "record=!record!%%a"
   rem Count the fields in this line
   set "line=%%a"
   call :StrLen line allChars=
   set "noCommas=!line:,=!"
   call :StrLen noCommas withoutCommas=
   set /A fields+=allChars-withoutCommas
   if !fields! geq %numOfFields% (
      echo !record!
      set record=
      set fields=1
   )
)) > outputFile.txt
goto :EOF


:StrLen var len=
set "str=0!%1!"
set len=0
for /L %%A in (12,-1,0) do (
   set /A "len|=1<<%%A"
   for %%B in (!len!) do if "!str:~%%B,1!" == "" set /A "len&=~1<<%%A"
)
set %2=%len%
exit /B

Output:
Output wrote:"0711105","11","00000","119",2010-05-03-22.13.00.000000,"119",2012-12-12-12.13.00.000000,"M15",08/01/2008,"13",07/15/2007,07/15/2008,"E",," ",,"Prop XOL 10M x5M + 5M Swiss 41",,0,"N",0,"N","N",,"900","B5","30","46","44","4","N","P","P","D","119","119",07/15/2007,07/14/2008," ","00000","N","119"," ",,,," "," "," "," "," "," ",,0.00,0.00,0.00," "," "," ",0.00,"D"," "," ","N",,"","Wind excluded only to all 1st and 2nd tier counties from TX to SC inclusive.Reinstatement equals 1 x 150% "," "," "," "


Antonio


above script is removing the CRLF , its removing the extra CRLF which are coming in 1 column , but just before the column its putting CRLF and in next alue which is going on.row again CRLF which shouldn't as its only 1 column v, that means , my next column is starting in new line and my line breaks there due to CR LF

"0911002","11","00000","130",2010-07-02-14.44.00.000000,"130",2013-01-15-13.25.00.000000,"M15",03/01/2007,"13",04/01/2009,04/01/2010,"E",," ",,"Global Prprty QS w Terror 221 ",,0,"N",0,"N","N",,"900","B5","30","46","10","4","N","P","P","M","130","130",04/01/2009,04/01/2010," ","00000","N","130"," ",,,," "," "," "," "," "," ",,0.00,0.00,0.00," "," "," ",0.00,"D"," "," ","N",,"","Agreed loss reporting and cash calls for losses in excess of $1,000,000.Limits:1) $10M any one risk, any one insured or CRLF
2) Two times policy limit as respects ECO/XPL or3) $270M as respects all risks in any one loss occurence, $50M any one loss occurence worldwide excl. US & its T&P or4) $460M as respects all losses "," "," "," ""0911003","11","00000","119",2010-07-02-17.41.00.000000,"130",2013-01-15-13.25.00.000000,"M15",03/01/2007,"13",04/01/2009,04/01/2010,"E",," ",,"Global Prprty QS wo Terror 221",,0,"N",0,"N","N",,"900","B5","30","46","10","4","N","P","P","M","119","119",04/01/2009,04/01/2010," ","00000","N","119"," ",,,," "," "," "," "," "," ",,0.00,0.00,0.00," "," "," ",0.00,"D"," "," ","N",,"","Agreed loss reporting and cash calls for losses in excess of $1,000,000.CRLFLimits:1) $10M any one risk, any one insured or2) Two times policy limit as respects ECO/XPL or3) $270M as respects all risks in any one loss occurence, $50M any one loss occurence worldwide excl. US & its T&P or4) $460M as respects all losses "," "," "," ""0911101","11","00000","119",2010-05-04-19.48.00.000000,"119",2012-12-12-12.13.00.000000,"M15",03/01/2007,"13",04/01/2009,03/30/2010,"E",," ",,"Property Quota Share ",,0,"N",0,"N","N",,"130","B5","30","46","10","4","N","P","P","M","119","119",04/01/2009,03/30/2010," ","00000","N","119"," ",,,," "," "," "," "," "," ",,0.00,0.00,0.00," "," "," ",0.00,"D"," "," ","N",,""," "," "," CRLF "," "

aaksar
Posts: 105
Joined: 17 Nov 2012 05:13

Re: How to remove CR/LF in between the line

#21 Post by aaksar » 07 Mar 2013 23:09

here is the link of file, its a comma delimeted, but i hv fixed width also

https://www.dropbox.com/s/8c0y906ceonsq3u/XY_110A_CONTRACT

Squashman
Expert
Posts: 4488
Joined: 23 Dec 2011 13:59

Re: How to remove CR/LF in between the line

#22 Post by Squashman » 07 Mar 2013 23:09

Like Antonio said in his post "Note that this method would fail if a field in a "broken" record have a comma (even if it is enclosed in quotes)."

And your data has this: $1,000,000

This is near impossible to fix with batch!

aaksar
Posts: 105
Joined: 17 Nov 2012 05:13

Re: How to remove CR/LF in between the line

#23 Post by aaksar » 08 Mar 2013 01:25

yeah......
i think thro java Script can we do this? and later cala JS in Batch

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: How to remove CR/LF in between the line

#24 Post by dbenham » 08 Mar 2013 08:17

foxidrive wrote:
dbenham wrote:I think records are supposed to be terminated by CR/LF, but I'm not sure about new lines within columns. I shouldn't think it would matter.


I think it will certainly matter.

:lol: By "doesn't matter", I meant that I don't think the "official" specification cares. Everything within the quotes is a literal, except for quotes being doubled.

But certainly it can impact software trying to parse the CSV.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: How to remove CR/LF in between the line

#25 Post by foxidrive » 08 Mar 2013 08:45

dbenham wrote:I meant that I don't think the "official" specification cares.


I doubt the official spec allows random CR/LF anywhere in the fields in a CSV file.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: How to remove CR/LF in between the line

#26 Post by dbenham » 08 Mar 2013 10:05

foxidrive wrote:
dbenham wrote:I meant that I don't think the "official" specification cares.


I doubt the official spec allows random CR/LF anywhere in the fields in a CSV file.


From Wikipedia:

The huge variety among "CSV" formats has led to the assertion that there is no "CSV standard".[6][7] In common usage, almost any delimiter-separated text data may be referred to as a "CSV" file. Different CSV formats may not be compatible.

Nevertheless, RFC 4180 is an effort to formalize CSV. It defines the MIME type "text/csv", and CSV files that follow its rules should be very widely portable. Among its requirements:

- DOS-style lines that end with (CRLF) characters (optional for the last line)

- An optional header record (there is no sure way to detect whether it is present, so care is required when importing).

- Each record "should" contain the same number of comma-separated fields.

- Any field may be quoted (with double quotes). Fields containing a line-break, double-quote, and/or commas should be quoted. (If they are not, the file will likely be impossible to process correctly).

- A (double) quote character in a field must be represented by two (double) quote characters.

The format is simple and can be processed by most programs that claim to read CSV files. The exceptions are (a) programs may not support line-breaks within quoted fields, and (b) programs may confuse the optional header with data or interpret the first data line as an optional header.

techno
Posts: 1
Joined: 09 Dec 2015 05:31

Re: How to remove CR/LF in between the line

#27 Post by techno » 09 Dec 2015 05:36

Hello,

Been looking into a similar problem and the above batch works a treat :-)

However, the file we export from our system uses | [PIPES] as the column delimiter, I've gotten as far as changing the following line:
set "noCommas=!line:,=!"

to:

set "noCommas=!line:^^|=!"

But will not process the input file. The problem column is a notes column in our program that is free text for users, so I need to avoid using an delimiters that a user would use in standard text.

Any help / suggestions would be appreciated.

Many Thanks

Post Reply