CONVERTCP.exe - Convert text from one code page to another

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
miskox
Posts: 553
Joined: 28 Jun 2010 03:46

Re: CONVERTCP.exe - Convert text from one code page to another

#31 Post by miskox » 26 Sep 2017 12:25

@aGerman and @Dave:

As Steffen wrote he made CONVERTCP for me - I had one old 16-bit 'pure' DOS .exe which I was using on an almost daily basis to do conversions between CP852 and CP1250 (and vice versa) - of course these 16-bit .exe files don't work on 64-bit architecture any more. Of course Steffen checked the source file and decided it was easier for him to write the program from scratch than to try and rebuild it. Here I cannot thank him enough.

Dave's JREPL is an excellent tool. It is just too complex for my needs (as Steffen also mentioned). Also (Steffen mentioned) I have very large .txt files to process (few hundred Mega Bytes).

Thank you.
Saso

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#32 Post by aGerman » 30 Dec 2017 08:28

Recently I found a silly little bug. In the past I supported options with leading dash (besides of leading forward slash). It was an undocumented "feature" which, in the end, was a failure. File names with leading dash were erroneously recognized as options. Fixed with version 1.5.

Virustotal:
x86: https://www.virustotal.com/en/file/cfbd ... /analysis/
x64: https://www.virustotal.com/en/file/1113 ... /analysis/

Steffen

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: CONVERTCP.exe - Convert text from one code page to another

#33 Post by dbenham » 30 Dec 2017 08:53

Out of curiosity, how did you solve the problem :?:

Did you remove the "feature" of - options?

Or did you come up with a mechanism for differentiating - option from file name beginning with - ? (Perhaps differentiating quoted vs. unquoted)


Dave

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#34 Post by aGerman » 30 Dec 2017 11:03

Yes I removed it, Dave. It would have been possible to check for the existence of the following file argument after option /i. Also leading dashes could have been permitted after option /o. But any additional check would lead to decrease the performance of the tool. Thus, I decided to remove it and keep it simple. I don't think it will cause a lot of backward compatibility problems because it wasn't the documented way to pass options.
FWIW Arguments in C come in without surrounding quotes (similar to the WScript.Arguments items in the WSH). To differentiate quoted and unquoted arguments I would have to parse the command line myself. Possible but ... no :lol:

Steffen

rahulk
Posts: 1
Joined: 24 Jan 2018 06:31

Re: CONVERTCP.exe - Convert text from one code page to another

#35 Post by rahulk » 24 Jan 2018 06:35

Hi, Please let me know how to proceed to convert, should i go in cmd? then what i need to do? please explain in details. I have a file in my system suppose path in C:\abc\abc.txt, now what i have to do?

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: CONVERTCP.exe - Convert text from one code page to another

#36 Post by Squashman » 24 Jan 2018 09:49

rahulk wrote:
24 Jan 2018 06:35
Hi, Please let me know how to proceed to convert, should i go in cmd? then what i need to do? please explain in details. I have a file in my system suppose path in C:\abc\abc.txt, now what i have to do?
The very first post of this thread shows you how to use it. Read that and try something then come back with a specific question.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#37 Post by aGerman » 24 Jan 2018 11:23

rahulk wrote:
24 Jan 2018 06:35
Please let me know how to proceed to convert
First of all what character encoding does your file currently have and to what character encoding do you like to convert the file? I can't help without these informations.

Steffen

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#38 Post by aGerman » 01 Feb 2018 10:51

Lately I wrote a little cross-platform library in pure C to convert between different Unicode charsets. I came across UTF-32 that Windows doesn't provide any API functions for conversions. Since I already wrote my own functions I thought I could also add the support of UTF-32 to CONVERTCP. Some bigger changes were needed in the core functions of the source code which is the reason why I increased the major version number to 2.
Use codepage ID 12000 for UTF-32 Little Endian and 12001 for UTF-32 Big Endian.

While doing some tests I also found and fixed a bug that might have happend while reading UTF-16 BE containing surrogate pairs and having a size >1 MB.

Virustotal scans of version 2.0:
x86: https://www.virustotal.com/en/file/964c ... /analysis/
x64: https://www.virustotal.com/en/file/66ce ... /analysis/

Steffen

lbriza
Posts: 1
Joined: 11 Apr 2018 05:58

Re: CONVERTCP.exe - Convert text from one code page to another

#39 Post by lbriza » 11 Apr 2018 06:41

Hello,
I have discovered a bug in CONVERTCP.exe utility. I have converted followiing file, and last several lines of output file is totally different then original file, please see it.
This output was generated by comand:

Code: Select all

convertcp.exe 65001 1250 /i .\issues-03-2018qqqq.csv /o tmpout.csv
Archive with both files You can download here (size exceeded allowed limit):

https://emerson.sendthisfile.com/c.jsp? ... Crx0yfTx6T

Note: These file will expire in 14 days.


best regards
Lubomir

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#40 Post by aGerman » 11 Apr 2018 10:18

Thank you very much for your feedback Lubomir!
I was able to reproduce this issue. I'll get back with a bugfix as soon as possible.

Steffen

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#41 Post by aGerman » 11 Apr 2018 12:35

Bug fixed :)

Virustotal scans of version 2.1:
x86: https://www.virustotal.com/en/file/a313 ... /analysis/
x64: https://www.virustotal.com/en/file/b9ae ... /analysis/

For those of you that are interested in the technical reason ...
Output streams are buffered by the operating system. That's something that I already knew. But since the failure never occurred in my tests I thought that the buffer was flushed at least when the thread function terminates. Obviously I was wrong. It took a while to find the reason but the fix is quite simple. It's just a call of the FlushFileBuffers API function at the end of the thread function. EDIT: NOPE, THIS DID NOT SOLVE THE PROBLEM

So thanks again Lubomir! Much appreciated indeed. Developers need people like you that report bugs rather than silently use the next found program :D

Steffen

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: CONVERTCP.exe - Convert text from one code page to another

#42 Post by dbenham » 11 Apr 2018 21:58

aGerman wrote:
11 Apr 2018 12:35
So thanks again Lubomir! Much appreciated indeed. Developers need people like you that report bugs rather than silently use the next found program :D
Indeed - Well said :!:

miskox
Posts: 553
Joined: 28 Jun 2010 03:46

Re: CONVERTCP.exe - Convert text from one code page to another

#43 Post by miskox » 12 Apr 2018 02:45

Thanks again Steffen!

Though I never had this problem it is good to have new version.

Test file supplied has 40,000+ lines - my test file had more. So strange how I never encountered the problem.

Saso

P.S.: Steffen: maybe you could add a date below your name in the first post when you changed the file. Release notes are way down the post. So it would look something like this:

Code: Select all

Steffen

(updated 11-apr-2018)
convertcp_v2.1.zip
    (86.46 KiB) Downloaded 5 times


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#44 Post by aGerman » 12 Apr 2018 03:04

Yeah there must have been something magic in the file of Lubomir :lol:

Sure I can add the date somewhere near the link. Although there is already a list of release notes at the end of the initial post as you mentioned.

Steffen

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#45 Post by aGerman » 14 Apr 2018 07:07

Just one thing that might be worth to notice:

Right now I discovered that the output of CONVERTCP and JREPL (as well as using various text editors) are still different when I converted Lubomir's file.
CONVERTCP does not change line endings automatically. E.g. things like double line feeds (LF LF) can be found in Lubomir's file. CONVERTCP leaves it as LF LF while other software may automatically convert it to CR LF CR LF.

If software changes the line ending then your data could get corrupted. Typical example:
If you wrap the line in an Excel cell using [Alt]+[Enter] and you export this data as CSV then you'll find it as single LF (while the end of the row is CR LF). It would have been fatal if this single LF would be converted to CR LF.

Steffen

Post Reply