CONVERTCP.exe - Convert text from one code page to another

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#61 Post by aGerman » 27 Apr 2018 00:43

@Sqashman

I hope you found my former reply to your posts. I have a quesion regarding the encoding you mentioned. What is the difference between EBCIDIC and EBCDIC? I only know the latter and I know that several EBCDIC code pages are supportet on Windows. See viewtopic.php?f=3&t=7570&start=15#p50475

Steffen

Squashman
Expert
Posts: 4106
Joined: 23 Dec 2011 13:59

Re: CONVERTCP.exe - Convert text from one code page to another

#62 Post by Squashman » 27 Apr 2018 00:58

aGerman wrote:
27 Apr 2018 00:43
@Sqashman

I hope you found my former reply to your posts. I have a quesion regarding the encoding you mentioned. What is the difference between EBCIDIC and EBCDIC? I only know the latter and I know that several EBCDIC code pages are supportet on Windows. See viewtopic.php?f=3&t=7570&start=15#p50475

Steffen
Spelling error.

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#63 Post by aGerman » 27 Apr 2018 01:50

Yes the u in Squashman was missing, sorry :lol:

Seriously, is there any of the EBCDIC encodings you could use for the direct conversion from other encodings? Not sure if IBM-1047 is really the same as Codepage ID 1047 on Windows. Although according to the description it seems so. CONVERTCP always leaves the end of line as is. I don't know if that would cause a problem for you.

FWIW You seem to have a night shift :? I didn't expect to get an answer that early.

Steffen

Squashman
Expert
Posts: 4106
Joined: 23 Dec 2011 13:59

Re: CONVERTCP.exe - Convert text from one code page to another

#64 Post by Squashman » 27 Apr 2018 08:16

Yikes. Didn't even realize you spelled my name wrong. :lol: I was just referring to my spelling mistake on EBCDIC.

I couldn't sleep last night. Happens to me a couple days a week.

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#65 Post by aGerman » 27 Apr 2018 09:32

Squashman wrote:
27 Apr 2018 08:16
I was just referring to my spelling mistake on EBCDIC.
Yes I understood. I was just joking around :wink:
Squashman wrote:
27 Apr 2018 08:16
I couldn't sleep last night. Happens to me a couple days a week.
:(
Happens to me every day but that's because it's my nature. Unfortunately I'm an owl and 3-5 hours of sleep is normal for me. Only get enough sleep on the weekend :roll:

Steffen

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#66 Post by aGerman » 27 Apr 2018 14:15

As Carlos said, option /n made CONVERTCP rather overcomplicated to use. I removed this option in v4.2 but kept the increasement of the limit for certain code pages to 511 MB. The decision when to use or omit threading will be done automatically now.

Virustotal scans of version 4.2:
x86: https://www.virustotal.com/en/file/6560 ... /analysis/
x64: https://www.virustotal.com/en/file/a71f ... /analysis/

Steffen

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#67 Post by aGerman » 29 Apr 2018 05:23

Conversion to UTF-32 without threading caused a memory leak and CONVERTCP crashed. Fixed with v4.3.

Virustotal scans of version 4.3:
x86: https://www.virustotal.com/en/file/6351 ... /analysis/
x64: https://www.virustotal.com/en/file/fd06 ... /analysis/
Currently 2 of 67 AV engines detect the 32 bit tool as false positive. Don't know why. The 64 bit tool passes all tests with the same source code.
EDIT: They changed their mind. Also the 32 bit version passed the tests now.

Steffen

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#68 Post by aGerman » 12 May 2018 11:30

Version 5.0 supports passing aliases (such as MIME types or IATA numbers) rather than only code page IDs. That enables you to pass e.g. UTF-8 instead of 65001.
A list of supported aliases is uploaded on SourceForge, too.

Virustotal scans of version 5.0:
x86: https://www.virustotal.com/en/file/a1c3 ... /analysis/
x64: https://www.virustotal.com/en/file/cee3 ... /analysis/

Steffen

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#69 Post by aGerman » 14 Jun 2018 10:13

Version 5.1 behaves the same as version 5.0. I only changed the way that strings of the internal lookup table are saved in order to shrink the file size of the binaries.

Virustotal scans of version 5.1:
x86: https://www.virustotal.com/en/file/9612 ... /analysis/
x64: https://www.virustotal.com/en/file/1c95 ... /analysis/

Steffen

carlos
Expert
Posts: 485
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: CONVERTCP.exe - Convert text from one code page to another

#70 Post by carlos » 15 Jun 2018 19:05

Thanks aGerman, i get the last version.
The only thing I did not like was having to download the source code file, the readme and the pdf file of the codes separately. I would have liked a zip file that had everything necessary.

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#71 Post by aGerman » 16 Jun 2018 05:26

Most of the projects at SourceForge have this structure. People who only want to use the tool are not interested in the source code or in a lot of verbose information. That way users can download only the files they need.

Steffen

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#72 Post by aGerman » 02 Sep 2018 13:30

Carlos' concerns about false positives of AntiVirus software in his recent thread ...
viewtopic.php?f=3&t=8813&p=57913#p57911
... make me write some sentences about the VirusTotal links of CONVERTCP.

- I wrote the code and hence I already know that the program isn't malicious. Users that don't trust me can read the source code and compile it by their own. But there are plenty of people who neither understand the code nor are able to compile it. They have no other chance than using the uploaded binaries. However the only result of the ViruTotal scans in this case is to see if some engines report a false positive for a harmless program.
- I discovered some possibilities how to avoid false positives. Adding extended file properties is one, having plain-text sequences in the binary file is another, also changing variable types to those with a higher width works every now and then *). I could have signed the tool using a bought certificate. But firstly I'm not willing to spend money for a certificate and secondly I don't understand the sense in terms of AV perception. Even malware could have been signed that way.
- In the end VirusTotal is more of a service to test the AV engines. And in case of CONVERTCP it's a service to test the engines against false positives. Don't rely on VirusTotal if you don't trust me.
The reason why I do the tests on VirusTotal is because I want to get an idea of how many times the users will be bothered by their antivirus software. If it'll be too many times then I'll try to write the source code in a slightly different way and try again.
So you are wondering why I always post the links? Um ... because I want to wrap you up in a warm and fluffy blanket? Haha, no that's not the reason. As I said - I didn't use code signing. The actual purpose of code signing would have been to confirm that I'm the author and that the tool wasn't changed when you downloaded it. The links point to the analysis sites where I uploaded the binaries. That means if you upload the tool to VirusTotal and you get redirected to the same site then you know you got the unchanged tool where I'm the author. Thus, it's just to make it easy for you to validate your downloads.

*) It might be worth to hold on a second at that point. The techniques I was talking about don't change the behavior of the program in any way. They don't make a malicious program harmless and they don't make a harmless program malicious. They just make a nervous AV to calm down in that case. If I can use these thechniques to avoid false positives what about developers of malware? Wouldn't they be able to prevent their malicious programs from being detected to a certain extend? If you think about that how much do you still trust AV software? By far the best Antivirus is that between chair and keyboard! No 3rd party AV will ever help if you thoughtlessly click on any available link or email attachment, if you download programs from suspicious sites, if you always choose the "default" installation or you click on "Next" without reading what it means ...

Steffen

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#73 Post by aGerman » 04 Mar 2019 11:44

During some test in another C project I found that the standard input stream doesn't behave like other file streams.
1) Even if the stream still has enough characters to read, it frequently happens that the ReadFile function returns before the buffer has been completely filled.
2) The standard input stream does not support stepping backwards.
This does influence how redirected UTF-8 streams are processed. In fact, previous versions may not read redirected streams correctly if the console code page was changed using CHCP 65001. This bug has been fixed in version 5.2. Besides of that, some minor optimizations were made.

Virustotal scans of version 5.2:
x86: https://www.virustotal.com/en/file/2547 ... /analysis/
x64: https://www.virustotal.com/en/file/1b4d ... /analysis/

Steffen

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#74 Post by aGerman » 17 Mar 2019 09:54

I was acting on a proposal that I received on SourceForge.
The default behavior of CONVERTCP is that it silently replaces all characters that are invalid or that don't exist in the passed code pages with either a replacement character or with an approximated ASCII character which looks similar but has different semantics. The new option /v (for "verify") doesn't change this behavior but CONVERTCP returns 1 instead of 0 if at least one character has been replaced with a character that doesn't match the same Unicode code point.

In most cases I have to rely on the conversion routines of the WinAPI that have restrictions and rules for different encodings. Since I have only poor experiences with DBCSs as used in CJK environments I'd really like those of you guys that are able to make some tests with Chinese, Japanese, or Korean code pages to give a little feedback how CONVERTCP behaves.


Virustotal scans of version 6.0:
x86: https://www.virustotal.com/en/file/769f ... /analysis/
x64: https://www.virustotal.com/en/file/d280 ... /analysis/

Steffen

aGerman
Expert
Posts: 3628
Joined: 22 Jan 2010 18:01
Location: Germany

Re: CONVERTCP.exe - Convert text from one code page to another

#75 Post by aGerman » 19 Mar 2019 16:26

Fixed inconsistent processing of Byte Order Marks read from redirected streams.

Virustotal scans of version 6.1:
x86: https://www.virustotal.com/gui/file/267 ... /detection
x64: https://www.virustotal.com/gui/file/7be ... /detection

Steffen

Post Reply