Quoted from there:viewtopic.php?f=3&t=7703&p=51312#p51310
I have tested your CONVERTCP utility, and read the source code:
I saw no error, but i noticed that your tool does more, than just converting between codepages - it also approximates characters that are not within the target codepage (which is not that bad, because cmd.exe is doing the same, but i would mention it somewhere).
For example i created a file "string.txt" with this content (i hope it is not corrupted) encoded using UTF-8:
If you convert it to codepage 850 you get:
The recommended behaviour for such cases i know were to use the REPLACEMENT CHARACTER, a question mark, a square, or a question mark in a square for such cases.
This is by design and actually wanted behavior.
_In_ UINT CodePage,
_In_ DWORD dwFlags,
_In_ LPCWSTR lpWideCharStr,
_In_ int cchWideChar,
_Out_opt_ LPSTR lpMultiByteStr,
_In_ int cbMultiByte,
_In_opt_ LPCSTR lpDefaultChar,
_Out_opt_ LPBOOL lpUsedDefaultChar
For the CP_UTF7 and CP_UTF8 settings for CodePage, this parameter must be set to NULL. Otherwise, the function fails with ERROR_INVALID_PARAMETER.lpUsedDefaultChar
For the CP_UTF7 and CP_UTF8 settings for CodePage, this parameter must be set to NULL. Otherwise, the function fails with ERROR_INVALID_PARAMETER.
That means at least for UTF-7 and UTF-8 I'm not even able to define a default character.
I noted this behavior in my first reply to Dave:viewtopic.php?f=3&t=7570#p50285
2) The reason why I don't even want to work around it is that the utility was requested by miskox. He told me via email
I 'patched' original .exe to make another .exe version with NOCSZ (that is NOČŠŽ) which replaces ČŠŽĐĆ characters with ordinary CZSDC - depending on the input code page.
That's why I called it "wanted behavior".