JSORT.BAT v4.2 - problems with german umlauts

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Savion
Posts: 5
Joined: 29 Jul 2022 01:35

JSORT.BAT v4.2 - problems with german umlauts

#1 Post by Savion » 29 Jul 2022 02:11

Hello.

I have tested the nice script but jsort make problems with german umlauts.
If I convert a text with this:

Code: Select all

JSORT old.txt /p 12 /I /N /o new.txt
the new.txt has errors with German umlauts.

The old.txt is a ANSI file and this is the new.txt (ANSI) with errors.
1.jpg
1.jpg (52.96 KiB) Viewed 6065 times
If I convert the new.txt to UTF-8
2.jpg
2.jpg (54.31 KiB) Viewed 6065 times
Correct would be:
3.jpg
3.jpg (52.28 KiB) Viewed 6065 times
What is the right syntax for the export file?
I hope anyone can help me.

Thanks.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: JSORT.BAT v4.2 - problems with german umlauts

#2 Post by aGerman » 29 Jul 2022 06:06

Without trying to reproduce the behavior yet - just an idea: What happens if you place a

Code: Select all

chcp 1252
before calling JSORT?

Steffen

Savion
Posts: 5
Joined: 29 Jul 2022 01:35

Re: JSORT.BAT v4.2 - problems with german umlauts

#3 Post by Savion » 29 Jul 2022 13:22

Hello Steffen.

I have tested.
Same error - All Umlauts are not right.
I have no idea more.

Another idea?

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: JSORT.BAT v4.2 - problems with german umlauts

#4 Post by aGerman » 30 Jul 2022 04:42

I'm afraid fixing this would require a refactoring of JSORT. E.g. reading and writing files in the JScript section instead of the Batch section of this hybrid script.

Steffen

Savion
Posts: 5
Joined: 29 Jul 2022 01:35

Re: JSORT.BAT v4.2 - problems with german umlauts

#5 Post by Savion » 30 Jul 2022 07:15

Thanks Steffen for your information.

Do you know a program / script that can correct umlauts?
Then I would let them run over the txt files.

miskox
Posts: 553
Joined: 28 Jun 2010 03:46

Re: JSORT.BAT v4.2 - problems with german umlauts

#6 Post by miskox » 30 Jul 2022 08:15

Can't SORT do the job? I see that only /N could be a problem. But if you don't need it... or you can rearrange the input file.

Saso

Savion
Posts: 5
Joined: 29 Jul 2022 01:35

Re: JSORT.BAT v4.2 - problems with german umlauts

#7 Post by Savion » 30 Jul 2022 08:34

Hello miskox.
Jsort can SORT. This is not the problem. Only the created new.txt with german umlauts is the problem.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: JSORT.BAT v4.2 - problems with german umlauts

#8 Post by aGerman » 30 Jul 2022 09:37

I guess Saso refers to the SORT command that ships with Windows anyways.

Steffen

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: JSORT.BAT v4.2 - problems with german umlauts

#9 Post by aGerman » 30 Jul 2022 11:07

I found a relatively simple fix:
Replace all occurrences of
WScript.Echo
with
WScript.StdOut.WriteLine
in JSORT.BAT

Steffen

miskox
Posts: 553
Joined: 28 Jun 2010 03:46

Re: JSORT.BAT v4.2 - problems with german umlauts

#10 Post by miskox » 30 Jul 2022 12:17

aGerman wrote:
30 Jul 2022 09:37
I guess Saso refers to the SORT command that ships with Windows anyways.

Steffen
Yes Steffen, you are right SORT.exe that is part of the Windows OS.

Saso

Savion
Posts: 5
Joined: 29 Jul 2022 01:35

Re: JSORT.BAT v4.2 - problems with german umlauts

#11 Post by Savion » 30 Jul 2022 21:10

Steffen - YES THIS IS IT! :D
THANK's THANK's THANK's!

No problem more with umlauts.

Only replace
WScript.Echo
with
WScript.StdOut.WriteLine


Beautiful Sunday Steffen.

miskox
Posts: 553
Joined: 28 Jun 2010 03:46

Re: JSORT.BAT v4.2 - problems with german umlauts

#12 Post by miskox » 31 Jul 2022 10:12

Looks like Dave has some work to do.

Thanks Steffen.

Saso

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: JSORT.BAT v4.2 - problems with german umlauts

#13 Post by aGerman » 01 Aug 2022 07:26

Not sure if Dave will still be maintaining this script. I should probably add this workaround to the original topic since the last commenter in 2019 seemingly faced the same problem.

Steffen

Sponge Belly
Posts: 216
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: JSORT.BAT v4.2 - problems with german umlauts

#14 Post by Sponge Belly » 06 Aug 2022 05:59

aGerman wrote:
Replace all occurrences of WScript.Echo with WScript.StdOut.WriteLine
Thanks for the tip, Steffen! :)

But can you explain why replacing WScript.Echo with WScript.StdOut.WriteLine solves the umlaut problem?

- SB

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: JSORT.BAT v4.2 - problems with german umlauts

#15 Post by aGerman » 06 Aug 2022 06:54

Quick investigation.

Script:

Code: Select all

@if (0)==(0) echo off
<%1 >CONOUT$ cscript //nologo //e:jscript "%~fs0"
pause
goto :eof @end

var ch = WScript.StdIn.ReadLine();
WScript.Echo(ch.charCodeAt(0).toString(16));
WScript.Echo(ch);
WScript.StdOut.WriteLine(ch);
Precondition for the output shown below:
ACP: 1252
OEMCP: 850
A test file containing only the byte 0xE9

Known character representation for byte 0xE9:
é in my ACP
Ú in my OEMCP

Output if the test file is dropped to the script:

Code: Select all

e9
é
Ú
Drücken Sie eine beliebige Taste . . .
Conclusion:
- WScript.StdIn.ReadLine reads byte 0xE9 without any charset conversion.
- WScript.Echo performs a conversion from ACP to OEMCP. The new value needs to be 0x82 to get represented as é in CP 850. Redirected to a file and interpreted in ACP byte 0x82 would be a "single low quotation mark" character. Can be proven by replacing CONOUT$ with a file name in the script.
- WScript.StdOut.WriteLine writes the original byte value through. It appears as Ú in CP 850. Redirected to a file and interpreted in ACP it would still be the é.

Steffen

Post Reply