Detecting two bytes and replacing them by one
Posted: 22 Jul 2015 02:38
In ny directory tree there are some (many) *.txt files which contain text which is UTF-8 encoded (without BOM maker).
The text inside such a file may contain UTF-8 encoded german Umlaute which consist of TWO bytes.
Example:
xC3xB6 for o Umlaut = "ö"
Now I want to convert all these occurencies in all the text files from two-byte-UTF-8 encoding to one-byte ISO-8859-1 (or ISO 8859-15) encoding.
So refering to the example above Hex xC3xB8 should be replaced by Hex xF6
How can I achieve this from a DOS batch script?
As far as I found out there are some advanced scripts for this task like:
viewtopic.php?f=3&t=3855
However this is too comprehensive and big for this small task.
Isn't there a smaller (one-liner) script command for that?
Thank you
Peter
The text inside such a file may contain UTF-8 encoded german Umlaute which consist of TWO bytes.
Example:
xC3xB6 for o Umlaut = "ö"
Now I want to convert all these occurencies in all the text files from two-byte-UTF-8 encoding to one-byte ISO-8859-1 (or ISO 8859-15) encoding.
So refering to the example above Hex xC3xB8 should be replaced by Hex xF6
How can I achieve this from a DOS batch script?
As far as I found out there are some advanced scripts for this task like:
viewtopic.php?f=3&t=3855
However this is too comprehensive and big for this small task.
Isn't there a smaller (one-liner) script command for that?
Thank you
Peter