Page 1 of 1
UTF-8 encoding while replacing string [SOLVED]
Posted: 05 Aug 2021 12:10
by AlphaInc.
Hello everybody,
I recently set a mediaInfo script that outputs information about different media files inside a specific folder.
Now after every output I need to change some strings in the created output. For that I use the following command:
Code: Select all
powershell -Command "(gc myFile.txt) -replace 'foo', 'bar' | Out-File -encoding ASCII myFile.txt"
The problem I have is that there is an Umlaut which gets replaced (ö turns into ??). I also tried to replace ASCII encoding with utf8 with no success. Is there anyway to get my Umlaut into my encoded text-file OR use another string-replacement-string ?
Re: UTF-8 encoding while replacing string
Posted: 05 Aug 2021 13:51
by aGerman
The important question is, what's the encoding of the input? If you know it, you have to specify it along with gc.
FWIW Obviously ASCII makes things worse because only 7-bit ASCII characters are supported in this case.
Steffen
Re: UTF-8 encoding while replacing string
Posted: 05 Aug 2021 14:14
by AlphaInc.
I don’t know, I created the txt file by using Mediainfo with a template and then output it (Mediainfo -Inform=file://template.txt Video.mkv >> Output.txt) to a text file.
Re: UTF-8 encoding while replacing string
Posted: 05 Aug 2021 14:45
by aGerman
I don't know anything about mediaInfo-
Try to run the tool I uploaded in Jean-François' thread. Maybe it's able to tell you the encoding.
viewtopic.php?p=64494#p64494
If you don't succeed, put the text file in a zip archive and upload it here. I'll probably find it out for you in no time.
Steffen
Re: UTF-8 encoding while replacing string [SOLVED]
Posted: 06 Aug 2021 02:28
by AlphaInc.
I'll try it, thank you.
But (for all who may run into something similar) found a workaround for that. I replaced the powershell command with a python script which replaces the string without having to deal with the encoding of the file. Maybe not ideal but it's good enough for my use case.
Re: UTF-8 encoding while replacing string [SOLVED]
Posted: 06 Aug 2021 03:52
by aGerman
If Python works out of the box then it's a strong indication that the text is UTF-8-encoded.
AlphaInc. wrote: ↑06 Aug 2021 02:28
without having to deal with the encoding of the file
That's just luck
Steffen