On 2023-07-21 14:43, Vojtěch Zeisek wrote:
Dne pátek 21. července 2023 14:09:31 CEST, Carlos E. R. napsal(a):
In an old text file, which "file" says it is utf-8 text, there are some chars which are not.
Did You try "enca file.txt"? From my experience it's relatively successful. But if the file was overwritten under some wrong encoding, it might be hard. You can also guess from <https://en.wikipedia.org/wiki/Code_page>
I had to install it. cer@Telcontar:~> enca p enca: Cannot determine (or understand) your language preferences. Please use `-L language', or `-L none' if your language is not supported (only a few multibyte encodings can be recognized then). Run `enca --list languages' to get a list of supported languages. cer@Telcontar:~> Doesn't include Spanish... :-( cer@Telcontar:~> enca --list languages belarusian: CP1251 IBM866 ISO-8859-5 KOI8-UNI maccyr IBM855 KOI8-U bulgarian: CP1251 ISO-8859-5 IBM855 maccyr ECMA-113 czech: ISO-8859-2 CP1250 IBM852 KEYBCS2 macce KOI-8_CS_2 CORK estonian: ISO-8859-4 CP1257 IBM775 ISO-8859-13 macce baltic croatian: CP1250 ISO-8859-2 IBM852 macce CORK hungarian: ISO-8859-2 CP1250 IBM852 macce CORK lithuanian: CP1257 ISO-8859-4 IBM775 ISO-8859-13 macce baltic latvian: CP1257 ISO-8859-4 IBM775 ISO-8859-13 macce baltic polish: ISO-8859-2 CP1250 IBM852 macce ISO-8859-13 ISO-8859-16 baltic CORK russian: KOI8-R CP1251 ISO-8859-5 IBM866 maccyr slovak: CP1250 ISO-8859-2 IBM852 KEYBCS2 macce KOI-8_CS_2 CORK slovene: ISO-8859-2 CP1250 IBM852 macce CORK ukrainian: CP1251 IBM855 ISO-8859-5 CP1125 KOI8-U maccyr chinese: GBK BIG5 HZ none: cer@Telcontar:~> -- Cheers / Saludos, Carlos E. R. (from 15.4 x86_64 at Telcontar)