[oS-EN] Text conversion problem

21 Jul 2023

      Hi,

In an old text file, which "file" says it is utf-8 text, there are some 
chars which are not. Some were obviously accented letters, like "á", so 
I just did a search and replace on them. There is one entry I don't know 
what it is:

Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·

rsync -a myfolder/*/*.jpg my-new-folder/

Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·

(The file got corrupted at some point because one editor thought the 
file was utf-8, and another editor thought differently).

I saved the bad part to a file, but the command "file" insists it is utf-8

cer@Telcontar:~> cat p
Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·Â·
cer@Telcontar:~> file p
p: UTF-8 Unicode text
cer@Telcontar:~>

And I fail to force a conversion:

cer@Telcontar:~> iconv -f LATIN6 -t UTF-8  p
ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·
ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·ÃÂ·
cer@Telcontar:~>

Perhaps I'm guessing wrong the non utf encoding. I don't remember for 
sure which was the old latin encoding we used in Spain, too.

My guess is that the string is

················

which is a centered dot, which in my keyboard (Spain) is on the [.] key, 
but pressing also [AltGr]

Testing besides the bad string:

Â·Â··

Ideas?

-- 
Cheers / Saludos,

		Carlos E. R.

   (from Telcontar, using openSUSE Leap 15.4)

Carlos E. R.

Vojtěch Zeisek

Carlos E. R.

Andrei Borzenkov

Carlos E.R.

Andrei Borzenkov

Carlos E.R.

tags

participants (4)