[opensuse] hexadecimal boxes on webpages
Right. This is annoying me, I have to get an instant answer! Can somebody tell me why I get these symbols in all sorts of places on the Internet (Twitter, random web pages, etc.)? দইসাবু This may not seem like a particularly openSUSE-related question, but the reason I ask on this newsgroup is because it would seem that for many other computer users these symbols are replaced by meaningful characters, but on all my openSUSE installations I see these nonsensical symbols regularly. The thing is, it's possible that some of you won't know what I'm talking about if your own system interprets them correctly and shows the actual characters above, so for your benefit, I'm referring to little boxes the full height and width of a text character (or bigger), containing four hexadecimal figures. Like: ---- |04| |A6| ---- From their placement on Twitter I assume it's often emoticons or obscure symbols available in some Windows font I don't have that I should be seeing, maybe Wingdings or something like that. What is it about openSUSE / Linux / Firefox or whatever else that results in me seeing these hexadecimal boxes instead of the correct characters? I use UTF-8 system-wide. For a long while I genuinely thought it was some geeky meme that required people in the know to reference some list! Until I started seeing them in far too regular places, like on Wikipedia articles. Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2/22/2014 3:58 PM, Peter wrote:
Right. This is annoying me, I have to get an instant answer! Can somebody tell me why I get these symbols in all sorts of places on the Internet (Twitter, random web pages, etc.)?
দইসাবু
This may not seem like a particularly openSUSE-related question, but the reason I ask on this newsgroup is because it would seem that for many other computer users these symbols are replaced by meaningful characters, but on all my openSUSE installations I see these nonsensical symbols regularly. The thing is, it's possible that some of you won't know what I'm talking about if your own system interprets them correctly and shows the actual characters above, so for your benefit, I'm referring to little boxes the full height and width of a text character (or bigger), containing four hexadecimal figures. Like:
---- |04| |A6| ----
From their placement on Twitter I assume it's often emoticons or obscure symbols available in some Windows font I don't have that I should be seeing, maybe Wingdings or something like that. What is it about openSUSE / Linux / Firefox or whatever else that results in me seeing these hexadecimal boxes instead of the correct characters? I use UTF-8 system-wide.
For a long while I genuinely thought it was some geeky meme that required people in the know to reference some list! Until I started seeing them in far too regular places, like on Wikipedia articles.
Peter
দইসাবু The internet isn't just for english any more. But yes, emoji is the likely source of some of this, probably posted from cell phones, and your browser may not be handling them all that well. -- _____________________________________ ---This space for rent--- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/22/2014 06:58 PM, Peter pecked at the keyboard and wrote:
Right. This is annoying me, I have to get an instant answer! Can somebody tell me why I get these symbols in all sorts of places on the Internet (Twitter, random web pages, etc.)?
দইসাবু
This may not seem like a particularly openSUSE-related question, but the reason I ask on this newsgroup is because it would seem that for many other computer users these symbols are replaced by meaningful characters, but on all my openSUSE installations I see these nonsensical symbols regularly. The thing is, it's possible that some of you won't know what I'm talking about if your own system interprets them correctly and shows the actual characters above, so for your benefit, I'm referring to little boxes the full height and width of a text character (or bigger), containing four hexadecimal figures. Like:
---- |04| |A6| ----
From their placement on Twitter I assume it's often emoticons or obscure symbols available in some Windows font I don't have that I should be seeing, maybe Wingdings or something like that. What is it about openSUSE / Linux / Firefox or whatever else that results in me seeing these hexadecimal boxes instead of the correct characters? I use UTF-8 system-wide.
For a long while I genuinely thought it was some geeky meme that required people in the know to reference some list! Until I started seeing them in far too regular places, like on Wikipedia articles.
Peter
I would most likely say it would be a missing font. Try looking at the source code for a web page that has the "box characters" on it for clues. -- Ken Schneider SuSe since Version 5.2, June 1998 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 23/02/14 01:12, Ken Schneider - openSUSE wrote:
I would most likely say it would be a missing font. Try looking at the source code for a web page that has the "box characters" on it for clues.
Too often these are big sites where the source is very complex. Take this page on Wikipedia, for example: http://en.wikipedia.org/wiki/Tapioca The first incidence of these characters appears for me in the 3rd paragraph of the main description, before the word "kappa" in brackets. I do always have the msttfonts package installed but I don't think that includes things like Webdings. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2014-02-23 01:19, Peter wrote:
The first incidence of these characters appears for me in the 3rd paragraph of the main description, before the word "kappa" in brackets.
That one is easy. The sentence says: «It is widely named by "കപ്പ" ("kappa") in Malayalam» Obviously, they have written the name in Malayalam letters... which I also can not read. If you get a box like - ---- |04| |A6| - ---- Your fonts are not set right, or you have to change language settings in FF (view -> char encoding). - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlMJQQ4ACgkQja8UbcUWM1wsoAD/db4RqqR8e9U98TbzdxlrQXQL Cweg67FwWVm00q2g5DsA/igQ2bde9CN7glIYciNAigPvVD5gPLW2A/j+zn0GwXsq =DLgn -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
I also saw the strange characters looking at the page, after installing indic-fonts it looks reasonable. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2/22/2014 4:19 PM, Peter wrote:
On 23/02/14 01:12, Ken Schneider - openSUSE wrote:
I would most likely say it would be a missing font. Try looking at the source code for a web page that has the "box characters" on it for clues.
Too often these are big sites where the source is very complex. Take this page on Wikipedia, for example: http://en.wikipedia.org/wiki/Tapioca
The first incidence of these characters appears for me in the 3rd paragraph of the main description, before the word "kappa" in brackets.
I do always have the msttfonts package installed but I don't think that includes things like Webdings.
On my browser, they render as they appear a proper foreign language characters, on Chrome, Firefox, on both Windows and Linux. Heck, even Kong shows them properly. "കപ്പ" does not appear as Webdings on my screen. Looks like the attached jpg. -- _____________________________________ ---This space for rent---
On 23/02/14 01:52, John Andersen wrote:
On my browser, they render as they appear a proper foreign language characters, on Chrome, Firefox, on both Windows and Linux. Heck, even Kong shows them properly.
"കപ്പ" does not appear as Webdings on my screen. Looks like the attached jpg.
Well from your attached image, kappa would appear to be represented by a small penis being approached by a sleepwalking stick robot. No wonder I don't have those characters on my installation. It just surprises me how many characters I seem to be lacking from a default openSUSE install + ms tt fonts. I frequently get these boxes at the end of tweets in place of what must be emoticons, and there was a glut the other week around Valentine's Day where I could see little heart symbols, but whatever other symbols people were tweeting alongside them weren't translating at my end. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2/22/2014 5:04 PM, Peter wrote:
On 23/02/14 01:52, John Andersen wrote:
On my browser, they render as they appear a proper foreign language characters, on Chrome, Firefox, on both Windows and Linux. Heck, even Kong shows them properly.
"കപ്പ" does not appear as Webdings on my screen. Looks like the attached jpg.
Well from your attached image, kappa would appear to be represented by a small penis being approached by a sleepwalking stick robot. No wonder I don't have those characters on my installation.
It just surprises me how many characters I seem to be lacking from a default openSUSE install + ms tt fonts. I frequently get these boxes at the end of tweets in place of what must be emoticons, and there was a glut the other week around Valentine's Day where I could see little heart symbols, but whatever other symbols people were tweeting alongside them weren't translating at my end.
I honestly don't remember installing any extra fonts in my OpenSuse other than MS tt fonts. However, I do have OfficeLibre installed, and it might be that comes with a boatload of fonts because it is a world package. -- _____________________________________ ---This space for rent--- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-02-23 01:19 (GMT+0100) Peter composed:
Ken Schneider wrote:
I would most likely say it would be a missing font. Try looking at the source code for a web page that has the "box characters" on it for clues.
Too often these are big sites where the source is very complex. Take this page on Wikipedia, for example: http://en.wikipedia.org/wiki/Tapioca
The first incidence of these characters appears for me in the 3rd paragraph of the main description, before the word "kappa" in brackets.
Here I get what are obviously non-western glyphs of some sort instead of those boxes. Copied and pasted here: കപ്പ in this plain text email composition window they look the same as in the browser window.
I do always have the msttfonts package installed but I don't think that includes things like Webdings.
You probably just don't have enough UTF-8 character set coverage from your installed fonts. I would think the DejaVus would include those used on that Wikipedia page, but I'm pretty sure the DejaVus are installed by default in openSUSE since many years ago. My global default is Droid, but I also have Liberation and Linux Libertine in addition to the DejaVus and X11 basics. Add more and see what happens. -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 23/02/14 01:55, Felix Miata wrote:
Here I get what are obviously non-western glyphs of some sort instead of those boxes. Copied and pasted here:
കപ്പ
in this plain text email composition window they look the same as in the browser window.
< snip >
You probably just don't have enough UTF-8 character set coverage from your installed fonts. I would think the DejaVus would include those used on that Wikipedia page, but I'm pretty sure the DejaVus are installed by default in openSUSE since many years ago. My global default is Droid, but I also have Liberation and Linux Libertine in addition to the DejaVus and X11 basics. Add more and see what happens.
That would make sense I guess. If people are getting more UTF-8 coverage from the fonts on their smartphone than on a full Linux distro installation, then maybe I should try and install the Droid fonts. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-02-23 02:07 (GMT+0100) Peter composed:
If people are getting more UTF-8 coverage from the fonts on their smartphone than on a full Linux distro installation, then maybe I should try and install the Droid fonts.
Others I have that if not already installed might avoid the problem: ghostscript-font* efont-unicode cantarell-fonts google-croscore-fonts google-opensans-fonts In SeaMonkey mail composition here കപ്പ renders as intended, but only as gibberish in received and sent mail. -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 23/02/14 01:55, Felix Miata wrote:
You probably just don't have enough UTF-8 character set coverage from your installed fonts. I would think the DejaVus would include those used on that Wikipedia page, but I'm pretty sure the DejaVus are installed by default in openSUSE since many years ago. My global default is Droid, but I also have Liberation and Linux Libertine in addition to the DejaVus and X11 basics. Add more and see what happens.
On 23/02/14 02:27, Felix Miata wrote:
Others I have that if not already installed might avoid the problem: ghostscript-font* efont-unicode cantarell-fonts google-croscore-fonts google-opensans-fonts
In SeaMonkey mail composition here കപ്പ renders as intended, but only as gibberish in received and sent mail.
Looking in YaST now, these are the fonts that were installed by default on my standard openSUSE 13.1 KDE install: cantarell-fonts dejavu-fonts efont-unicode-bitmap-fonts ghostscript-fonts-other ghostscript-fonts-std gnu-unifont-bitmap-fonts google-droid-fonts intlfonts-euro-bitmap-fonts liberation-fonts xorg-x11-fonts xorg-x11-fonts-core to which I added fetchmsttfonts The package ghostscript-fonts is NOT installed, although this appears to be merely a helper package that pulls in the two other ghostscript-fonts packages listed above. There's a whole lot more available in the openSUSE OSS repo, so many in fact that I'm not sure which to choose. Without delving into obscurities that may not be around in the next version of the distro, I might go for: bitstream-vera-fonts gnu-free-fonts intlfonts-arabic-bitmap-fonts intlfonts-asian-bitmap-fonts intlfonts-chinese-bitmap-fonts intlfonts-ethiopic-bitmap-fonts intlfonts-japanese-bitmap-fonts intlfonts-ttf-fonts and see if that improves things. Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-02-24 12:28 (GMT+0100) Peter composed:
I might go for:
bitstream-vera-fonts gnu-free-fonts intlfonts-arabic-bitmap-fonts intlfonts-asian-bitmap-fonts intlfonts-chinese-bitmap-fonts intlfonts-ethiopic-bitmap-fonts intlfonts-japanese-bitmap-fonts intlfonts-ttf-fonts
and see if that improves things.
FWIW, those Malayalam glyphs display here in 11.4 with none of the above packages installed. What I do have: agfa-fonts cantarell-fonts efont-unicode fonts-config ghostscript-fonts-other ghostscript-fonts-std gnu-free-fonts google-croscore-fonts google-droid-fonts google-opensans-fonts liberation-fonts linux-libertine-fonts misc-console-font xorg-x11-fonts xorg-x11-fonts-core xorg-x11-libfontenc -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2/24/2014 3:28 AM, Peter wrote:
There's a whole lot more available in the openSUSE OSS repo, so many in fact that I'm not sure which to choose. Without delving into obscurities that may not be around in the next version of the distro, I might go for:
Assuming that one doesn't actually read these fonts, why would one care if they were displayed correctly or as random gibberish? -- _____________________________________ ---This space for rent--- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 24/02/14 20:17, John Andersen wrote:
Assuming that one doesn't actually read these fonts, why would one care if they were displayed correctly or as random gibberish?
The example I gave of the Wikipedia page wasn't a good one, since that was just a word in a foreign language. Further down that page were many more instances but again these were all just translations of a word in all sorts of more obscure languages. The main case where I'm constantly coming across these symbols is in tweets or comments attached to articles. I'll have to try and provide an example the next time I stumble across one. In these instances, it's definitely not foreign languages / alphabets but just special characters, perhaps sometimes symbols designed to form a certain image together. These are just regular tweets and comments from everyday occidental computer / phone users, so I assume they're picking easy-to-find symbols from some Android / iOS or PC character map application. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Peter wrote:
The main case where I'm constantly coming across these symbols is in tweets or comments attached to articles.
I don't use twitter, but the obvious question is whether it is completely clean, in the sense of all tweets being correctly encoded Unicode? Or do users of e.g. PCs manage to post other character sets or double-encoded Unicode or such monstrosities? Because unless it is all completely correct Unicode, there's no point trying to fix it with additional fonts. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
В Sun, 23 Feb 2014 00:58:22 +0100 Peter <gumb@linuxmail.org> пишет:
---- |04| |A6| ----
Go to http://fontinfo.opensuse.org , find which font providing this glyph and install it. It even offers one click install link. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Peter wrote:
Right. This is annoying me, I have to get an instant answer! Can somebody tell me why I get these symbols in all sorts of places on the Internet (Twitter, random web pages, etc.)?
দইসাবু
They render correctly on my system(s) linux and windows(depends on location), but I was told by the person responsible for suse's font-caching and rendering system that my system was all screwed up because I have collected fonts for 10-15 years, and most of them are not in packages... If you can run a windows package -- OR there is a website with similar functionality, I strongly suggest getting "BabelMap" to run on windows (it's donation-ware) @ http://www.babelstone.co.uk/Software/BabelMap.html. The 'online version' in javascript tries to use your browser fonts and is at http://www.babelstone.co.uk/Unicode/babelmap.html. The main page has pointers to the javascript utils as well as the windows software (@ http://www.babelstone.co.uk/) The windows software tells you exactly what fonts cover what ranges and what ranges you have 'covered' and which ones you don't. With that software, I can just paste the characters that were on the screen into it's buffer, and see the hex code and what characters they are: দইসাবু = দইসাবু Bengali letters: 'Da', 'I', 'Sa', 'AA' 'va/wa', 'U' -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 24/02/14 08:39, Linda Walsh wrote:
They render correctly on my system(s) linux and windows(depends on location), but I was told by the person responsible for suse's font-caching and rendering system that my system was all screwed up because I have collected fonts for 10-15 years, and most of them are not in packages...
If you can run a windows package -- OR there is a website with similar functionality,
I strongly suggest getting "BabelMap" to run on windows (it's donation-ware) @ http://www.babelstone.co.uk/Software/BabelMap.html.
The 'online version' in javascript tries to use your browser fonts and is at http://www.babelstone.co.uk/Unicode/babelmap.html.
The main page has pointers to the javascript utils as well as the windows software (@ http://www.babelstone.co.uk/)
The windows software tells you exactly what fonts cover what ranges and what ranges you have 'covered' and which ones you don't.
With that software, I can just paste the characters that were on the screen into it's buffer, and see the hex code and what characters they are:
দইসাবু = দইসাবু
Bengali letters: 'Da', 'I', 'Sa', 'AA' 'va/wa', 'U'
Well I run Linux exclusively. I do have an old box that's been gathering dust for a while with a Windows ME partition on it, but let's not go there... Besides, for all the openSUSE machines I administer, not all of them in my own home, I'd prefer a quick Linux-based solution of installing one or two additional packages. I'm just looking through all the available fonts in the standard openSUSE repo now. Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am 24.02.2014 12:12, schrieb Peter:
I'm just looking through all the available fonts in the standard openSUSE repo now. Have you looked at the package indic-fonts which as I commented in the first post should solve the problem you have with the malaysian characters? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 24/02/14 12:17, Martin Helm wrote:
Am 24.02.2014 12:12, schrieb Peter:
I'm just looking through all the available fonts in the standard openSUSE repo now. Have you looked at the package indic-fonts which as I commented in the first post should solve the problem you have with the malaysian characters?
Ah, I missed that one. I'll add it to the list of fonts I'm going to add in my other message from a minute ago, thanks. Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Martin Helm wrote:
the malaysian characters?
Just an FYI. They are malayalam characters, not malaysian. Two completely different things. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Peter wrote:
On 24/02/14 08:39, Linda Walsh wrote:
The 'online version' in javascript tries to use your browser fonts and is at http://www.babelstone.co.uk/Unicode/babelmap.html.
The main page has pointers to the javascript utils as well as the windows software (@ http://www.babelstone.co.uk/)
Well I run Linux exclusively. I do have an old box that's been gathering dust for a while with a Windows ME partition on it, but let's not go there...
But with the ones you mentioned, the webpage version would at least be able to tell you what range the char is in - like "Bengali", in this case, and that might help in finding what covers it next. While I'm impressed at the better coverage and configurability of the Linux solution -- the fact that the code writers don't know how to marshal data and have to turn arch-free fonts into a format that is dependent on each machines memory architecture, is rather appalling. I.e. they want to be able to mmap the cached data into C data structures that are not padded or portable in any way -- which is why the fc-config has to be run twice if you have 32 and 64 bit apps installed -- prompted me to do a "purge" of 32-bit apps on my 64-bit machine Also, since it's 1 mmap'ed structure on disk, they can't update part of it when you add 1 font -- So the whole thing gets updated. Solution to that? Double the access time by putting each font in a separate directory. A ill-thought-out solution that wreaks havoc with existing X11 font utils that create 1 index file/directory. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hi On 22. února 2014 18:58:22 GMT-05:00, Peter <gumb@linuxmail.org> wrote:
Right. This is annoying me, I have to get an instant answer! Can somebody tell me why I get these symbols in all sorts of places on the Internet (Twitter, random web pages, etc.)?
দইসাবু
This may not seem like a particularly openSUSE-related question, but the reason I ask on this newsgroup is because it would seem that for many other computer users these symbols are replaced by meaningful characters, but on all my openSUSE installations I see these nonsensical symbols regularly. The thing is, it's possible that some of you won't know what I'm talking about if your own system interprets them correctly and shows the actual characters above, so for your benefit, I'm referring to little boxes the full height and width of a text character
(or bigger), containing four hexadecimal figures. Like:
---- |04| |A6| ----
From their placement on Twitter I assume it's often emoticons or obscure symbols available in some Windows font I don't have that I should be seeing, maybe Wingdings or something like that. What is it about openSUSE / Linux / Firefox or whatever else that results in me seeing these hexadecimal boxes instead of the correct characters? I use
UTF-8 system-wide.
For a long while I genuinely thought it was some geeky meme that required people in the know to reference some list! Until I started seeing them in far too regular places, like on Wikipedia articles.
Peter
I sometimes had similar problem when using an ad blocking SW - the browser wasn't then able to correctly parse some parts of the source code of the web page... V. -- Vojtěch Zeisek Komunita openSUSE GNU/Linuxu Community of the openSUSE GNU/Linux http://www.opensuse.org/ http://trapa.cz/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (10)
-
Andrey Borzenkov
-
Carlos E. R.
-
Dave Howorth
-
Felix Miata
-
John Andersen
-
Ken Schneider - openSUSE
-
Linda Walsh
-
Martin Helm
-
Peter
-
Vojtěch Zeisek