On Fri, 30 Mar 2012 17:01:04 +0200 Ladislav Slezak <lslezak@suse.cz> wrote:
Hi all,
This is just a note for you in case you came across a strange UTF-8 string problem in YCP (maybe you already know that but for me it was quite a surprise):
YCP string operator [] takes _byte_ index in the string, while size(string) returns _number_of_characters_. The problem is when you combine both functions, the result will be probably buggy, see https://bugzilla.novell.com/show_bug.cgi?id=728588
Example from the bug:
size("áa") => 2,
but
"áa"[1] is not "a" as expected but the second _byte_ of the string which is one half of the "á" UTF-8 character, if you remove it you'll get garbage in the string...
Keep this in your mind when iterating over YCP strings...
(I'm not sure whether fixing YCPString::[] would be a good idea, it might break something else. Martin?)
When I compare behaviour to other languages like wstring in C++ or ruby string, I see this behaviour really strange. operator[] is expected to return element at given position. For a lot of programmer , which don't have ycp as first first language, is string is array of character ( not bytes ). So I really vote for change of this behaviour. Josef
--
Ladislav Slezák Appliance department / YaST Developer Lihovarská 1060/12 190 00 Prague 9 / Czech Republic tel: +420 284 028 960 lslezak@suse.com SUSE
-- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org