Mailinglist Archive: yast-devel (23 mails)

< Previous Next >
Re: [yast-devel] Re: YCP substring() Was: YCP String operator [] and UTF-8
On Tue, 3 Apr 2012 11:58:16 +0200
Arvin Schnell <aschnell@xxxxxxx> wrote:

On Tue, Apr 03, 2012 at 11:33:09AM +0200, Klaus Kaempf wrote:
* Ladislav Slezak <lslezak@xxxxxxx> [Apr 03. 2012 11:10]:

I used substring() to get one character. So the problematic call is
actually:

substring("áa", 1, 1);

which returns "\0xF1" instead of "a" as I expected.

The documentation does not tell whether the substring() argument units
are in
bytes or characters.
http://doc.opensuse.org/projects/YaST/openSUSE11.3/tdg/substring-rest.html

So any opinions on changing this call? Is the UTF-8 assumption also valid
here?

Yes. sub_string_ is operating on strings and strings are defined to be
UTF-8 encoded.

Generally I agree that strings in YCP are UTF-8 encoded and
functions should respect this.

But simply fixing the functions might require converting from
UTF-8 to wstring and back in every function and that sounds very
costly. E.g. the size functions in YCP converts the string to
wstring. When I noticed that and saw how many time
size(string) == 0 is used I added an isempty function in YCP.

Could be that using wstring internally in YCPString is the better
solution.


I absolutelly agree. If we have each string as UTF string in ycp, then not
using wstring doesn't make much sense to me. Of course we need to check which
depends on it, but I think that it should be mainly various bindings. Other
part of code should not be interested what is internal representation.

Josef

Regards,
Arvin


--
To unsubscribe, e-mail: yast-devel+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: yast-devel+owner@xxxxxxxxxxxx

< Previous Next >
List Navigation