[opensuse] apache 2.4 performance issue / processwire.
A customer has reported a performance issue - their new website is slow. 3-4 seconds per page load. I can confirm it. It is a shared server, quite old. 2 x 2.4GHz Xeons, 4Gb RAM. It's running openSUSE 12.3, apache 2.4 with mpm-itk. However, it does not appear to be hardware related, other pages serve perfectly well, e.g. http://www.dns24.ch/ (I'll send you the customer website link privately if you want). The website is using a CMS called Processwire: http://processwire.com/ They don't have a support mailing list only fora (which I detest). Anyone here happen to be familiar with processwire? The web developer who's been doing the work doesn't seem overly familiar with tuning processwire, so I'm kind of left with solving the problem. (aka proving it's processwire that has a problem). -- Per Jessen, Zürich (19.9°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Have you tried opening the pages and viewing whats causing the
slowness with FireFoxe's network monitor or tamper-data plugin?
https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor
--
Later,
Darin
On Tue, Jun 7, 2016 at 3:14 PM, Per Jessen
A customer has reported a performance issue - their new website is slow. 3-4 seconds per page load. I can confirm it. It is a shared server, quite old. 2 x 2.4GHz Xeons, 4Gb RAM. It's running openSUSE 12.3, apache 2.4 with mpm-itk. However, it does not appear to be hardware related, other pages serve perfectly well, e.g.
(I'll send you the customer website link privately if you want).
The website is using a CMS called Processwire:
They don't have a support mailing list only fora (which I detest). Anyone here happen to be familiar with processwire? The web developer who's been doing the work doesn't seem overly familiar with tuning processwire, so I'm kind of left with solving the problem. (aka proving it's processwire that has a problem).
-- Per Jessen, Zürich (19.9°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Darin Perusich wrote:
Have you tried opening the pages and viewing whats causing the slowness with FireFoxe's network monitor or tamper-data plugin?
https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor -- Later, Darin
Nope, haven't tried that - it seems fairly certain the problem is server- side. (when non-processwire sites work just fine). -- Per Jessen, Zürich (18.1°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Darin Perusich wrote:
Have you tried opening the pages and viewing whats causing the slowness with FireFoxe's network monitor or tamper-data plugin?
https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor -- Later, Darin
Nope, haven't tried that - it seems fairly certain the problem is server-> side. (when non-processwire sites work just fine).
Hi Darin nice FF tool, thanks, I didn't know about that one: http://files.jessen.ch/ff-netmon-mmsx-com2.jpeg Looks like that first 404 takes 4seconds to happen? After that, everything is served at normal speed. -- Per Jessen, Zürich (18.6°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Per Jessen wrote:
Darin Perusich wrote:
Have you tried opening the pages and viewing whats causing the slowness with FireFoxe's network monitor or tamper-data plugin?
https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor -- Later, Darin
Nope, haven't tried that - it seems fairly certain the problem is server-> side. (when non-processwire sites work just fine).
Hi Darin
nice FF tool, thanks, I didn't know about that one:
http://files.jessen.ch/ff-netmon-mmsx-com2.jpeg
Looks like that first 404 takes 4seconds to happen? After that, everything is served at normal speed.
Okay, the error-handler for 404 is /index.php, so everything is served through processwire. Fair enough. Without knowing anything about the innards, how do I "profile" index.php ? I googled it some, looks like "xdebug" is one option. -- Per Jessen, Zürich (18.9°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-08 08:58, Per Jessen wrote:
Per Jessen wrote:
Darin Perusich wrote:
Have you tried opening the pages and viewing whats causing the slowness with FireFoxe's network monitor or tamper-data plugin?
https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor
Nope, haven't tried that - it seems fairly certain the problem is server-> side. (when non-processwire sites work just fine).
nice FF tool, thanks, I didn't know about that one:
Indeed! Nice tool.
Looks like that first 404 takes 4seconds to happen? After that, everything is served at normal speed.
Some of the graphics there (png, jpeg) also seem to be large. One takes 0.6S to download or to serve. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
Carlos E. R. wrote:
On 2016-06-08 08:58, Per Jessen wrote:
Per Jessen wrote:
Darin Perusich wrote:
Have you tried opening the pages and viewing whats causing the slowness with FireFoxe's network monitor or tamper-data plugin?
https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor
Nope, haven't tried that - it seems fairly certain the problem is server-> side. (when non-processwire sites work just fine).
nice FF tool, thanks, I didn't know about that one:
Indeed! Nice tool.
Looks like that first 404 takes 4seconds to happen? After that, everything is served at normal speed.
Some of the graphics there (png, jpeg) also seem to be large. One takes 0.6S to download or to serve.
Yeah, I've told the customer about that a long time ago, and suggested some optimizations, but it's really about the 3-4 seconds it takes to run index.php. I would run apache single-threaded with strace, but I can't on a production server :-) -- Per Jessen, Zürich (16.6°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-08 13:16, Per Jessen wrote:
Carlos E. R. wrote:
Some of the graphics there (png, jpeg) also seem to be large. One takes 0.6S to download or to serve.
Yeah, I've told the customer about that a long time ago, and suggested some optimizations, but it's really about the 3-4 seconds it takes to run index.php.
Some sites optimize wonderfully for "mobile" users. Others don't do it at all. It needs optimizing the width of the page and the byte size of things to download, both. I hate looking at some news sites on my tablet because they are close to unreadable.
I would run apache single-threaded with strace, but I can't on a production server :-)
You can attach to a PID of an already running process. strace can attach to children automatically, but I don't know if you can do that with PID. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
Carlos E. R. wrote:
I would run apache single-threaded with strace, but I can't on a production server :-)
You can attach to a PID of an already running process. strace can attach to children automatically, but I don't know if you can do that with PID.
Yes, but I can't tell which apache thread is going to run my request. Hmm, I guess I could strace all of them .... -- Per Jessen, Zürich (16.7°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
You can update mod_log_config.conf to include %P to include the PID of
the child processes.
--
Later,
Darin
On Wed, Jun 8, 2016 at 7:35 AM, Per Jessen
Carlos E. R. wrote:
I would run apache single-threaded with strace, but I can't on a production server :-)
You can attach to a PID of an already running process. strace can attach to children automatically, but I don't know if you can do that with PID.
Yes, but I can't tell which apache thread is going to run my request. Hmm, I guess I could strace all of them ....
-- Per Jessen, Zürich (16.7°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-08 13:35, Per Jessen wrote:
Carlos E. R. wrote:
I would run apache single-threaded with strace, but I can't on a production server :-)
You can attach to a PID of an already running process. strace can attach to children automatically, but I don't know if you can do that with PID.
Yes, but I can't tell which apache thread is going to run my request. Hmm, I guess I could strace all of them ....
Yes, that is what I meant. -f Trace child processes as they are created by currently traced processes as a result of the fork(2), vfork(2) and clone(2) system calls. Note that -p PID -f will attach all threads of process PID if it is multi-threaded, not only thread with thread_id = PID. -ff If the -o filename option is in effect, each processes trace is written to filename.pid where pid is the numeric process id of each process. This is incompatible with -c, since no per-process counts are kept. -b syscall If specified syscall is reached, detach from traced process. Currently, only execve syscall is supported. This option is useful if you want to trace multi-threaded process and therefore require -f, but don't want to trace its (potentially very complex) children. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
Carlos E. R. wrote:
On 2016-06-08 13:35, Per Jessen wrote:
Carlos E. R. wrote:
I would run apache single-threaded with strace, but I can't on a production server :-)
You can attach to a PID of an already running process. strace can attach to children automatically, but I don't know if you can do that with PID.
Yes, but I can't tell which apache thread is going to run my request. Hmm, I guess I could strace all of them ....
Yes, that is what I meant.
-f Trace child processes as they are created by currently traced processes as a result of the fork(2), vfork(2) and clone(2) system calls. Note that -p PID -f will attach all threads of process PID if it is multi-threaded, not only thread with thread_id = PID.
-ff If the -o filename option is in effect, each processes trace is written to filename.pid where pid is the numeric process id of each process. This is incompatible with -c, since no per-process counts are kept.
That is slightly different - apache runs a number threads, afaict each forks a new one per request. Anyway, I straced all the running threads and got a good trace! /home/per/Diagnostics/strace-http-21602-23194.txt The request starts around 13:52:03.875969 and finishes 13:52:11.622257 (roughly). It took a little longer than normal, presumably due to the strace? I count 184 open() call, of which 157 on '/srv/www/vhosts'. 176 calls of access(), and 15136 x lstat64(). -- Per Jessen, Zürich (17.4°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-06-08 15:18, Per Jessen wrote:
I count 184 open() call, of which 157 on '/srv/www/vhosts'. 176 calls of access(), and 15136 x lstat64().
There is another tool that says how much time is used on each zone or call. Ah, a profiler. Maybe thousands of calls to the same function is the issue, or maybe not because the i/o is cached. But I don't remember how to use a profiler in Linux this instant. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAldYhFsACgkQja8UbcUWM1zzfAD/cjPgGryn/zB37yOajph/5a5S 2m3raiv0sHtdD9VmLUkA/i5tzC1XRB+J12Lq1pon7GisNQLv/pgkAoFbERNeQsVm =XeAi -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 2016-06-08 15:18, Per Jessen wrote:
I count 184 open() call, of which 157 on '/srv/www/vhosts'. 176 calls of access(), and 15136 x lstat64().
There is another tool that says how much time is used on each zone or call. Ah, a profiler. Maybe thousands of calls to the same function is the issue, or maybe not because the i/o is cached.
But I don't remember how to use a profiler in Linux this instant.
If it were C-code, no problem, but it's PHP. Regardless, it's not so important how much time is spent where, it's about reducing the time from 4 seconds to 1 second. -- Per Jessen, Zürich (15.1°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-09 08:06, Per Jessen wrote:
Carlos E. R. wrote:
On 2016-06-08 15:18, Per Jessen wrote:
I count 184 open() call, of which 157 on '/srv/www/vhosts'. 176 calls of access(), and 15136 x lstat64().
There is another tool that says how much time is used on each zone or call. Ah, a profiler. Maybe thousands of calls to the same function is the issue, or maybe not because the i/o is cached.
But I don't remember how to use a profiler in Linux this instant.
If it were C-code, no problem, but it's PHP. Regardless, it's not so important how much time is spent where, it's about reducing the time from 4 seconds to 1 second.
Well, exactly. Find out the line(s) in the source code where most time is spent. Hopefully those 4 seconds are lost in a small section, even a single line, perhaps a loop. Once found, the code can be changed. Unfortunately, it is php code. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
Carlos E. R. wrote:
On 2016-06-09 08:06, Per Jessen wrote:
Carlos E. R. wrote:
On 2016-06-08 15:18, Per Jessen wrote:
I count 184 open() call, of which 157 on '/srv/www/vhosts'. 176 calls of access(), and 15136 x lstat64().
There is another tool that says how much time is used on each zone or call. Ah, a profiler. Maybe thousands of calls to the same function is the issue, or maybe not because the i/o is cached.
But I don't remember how to use a profiler in Linux this instant.
If it were C-code, no problem, but it's PHP. Regardless, it's not so important how much time is spent where, it's about reducing the time from 4 seconds to 1 second.
Well, exactly. Find out the line(s) in the source code where most time is spent.
Already done - 15000 calls of lstat(). -- Per Jessen, Zürich (18.0°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-09 13:10, Per Jessen wrote:
Carlos E. R. wrote:
On 2016-06-09 08:06, Per Jessen wrote:
Carlos E. R. wrote:
On 2016-06-08 15:18, Per Jessen wrote:
I count 184 open() call, of which 157 on '/srv/www/vhosts'. 176 calls of access(), and 15136 x lstat64().
There is another tool that says how much time is used on each zone or call. Ah, a profiler. Maybe thousands of calls to the same function is the issue, or maybe not because the i/o is cached.
But I don't remember how to use a profiler in Linux this instant.
If it were C-code, no problem, but it's PHP. Regardless, it's not so important how much time is spent where, it's about reducing the time from 4 seconds to 1 second.
Well, exactly. Find out the line(s) in the source code where most time is spent.
Already done - 15000 calls of lstat().
SOURCE code! Which line in the PHP. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On 2016-06-09 13:10, Per Jessen wrote:
Carlos E. R. wrote:
On 2016-06-09 08:06, Per Jessen wrote:
Carlos E. R. wrote:
On 2016-06-08 15:18, Per Jessen wrote:
I count 184 open() call, of which 157 on '/srv/www/vhosts'. 176 calls of access(), and 15136 x lstat64().
There is another tool that says how much time is used on each zone or call. Ah, a profiler. Maybe thousands of calls to the same function is the issue, or maybe not because the i/o is cached.
But I don't remember how to use a profiler in Linux this instant.
If it were C-code, no problem, but it's PHP. Regardless, it's not so important how much time is spent where, it's about reducing the time from 4 seconds to 1 second.
Well, exactly. Find out the line(s) in the source code where most time is spent.
Already done - 15000 calls of lstat().
SOURCE code! Which line in the PHP.
I know, I know - there are no such calls in the code, so it's in the PHP interpreter. -- Per Jessen, Zürich (19.3°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-09 14:39, Per Jessen wrote:
Dave Howorth wrote:
On 2016-06-09 13:10, Per Jessen wrote:
Carlos E. R. wrote:
Well, exactly. Find out the line(s) in the source code where most time is spent.
Already done - 15000 calls of lstat().
SOURCE code! Which line in the PHP.
I know, I know - there are no such calls in the code, so it's in the PHP interpreter.
Yes but which line in the source code causes the interpreter to make those calls? There's no obvious place, so it's necessary to use the debugger to discover where it is. The developer should see this too, so I'm very curious if he isn't. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On 2016-06-09 14:39, Per Jessen wrote:
Dave Howorth wrote:
On 2016-06-09 13:10, Per Jessen wrote:
Carlos E. R. wrote:
Well, exactly. Find out the line(s) in the source code where most time is spent.
Already done - 15000 calls of lstat().
SOURCE code! Which line in the PHP.
I know, I know - there are no such calls in the code, so it's in the PHP interpreter.
Yes but which line in the source code causes the interpreter to make those calls? There's no obvious place, so it's necessary to use the debugger to discover where it is.
The developer should see this too, so I'm very curious if he isn't.
Well, I can at least report some progress - a) I have a test system up and running. b) I have xdebug working, I find the documentation is a little lacking, but I'm getting there. c) In utter desperation, I actually went and googled "php many lstat calls": http://grokbase.com/t/php/php-internals/087f1t68mm/lstat-call-on-each-direct... (precisely what I am seeing). I did a function trace, and I do not see loads of clearstatcache() calls, I've increased "realpath-cache-size", but http://php.net/manual/de/function.realpath-cache-size.php
"Note that the realpath cache is not used if either safe_mode is on or an open_basedir restriction is in effect. This is having a huge performance effect, causing lots of calls to lstat. http://bugs.php.net/bug.php?id=52312
I am using "open_basedir", of course. Can't get through to bugs.php.net at the moment. I'm still reading the bugreport. On my test-system, a plain office PC, but directly attached disk (vs. iSCSI), the 15000 lstat calls are done in about 500ms. -- Per Jessen, Zürich (19.6°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/09/2016 10:32 AM, Per Jessen wrote:
On my test-system, a plain office PC, but directly attached disk (vs. iSCSI), the 15000 lstat calls are done in about 500ms.
Still, the matter of why there are 15,000 lstat() calls is unresolved. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/09/2016 10:32 AM, Per Jessen wrote:
On my test-system, a plain office PC, but directly attached disk (vs. iSCSI), the 15000 lstat calls are done in about 500ms.
Still, the matter of why there are 15,000 lstat() calls is unresolved.
That is actually well explained in the bugreport and the link: http://grokbase.com/t/php/php-internals/087f1t68mm/lstat-call-on-each-direct... "well explained" = it sounds reasonable. I won't pretend to have familiarized myself with the php internals to that level. -- Per Jessen, Zürich (21.9°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/09/2016 11:15 AM, Per Jessen wrote:
Anton Aylward wrote:
On 06/09/2016 10:32 AM, Per Jessen wrote:
On my test-system, a plain office PC, but directly attached disk (vs. iSCSI), the 15000 lstat calls are done in about 500ms.
Still, the matter of why there are 15,000 lstat() calls is unresolved.
That is actually well explained in the bugreport and the link:
http://grokbase.com/t/php/php-internals/087f1t68mm/lstat-call-on-each-direct...
"well explained" = it sounds reasonable. I won't pretend to have familiarized myself with the php internals to that level.
Its doing something that, perhaps, Apache should be doing. I look at that
<?php $fp=fopen("/var/www/metacafe/test","r"); fclose($fp); ?>
When running with strace -e lstat I see this: lstat("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/var/www", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0 lstat("/var/www/metacafe", {st_mode=S_IFDIR|0755, st_size=4096, ...}) =
lstat("/var/www/metacafe/test", 0x7fbfff9b10) = -1 ENOENT (No such file or directory)
and I think its doing
I want to get to "/var/www/metacafe/test". Can I access "/var"?
lstat("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
Yes I can. Can I get to "/var/www"?
lstat("/var/www", {st_mode=S_IFDIR|0755 st_size=12288, ...}) = 0
Yes I can. Can I get to "/var/www/metacafe/test"?
lstat("/var/www/metacafe", {st_mode=S_IFDIR|0755, st_size=4096, ...}) =
lstat("/var/www/metacafe/test", 0x7fbfff9b10) = -1 ENOENT (No such file or directory)
No I can't. This doesn’t explain why it scans so many files, though. And yes, moving the base up makes sense. I have a number of applications installed in ... well on my system its "/srv/www/htdocs/:. As far as, in my case, "owncloud" goes, its base is set to "/srv/www/htdocs/owncloud/" to start with; that's its logical root. This should be taken care of by the vhost configuration. Yes, it means you are now down to one or two lstat() calls per file, but it still doesn’t explain why it needs to do the 4500 files. Quite apart from the caching that PHP may or may not be doing between HTTP invocations (probably not, this is a connectionless protocol remember) there should be inode and pathname caching done by the kernel. It won't be so effective in a shared environment, competing with other users opening and stating files. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
I look at that
<?php $fp=fopen("/var/www/metacafe/test","r"); fclose($fp); ?>
When running with strace -e lstat I see this: lstat("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 lstat("/var/www", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0 lstat("/var/www/metacafe", {st_mode=S_IFDIR|0755, st_size=4096, ...}) =
lstat("/var/www/metacafe/test", 0x7fbfff9b10) = -1 ENOENT (No such file or directory)
and I think its doing
I want to get to "/var/www/metacafe/test". Can I access "/var"?
As it is realpath() doing the calling, I think it's about determining if it's a symlink or not.
This doesn’t explain why it scans so many files, though.
I see two reasons - a modular design which causes many include()s, and using "open_basedir" which disables caching of realpath() results. I could probably provide some real numbers for you tomorrow, but I'm now more interested to see a) how much removing open_basedir will improve the situation, and b) what the security implications of that might be. Fixing processwire and/or the php interpreter are not on my list of priorities :-)
Quite apart from the caching that PHP may or may not be doing between HTTP invocations (probably not, this is a connectionless protocol remember) there should be inode and pathname caching done by the kernel.
Yeah, that is an interesting point. -- Per Jessen, Zürich (17.2°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-06-09 23:15, Per Jessen wrote:
Anton Aylward wrote:
Quite apart from the caching that PHP may or may not be doing between HTTP invocations (probably not, this is a connectionless protocol remember) there should be inode and pathname caching done by the kernel.
Yeah, that is an interesting point.
nscd? - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAldZ8EwACgkQja8UbcUWM1yNywD/f7nXFkCCIi8Ib/ehuI1YVCA8 CEkqerWB6LLGIyNtQmgA/AmJfKBxEFV5eB1DdEEQUYRJE0RJb6OdJTL1W+Phw6qF =Azxt -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 2016-06-09 23:15, Per Jessen wrote:
Anton Aylward wrote:
Quite apart from the caching that PHP may or may not be doing between HTTP invocations (probably not, this is a connectionless protocol remember) there should be inode and pathname caching done by the kernel.
Yeah, that is an interesting point.
nscd?
No, there's no name resolution involved here. -- Per Jessen, Zürich (24.9°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, 2016-06-09 at 23:15 +0200, Per Jessen wrote:
Anton Aylward wrote: from the caching that PHP may or may not be doing between
HTTP invocations (probably not, this is a connectionless protocol remember) there should be inode and pathname caching done by the kernel.
Yeah, that is an interesting point.
What cacheing can the kernel do? The filesystem is multi-hosted (at least in possibility) yes? So surely the kernel has to go right back to the filesystem every time to make sure it gets current state information. What atime setting does the filesystem have? Anything other than noatime will definitely require the kernel to go right back to the disk blocks at least some of the time. The bug report describes what PHP does; there's lots of discussion of the cacheing it does and doesn't do. In particular it does cache between [HTTP] requests, because as it says what's the point of a cache otherwise. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/09/2016 07:09 PM, Dave Howorth wrote:
What cacheing can the kernel do? The filesystem is multi-hosted (at least in possibility) yes? So surely the kernel has to go right back to the filesystem every time to make sure it gets current state information.
No. If user A searches for a file (be it to read, write or just stat()) and the kernel reads the inode for that file, it caches it, LRU. If later, user B opens that file to write to it, so modifying the size and time, or perhaps, instead, does a chmod(), the kernel has that inode in its cache once again. It will sync the inode to disk to preserve the metastructure, but its till in the cache. So when user A comes to whatever or stat the file again, its still in the cache, the updated version is in the cache. IIR inode caching was in the non-networked UNIX V7 or the 1970s. I don't know if it was V6, I'd have to go to the library and find a copy of Lyons. This holds true if the file system is NFS mounted by multiple hosts, the caching is by the host the mounts the disk, not the ones that do the NFS mounting. The clients may be configured to cache the results of the NFS 'getattr()' call, but it will be for a very short time. You might look at the 'slabtop' command. I see that the ext4FS has additional inode caching, and so too, it seems does Btrfs. This is news to me :-/ SUN later introduced pathname caching as well. I saw the code for that once, but it was years ago and I don't recall the algorithm. At https://www.kernel.org/doc/Documentation/sysctl/vm.txt you can read the entry for vfs_cache_pressure. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Thu, 2016-06-09 at 23:15 +0200, Per Jessen wrote:
Anton Aylward wrote: from the caching that PHP may or may not be doing between
HTTP invocations (probably not, this is a connectionless protocol remember) there should be inode and pathname caching done by the kernel.
Yeah, that is an interesting point.
What cacheing can the kernel do? The filesystem is multi-hosted (at least in possibility) yes?
It's multipath'ed yes, but there's only one copy (as seen from the running system). The system sees a SCSI disk, that's all.
So surely the kernel has to go right back to the filesystem every time to make sure it gets current state information.
What atime setting does the filesystem have? Anything other than noatime will definitely require the kernel to go right back to the disk blocks at least some of the time.
Umm, /srv/www is not mounted with noatime, maybe that needs testing too. Seems like a reasonable thing for a webserver. -- Per Jessen, Zürich (13.4°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-10 07:12, Per Jessen wrote:
Dave Howorth wrote:
On Thu, 2016-06-09 at 23:15 +0200, Per Jessen wrote:
Anton Aylward wrote: from the caching that PHP may or may not be doing between
HTTP invocations (probably not, this is a connectionless protocol remember) there should be inode and pathname caching done by the kernel.
Yeah, that is an interesting point.
What cacheing can the kernel do? The filesystem is multi-hosted (at least in possibility) yes?
It's multipath'ed yes, but there's only one copy (as seen from the running system). The system sees a SCSI disk, that's all.
I was mistaken - I didn't understand how iSCSI worked; thought it allowed dual simultaneous connections.
So surely the kernel has to go right back to the filesystem every time to make sure it gets current state information.
What atime setting does the filesystem have? Anything other than noatime will definitely require the kernel to go right back to the disk blocks at least some of the time.
Umm, /srv/www is not mounted with noatime, maybe that needs testing too. Seems like a reasonable thing for a webserver.
There's some difference between the production server versus the test server and development machine that is causing the problem (in the sense of making bad behaviour unacceptable - lots of lstats taking a very long time - apparently over 250 µs each). I'm just trying to thing of things that might affect the timing. I take it reorganizing the paths or using a ramfs cache are not sensible for whatever reason? Have you got anywhere with enquiries to processwire or PHP? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/10/2016 05:36 AM, Dave Howorth wrote:
There's some difference between the production server versus the test server and development machine that is causing the problem (in the sense of making bad behaviour unacceptable - lots of lstats taking a very long time - apparently over 250 µs each). I'm just trying to thing of things that might affect the timing.
The normal difference between those and a development machine is the multi-user load. And "multi" a few other things too! A production system *will* have many users and variegated load. I've mentioned inode caching and pathname caching, which *NIX has always done well. In a heavily multi-user, mult0tasking environment such as production web server at an ISP with many Adobe 'virtual web hosts' there is going to a great deal of churn to both those. <Sidebar> I suspect that a major web site like Wikipedia and user account ISPs like the one I use do load balancing in very different ways. For a start, Wikipedia-scale sites have different servers for the graphics & CSS & JavaScript from the mainline text. Or at least different "names". Those "names" might in themselves be balanced across a number of pieces of hardware with either a round-robin-dns or a hardware load balance in front. The storage for the text of the pages will probably be a database rather than files, which can more easily be shared and which can do a completely different type of caching from what we've discussed so far. How the IMG, CSS JavaScript is stored & accessed also offers A multi user ISP such as the one I use takes a different approach, and it s quite possible the one in question follows this model as well. My ISP loads up user accounts on a single machine, be they shared vhost Apache services or via actual virtual hosts until either a space or load limit is reached, then they start with a new machine. Some services are networked, but all a single user's files are on his 'home' machine. Andy NFS sharing, any multiple access of files other than the normal sharing of binaries with Linux is fore the system, not the user applications. As with normal Linux, each user owns his own files and the login uses chroot or other to make other user's files invisible to him. As a result, only the user (or sysadmins) have access to a a user's files. Thus the whole issue of the inode in core being changed by some other user access the file makes no sense because it isn't going to happen! The worst case is a hacker braking in to the account and infecting the files; the marginal case is the files being 'accessed' by automatic backup process. What *will* happen is the in-core caches suffer churn simply because this is a multi-user system. </sidebar> Multi-user churn won't happen in a single user development environment. It can be simulated in a test environment if and only if that part of the defined test suite. While I've seen test suites that simulate 'load' by running processes, even ones that try to grab memory, I've not seen ones that deliberate attack the various SLABs and caches. "SLABs"? well yes; quite apart from the generic VFS caches, many file systems have their own inode and other SLAB caches. Run 'slabtop' or read the contents of /proc/slabinfo or 'man 5 slabinfo' and you'll see, for example, not only 'kernfs_node_cache', 'inode_cache' and 'proc_inode_cache' but also 'ext4_inode_cache' and in my case 'reiser_inode_cache' and 'btrfs_inode'. Nobody said this was simple! There's a LOT you can 'tune', or perhaps, 'upset'. But testing is a strange art. I recall one case where I was testing a web application written by a 3rd part for a client. I just clicked on the blank areas of the screen repeatedly and the application went haywire! The developers comment was, and i quote:" You're not supposed to do that". Well if users only ever did what developers thought they should and applications were installing and used n the kind of situations the developers developed them in, ten we wouldn’t be having this discussion! -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-10 12:49, Anton Aylward wrote:
On 06/10/2016 05:36 AM, Dave Howorth wrote:
There's some difference between the production server versus the test server and development machine that is causing the problem (in the sense of making bad behaviour unacceptable - lots of lstats taking a very long time - apparently over 250 µs each). I'm just trying to thing of things that might affect the timing.
The normal difference between those and a development machine is the multi-user load.
But we know that isn't the issue here! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/10/2016 08:53 AM, Dave Howorth wrote:
On 2016-06-10 12:49, Anton Aylward wrote:
On 06/10/2016 05:36 AM, Dave Howorth wrote:
There's some difference between the production server versus the test server and development machine that is causing the problem (in the sense of making bad behaviour unacceptable - lots of lstats taking a very long time - apparently over 250 µs each). I'm just trying to thing of things that might affect the timing.
The normal difference between those and a development machine is the multi-user load.
But we know that isn't the issue here!
Yes, that's my point, or part of it at least. Since that isn't the point, the kernel *should* be caching inodes and directory paths fragments and the kernel cache latency should be greater that the the time between the client calls to the Apache application, even if, because it is connectionless, there is a new instance of the service every time and the PHP process and it own cache is thrown away every time (I realize there are ways to avoid that). Even with something else accesses the same user's files the inodes will still be cached. So subsequent invocations should return from the directory and inode cache very fast. In fact since the subsequent lstat() calls differ only by the last segment, the file name, the cache should contain the earlier fragments Let me be a bit pedantic and explicit about what I mean when I say there is something wrong with the way things are being cached. Imagine this is C code (or even Perl code come to that) rather than PHP and there is no caching in the application, we are only considering the kernel cache. What is happening is 1. lstat("/var") this is followed by 2. lstat("/var/www") Well the DPATH and inode for "/var" should be in the kernel cache, so that should not take long. We get one disk hit for #1, one disk hit for #2 3. lstat("/var/www/metacafe") Same logic as above but now for the next pathname fragment; one disk hit for #3, the rest come from the kernel cache. 4. lstat("/var/www/metacafe/file1") Same logic. One more disk hit. Now we access the same but for the next file 5. lstat("/var") served from kernel cache 6. lstat("/var/www") served from kernel cache 7. lstat("/var/www/metacafe") served from kernel cache 8. lstat("/var/www/metacafe/file2") hit the disk for that I've prefaced that by saying that it C code so as to eliminate the issue of any caching in the application. I'm focusing on the kernel. You can code that up and see the timing of 1..4 vs 5..8 Now lets put an interpret above that, PHP, Perl, Ruby, write the code and see it from the command line. I realize that PHP is more a web scripting language that Perl or Ruby but yes a command line version is available. So what does it look like with the interpreter layer above? IF the 3 interprets show a similar overhead then EITHER all three interpreters have innards that buqqer around with their own idea of 'optimizing' system calls and caching outside of being explicitly told to do so by the programmer OR the problem lies with the code being run under Apache (or some other web server. Hmm "some other web server"? Have you tried it with nginx or thttpd or even lighttpd ??? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
So surely the kernel has to go right back to the filesystem every time to make sure it gets current state information.
What atime setting does the filesystem have? Anything other than noatime will definitely require the kernel to go right back to the disk blocks at least some of the time.
Umm, /srv/www is not mounted with noatime, maybe that needs testing too. Seems like a reasonable thing for a webserver.
There's some difference between the production server versus the test server and development machine that is causing the problem (in the sense of making bad behaviour unacceptable - lots of lstats taking a very long time - apparently over 250 µs each). I'm just trying to thing of things that might affect the timing.
I take it reorganizing the paths or using a ramfs cache are not sensible for whatever reason?
It's shared environment, minor changes that don't change the general scheme would be fine, I could even disable open_basedir.
Have you got anywhere with enquiries to processwire or PHP?
Well - a) processwire, never even got my forum registratrion confirmation. b) I tried logging into http://bugs.php.net to update the bug, but my old userid doesn't seem valid anymore - login only possible with @php.net addresses. :-) -- Per Jessen, Zürich (24.5°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Fri, 2016-06-10 at 17:42 +0200, Per Jessen wrote:
Dave Howorth wrote:
I take it reorganizing the paths or using a ramfs cache are not sensible for whatever reason?
It's shared environment, minor changes that don't change the general scheme would be fine, I could even disable open_basedir.
I haven't thought about the implications of disabling open_basedir at all. Could you try connecting an iSCSI filesystem to your test machine? That might prove whether iSCSI is an essential feature of the problem. If it is then adding a local hard disk to the server might be the cheapest solution? Though probably cause you maintenance issues in the future.
Have you got anywhere with enquiries to processwire or PHP?
Well -
a) processwire, never even got my forum registratrion confirmation. b) I tried logging into http://bugs.php.net to update the bug, but my old userid doesn't seem valid anymore - login only possible with @php.net addresses. :-)
The bug seems moribund, in that only people who have the problem have posted to it recently, not people who might play a part in its solution. So I thought it might be more productive to post to the mailing list. But if you already had a login, I can see why the bug would be first port of call. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Fri, 2016-06-10 at 17:42 +0200, Per Jessen wrote:
Dave Howorth wrote:
I take it reorganizing the paths or using a ramfs cache are not sensible for whatever reason?
It's shared environment, minor changes that don't change the general scheme would be fine, I could even disable open_basedir.
I haven't thought about the implications of disabling open_basedir at all.
It seems to be the main stumbling block - I haven't had time to think it through yet.
Could you try connecting an iSCSI filesystem to your test machine? That might prove whether iSCSI is an essential feature of the problem.
Not a bad idea, that should be asy enough to do.
If it is then adding a local hard disk to the server might be the cheapest solution? Though probably cause you maintenance issues in the future.
Yeah. It would mean a minimum of two disks, and then it's not too far to a dedicated server.
Have you got anywhere with enquiries to processwire or PHP?
Well -
a) processwire, never even got my forum registratrion confirmation. b) I tried logging into http://bugs.php.net to update the bug, but my old userid doesn't seem valid anymore - login only possible with @php.net addresses. :-)
The bug seems moribund, in that only people who have the problem have posted to it recently, not people who might play a part in its solution. So I thought it might be more productive to post to the mailing list. But if you already had a login, I can see why the bug would be first port of call.
I'll check the mailing list too, but I went to http://lists.php.net, got redirect to http://news.php.net where it's really clear how to subscribe to a list. I think it's probably subscribe by email only. -- Per Jessen, Zürich (18.8°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Fri, 2016-06-10 at 22:08 +0200, Per Jessen wrote:
Dave Howorth wrote: [snip]
I haven't thought about the implications of disabling open_basedir at all.
It seems to be the main stumbling block - I haven't had time to think it through yet. [snip]
Have you got anywhere with enquiries to processwire or PHP? [snip] The bug seems moribund, in that only people who have the problem have posted to it recently, not people who might play a part in its solution. So I thought it might be more productive to post to the mailing list. But if you already had a login, I can see why the bug would be first port of call.
I'll check the mailing list too, but I went to http://lists.php.net, got redirect to http://news.php.net where it's really clear how to subscribe to a list. I think it's probably subscribe by email only.
The subscription instructions seem to be at http://php.net/mailing-lists.php I think I'd try the internals list. It might also be worth asking about the implications of disabling open_basedir. Have a good weekend. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-09 15:32, Per Jessen wrote:
Well, I can at least report some progress -
Congratulations.
a) I have a test system up and running.
b) I have xdebug working, I find the documentation is a little lacking, but I'm getting there.
c) In utter desperation, I actually went and googled "php many lstat calls":
http://grokbase.com/t/php/php-internals/087f1t68mm/lstat-call-on-each-direct...
(precisely what I am seeing).
Looks like you struck gold. If it is doing five lstats for every file and there are 4500 files then hitting each file once with no cacheing would give you the bulk of your lstats but why would it be hitting every file? Making a sitemap? It seems that moving the apache root to the system root gets around the problem. Obviously not desirable with a real system but could you chroot or something? Maybe even reducing the number of levels in the path would help. It might be faster to access /htdoc than the current path.
I did a function trace, and I do not see loads of clearstatcache() calls, I've increased "realpath-cache-size", but
http://php.net/manual/de/function.realpath-cache-size.php
"Note that the realpath cache is not used if either safe_mode is on or an open_basedir restriction is in effect. This is having a huge performance effect, causing lots of calls to lstat. http://bugs.php.net/bug.php?id=52312
I am using "open_basedir", of course. Can't get through to bugs.php.net at the moment. I'm still reading the bugreport.
I can access http://bugs.php.net/bug.php?id=52312 but I'm not sure what you're having trouble with if you're already reading the bug report.
On my test-system, a plain office PC, but directly attached disk (vs. iSCSI), the 15000 lstat calls are done in about 500ms.
The interesting question to me there is why is that system faster than your server? Or rather, why is your server slow? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On 2016-06-09 15:32, Per Jessen wrote:
Well, I can at least report some progress -
Congratulations.
a) I have a test system up and running.
b) I have xdebug working, I find the documentation is a little lacking, but I'm getting there.
c) In utter desperation, I actually went and googled "php many lstat calls":
http://grokbase.com/t/php/php-internals/087f1t68mm/lstat-call-on-each-direct...
(precisely what I am seeing).
Looks like you struck gold.
Yes, very much so.
If it is doing five lstats for every file and there are 4500 files then hitting each file once with no cacheing would give you the bulk of your lstats but why would it be hitting every file? Making a sitemap?
I think it happens with include()/require() too, and processwire is quite modularised.
I am using "open_basedir", of course. Can't get through to bugs.php.net at the moment. I'm still reading the bugreport.
I can access http://bugs.php.net/bug.php?id=52312 but I'm not sure what you're having trouble with if you're already reading the bug report.
When I started writing the above, bugs.php.net was hanging ... forgot to delete the middle sentence before I posted.
On my test-system, a plain office PC, but directly attached disk (vs. iSCSI), the 15000 lstat calls are done in about 500ms.
The interesting question to me there is why is that system faster than your server? Or rather, why is your server slow?
The test system has a local disk, whereas the server uses iSCSI to the SAN. It's not even about bandwidth available, I'm guessing it's simply a longer path for each call. When I googled, people were reporting similar issues with NFS. I'll try to summarise the problem - the stat() calls are made by realpath() which resolves a path with relative and symbolic links to an absolute name. These calls are normally cached, except when open_basedir is active. (which it typically would be in a shared environment). It is a pity that there is no official patch for this yet, it's been going on for a few years. I found something called "turbo_realpath" which seems to works, at least on the test system. It enables the realpath() cache and disables function for creating sym/links. The other possibility is to disable open_basedir. As we're using mpm-itk, every request is run with user permissions anyway, so open_basedir might not actually be necessary. -- Per Jessen, Zürich (21.9°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-09 16:31, Per Jessen wrote:
Dave Howorth wrote:
On 2016-06-09 15:32, Per Jessen wrote:
Well, I can at least report some progress -
Congratulations.
a) I have a test system up and running.
b) I have xdebug working, I find the documentation is a little lacking, but I'm getting there.
c) In utter desperation, I actually went and googled "php many lstat calls":
http://grokbase.com/t/php/php-internals/087f1t68mm/lstat-call-on-each-direct...
(precisely what I am seeing).
Looks like you struck gold.
Yes, very much so.
If it is doing five lstats for every file and there are 4500 files then hitting each file once with no cacheing would give you the bulk of your lstats but why would it be hitting every file? Making a sitemap?
I think it happens with include()/require() too, and processwire is quite modularised.
I am using "open_basedir", of course. Can't get through to bugs.php.net at the moment. I'm still reading the bugreport.
I can access http://bugs.php.net/bug.php?id=52312 but I'm not sure what you're having trouble with if you're already reading the bug report.
When I started writing the above, bugs.php.net was hanging ... forgot to delete the middle sentence before I posted.
Ah, OK.
On my test-system, a plain office PC, but directly attached disk (vs. iSCSI), the 15000 lstat calls are done in about 500ms.
The interesting question to me there is why is that system faster than your server? Or rather, why is your server slow?
The test system has a local disk, whereas the server uses iSCSI to the SAN. It's not even about bandwidth available, I'm guessing it's simply a longer path for each call. When I googled, people were reporting similar issues with NFS.
I'll try to summarise the problem -
the stat() calls are made by realpath() which resolves a path with relative and symbolic links to an absolute name. These calls are normally cached, except when open_basedir is active. (which it typically would be in a shared environment).
It is a pity that there is no official patch for this yet, it's been going on for a few years. I found something called "turbo_realpath" which seems to works, at least on the test system. It enables the realpath() cache and disables function for creating sym/links.
The other possibility is to disable open_basedir. As we're using mpm-itk, every request is run with user permissions anyway, so open_basedir might not actually be necessary.
The bug doesn't seem to have had any updates from Rasmus in recent years. The suggestion to only do the realpath test when actually opening the file seemed sensible. I wonder if the discussion has moved somewhere else? Maybe post to the PHP-INTERNALS list and ask? Disabling open_basedir while also disabling symlinks in PHP and ideally in Apache if possible might be a fix. I haven't looked closely enough to know. I'm nervous about deliberately opening security holes for good reason since the bad guys will be testing for those. Adding a local disk is almost certainly a fix, if possible. Making a RAM-based filesystem to cache that part of your disks might be a possibility, if you can separate out anything thta changes state whilst the service is active. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/09/2016 11:56 AM, Dave Howorth wrote:
Disabling open_basedir while also disabling symlinks in PHP and ideally in Apache if possible might be a fix. I haven't looked closely enough to know. I'm nervous about deliberately opening security holes for good reason since the bad guys will be testing for those.
Adding a local disk is almost certainly a fix, if possible.
Making a RAM-based filesystem to cache that part of your disks might be a possibility, if you can separate out anything thta changes state whilst the service is active.
Those are god ideas, Dave, but they really patch over the problem, which I think is a design/architecture one: Why is it lstat()ing all those files every time? Solve that and those 'patches' aren't critical. I'm not saying that speed improvements like these are of no use, just that a design/architecture change gets to the root cause and fixing that is more bountiful. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, 2016-06-09 at 12:13 -0400, Anton Aylward wrote:
On 06/09/2016 11:56 AM, Dave Howorth wrote:
Disabling open_basedir while also disabling symlinks in PHP and ideally in Apache if possible might be a fix. I haven't looked closely enough to know. I'm nervous about deliberately opening security holes for good reason since the bad guys will be testing for those.
Adding a local disk is almost certainly a fix, if possible.
Making a RAM-based filesystem to cache that part of your disks might be a possibility, if you can separate out anything thta changes state whilst the service is active.
Those are god ideas, Dave, but they really patch over the problem, which I think is a design/architecture one:
The point is what can Per change with reasonable effort versus what can Rasmus and friends sensibly fix (given that they've shown no signs of doing it for up to a decade (depending how you measure). Why do I know Perl and not PHP? It's not a coincidence. Cheers, Dave
Why is it lstat()ing all those files every time?
Solve that and those 'patches' aren't critical. I'm not saying that speed improvements like these are of no use, just that a design/architecture change gets to the root cause and fixing that is more bountiful.
-- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon?
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/09/2016 04:19 PM, Dave Howorth wrote:
Why do I know Perl and not PHP? It's not a coincidence.
LOL! + LOTS! I've been using Perl for over 25 years, done some amazing stuff with it. I've looked at PHP a couple of times, bought a book about it years ago; opened it, read the first chapter and decided that this was not for me. Like this: <quote src="https://blog.nexcess.net/2010/03/31/php-open_basedir-and-magento-performance/"> There are negative side effects in relation to system performance when using open_basedir. The most significant is that when open_basedir is enabled, the PHP realpath cache will be disabled. The PHP realpath cache has been available in PHP since 5.1.0 and caches the paths of PHP include files. If running a smaller site with a low file count and with relatively shallow directory paths, the fact that directory paths will not be cached isn’t necessarily essential. But with applications such as Magento built on the Zend Framework, you end up with both a large base file count with a large include path and a very deep directory structure. In this situation you need to make sure that your paths do become cached with the realpath cache which means leaving open_basedir disabled. </quote> You don't get nonsense like that with Perl. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/09/2016 04:19 PM, Dave Howorth wrote:
Why do I know Perl and not PHP? It's not a coincidence.
LOL!
+ LOTS!
I've been using Perl for over 25 years, done some amazing stuff with it. I've looked at PHP a couple of times, bought a book about it years ago; opened it, read the first chapter and decided that this was not for me.
Like this: <quote src="https://blog.nexcess.net/2010/03/31/php-open_basedir-and-magento-performance/"> There are negative side effects in relation to system performance when using open_basedir. The most significant is that when open_basedir is enabled, the PHP realpath cache will be disabled. The PHP realpath cache has been available in PHP since 5.1.0 and caches the paths of PHP include files.
Anton, despite the current topic, that has got nothing to do with the language as such, it's an environmental setting for securing shared environments in Apache.
If running a smaller site with a low file count and with relatively shallow directory paths, the fact that directory paths will not be cached isn’t necessarily essential. But with applications such as Magento built on the Zend Framework, you end up with both a large base file count with a large include path and a very deep directory structure.
To my knowledge, we have a two hosted Magento customers, both on their own servers.
You don't get nonsense like that with Perl.
There is no doubt a reason why the most popular web-based tools are written in PHP (owncloud, wordpress, roundcube, joomla, moodle and many more). I don't recall in the last ten years having had reason to install a web-based perl application, where as I have installed PHP, Ruby-on-Rails and even obj- C apps. It's always about the right tool for the right job, but you also have to have the people to wield it. -- Per Jessen, Zürich (16.3°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
The point is what can Per change with reasonable effort versus what can Rasmus and friends sensibly fix (given that they've shown no signs of doing it for up to a decade (depending how you measure).
Yep, that is exactly it.
Why do I know Perl and not PHP? It's not a coincidence.
Funny, the opposite here - I'm essentially a C & assembler programmer, and I like PHP for it's C-like structure/syntax. -- Per Jessen, Zürich (16.6°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 07:25 AM, Carlos E. R. wrote:
Some sites optimize wonderfully for "mobile" users. Others don't do it at all. It needs optimizing the width of the page and the byte size of things to download, both. I hate looking at some news sites on my tablet because they are close to unreadable.
+1 :-) And I have a large TABLET! So make that 'phone'. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-08 13:45, Anton Aylward wrote:
On 06/08/2016 07:25 AM, Carlos E. R. wrote:
Some sites optimize wonderfully for "mobile" users. Others don't do it at all. It needs optimizing the width of the page and the byte size of things to download, both. I hate looking at some news sites on my tablet because they are close to unreadable.
+1 :-) And I have a large TABLET! So make that 'phone'.
My phone is large :-) Judging from the difficulty in finding big enough... er..., how do you call it, bag, envelope, thing to put the phone in, and that in your pocket? It falls out from my pocket. 8*15 cm. There are bigger phones, though. Too expensive for me. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On 2016-06-08 12:16, Per Jessen wrote:
I would run apache single-threaded with strace, but I can't on a production server :-)
Which begs the question of whether there is a test server? Can't you start and trace a new copy of apache on a different port? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On 2016-06-08 12:16, Per Jessen wrote:
I would run apache single-threaded with strace, but I can't on a production server :-)
Which begs the question of whether there is a test server?
There isn't. Never had much use for one. I was thinking of copying the whole setup to a test-server for this occasion though.
Can't you start and trace a new copy of apache on a different port?
Might be possible, yeah. -- Per Jessen, Zürich (17.4°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 07:11 AM, Carlos E. R. wrote:
Some of the graphics there (png, jpeg) also seem to be large. One takes 0.6S to download or to serve.
OUCH and DOUBLE OUCH. You said earlier that it was a server problem. In view of this I'd say "yes, but...' if a browser has to resize a image to fir the specified dimensions or otherwise fit the image in a bounding box, that takes time and computation on the part of the browser. You can quite reasonably say this is a server error. If the page designer had resized the image to the required dimensions in the first place, it would be smaller, quicker to download, and would not need resizing by the browser. You can view this as a class of possible server errors that I've seen: - error ridden HTML that the browser has to figure out the best it can do with it, try to make some sense of it - error ridden or massive or massively redundant CSS, multiple definitions, multiple files that the browser has to download and parse. Again an error on the part of the designer. - As above but for JavaScript or other mobile code. - useless eye-candy that consumes browser and CPU power to little effect I will grant you that some site, Wikipedia is one, are very heavy handed on the CSS and JavaScript. But it is worth noting that they have a separate server for those. The browser can parse the <HEAD> of the page and start downloading in parallel from that other server and while it may eat the end user's bandwidth it does not add load to the HTML server. If graphics, css or JavaScript add a significant amount of traffic this parallelism, which is quite different from the parallelism of it being done by one Apache server on one host, can be put to very good use. IIR Wikipedia and others publish statistics and an analysis about this. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/08/2016 07:11 AM, Carlos E. R. wrote:
Some of the graphics there (png, jpeg) also seem to be large. One takes 0.6S to download or to serve.
OUCH and DOUBLE OUCH.
You said earlier that it was a server problem.
The 0.6seconds to serve is not a problem - yet. It is easy reduced by a factor 10 by just reducing the quality of the jpeg. It looks like 15'000 calls of lstat() is more likely the issue. I have not yet determined whether that is a reasonable number or not.
if a browser has to resize a image to fir the specified dimensions or otherwise fit the image in a bounding box, that takes time and computation on the part of the browser.
Yes, but that is not causing any problem. More importantly, it wouldn't be my problem anyway :-)
You can quite reasonably say this is a server error. If the page designer had resized the image to the required dimensions in the first place, it would be smaller, quicker to download, and would not need resizing by the browser.
How does the designer know beforehand which size will be required? The background image on the first page is 1500x1000, probably a reasonable compromise. -- Per Jessen, Zürich (17.6°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 09:35 AM, Per Jessen wrote:
It looks like 15'000 calls of lstat() is more likely the issue. I have not yet determined whether that is a reasonable number or not.
Hmm. Not cheap. And why? Is this because of directories with large number of files and obsessive step-and-repeat. Lstat() or even stat() should not be expensive, except perahps for a badly done copy of the data to user space. However the real cost of a stat() call is that of reading the inode from disk. it may involve multiple hits because of path traversal. If *all* of those 15,000 calls *always" involves traversing from "/" that might make it worse. The critical factors are caching, of the inodes, obviously, and not just of the terminal inodes, but the inodes of the traversal path. There's also the pathname caching which should be able to short-cut some that traversal and make better use of the inode caching. But it gets back to "why"? Certainly a change in coding, a "cd('/$base_directory')" then lstat() of just "$filename" rather than a lstat() of "/$base_directory/$filename" would be a better implementation. But, once again, if there is a directory with 15,000 files and its doing a step-and-repeat to find one, then that bad architecture. Take a look, for example, at how Postfix deals with a two-level management structure. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/08/2016 09:35 AM, Per Jessen wrote:
It looks like 15'000 calls of lstat() is more likely the issue. I have not yet determined whether that is a reasonable number or not.
Hmm. Not cheap. And why? Is this because of directories with large number of files and obsessive step-and-repeat.
It seems to be 405 unique files or directories, 285 that belong to processwire (the cms). Otherwise: /srv/www/vhosts/srv000004b/htdocs - 1461 calls. /srv/www/vhosts/srv000004b - 2076 /srv/www/vhosts - 2076 /srv/www - 2076 /srv - 2076 I don't really understand why anything above "/srv/www/vhosts/srv000004b/htdocs" should be interesting ....
But, once again, if there is a directory with 15,000 files and its doing a step-and-repeat to find one, then that bad architecture.
The total website contains 4561 files, all included, also the CMS code. -- Per Jessen, Zürich (16.7°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 11:08 AM, Per Jessen wrote:
I don't really understand why anything above "/srv/www/vhosts/srv000004b/htdocs" should be interesting ....
This gives me the impression that some of the code, at least, is doing full traversal searches rather than a "cd()" and local "lstat()". -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/08/2016 11:08 AM, Per Jessen wrote:
I don't really understand why anything above "/srv/www/vhosts/srv000004b/htdocs" should be interesting ....
This gives me the impression that some of the code, at least, is doing full traversal searches rather than a "cd()" and local "lstat()".
It's pretty weird - here's a snippet with the filenames from the strace, in that order: /srv/www/vhosts/srv000004b/htdocs/site/finished.php /srv/www/vhosts/srv000004b/htdocs/site /srv/www/vhosts/srv000004b/htdocs /srv/www/vhosts/srv000004b /srv/www/vhosts /srv/www /srv -- Per Jessen, Zürich (16.4°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-08 16:08, Per Jessen wrote:
Anton Aylward wrote:
On 06/08/2016 09:35 AM, Per Jessen wrote:
It looks like 15'000 calls of lstat() is more likely the issue. I have not yet determined whether that is a reasonable number or not.
Hmm. Not cheap. And why? Is this because of directories with large number of files and obsessive step-and-repeat.
It seems to be 405 unique files or directories, 285 that belong to processwire (the cms).
Otherwise:
/srv/www/vhosts/srv000004b/htdocs - 1461 calls. /srv/www/vhosts/srv000004b - 2076 /srv/www/vhosts - 2076 /srv/www - 2076 /srv - 2076
I don't really understand why anything above "/srv/www/vhosts/srv000004b/htdocs" should be interesting ....
But, once again, if there is a directory with 15,000 files and its doing a step-and-repeat to find one, then that bad architecture.
The total website contains 4561 files, all included, also the CMS code.
The first question is whether the inefficiency is in apache or the PHP code. Given that you say other sites using the same apache are much more speedy, it seems very likely that it's the PHP code. In which case, debugging the PHP should be way easier than trying to figure out what's happening from an strace log. Am I right in thinking that you still feel that you need to solve this, rather than the web programmer? If so, then posting the source of index.php or whatever page it is that you're looking at might help us find the problem. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 11:56 AM, Dave Howorth wrote:
The first question is whether the inefficiency is in apache or the PHP code. Given that you say other sites using the same apache are much more speedy, it seems very likely that it's the PHP code. In which case, debugging the PHP should be way easier than trying to figure out what's happening from an strace log.
+1
Am I right in thinking that you still feel that you need to solve this, rather than the web programmer? If so, then posting the source of index.php or whatever page it is that you're looking at might help us find the problem.
+1 Even those of us who are not PHP programmers can read 'structured code', and, as I said, if three is come inefficiency in structure such as the full path for lstat() every time, we'll see that. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On 2016-06-08 16:08, Per Jessen wrote:
Anton Aylward wrote:
On 06/08/2016 09:35 AM, Per Jessen wrote:
It looks like 15'000 calls of lstat() is more likely the issue. I have not yet determined whether that is a reasonable number or not.
Hmm. Not cheap. And why? Is this because of directories with large number of files and obsessive step-and-repeat.
It seems to be 405 unique files or directories, 285 that belong to processwire (the cms).
Otherwise:
/srv/www/vhosts/srv000004b/htdocs - 1461 calls. /srv/www/vhosts/srv000004b - 2076 /srv/www/vhosts - 2076 /srv/www - 2076 /srv - 2076
I don't really understand why anything above "/srv/www/vhosts/srv000004b/htdocs" should be interesting ....
But, once again, if there is a directory with 15,000 files and its doing a step-and-repeat to find one, then that bad architecture.
The total website contains 4561 files, all included, also the CMS code.
The first question is whether the inefficiency is in apache or the PHP code.
Right - I am using mpm_itk, which adds a little bit of overhead, but it's worth it. The developer claims the site serves fine on his own PC (typical multi-core thingie), so another question I have been pondering is if the hardware is suitably potent. I don't really see an issue, but as the filesystem is on our SAN connected with iSCSI, IO might not be quite as plentyful as on a stand-alone PC with SSD drives. Still, 15000 superfluous calls to lstat() ...
Given that you say other sites using the same apache are much more speedy, it seems very likely that it's the PHP code. In which case, debugging the PHP should be way easier than trying to figure out what's happening from an strace log.
Hmm, that is a possibility. I was even thinking of registering for the processwire forum and post a question there.
Am I right in thinking that you still feel that you need to solve this, rather than the web programmer? If so, then posting the source of index.php or whatever page it is that you're looking at might help us find the problem.
My impression of the programmer is that he's more of a designer than a programmer. Yes, the ball is still very much with me - it's matter of "the website is slow on your mahcine" vs. "it works very well on mine". -- Per Jessen, Zürich (16.5°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 12:23 PM, Per Jessen wrote:
The developer claims the site serves fine on his own PC
yea, right! I had that years ago from an accounting system developer. Only my client loaded up a complete warehouse of inventory and about 500 transactions per day. When we marched down to visit the developer he only had a couple of dozen items and tried it with one or two transactions. He simply didn't understand 'scalability' and was locking every transaction. For much too long. Yes the code was 'accounting practices perfect', but the system was unusable in a real world situation. A massive recoding and finer grained modularization (and hence locking) also speeded up in other ways such as cache reticence. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/08/2016 12:23 PM, Per Jessen wrote:
The developer claims the site serves fine on his own PC
yea, right!
I had that years ago from an accounting system developer. Only my client loaded up a complete warehouse of inventory and about 500 transactions per day. When we marched down to visit the developer he only had a couple of dozen items and tried it with one or two transactions.
He simply didn't understand 'scalability' and was locking every transaction. For much too long. Yes the code was 'accounting practices perfect', but the system was unusable in a real world situation.
A massive recoding and finer grained modularization (and hence locking) also speeded up in other ways such as cache reticence.
Yes, anyone who has been around for long enough will easily recognise that situation. I would be very happy with a plain scalability issue like that :-) I cannot yet explain the - seemingly superfluous - 15000 stat() calls on directories above the DocRoot. -- Per Jessen, Zürich (16.5°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Op woensdag 8 juni 2016 18:23:33 CEST schreef Per Jessen:
The developer claims the site serves fine on his own PC (typical multi-core thingie)
And he has developped for a single user website on an equivalent of his PC ? If he's a real webdevelopper he should know that his claim is useless. This is about how I work: - development environment locally. Using just a small amount of the data - test environment, using the full data collection, stress testing should be included ( what if X users hit the URL and trigger query Y to run ), needs to be on (preferably) an identical machine. - production environment, where only stuff lands that has proven to result in a proper working situation. Things I've seen where developers / designers claimed "it was running OK on their machine" : - MySQL databases with no indices ( works fine locally with 2 customers, 10 products, 10 orders ) - Huge queries that should only run for logged in users, but placed outside the closed parts of the site - Badly writen queries ( with OK results .... easily done ) - Calls to external webservices being left out In my experience apache and nginx are hardly ever the cause of websites running slow. Too many high res pictures can be one, but most modern CMS's have their own caches, where they keep resized images, some even reduce on the fly when uploading pics. Most of the issues in the performance area I've seen are related to queries. The amount of attempts to "load everything we need in one query" .... sigh. -- Gertjan Lettink, a.k.a. Knurpht openSUSE Board Member openSUSE Forums Team -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-09 10:08, Knurpht - Gertjan Lettink wrote:
Op woensdag 8 juni 2016 18:23:33 CEST schreef Per Jessen:
The developer claims the site serves fine on his own PC (typical multi-core thingie)
And he has developped for a single user website on an equivalent of his PC ? If he's a real webdevelopper he should know that his claim is useless.
This is about how I work: - development environment locally. Using just a small amount of the data - test environment, using the full data collection, stress testing should be included ( what if X users hit the URL and trigger query Y to run ), needs to be on (preferably) an identical machine. - production environment, where only stuff lands that has proven to result in a proper working situation.
Things I've seen where developers / designers claimed "it was running OK on their machine" : - MySQL databases with no indices ( works fine locally with 2 customers, 10 products, 10 orders ) - Huge queries that should only run for logged in users, but placed outside the closed parts of the site - Badly writen queries ( with OK results .... easily done ) - Calls to external webservices being left out
In my experience apache and nginx are hardly ever the cause of websites running slow. Too many high res pictures can be one, but most modern CMS's have their own caches, where they keep resized images, some even reduce on the fly when uploading pics. Most of the issues in the performance area I've seen are related to queries. The amount of attempts to "load everything we need in one query" .... sigh.
I agree with the things that you say but as I understand Per's situation, they don't fit his symptoms. His problem on the production system apparently occurs with a single user making a single request - the first request. So the problem isn't load-related or complex-query-related. Per, various suggestions: I do think you need a test system (the customer should pay really). I think asking a question on the processwire forum might well be useful. I think it would be useful to understand the difference between the developer's system and the production/test system. Can you or he run exactly the same strace test on his machine and get the same behaviour? If not, what is different? I'm not a PHP programmer and knew nothing about processwire (though I do know Perl). There seems to be a debugger for PHP called xdebug, and it seems to be possible to run processwire from the command-line, which makes using a debugger much easier. So if it was me, I would try to run the same request that you used for the strace from the command line with the debugger and single step it to find out where the lstat's are happening. HTH, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
I agree with the things that you say but as I understand Per's situation, they don't fit his symptoms.
His problem on the production system apparently occurs with a single user making a single request - the first request. So the problem isn't load-related or complex-query-related.
Precisely. It is also reproducable on subsequent requests, although they seem to process slightly faster.
Per, various suggestions:
I do think you need a test system (the customer should pay really).
Yes and yes. Assuming I can come up with the proof needed, I will be sending the customer a bill for my time. Well, some of it :-(
I think asking a question on the processwire forum might well be useful.
Yep. I did in fact register for the forum yesterday, but have yet to receive the confirmation.
I think it would be useful to understand the difference between the developer's system and the production/test system. Can you or he run exactly the same strace test on his machine and get the same behaviour? If not, what is different?
I expect he's been working on Windows or Mac, but I think it ought to be enough to copy the website to some similar hardware and then compare numbers.
I'm not a PHP programmer and knew nothing about processwire (though I do know Perl). There seems to be a debugger for PHP called xdebug, and it seems to be possible to run processwire from the command-line, which makes using a debugger much easier. So if it was me, I would try to run the same request that you used for the strace from the command line with the debugger and single step it to find out where the lstat's are happening.
Interesting idea about running processwire on the command line, thanks. -- Per Jessen, Zürich (16.8°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Knurpht - Gertjan Lettink wrote:
Op woensdag 8 juni 2016 18:23:33 CEST schreef Per Jessen:
The developer claims the site serves fine on his own PC (typical multi-core thingie)
And he has developped for a single user website on an equivalent of his PC ? If he's a real webdevelopper he should know that his claim is useless.
Well yes. The web-programmer/developer/designer world is full of cowboys and self-taught what-have-yous. It is a fact of life, it is just a pity that I end up spending hours and hours sorting things out afterwards.
Things I've seen where developers / designers claimed "it was running OK on their machine" : - MySQL databases with no indices ( works fine locally with 2 customers, 10 products, 10 orders ) - Huge queries that should only run for logged in users, but placed outside the closed parts of the site - Badly writen queries ( with OK results .... easily done ) - Calls to external webservices being left out
Yes, scalability and preparing for it can be an issue, but it isn't in this case. The site is largely static, but based on an-easy-to-use CMS for the customer to make minor changes himself (news, success stories and such).
In my experience apache and nginx are hardly ever the cause of websites running slow. Too many high res pictures can be one, but most modern CMS's have their own caches, where they keep resized images, some even reduce on the fly when uploading pics. Most of the issues in the performance area I've seen are related to queries. The amount of attempts to "load everything we need in one query" .... sigh.
Fortunately, that sort of thing is easy to diagnose and to fix. Now, what about my 15000 lstat() calls? :-) -- Per Jessen, Zürich (16.8°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
Am I right in thinking that you still feel that you need to solve this, rather than the web programmer? If so, then posting the source of index.php or whatever page it is that you're looking at might help us find the problem.
That one's easy, it's the CMS entrypoint, all open source. If you want, you can download the package from http://processwire.com/ http://files.jessen.ch/processwire-index-php.txt -- Per Jessen, Zürich (16.6°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 09:35 AM, Per Jessen wrote:
You can quite reasonably say this is a server error. If the page designer had resized the image to the required dimensions in the first place, it would be smaller, quicker to download, and would not need resizing by the browser.
How does the designer know beforehand which size will be required? The background image on the first page is 1500x1000, probably a reasonable compromise.
That might be OK for a banner; thee are a few page templates that start with a large image at the top. But for a portable device that's outrageous! The answer to "how" _might_ be that the HTML code has a conditional. It asks what the browser is, the browser itself knows, and the "IF" clause deals differently with, for example, Internet Explorer vs Firefox. You'll see this quite often a IE handles some CSS differently. Fragments like this: </script> <!-- JS includes --> <!--[if lt IE 9]> <script src="scripts/html5shiv.js?4241844378" type="text/javascript"></script> <![endif]--> <!--custom head HTML--> <script> I don't know if that can be applied to 'devices". However there's also JavaScript. if you look at the raw feed from wikipedia you'll see that there's capability for handling smaller screens, dropping the sidebar and graphics, for phones such as 'android. It is rather rococo! See also: http://www.instructables.com/id/Make-your-HTML-Website-suitable-for-Mobile-D... Google for "fluid layout". -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/08/2016 09:35 AM, Per Jessen wrote:
You can quite reasonably say this is a server error. If the page designer had resized the image to the required dimensions in the first place, it would be smaller, quicker to download, and would not need resizing by the browser.
How does the designer know beforehand which size will be required? The background image on the first page is 1500x1000, probably a reasonable compromise.
That might be OK for a banner; thee are a few page templates that start with a large image at the top.
But for a portable device that's outrageous!
The answer to "how" _might_ be that the HTML code has a conditional.
Too much load on the server, much better to leave it to the client to sort out.
It asks what the browser is, the browser itself knows, and the "IF" clause deals differently with, for example, Internet Explorer vs Firefox. You'll see this quite often a IE handles some CSS differently.
Sure, but how does it know which size images it needs to produce?
However there's also JavaScript.
Yes, I know you could feed back the screen dimensions with JS, but by now it would be much cheaper to just leave it to the client to sort out, which is what everybody does. For mobile devices, don't people usually/often write separate sites anyway? -- Per Jessen, Zürich (16.6°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 11:33 AM, Per Jessen wrote:
The answer to "how" _might_ be that the HTML code has a conditional.
Too much load on the server, much better to leave it to the client to sort out.
Yes, that's what I mean, the 'conditional" (either an "IF" or ass a JavaScript, or a qualifier in teh CSS (about screensize) is done in the browser. The worst case might be that the browser requests a different CSS or JS from the server if its a phone, tablet or PC. But if you look at most HTML, its requesting one or more CSS file, one of more JS file, so selecting version A rather than version B isn't going to affect the load on the server. The issue isn't about load, its about DESIGN, which is a matter of forethought. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 09:35 AM, Per Jessen wrote:
Yes, but that is not causing any problem. More importantly, it wouldn't be my problem anyway :-)
There's and adage I was taught back in my programming/debugging days that amounts to if there is sloppiness in one part of the implantation or design its indicative that there is sloppiness elsewhere as well, and that it may be indicative of a poor[1] design as well a s poor implementation. [1] Which could mean many things: fragile, inefficient, unmaintainable ... -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 07:11 AM, Carlos E. R. wrote:
Some of the graphics there (png, jpeg) also seem to be large. One takes 0.6S to download or to serve.
Some people are under the mistaken impression that they must use .png for one of a number of reasons. - Some think such files are "more portable" - Some think they have to use .png because of patent/copyright on other formats My experience is that .png files can be 10 to 10,000 times larger than a .jpeg or .gif. As far as 'copyright' goes, the issue may be more with the subject matter or content of the image than the format in which it is stored! -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-08 13:52, Anton Aylward wrote:
On 06/08/2016 07:11 AM, Carlos E. R. wrote:
Some of the graphics there (png, jpeg) also seem to be large. One takes 0.6S to download or to serve.
Some people are under the mistaken impression that they must use .png for one of a number of reasons.
- Some think such files are "more portable" - Some think they have to use .png because of patent/copyright on other formats
When I capture a computer display with png and then convert to jpg with gimp (watching the quality-size balance) I observe that letters and other details become grainy. So I would create both, then decide which to use, one by one. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On 06/08/2016 08:50 AM, Carlos E. R. wrote:
When I capture a computer display with png
If you mean use a screen-scraper tool like KSnapshot then no wonder you get poor quality! Maybe I've generated it myself in which case I can recreate the image without the text, convert, the reapply the text with gimp maybe the tool I used to create the image can export to a gif or jpeg. Many of my images are mindmaps and the tools I use can export to all those formats.
and then convert to jpg with gimp (watching the quality-size balance)
OUCH again! Gimp is not the best tool to do a png -> gif or jpeg conversion Try 'convert' from the ImageMagick package.
I observe that letters and other details become grainy.
Its a shame that PNG doesn't define a 'text" chunk the way that HTML does, or the other "XML" like standards such as SVG. NAPLPS, a sort-of precursor to HTML in some ways was very text oriented and also had 'text' chunks. https://en.wikipedia.org/wiki/NAPLPS
So I would create both, then decide which to use, one by one.
-- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-08 16:23, Anton Aylward wrote:
On 06/08/2016 08:50 AM, Carlos E. R. wrote:
When I capture a computer display with png
If you mean use a screen-scraper tool like KSnapshot then no wonder you get poor quality!
no no. A screen capture is by definition "maximum quality" always. It has a defined number of pixels, it is a memory image. There is no possible error. However, rendering it as a jpeg is always lossy, by definition. Some quality is lost, and you decide how much, in a compromise between quality and size.
and then convert to jpg with gimp (watching the quality-size balance)
OUCH again! Gimp is not the best tool to do a png -> gif or jpeg conversion
No, gimp is the perfect tool to convert to jpeg (I didn't say gif). Why? Because you can move the quality slider from 1 to 100 and watch the result dynamically, instantly, and decide how much quality you want to sacrifice. No other tool allows this. Because of this human decision procedure, the result is slow. But perfect.
Try 'convert' from the ImageMagick package.
Not if you want to make the informed quality decision.
I observe that letters and other details become grainy.
Its a shame that PNG doesn't define a 'text" chunk the way that HTML does, or the other "XML" like standards such as SVG. NAPLPS, a sort-of precursor to HTML in some ways was very text oriented and also had 'text' chunks. https://en.wikipedia.org/wiki/NAPLPS
Try DjVu, then. It renders letters precisely and with as much compression as jpeg. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
Carlos E. R. wrote:
However, rendering it as a jpeg is always lossy, by definition. Some quality is lost, and you decide how much, in a compromise between quality and size.
I don't know, but an uncompressed jpeg might mean no loss? I mean, some digital cameras produce uncompressed JPEGs.
Why? Because you can move the quality slider from 1 to 100 and watch the result dynamically, instantly, and decide how much quality you want to sacrifice. No other tool allows this.
On screenshots for documentation, I invariably use 50-60, unless it's really complex.
Try 'convert' from the ImageMagick package.
Not if you want to make the informed quality decision.
Yeah, I can see that. No pun intended.
Its a shame that PNG doesn't define a 'text" chunk the way that HTML does, or the other "XML" like standards such as SVG. NAPLPS, a sort-of precursor to HTML in some ways was very text oriented and also had 'text' chunks. https://en.wikipedia.org/wiki/NAPLPS
IBMs DCF (specifically GML) is the direct-line ancestor. I worked quite a bit with GML & SCRIPT in the mid-to-late 80s, they had tags such as p, h1, h2, ol, li, ul, sl etc etc. When HTML turned up, it didn't take much to learn. HTML was written using SGML (ISO 8879:1986), which in turn was derived/expanded from GML, which dates back to the late 60s. -- Per Jessen, Zürich (16.7°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 01:24 PM, Carlos E. R. wrote:
If you mean use a screen-scraper tool like KSnapshot then no wonder you get poor quality! no no. A screen capture is by definition "maximum quality" always. It has a defined number of pixels, it is a memory image. There is no possible error.
I disagree. It may well be that the image you are scraping from the screen has a higher definition source. It may be that the original source is a 2650x1800 image that has been rendered down to fit 256x174. Why not wget the original? That's what I've always done in these cases. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-06-08 21:03, Anton Aylward wrote:
On 06/08/2016 01:24 PM, Carlos E. R. wrote:
If you mean use a screen-scraper tool like KSnapshot then no wonder you get poor quality! no no. A screen capture is by definition "maximum quality" always. It has a defined number of pixels, it is a memory image. There is no possible error.
I disagree. It may well be that the image you are scraping from the screen has a higher definition source. It may be that the original source is a 2650x1800 image that has been rendered down to fit 256x174.
Why not wget the original?
Then it is not a screenshot. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAldYhNsACgkQja8UbcUWM1yGtAD/dz29pugCu/siAIrDHasnXMkv +7H4+U1NtWQcJYPLCREA/AwfbUD7HUlfWq+IX3AF5GFNhMIAOO+6f5fauLuX69VM =N1oB -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 04:49 PM, Carlos E. R. wrote:
On 2016-06-08 21:03, Anton Aylward wrote:
On 06/08/2016 01:24 PM, Carlos E. R. wrote:
If you mean use a screen-scraper tool like KSnapshot then no wonder you get poor quality! no no. A screen capture is by definition "maximum quality" always. It has a defined number of pixels, it is a memory image. There is no possible error.
I disagree. It may well be that the image you are scraping from the screen has a higher definition source. It may be that the original source is a 2650x1800 image that has been rendered down to fit 256x174.
Why not wget the original?
Then it is not a screenshot.
Why is a screenshot so important? To get on the screen it must have come from somewhere. Why no make use of that original somewhere? If its the format you're concerned with, then we get back to the issue of converting. if its a high resolution SVG then there isn't a problem storing the original and converting to a pixel format. If its a low resolution image that's been zoomed then you've lost precision when one pixel gets scaled up to 2..4...8 But that's beside the point. If you had wget the small original you can still scale it up using convert or gimp. What is this obsession with the screen? Like I say, to get on the screen it must have come from somewhere. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/08/2016 04:49 PM, Carlos E. R. wrote:
On 2016-06-08 21:03, Anton Aylward wrote:
On 06/08/2016 01:24 PM, Carlos E. R. wrote:
If you mean use a screen-scraper tool like KSnapshot then no wonder you get poor quality! no no. A screen capture is by definition "maximum quality" always. It has a defined number of pixels, it is a memory image. There is no possible error.
I disagree. It may well be that the image you are scraping from the screen has a higher definition source. It may be that the original source is a 2650x1800 image that has been rendered down to fit 256x174.
Why not wget the original?
Then it is not a screenshot.
Why is a screenshot so important? To get on the screen it must have come from somewhere. Why no make use of that original somewhere?
It depends on what you need it for. If you just like the background image and want to use it on your desktop, copying the original (Ctrl-I -> Media -> Save as) is the better option, but if you want to document what it looked like on your screen, well .... -- Per Jessen, Zürich (15.1°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-06-09 08:05, Per Jessen wrote:
Anton Aylward wrote:
On 06/08/2016 04:49 PM, Carlos E. R. wrote:
On 2016-06-08 21:03, Anton Aylward wrote:
On 06/08/2016 01:24 PM, Carlos E. R. wrote:
> If you mean use a screen-scraper tool like KSnapshot then no > wonder you get poor quality! no no. A screen capture is by definition "maximum quality" always. It has a defined number of pixels, it is a memory image. There is no possible error.
I disagree. It may well be that the image you are scraping from the screen has a higher definition source. It may be that the original source is a 2650x1800 image that has been rendered down to fit 256x174.
Why not wget the original?
Then it is not a screenshot.
Why is a screenshot so important? To get on the screen it must have come from somewhere. Why no make use of that original somewhere?
It depends on what you need it for. If you just like the background image and want to use it on your desktop, copying the original (Ctrl-I -> Media -> Save as) is the better option, but if you want to document what it looked like on your screen, well ....
Notice that for me a screenshot is a screencapture, it is something for which the original is the computer display, for whatever reason (think of a page explaining how to use certain software). And as such, it has a fixed number of pixels and colours. It can not be improved, it is what it is (unless you switch the computer video hardware or choose another possible resolution). Then. Given a given picture, whatever source, it can be converted to jpeg. Jpeg is a lossy format. It compresses at the cost of worse quality. There is a compromise, you can improve the quality, or improve the size, not both. Thus you can do several conversions at different quality factors (from 1 to 100) and compare visually the results, and choose which one you like. The compression itself can be done with whatever software, but gimp has here a distinct advantage: it has a slider to choose the quality, and as you move it you see the program showing in a view how exactly will that picture display at the quality you have just selected with the mouse. You move the slider to 99 and see it perfect, and it tells you also what size you get. Too much, so you decrease to 90. Very good, but I want smaller. Move the slider, watch the image change the same instant and the size counter change, till you hit the sweet point of size and quality, then save the picture at that setting. If you still don't understand what I mean, try it. You have to click on the preview to display the target size for this to work. By default it is disabled. No other application does this with jpegs. (I hate that cameras compress to jpeg. It should be png. At least a choice. No, I do not know if Q=100 means no losses) Try to compress a screenshot with text to jpeg. As you move the slider to smaller size you will see the fonts to degrade a lot, get grainy, undefined. Often it is not worth it and you have to select png or gif. For the same size as jpeg, you can get much better results with "dejavu" format. It is ideal for scanned text (which is why it is used by libraries). Unfortunately, few programs support it. Not gimp. Nor convert. If you are interested, I can explain more. There is a web site with comparisons. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On 06/09/2016 05:43 AM, Carlos E. R. wrote:
It depends on what you need it for. If you just like the background image and want to use it on your desktop, copying the original (Ctrl-I -> Media -> Save as) is the better option, but if you want to document what it looked like on your screen, well ....
Notice that for me a screenshot is a screencapture, it is something for which the original is the computer display, for whatever reason (think of a page explaining how to use certain software). And as such, it has a fixed number of pixels and colours. It can not be improved, it is what it is (unless you switch the computer video hardware or choose another possible resolution).
Yes; I said in an earlier email that I can see a screenshot of the complete screen with menubars and framing. Certainly if you are documenting a product or a bug this makes sense. But if I'm capturing an image, say of a witty cat picture or something amusing, its the image I'm interested in capturing, and it may well be that the page its displayed on has down-rendered it, so loosing quality from the original. I go to the original, not the screen image.
Then. Given a given picture, whatever source, it can be converted to jpeg. Jpeg is a lossy format. It compresses at the cost of worse quality. There is a compromise, you can improve the quality, or improve the size, not both.
Yes, and if the picture you start with is the original rather than the screen capture, you may well have a higher resolution, higher quality object to start with, which gives you more options to choose from when you start playing around with it. I have the same option with my camera. I choose to shoot in RAW format. If I shoot in jpeg mode the camera is already degrading the raw information, probably to one of a number of "scene" settings. If I have the RAW then I can, in post-processing with Darktable (or Photoshop if I were on Windows) convert that to any or all of the 'scenes', crop, and otherwise manipulate the image in many ways the camera is not able to. And still have the high quality original The RAW original has information that the derivatives does not. What's more, I'm not stuck with jpeg. I can produce a 16-bit TIFF or a 16-bit JPEG2000 from that RAW. More information that in the jpeg the camera produced. And I can do that in any one of a number of color spaces. Yes, Darktable _could_ start with a jpeg or gif, but it cannot recreate the information that was lost when the image was downgraded. My nominal 16M P&S camera produces a 4608 x 3456 pixel sized RAW that contains the 12-bits for each of the 3 colours plus phase information. Some cameras do 14 or 16 bits, some have even larger sensors, 20M or 24M. Not only that, my RAW image contains a lot of metadata; not just the exposure settings, was flash used, shutter/aperture, date and time, but also the geolocation - my camera has GPS. In fact it has SmartGPS and can also tell me the name of the location and often what I was photographing (e.g "Stonehenge" or "Parthenon". If it can't, then Darktable can make use of Googlemaps of FreeMaps for the same, based on the Lat/Long. http://photographyconcentrate.com/10-reasons-why-you-should-be-shooting-raw/
Thus you can do several conversions at different quality factors (from 1 to 100) and compare visually the results, and choose which one you like.
Yes, and the example above with shooting RAW and using Darktable I have all that and much more that I can do ... ... BECAUSE I'M STARTING WITH MORE INFORMATION ! Which is the whole point I'm trying to make here. Its one thing to have screenshots to document a program, but if you're saving images from the web as images, then go for quality.
If you still don't understand what I mean, try it.
I do understand. I've been using GIMP and other image processing software for a very long time.
No other application does this with jpegs.
That is not the case. There are a number of photo-processing tools that can.
(I hate that cameras compress to jpeg. It should be png. At least a choice. No, I do not know if Q=100 means no losses)
Indeed, and it would be nice if we could flash the software of older cameras to do this. And while we're about it lets have them all use the standardized RAW format from Adobe - DNG. .... and then we wake up.
Try to compress a screenshot with text to jpeg. As you move the slider to smaller size you will see the fonts to degrade a lot, get grainy, undefined. Often it is not worth it and you have to select png or gif.
The issue here is loss of information. The 'text' isn't text, its just an image, and looses resolution just like everything else. This is not HTML, NAPLPS or SVG where the text is a text object distinct from the image objects or bitmap objects. As long as its a bitmap its going to degrade in resolution whatever you do because its bits and no its your brain that's seeing it as text, just as you brain sees the image is a cat and not a house. The thing is that a degraded image of a cat (or a house) still looks, yo your brain, like a cat (or a house). When the text looks like it was carved in stone and then subject to 4,000 years of being exposed to the weather in the Orkneys, then you'll have a hard time reading it. The issue here, and the whole point of what I'm going on about, is ENTROPY. Loss of information in the process you're describing. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/09/2016 02:05 AM, Per Jessen wrote:
Why is a screenshot so important? To get on the screen it must have come from somewhere. Why no make use of that original somewhere?
It depends on what you need it for. If you just like the background image and want to use it on your desktop, copying the original (Ctrl-I -> Media -> Save as) is the better option, but if you want to document what it looked like on your screen, well ....
That still makes no sense to me. If "what it looked like on your screen" means all the framing & menu and stuff from the desktop manager or application, then yes, that's what it looks like on your scree, ksnapshot "full screen" as opposed to ksnapshot "rectangular region". If you're documenting a !FAILURE! then you'll want to show the context as wel. But if all you're saving is the image, sorry, no, you're still not making sense. What it looked like on the screen is an artefact of the display tool and as long as you use the same display tool and the same source you should get the same result, unless you fiddle around with your monitor settings colour scheme in between time. I can well imagine a situation where I *want* to examine the difference something gets displayed in different tools or different colour scheme settings. Just the opposite of what you seem to be saying. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 01:24 PM, Carlos E. R. wrote:
On 2016-06-08 16:23, Anton Aylward wrote:
On 06/08/2016 08:50 AM, Carlos E. R. wrote:
No, gimp is the perfect tool to convert to jpeg (I didn't say gif).
Why? Because you can move the quality slider from 1 to 100 and watch the result dynamically, instantly, and decide how much quality you want to sacrifice. No other tool allows this.
I think that is wrong-headed. The 'master copy' (or 'reference copy' if you like) of an image should be the best you can. What gets sent out, what gets displayed, is the matter of choice. Which gets back to, as I said, grabbing the source rather than a screen-scraping. Certainly the CMS tools I use let me upload the highest quality I can, then use ImageMagick to produce a thumbnail for the pick list, and optionally let you pre-scale/compress/convert the source to the size/format
Try 'convert' from the ImageMagick package.
Not if you want to make the informed quality decision.
'convert' has a "quality value" option to set how much compression to use ... or not. Along with many other options to flatted, enhance, and generally and do all manner of pixel-level manipulation. RTFM. It also lets you overlay text, positioned where you want, in the font you want with the letter and word spacing and kerning you want. All eminently scriptable by us poor benighted shell programmers :-) -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/08/2016 07:11 AM, Carlos E. R. wrote:
Some of the graphics there (png, jpeg) also seem to be large. One takes 0.6S to download or to serve.
Some people are under the mistaken impression that they must use .png for one of a number of reasons.
- Some think such files are "more portable" - Some think they have to use .png because of patent/copyright on other formats
My experience is that .png files can be 10 to 10,000 times larger than a .jpeg or .gif.
As far as 'copyright' goes, the issue may be more with the subject matter or content of the image than the format in which it is stored!
For photos, backgrounds and such, use jpeg - when the individual pixels matter, use png. png has largely replaced gif due to a Unisys compression patent issue. LZW or some such. -- Per Jessen, Zürich (17.8°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
On 2016-06-08 15:26, Per Jessen wrote:
png has largely replaced gif due to a Unisys compression patent issue. LZW or some such.
Didn't that patent expire already?
Yes it did, but I don't see people reverting to gif. -- Per Jessen, Zürich (16.7°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 01:32 PM, Per Jessen wrote:
Didn't that patent expire already? Yes it did, but I don't see people reverting to gif.
It depends what you mean by "reverting". I've just done a peek under the hood at a search on eBay. All the structural stuff, the parts that done by eBay, are .png. I think there are a few jpgs there. The overwhelming majority of the vendor images were gif. If by 'reverting' you men the design & server and files are already in place, then why change? What is the motive? I'll grant you that. But there's no shortage of people using gif files in many contexts for 'new stuff'. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 06/08/2016 01:32 PM, Per Jessen wrote:
Didn't that patent expire already? Yes it did, but I don't see people reverting to gif.
It depends what you mean by "reverting".
Abandoning png in favour of gif.
I've just done a peek under the hood at a search on eBay. All the structural stuff, the parts that done by eBay, are .png. I think there are a few jpgs there.
The overwhelming majority of the vendor images were gif.
eBay may well have bought a gif license back when it was a problem. I'm not sure if such a large commercial site is a representative example. Just thinking out loud.
If by 'reverting' you men the design & server and files are already in place, then why change? What is the motive? I'll grant you that.
Right.
But there's no shortage of people using gif files in many contexts for 'new stuff'.
I don't think so, but I have no hard data back it up with, just like you :-) -- Per Jessen, Zürich (15.5°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 06/08/2016 01:25 PM, Carlos E. R. wrote:
On 2016-06-08 15:26, Per Jessen wrote:
png has largely replaced gif due to a Unisys compression patent issue. LZW or some such.
Didn't that patent expire already?
As I understand it ... * In the USA the Unisys patent expired on 20 June 2003 * In Europe it expired on 18 June 2004 * In Japan it expired on 20 June 2004 * In Canada it expired on 7 July 2004 Last of all, the U.S. IBM patent expired 11 August 2006. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (7)
-
Anton Aylward
-
Carlos E. R.
-
Darin Perusich
-
Dave Howorth
-
Dave Howorth
-
Knurpht - Gertjan Lettink
-
Per Jessen