Hello community,
here is the log from the commit of package collectl for openSUSE:Factory checked in at 2016-06-02 09:39:02
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/collectl (Old)
and /work/SRC/openSUSE:Factory/.collectl.new (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "collectl"
Changes:
--------
--- /work/SRC/openSUSE:Factory/collectl/collectl.changes 2015-06-01 09:53:54.000000000 +0200
+++ /work/SRC/openSUSE:Factory/.collectl.new/collectl.changes 2016-06-02 09:39:04.000000000 +0200
@@ -1,0 +2,29 @@
+Wed May 4 11:51:38 UTC 2016 - tabraham@suse.com
+
+- Update to 4.0.4
+ + if you try to playback a file with --stats and it has recorded
+ processes or slabs, ignore them be removing from $subsys
+ + playback of process data with -P was not skipping first interval and so
+ stats for first entry we not rates but rather raw numbers
+ + change 'yikes' message to something more meaningful
+ + fixed problem with -sZ -P printing all 0s for thread count
+ + added /usr/lib/systemd/system/collectl.service, per sourceforge help
+ discussion on 2015-12-28
+ + added disk read/write wait timing for disk detail in terminal, plot
+ and lexpr format
+ + new switch dskremap allows one to change disk names on the fly because
+ in some cases such as etherd disks, the names are messy for use with
+ other tools like ganlia
+ + removed access to disk name remapping file
+ + the rawdskfilt has been enhanced to allow a preceding + which will
+ cause the following string to be appended to the default filter
+
+- Changes from 4.0.3
+ + add AnonHuge memory to memory stats, both verbose and detailed as
+ well as lexpr
+ + if lexpr called with --import, throw an error
+ + tighten divide-by-zero test for -sM because it looks like in some cases
+ when misses >0 we're getting occasional errors. could hits be somehow
+ negative?
+
+-------------------------------------------------------------------
Old:
----
collectl-4.0.2.src.tar.gz
New:
----
collectl-4.0.4.src.tar.gz
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Other differences:
------------------
++++++ collectl.spec ++++++
--- /var/tmp/diff_new_pack.YHcrUu/_old 2016-06-02 09:39:05.000000000 +0200
+++ /var/tmp/diff_new_pack.YHcrUu/_new 2016-06-02 09:39:05.000000000 +0200
@@ -1,7 +1,7 @@
#
# spec file for package collectl
#
-# Copyright (c) 2015 SUSE LINUX GmbH, Nuernberg, Germany.
+# Copyright (c) 2016 SUSE LINUX GmbH, Nuernberg, Germany.
#
# All modifications and additions to the file contributed by third parties
# remain the property of their copyright owners, unless otherwise agreed
@@ -17,7 +17,7 @@
Name: collectl
-Version: 4.0.2
+Version: 4.0.4
Release: 0
Summary: Collects data that describes the current system status
License: Artistic-1.0 and GPL-2.0+
++++++ collectl-4.0.2.src.tar.gz -> collectl-4.0.4.src.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/INSTALL new/collectl-4.0.4/INSTALL
--- old/collectl-4.0.2/INSTALL 2015-05-27 15:02:55.000000000 +0200
+++ new/collectl-4.0.4/INSTALL 2016-01-29 15:38:21.000000000 +0100
@@ -6,6 +6,7 @@
DOCDIR=$DESTDIR/usr/share/doc/collectl
SHRDIR=$DESTDIR/usr/share/collectl
MANDIR=$DESTDIR/usr/share/man/man1
+SYSDDIR=$DESTDIR/usr/lib/systemd/system
ETCDIR=$DESTDIR/etc
INITDIR=$ETCDIR/init.d
@@ -56,6 +57,11 @@
/bin/rm -f $ETCDIR/rc.d/rc*.d/*collectl
fi
+# only if systemd is supported
+if [ -d $SYSDDIR ]; then
+ cp service/collectl.service $SYSDDIR
+fi
+
# Try and decide which distro this is based on distro specific files.
distro=1
if [ -f /sbin/yast ]; then
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/RELEASE-collectl new/collectl-4.0.4/RELEASE-collectl
--- old/collectl-4.0.2/RELEASE-collectl 2015-05-27 15:02:55.000000000 +0200
+++ new/collectl-4.0.4/RELEASE-collectl 2016-01-29 15:38:21.000000000 +0100
@@ -27,6 +27,31 @@
COLLECTL CHANGES
+4.0.4 Jan 29, 2016
+ - if you try to playback a file with --stats and it has recorded
+ processes or slabs, ignore them be removing from $subsys [thanks ghassen]
+ - playback of process data with -P was not skipping first interval and so
+ stats for first entry we not rates but rather raw numbers [thanks philippe]
+ - change 'yikes' message to something more meaningful [thanks rob and laurence]
+ - fixed problem with -sZ -P printing all 0s for thread count [thanks philippe]
+ - added /usr/lib/systemd/system/collectl.service, per sourceforge help
+ discussion on 2015-12-28 [thanks george]
+ - added disk read/write wait timing for disk detail in terminal, plot
+ and lexpr format [thanks bud]
+ - new switch dskremap allows one to change disk names on the fly because
+ in some cases such as etherd disks, the names are messy for use with
+ other tools like ganlia [thanks gabriel]
+ - removed access to disk name remapping file
+ - the rawdskfilt has been enhanced to allow a preceding + which will
+ cause the following string to be appended to the default filter
+
+4.0.3 July 2, 2015
+ - add AnonHuge memory to memory stats, both verbose and detailed as
+ well as lexpr [thanks, fred]
+ - if lexpr called with --import, throw an error
+ - tighten divide-by-zero test for -sM because it looks like in some cases when misses >0
+ we're getting occasional errors. could hits be somehow negative? [thanks Robert]
+
4.0.2 May 27, 2015
- add /bin/bash to list of 'known shells' excluded from output with
--procopt k
@@ -35,7 +60,7 @@
- collect nr_shmem so we can track shared memory, apparently something
I thought of but never acted on [thanks Christian]
- do not include guest cpu metrics in totals since already accounted
- for in user time
+ for in user time [thanks Philippe]
4.0.1
- change /usr/sbin to /usr/bin in init.d/collectl [thanks Ladislav]
@@ -70,9 +95,15 @@
COLMUX CHANGES
-4.9.0 ???
+4.9.0 Jan 06, 2016
- header name printing in single line mode not quite right for all
combinations of switches
+ - not trapping 'collectl not installed' errors and just returning
+ the node isn't reachable
+ - new switch -timerange will report warnings for any nodes found to
+ differ from others by more than this number of seconds
+ - added COMMUNICATIONS PROBLEMS section to man page and dropped
+ section describing what changed in Version 3
4.8.3 Mar 9, 2015
- -oT -test wasn't including time column in help output whereas -od and -oD
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/UNINSTALL new/collectl-4.0.4/UNINSTALL
--- old/collectl-4.0.2/UNINSTALL 2015-05-27 15:02:55.000000000 +0200
+++ new/collectl-4.0.4/UNINSTALL 2016-01-29 15:38:21.000000000 +0100
@@ -29,6 +29,7 @@
DOCDIR=$DESTDIR/usr/share/doc/collectl
SHRDIR=$DESTDIR/usr/share/collectl
MANDIR=$DESTDIR/usr/share/man/man1
+SYSDDIR=$DESTDIR/usr/lib/systemd/system
ETCDIR=$DESTDIR/etc
INITDIR=$ETCDIR/init.d
@@ -36,6 +37,7 @@
rm -f $ETCDIR/collectl.conf
rm -f $INITDIR/collectl
rm -f $MANDIR/collectl*
+rm -f $SYSDDIR/collectl.service # may not be there...
rm -fr $DOCDIR
rm -fr $SHRDIR
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/collectl new/collectl-4.0.4/collectl
--- old/collectl-4.0.2/collectl 2015-05-27 15:02:55.000000000 +0200
+++ new/collectl-4.0.4/collectl 2016-01-29 15:38:21.000000000 +0100
@@ -1,6 +1,6 @@
#!/usr/bin/perl -w
-# Copyright 2003-2013Hewlett-Packard Development Company, L.P.
+# Copyright 2003-2016 Hewlett-Packard Development Company, L.P.
#
# collectl may be copied only under the terms of either the Artistic License
# or the GNU General Public License, which may be found in the source kit
@@ -94,7 +94,7 @@
$Resize=$IpmiCache=$IpmiTypes=$ipmiExec='';
$i1DataFlag=$i2DataFlag=$i3DataFlag=0;
$lastSecs=$interval2Print=0;
-$diskRemapFlag=$diskChangeFlag=$cpuDisabledFlag=$cpusDisabled=$cpusEnabled=$noCpusFlag=0;
+$diskChangeFlag=$cpuDisabledFlag=$cpusDisabled=$cpusEnabled=$noCpusFlag=0;
$boottime=0;
# only used once here, but set in formatit.ph
@@ -111,8 +111,8 @@
$rootFlag=(!$PcFlag && `whoami`=~/root/) ? 1 : 0;
$SrcArch= $Config{"archname"};
-$Version= '4.0.2-1';
-$Copyright='Copyright 2003-2015 Hewlett-Packard Development Company, L.P.';
+$Version= '4.0.4-1';
+$Copyright='Copyright 2003-2016 Hewlett-Packard Development Company, L.P.';
$License= "collectl may be copied only under the terms of either the Artistic License\n";
$License.= "or the GNU General Public License, which may be found in the source kit";
@@ -289,7 +289,6 @@
# in the same directory as formatit.ph
print "BinDir: $BinDir ReqDir: $ReqDir\n" if $debug & 1;
require "$ReqDir/formatit.ph";
-$diskRemapFlag=(eval {require "$ReqDir/diskremap.ph" or die}) ? 1 : 0;
$formatitLoaded=1;
# finally try to load these two, both of which are optional
@@ -353,7 +352,7 @@
$comment=$runas='';
$rawDskFilter=$rawDskIgnore=$rawNetFilter=$rawNetIgnore='';
$tcpFiltDefault='ituc';
-
+$dskRemap='';
my ($extract,$extractMode)=('',0);
# Since --top has optionals arguments, we need to see if it was specified without
@@ -444,6 +443,7 @@
'cpufilt=s' => \$cpuFilt,
'dskfilt=s' => \$dskFilt,
'dskopts=s' => \$dskOpts,
+ 'dskremap=s' => \$dskRemap,
'export=s' => \$export,
'from=s' => \$from,
'full!' => \$fullFlag,
@@ -1029,14 +1029,34 @@
# purely an ease of use thing to allow people to use x-y as a cpu range
$cpuFilt=~s/-/../g;
-# Both kinds of DISK and NETWORK filtering
# Remember, this filter overrides the one in collectl.conf
if ($rawDskFilter ne '')
{
- $DiskFilter=$rawDskFilter;
+ # if filter starts with '+', just add to existing string
+ if ($rawDskFilter=~/\+/)
+ {
+ error("+ in rawdskfilt must be the first char")
+ if $rawDskFilter!~/^\+/;
+
+ $rawDskFilter=~s/^\+//;
+ $DiskFilter.="|$rawDskFilter";
+ }
+ else
+ {
+ $DiskFilter=$rawDskFilter;
+ }
$DiskFilterFlag=1;
}
+undef %diskRemap;
+$diskRemap{'cciss\/'}='';
+foreach my $remap (split(/,/, $dskRemap))
+{
+ my ($pat, $sub) = split(/:/, $remap);
+ logmsg('F', "--dskremap string, $remap, missing ':'") if !defined($sub);
+ $diskRemap{$pat}=$sub;
+}
+
# cpu filters are a little differnt because we're detailing with
# numbers and can't use pattern matching without some pain so
# to make it easy just add those to keep or ignore to an array
@@ -1763,6 +1783,12 @@
$subsys='' if $subsys eq ' ';
print "recSubsys: $recSubsys subsys: $subsys tempSys: $tempSys\n" if $debug & 1;
+ if ($statsFlag && $recSubsys=~/[YZ]/ && $subsys=~/[YZ]/)
+ {
+ print "--stats does not apply to slabs/process and so ignoring those subsystems\n";
+ $subsys=~s/[YZ]//g;
+ }
+
# When processing a batch of files, it's possible none of them have any of the selected subsystems,
# the best example being playing back *.gz files which have been collected with --tworaw and only
# requestion data in one typw. In those cases both files will be processed and we need to skip
@@ -6735,6 +6761,19 @@
my $whatsnew=<\$address,
"age=i" =>\$age,
@@ -191,6 +194,7 @@
"retaddr=s" =>\$retaddr,
"reverse!" =>\$revFlag,
"sshkey=s" =>\$sshkey,
+ "timerange=i" =>\$timerange,
"sudo!" =>\$sudoFlag,
"test!" =>\$testFlag,
"timeout=i" =>\$timeout,
@@ -423,6 +427,29 @@
Time::HiRes::usleep(100000);
}
+ # make sure dates within --timerange secs
+ my $minSecs=9999999999;
+ my $maxSecs=0;
+ for (my $i=0; $i<@hostnames; $i++)
+ {
+ my $year=substr($dates[$i], 0, 4);
+ my $mon= substr($dates[$i], 4, 2);
+ my $day= substr($dates[$i], 6, 2);
+ my $hour=substr($dates[$i], 8, 2);
+ my $mins=substr($dates[$i], 10, 2);
+ my $secs=substr($dates[$i], 12, 2);
+ my $seconds=(timelocal($secs, $mins, $hour, $day, $mon-1, $year-1900));
+ $minSecs=$seconds if $seconds<$minSecs;
+ $maxSecs=$seconds if $seconds>$maxSecs;
+ print "$hostnames[$i]: $dates[$i]" if $debug & 1024;
+ if ($maxSecs-$minSecs>$timerange)
+ {
+ my $plural=($timerange>1) ? 's' : '';
+ print "WARNING: $hostnames[$i]'s time differs by more than $timerange second$plural with at least one other\n";
+ print " run again with -debug 1024 and/or use -timerange or -quiet to suppress this message\n";
+ }
+ }
+
# Finally go back through hosts list in reverse order so we don't shift things
# on top of each other, removing any that report unsuitability for use
my $killSsh=0;
@@ -449,6 +476,7 @@
$reason='could not resolve name' if $threadFailure[$i]==16;
$reason='timed out during banner exchange' if $threadFailure[$i]==32;
$reason='collectl version < 3.5' if $threadFailure[$i]==64;
+ $reason='collectl not installed' if $threadFailure[$i]==128;
printf "$hostnames[$i] removed from list: $reason\n";
$printedReturnFlag=1;
@@ -657,12 +685,10 @@
{
exit if !reformatHeaders();
- print "LASTHEADER: $lastHeader\n";
foreach my $col (split(/\s+/, $lastHeader))
{
# strip detail field names including surrounding []s
$col=~s/\[.*\]// if $colnoinstFlag;
- print "PUSH: $col\n";
push @headernames, $col;
}
}
@@ -1181,7 +1207,7 @@
my $uname= (!defined($usernames{$hostnames[$i]})) ? '' : "$usernames{$hostnames[$i]}\@";
my $command="$Ssh $switch $uname$hostnames[$i] $Collectl -v 2>&1\n";
$command=~s/ -q//; # remove 'quiet' switch so we see 'connection refused'
- print "Command: $command\n" if $debug & 512;
+ print "Command: $command" if $debug & 512;
my $collectl=`$command`;
# note if motd printed, 'collectl' may not start on the 1st line, so need /m on regx
@@ -1209,6 +1235,11 @@
$threadFailure[$i]=8 if $collectl=~/Permission denied/s;
$threadFailure[$i]=16 if $collectl=~/Could not resolve/s;
$threadFailure[$i]=32 if $collectl=~/timed out during banner exchange/s;
+ $threadFailure[$i]=128 if $collectl=~/collectl:\s+No such file/s;
+
+ $command="$Ssh $switch $uname$hostnames[$i] date +%Y%m%d%H%M%S 2>&1\n";
+ print "Command: $command" if $debug & 512;
+ $dates[$i]=`$command`;
}
}
@@ -2334,7 +2365,8 @@
Diagnostics
-debug number primarily for development/debugging, see source code
-nocheck do not check hosts (ping/ssh/collectl) before connecting
- -quiet do not report warnings for mismatched collectl versions
+ -quiet do not report warnings for mismatched collectl versions and
+ unknown connections
-reachable if specified, ALL hosts must be pingable/ssh-able
Miscellanous
@@ -2345,6 +2377,7 @@
start with -deb 1 to see address collectl told to use
-timeout secs use this timeout for remote collectl to connect back
requires collectl V3.6.4 or better
+ -timerange secs report remote systems times wider than this rang [def=$timerange]
$Copyright;
EOF
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/docs/markseger,collectl@web.sourceforge.net new/collectl-4.0.4/docs/markseger,collectl@web.sourceforge.net
--- old/collectl-4.0.2/docs/markseger,collectl@web.sourceforge.net 1970-01-01 01:00:00.000000000 +0100
+++ new/collectl-4.0.4/docs/markseger,collectl@web.sourceforge.net 2016-01-29 15:38:21.000000000 +0100
@@ -0,0 +1,134 @@
+<html>
+<head>
+<link rel=stylesheet href="style.css" type="text/css">
+<title>Exporting Data to Graphite</title>
+</head>
+
+<body>
+<center><h1>Exporting Data to Graphite</h1></center>
+<p>
+<h3>Introduction</h3>
+<p>
+With the release of Collectl Version 3.6.1, you can now send collectl data directly to <a href=http://graphite.wikidot.com>
+graphite </a>. For existing collectl users this now provides you with yet another way to store/plot collectl data, whether on
+a single system or hundreds. For graphite users who are not yet collectl users, you now have access to literally hundreds
+of performance metrics:
+<p>
+<ul>
+<li>Since all collectl instances collect this data at the same time, system noise on
+clusters running fine-grained parallel jobs is reduced, though for larger clusters.</li>
+<li>You can still log all the data collectl collects locally and only send a subset
+to graphite, reducing the load on both graphite and your network.</li>
+<li>You can monitor as infrequently as you like and send data to graphite at a
+coarser frequency of either of the average, minimum or maximum values over that interval.</li>
+<li>The r= switch, something unique to the graphite plugin, can help reduce the instantaneous
+load on the graphite server itself.</li>
+<li>All this at collectl's low monitoring overhead</li>
+</ul>
+
+<h3>Usage</h3>
+<p>
+You use this export like any other, the only required option being the address to send the data to as in
+the following example:
+
+<div class=terminal>
+<pre>
+collectl --export graphite,192.168.1.113
+</pre></div>
+
+However you should also note that since by design this export does not provide any terminal output, there
+are only 2 real ways to make sure it is doing what you expect, the first being to inspect graphite's
+whisper storage area for your particular host name and make sure the data you're collecting is in fact
+showing up there:
+
+<div class=terminal>
+<pre>
+ls /opt/graphite/storage/whisper/poker
+cpuload cputotals ctxint disktotals nettotals
+</pre></div>
+
+or to simply run with the debug mask set to 1, which tells the graphite module to echo all the data it is
+sending to graphite, noting in this case even though collectl is collecting cpu, disk and network data we're
+not sending cpu data to graphite. This is something you might do if logging more data to disk than you are
+sending to graphite, which in this case we are:
+
+<div class=terminal>
+<pre>
+collectl --export graphite,192.168.1.113,d=1,s=dn -rawtoo -f /var/log/collectl
+poker.disktotals.reads 0 1325609789
+poker.disktotals.readkbs 0 1325609789
+poker.disktotals.writes 0 1325609789
+poker.disktotals.writekbs 0 1325609789
+poker.nettotals.kbin 0 1325609789
+poker.nettotals.pktin 1 1325609789
+poker.nettotals.kbout 0 1325609789
+poker.nettotals.pktout 0 1325609789
+</pre></div>
+
+<b><i>tip</i></b> - if you add 8 to the debug flag, eg <i>d=9</i>, this tells the graphite module not to
+actually establish the connection with graphite's carbon listener but to only echo the data that would
+have been sent.
+<p>
+Once you're happy with the switch settings, be sure to update the <i>DaemonCommands</i> in /etc/collectl.conf
+and restart the collectl daemon to make them take effect.
+<p>
+<h3>Switches unique to graphite</h3>
+<b>e=escape</b>
+<br>When sending data to graphite, collectl prefaces each line item with the hostname. If that name includes
+a domain name, extra <i>dots</i> add additional levels the the variable names which may not be desireable. By including
+an escape character, those <i>dots</i> will be replaced by that character.
+<p>
+<b>r=seconds</b>
+<br>
+By design, collectl calls the export module as soon as the required data has been collected and collection
+is synchronized to the nearest milli-seconds across a cluster, this means all instances of collectl will send
+their data to graphite at almost exactly the same time. This high burst of data can overwhelm graphite and
+so to reduce the load when that is found to be a problem, OR if you just want to smooth out the load you can
+use <i>r=seconds</i> which literally means <i>delay sending your data to ganglia by a random number of micro-seconds
+<= seconds</i>.
+<p>
+There is an additional caveat and that is that this stall must have completed by the end of the
+current data collection periods and so you're restricted to a maximum delay of the interval less 1 second.
+This means if you run collectl with -i1, you can't use -r. However, since most users run collectl with
+intervals of 5 or 10 seconds, values of 4 or 9 should be more than sufficient. And if you choose a collection
+interval of 30 seconds you may still want to use a value of r closer to 5 or 10 seconds so that the data will
+arrive at graphite reasonablly close together.
+<p>
+For help with what other valid switches are, you can actually get the graphite module itself to tell
+you like this:
+<div class=terminal>
+<pre>
+collectl --export graphite,h
+</pre></div>
+
+<h3>Communications</h3>
+<p>
+Collectl will attempt to establish a TCP connection to the specified address/port, noting the default port is 2003.
+If that connection cannot be established, collectl will report an error but <i>not</i> exit! This is because
+graphite itself may be down and need to be restarted.
+
+<div class=terminal>
+<pre>
+collectl --export graphite,192.168.1.113,d=1,s=dn
+Could not create socket to 192.168.1.113:2003. Reason: Connection refused
+</pre></div>
+
+By design when collectl assumes the graphite address is correct and will try to reconnect every monitoring
+interval. Further, to avoid generating too many errors, it will silently continue to retry and only report
+the connection failure every 100 times, a constant you can modify in the graphite.ph header if you really
+care. Once graphite comes back online collectl will again start sending data to it.
+<p>
+<table width=100%><tr><td align=right><i>updated November 9, 2012</i></td></tr></colgroup></table>
+
+<script type="text/javascript">
+var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
+document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
+</script>
+<script type="text/javascript">
+try {
+var pageTracker = _gat._getTracker("UA-6408045-1");
+pageTracker._trackPageview();
+} catch(err) {}</script>
+
+</body>
+</html>
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/formatit.ph new/collectl-4.0.4/formatit.ph
--- old/collectl-4.0.2/formatit.ph 2015-05-27 15:02:55.000000000 +0200
+++ new/collectl-4.0.4/formatit.ph 2016-01-29 15:38:21.000000000 +0100
@@ -1,4 +1,4 @@
-# copyright, 2003-20012 Hewlett-Packard Development Company, LP
+# copyright, 2003-20016 Hewlett-Packard Development Company, LP
#
# collectl may be copied only under the terms of either the Artistic License
# or the GNU General Public License, which may be found in the source kit
@@ -66,7 +66,7 @@
}
}
- # For -sD calculations, we need the HZ of the system
+ # for jiffy based calculations, we need the HZ of the system
$HZ=POSIX::sysconf(&POSIX::_SC_CLK_TCK);
$PageSize=POSIX::sysconf(_SC_PAGESIZE);
@@ -237,8 +237,7 @@
my @fields=split(/\s+/, $line);
my $diskName=$fields[3];
- $diskName=remapDiskName($diskName) if $diskRemapFlag;
- $diskName=~s/cciss\///;
+ $diskName=diskRemapName($diskName);
push @dskOrder, $diskName;
$disks{$diskName}=$dskIndexNext++;
}
@@ -767,6 +766,23 @@
}
}
+sub diskRemapName
+{
+ my $diskName=shift;
+
+ foreach my $key (keys %diskRemap)
+ {
+ if ($diskName=~/$key/)
+ {
+ my $temp=$diskName;
+ $diskName=~s/$key/$diskRemap{$key}/;
+ $remapped{$temp}=$diskName; # save, just in case we want to use some day
+ #print "$temp RENAMED via REMAP: $key TO $diskName\n";
+ }
+ }
+ return($diskName);
+}
+
# Why is initFormat() so damn big?
#
# Since logs can be analyzed on a system on which they were not generated
@@ -1110,7 +1126,7 @@
$header=~/Interval: (\S+)/;
$interval=$1;
- # For -s p calculations, we need the HZ of the system
+ # save HZ and archictecture for later use
$header=~/HZ:\s+(\d+)\s+Arch:\s+(\S+)/;
$HZ=$1;
$SrcArch=$2;
@@ -1320,7 +1336,8 @@
$procsRun=$procsBlock=0;
$pagein=$pageout=$swapin=$swapout=$swapTotal=$swapUsed=$swapFree=0;
$pagefault=$pagemajfault=0;
- $memTot=$memUsed=$memFree=$memShared=$memBuf=$memCached=$memSlab=$memAnon=$memMap=$memCommit=$memLocked=0;
+ $memTot=$memUsed=$memFree=$memShared=$memBuf=$memCached=$memSlab=0;
+ $memAnon=$memAnonH=$memMap=$memCommit=$memLocked=0;
$memHugeTot=$memHugeFree=$memHugeRsvd=$memSUnreclaim=0;
$sockUsed=$sockTcp=$sockOrphan=$sockTw=$sockAlloc=0;
$sockMem=$sockUdp=$sockRaw=$sockFrag=$sockFragM=0;
@@ -1446,6 +1463,7 @@
$dskRead[$i]=$dskReadKB[$i]=$dskReadMrg[$i]=0;
$dskWrite[$i]=$dskWriteKB[$i]=$dskWriteMrg[$i]=0;
$dskRqst[$i]=$dskQueLen[$i]=$dskWait[$i]=$dskSvcTime[$i]=$dskUtil[$i]=0;
+ $dskWaitR[$i]=$dskWaitW[$i]=0;
}
for ($i=0; $i<$netIndexNext; $i++)
@@ -1568,7 +1586,8 @@
$pagefaultLast=$pagemajfaultLast=0;
$opsLast=$readLast=$readKBLast=$writeLast=$writeKBLast=0;
$memFreeLast=$memUsedLast=$memBufLast=$memCachedLast=0;
- $memInactLast=$memSlabLast=$memMapLast=$memAnonLast=$memCommitLast=$memLockedLast=0;
+ $memInactLast=$memSlabLast=$memMapLast=0;
+ $memAnonLast=$memAnonHLast=$memCommitLast=$memLockedLast=0;
$swapFreeLast=$swapUsedLast=0;
for ($i=0; $i<18; $i++)
@@ -1601,7 +1620,7 @@
$numaStat[$i]->{hitsLast}=$numaStat[$i]->{missLast}=$numaStat[$i]->{forLast}=0;
$numaMem[$i]->{freeLast}= $numaMem[$i]->{usedLast}=$numaMem[$i]->{actLast}=0;
$numaMem[$i]->{inactLast}=$numaMem[$i]->{mapLast}= $numaMem[$i]->{anonLast}=0;
- $numaMem[$i]->{lockLast}= $numaMem[$i]->{slabLast}=0;
+ $numaMem[$i]->{anonHLast}=$numaMem[$i]->{lockLast}= $numaMem[$i]->{slabLast}=0;
}
# ...and disks
@@ -2119,7 +2138,6 @@
# during interactive processing, the first interval only provides baseline data
# and so never call print
intervalPrint($seconds) if $playback ne '' || $intFirstSeen;
-
# need to reinitialize all relevant variables at end of each interval.
initInterval();
@@ -3277,7 +3295,9 @@
{
($major, $minor, $diskName, @dskFields)=split(/\s+/, $data);
- $diskName=~s/cciss\///;
+ # if using --dskremap (and also noting prepopulated with 'cciss/'), remap disk name
+ $diskName=diskRemapName($diskName);
+
if (!defined($disks{$diskName}))
{
$dskChangeFlag|=1; # new disk found
@@ -3379,6 +3399,8 @@
$dskRqst[$dskIndex]= $numIOs ? ($dskReadKB[$dskIndex]+$dskWriteKB[$dskIndex])/$numIOs : 0;
$dskQueLen[$dskIndex]= $dskTicks[$dskIndex] ? $dskWeighted[$dskIndex]/$dskTicks[$dskIndex] : 0;
$dskWait[$dskIndex]= $numIOs ? ($dskReadTicks[$dskIndex]+$dskWriteTicks[$dskIndex])/$numIOs : 0;
+ $dskWaitR[$dskIndex]= $dskRead[$dskIndex] ? ($dskReadTicks[$dskIndex]/$dskRead[$dskIndex]) : 0;
+ $dskWaitW[$dskIndex]= $dskWrite[$dskIndex] ? ($dskWriteTicks[$dskIndex]/$dskWrite[$dskIndex]) : 0;
$dskSvcTime[$dskIndex]=$numIOs ? $dskTicks[$dskIndex]/$numIOs : 0;
$dskUtil[$dskIndex]= $dskTicks[$dskIndex]*10/$microInterval;
@@ -3814,7 +3836,7 @@
}
}
- elsif ($subsys=~/m/i && $type=~/^Buffers|^Cached|^Dirty|^Active|^Inactive|^AnonPages|^Mapped|^Slab:|^Committed_AS:|^Huge|^SUnreclaim|^Mloc|^nr/)
+ elsif ($subsys=~/m/i && $type=~/^Buffers|^Cached|^Dirty|^Active|^Inactive|^Anon|^Mapped|^Slab:|^Committed_AS:|^Huge|^SUnreclaim|^Mloc|^nr/)
{
$data=(split(/\s+/, $data))[0];
$memBuf=$data if $type=~/^Buf/;
@@ -3823,7 +3845,8 @@
$memAct=$data if $type=~/^Act/;
$memInact=$data if $type=~/^Ina/;
$memSlab=$data if $type=~/^Sla/;
- $memAnon=$data if $type=~/^Anon/;
+ $memAnon=$data if $type=~/^AnonPages/;
+ $memAnonH=$data if $type=~/^AnonHuge/;
$memMap=$data if $type=~/^Map/;
$memLocked=$data if $type=~/^Mlocked/;
$memCommit=$data if $type=~/^Com/;
@@ -3843,6 +3866,7 @@
$memSlabC= $memSlab-$memSlabLast;
$memMapC= $memMap-$memMapLast;
$memAnonC= $memAnon-$memAnonLast;
+ $memAnonHC= $memAnonH-$memAnonHLast;
$memCommitC=$memCommit-$memCommitLast;
$memLockedC=$memLocked-$memLockedLast;
@@ -3852,6 +3876,7 @@
$memSlabLast= $memSlab;
$memMapLast= $memMap;
$memAnonLast= $memAnon;
+ $memAnonHLast= $memAnonH;
$memCommitLast=$memCommit;
$memLockedLast=$memLocked;
}
@@ -3891,8 +3916,10 @@
{ $numaMem[$node]->{inact}=$value; }
elsif ($name=~/^Mapped/)
{ $numaMem[$node]->{map}=$value; }
- elsif ($name=~/^Anon/)
+ elsif ($name=~/^AnonPages/)
{ $numaMem[$node]->{anon}=$value; }
+ elsif ($name=~/^AnonHugePages/)
+ { $numaMem[$node]->{anonH}=$value; }
elsif ($name=~/^Mlock/)
{ $numaMem[$node]->{lock}=$value; }
@@ -3910,6 +3937,7 @@
$numaMem[$node]->{inactC}=$numaMem[$node]->{inact}-$numaMem[$node]->{inactLast};
$numaMem[$node]->{mapC}= $numaMem[$node]->{map}- $numaMem[$node]->{mapLast};
$numaMem[$node]->{anonC}= $numaMem[$node]->{anon}- $numaMem[$node]->{anonLast};
+ $numaMem[$node]->{anonHC}=$numaMem[$node]->{anonH}-$numaMem[$node]->{anonHLast};
$numaMem[$node]->{lockC}= $numaMem[$node]->{lock}- $numaMem[$node]->{lockLast};
$numaMem[$node]->{slabC}= $numaMem[$node]->{slab}- $numaMem[$node]->{slabLast};
@@ -3919,6 +3947,7 @@
$numaMem[$node]->{inactLast}=$numaMem[$node]->{inact};
$numaMem[$node]->{mapLast}= $numaMem[$node]->{map};
$numaMem[$node]->{anonLast}= $numaMem[$node]->{anon};
+ $numaMem[$node]->{anonHLast}=$numaMem[$node]->{anonH};
$numaMem[$node]->{lockLast}= $numaMem[$node]->{lock};
$numaMem[$node]->{slabLast}= $numaMem[$node]->{slab};
}
@@ -3943,13 +3972,13 @@
# These MUST be caused by a kernel bug as counters shouldn't go backwards!!!
if ($numaStat[$node]->{miss}<0)
{
- print "#yikes! numa_miss < 0 on node: $node NOW: $numaStat[$node]->{missNow} LAST: $numaStat[$node]->{missLast}\n";
+ logmsg('E', "Possible kernel metric bug, miss counter went backwards from $numaStat[$node]->{missLast} to $numaStat[$node]->{missNow}");
$numaStat[$node]->{miss}=0;
}
if ($numaStat[$node]->{for}<0)
{
- print "#yikes! numa_foreign < 0 on node: $node NOW: $numaStat[$node]->{forNow} LAST: $numaStat[$node]->{forLast}\n";
+ logmsg('E', "Possible kernel metric bug, foreign counter went backwards from $numaStat[$node]->{forLast} to $numaStat[$node]->{forNow}");
$numaStat[$node]->{for}=0;
}
@@ -4426,7 +4455,7 @@
if ($subsys=~/m/)
{
$headers.="[MEM]Tot${SEP}[MEM]Used${SEP}[MEM]Free${SEP}[MEM]Shared${SEP}[MEM]Buf${SEP}[MEM]Cached${SEP}";
- $headers.="[MEM]Slab${SEP}[MEM]Map${SEP}[MEM]Anon${SEP}[MEM]Commit${SEP}[MEM]Locked${SEP}"; # always from V1.7.5 forward
+ $headers.="[MEM]Slab${SEP}[MEM]Map${SEP}[MEM]Anon${SEP}[MEM]AnonH${SEP}[MEM]Commit${SEP}[MEM]Locked${SEP}";
$headers.="[MEM]SwapTot${SEP}[MEM]SwapUsed${SEP}[MEM]SwapFree${SEP}[MEM]SwapIn${SEP}[MEM]SwapOut${SEP}";
$headers.="[MEM]Dirty${SEP}[MEM]Clean${SEP}[MEM]Laundry${SEP}[MEM]Inactive${SEP}";
$headers.="[MEM]PageIn${SEP}[MEM]PageOut${SEP}[MEM]PageFaults${SEP}[MEM]PageMajFaults${SEP}";
@@ -4594,11 +4623,12 @@
$dskName=$dskOrder[$i];
next if ($dskFiltKeep eq '' && $dskName=~/$dskFiltIgnore/) || ($dskFiltKeep ne '' && $dskName!~/$dskFiltKeep/);
- $temp= "[DSK]Name${SEP}[DSK]Reads${SEP}[DSK]RMerge${SEP}[DSK]RKBytes${SEP}";
- $temp.="[DSK]Writes${SEP}[DSK]WMerge${SEP}[DSK]WKBytes${SEP}[DSK]Request${SEP}";
+ # note I removed remapping of cciss name because it was just discovered I never needed to since
+ # the cciss/ was dropped when $dskOrder array implemented
+ $temp= "[DSK]Name${SEP}[DSK]Reads${SEP}[DSK]RMerge${SEP}[DSK]RKBytes${SEP}[DSK]WaitR${SEP}";
+ $temp.="[DSK]Writes${SEP}[DSK]WMerge${SEP}[DSK]WKBytes${SEP}[DSK]WaitW${SEP}[DSK]Request${SEP}";
$temp.="[DSK]QueLen${SEP}[DSK]Wait${SEP}[DSK]SvcTim${SEP}[DSK]Util${SEP}";
$temp=~s/DSK/DSK:$dskName/g;
- $temp=~s/cciss\///g;
$dskHeaders.=$temp;
}
writeData(0, $ch, \$dskHeaders, DSK, $ZDSK, 'dsk', \$headersAll);
@@ -4626,7 +4656,7 @@
for ($i=0; $i<$CpuNodes; $i++)
{
$numaHeaders.="[NUMA:$i]Used${SEP}[NUMA:$i]Free${SEP}[NUMA:$i]Slab${SEP}[NUMA:$i]Mapped${SEP}";
- $numaHeaders.="[NUMA:$i]Anon${SEP}[NUMA:$i]Inactive${SEP}[NUMA:$i]Hits${SEP}";
+ $numaHeaders.="[NUMA:$i]Anon${SEP}[NUMA:$i]AnonH${SEP}[NUMA:$i]Inactive${SEP}[NUMA:$i]Hits${SEP}";
}
writeData(0, $ch, \$numaHeaders, NUMA, $ZNUMA, 'numa', \$headersAll);
}
@@ -4960,7 +4990,7 @@
# the data is already being recorded in the raw file and we don't want to do
# both
- if (!$rawtooFlag && $subsys=~/[YZ]/ && $interval2Print)
+ if (!$rawtooFlag && $subsys=~/[YZ]/ && $interval2Print && !$firstTime2)
{
printPlotSlab($date, $time) if $subsys=~/Y/ && !$slabAnalOnlyFlag;
printPlotProc($date, $time) if $subsys=~/Z/ && !$procAnalOnlyFlag;
@@ -5003,7 +5033,8 @@
{
$plot.=sprintf("$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS",
$memTot, $memUsed, $memFree, $memShared, $memBuf, $memCached);
- $plot.=sprintf("$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS", $memSlab, $memMap, $memAnon, $memCommit, $memLocked); # Always from V1.7.5 forward
+ $plot.=sprintf("$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS",
+ $memSlab, $memMap, $memAnon, $memAnonH, $memCommit, $memLocked); # Always from V1.7.5 forward
$plot.=sprintf("$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS",
$swapTotal, $swapUsed, $swapFree, $swapin/$intSecs, $swapout/$intSecs,
$memDirty, $clean, $laundry, $memInact,
@@ -5233,16 +5264,16 @@
if (defined($disks{$dskName}))
{
my $i=$disks{$dskName};
- $dskRecord=sprintf("%s$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS",
+ $dskRecord=sprintf("%s$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS",
$dskName,
- $dskRead[$i]/$intSecs, $dskReadMrg[$i]/$intSecs, $dskReadKB[$i]/$intSecs,
- $dskWrite[$i]/$intSecs, $dskWriteMrg[$i]/$intSecs, $dskWriteKB[$i]/$intSecs,
+ $dskRead[$i]/$intSecs, $dskReadMrg[$i]/$intSecs, $dskReadKB[$i]/$intSecs, $dskWaitR[$i]/$intSecs,
+ $dskWrite[$i]/$intSecs, $dskWriteMrg[$i]/$intSecs, $dskWriteKB[$i]/$intSecs, $dskWaitW[$i]/$intSecs,
$dskRqst[$i], $dskQueLen[$i], $dskWait[$i], $dskSvcTime[$i], $dskUtil[$i]);
}
else
{
- $dskRecord=sprintf("%s$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS",
- $dskName, 0,0,0,0,0,0,0,0,0,0,0);
+ $dskRecord=sprintf("%s$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS$SEP%$FS",
+ $dskName, 0,0,0,0,0,0,0,0,0,0,0,0,0);
}
# If exception processing in effect and writing to a file, make sure this entry
@@ -5412,13 +5443,13 @@
for (my $i=0; $i<$CpuNodes; $i++)
{
# don't see how total can ever be 0, but let's be careful anyways
- my $misses=$numaStat[$i]->{for}+$numaStat[$i]->{miss};
- my $hitrate=($misses) ? $numaStat[$i]->{hits}/($numaStat[$i]->{hits}+$misses)*100 : 100;
+ my $hitsplusmisses=$numaStat[$i]->{hits}+$numaStat[$i]->{for}+$numaStat[$i]->{miss};
+ my $hitrate=($hitsplusmisses) ? $numaStat[$i]->{hits}/$hitsplusmisses*100 : 100;
- $numaPlot.=sprintf("$SEP%d$SEP%d$SEP%d$SEP%d$SEP%d$SEP%d$SEP%.2f",
+ $numaPlot.=sprintf("$SEP%d$SEP%d$SEP%d$SEP%d$SEP%d$SEP%d$SEP%d$SEP%.2f",
$numaMem[$i]->{used}, $numaMem[$i]->{free}, $numaMem[$i]->{slab},
$numaMem[$i]->{map}, $numaMem[$i]->{anon},
- $numaMem[$i]->{inact}, $hitrate);
+ $numaMem[$i]->{anonH},$numaMem[$i]->{inact}, $hitrate);
}
writeData(0, $datetime, \$numaPlot, NUMA, $ZNUMA, 'numa', \$oneline);
}
@@ -5965,15 +5996,15 @@
{
if ($dskOpts!~/f/)
{
- $dskhdr1Format="<---------reads---------><---------writes---------><--------averages--------> Pct\n";
- $dskhdr2Format=" KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util\n";
- $dskdetFormat="%s%-11s %6d %6d %4s %4s %6d %6d %4s %4s %5d %5d %4d %4d %3d\n";
+ $dskhdr1Format="<---------reads---------------><---------writes--------------><--------averages--------> Pct\n";
+ $dskhdr2Format=" KBytes Merged IOs Size Wait KBytes Merged IOs Size Wait RWSize QLen Wait SvcTim Util\n";
+ $dskdetFormat="%s%-11s %6d %6d %4s %4s %5d %6d %6d %4s %4s %5d %5d %5d %4d %4d %3d\n";
}
else
{
- $dskhdr1Format="<---------reads----------><---------writes---------><---------averages----------> Pct\n";
- $dskhdr2Format=" KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util\n";
- $dskdetFormat="%s%-11s %7.1f %6.0f %4s %4s %7.1f %6.0f %4s %4s %6.1f %6.1f %6.1f %6.1f %3.0f\n";
+ $dskhdr1Format="<------------reads--------------><-------------writes------------><---------averages----------> Pct\n";
+ $dskhdr2Format=" KBytes Merged IOs Size Wait KBytes Merged IOs Size Wait RWSize QLen Wait SvcTim Util\n";
+ $dskdetFormat="%s%-11s %7.1f %6.0f %4s %4s %6.1f %7.1f %6.0f %4s %4s %6.1f %6.1f %6.1f %6.1f %6.1f %5.2f\n";
}
}
@@ -6005,9 +6036,9 @@
$line=sprintf($dskdetFormat,
$datetime, $dskName,
$dskReadKB[$i]/$intSecs, $dskReadMrg[$i]/$intSecs, cvt($dskRead[$i]/$intSecs),
- $dskRead[$i] ? cvt($dskReadKB[$i]/$dskRead[$i],4,0,1) : 0,
+ $dskRead[$i] ? cvt($dskReadKB[$i]/$dskRead[$i],4,0,1) : 0, $dskWaitR[$i],
$dskWriteKB[$i]/$intSecs, $dskWriteMrg[$i]/$intSecs, cvt($dskWrite[$i]/$intSecs),
- $dskWrite[$i] ? cvt($dskWriteKB[$i]/$dskWrite[$i],4,0,1) : 0,
+ $dskWrite[$i] ? cvt($dskWriteKB[$i]/$dskWrite[$i],4,0,1) : 0, $dskWaitW[$i],
$dskRqst[$i], $dskQueLen[$i], $dskWait[$i], $dskSvcTime[$i], $dskUtil[$i]);
printText($line);
}
@@ -6673,24 +6704,24 @@
if ($memOpts!~/R/)
{
$line="#$miniFiller";
- $line.="<-------------------------------Physical Memory-------------------------------------->" if $memOpts eq '' || $memOpts=~/P/;
- $line.="<-----------Swap------------><-------Paging------>" if $memOpts eq '' || $memOpts=~/V/;
- $line.="<---Other---|-------Page Alloc------|------Page Refill----->" if $memOpts=~/p/;
- $line.="<------Page Steal-------|-------Scan KSwap------|------Scan Direct----->" if $memOpts=~/s/;
+ $line.="<------------------------------------Physical Memory------------------------------------------>" if $memOpts eq '' || $memOpts=~/P/;
+ $line.="<-----------Swap------------><-------Paging------>" if $memOpts eq '' || $memOpts=~/V/;
+ $line.="<---Other---|-------Page Alloc------|------Page Refill----->" if $memOpts=~/p/;
+ $line.="<------Page Steal-------|-------Scan KSwap------|------Scan Direct----->" if $memOpts=~/s/;
printText("$line\n");
$line="#$miniFiller";
- $line.=" Total Used Free Buff Cached Slab Mapped Anon Commit Locked Inact" if $memOpts eq '' || $memOpts=~/P/;
- $line.=" Total Used Free In Out Fault MajFt In Out" if $memOpts eq '' || $memOpts=~/V/;
- $line.=" Free Activ Dma Dma32 Norm Move Dma Dma32 Norm Move" if $memOpts=~/p/;
- $line.=" Dma Dma32 Norm Move Dma Dma32 Norm Move Dma Dma32 Norm Move" if $memOpts=~/s/;
+ $line.=" Total Used Free Buff Cached Slab Mapped Anon AnonH Commit Locked Inact" if $memOpts eq '' || $memOpts=~/P/;
+ $line.=" Total Used Free In Out Fault MajFt In Out" if $memOpts eq '' || $memOpts=~/V/;
+ $line.=" Free Activ Dma Dma32 Norm Move Dma Dma32 Norm Move" if $memOpts=~/p/;
+ $line.=" Dma Dma32 Norm Move Dma Dma32 Norm Move Dma Dma32 Norm Move" if $memOpts=~/s/;
printText("$line\n");
}
else
{
- $line=sprintf("#$miniFiller<-----------------------------------Physical Memory-------------------------------------------><------------Swap-------------><-------Paging------>\n");
+ $line=sprintf("#$miniFiller<---------------------------------------Physical Memory-----------------------------------------------><------------Swap-------------><-------Paging------>\n");
printText($line);
- printText("#$miniDateTime Total Used Free Buff Cached Slab Mapped Anon Commit Locked Inact Total Used Free In Out Fault MajFt In Out\n");
+ printText("#$miniDateTime Total Used Free Buff Cached Slab Mapped Anon AnonH Commit Locked Inact Total Used Free In Out Fault MajFt In Out\n");
}
exit(0) if $showColFlag;
}
@@ -6698,10 +6729,11 @@
if ($memOpts!~/R/)
{
$line="$datetime ";
- $line.=sprintf(" %7s %7s %7s %7s %7s %7s %7s %7s %7s %7s %5s",
+ $line.=sprintf(" %7s %7s %7s %7s %7s %7s %7s %7s %7s %7s %7s %5s",
cvt($memTot,7,1,1), cvt($memUsed,7,1,1), cvt($memFree,7,1,1),
cvt($memBuf,7,1,1), cvt($memCached,7,1,1),
- cvt($memSlab,7,1,1), cvt($memMap,7,1,1), cvt($memAnon,7,1,1),
+ cvt($memSlab,7,1,1), cvt($memMap,7,1,1),
+ cvt($memAnon,7,1,1), cvt($memAnonH,7,1,1),
cvt($memCommit,7,1,1), cvt($memLocked,7,1,1), cvt($memInact,5,1,1)) if $memOpts eq '' || $memOpts=~/P/;
$line.=sprintf(" %5s %5s %5s %4s %4s %5s %5s %4s %4s",
cvt($swapTotal,5,1,1), cvt($swapUsed,5,1,1), cvt($swapFree,5,1,1),
@@ -6720,10 +6752,11 @@
}
else
{
- $line=sprintf("$datetime %7s %8s %8s %8s %8s %7s %7s %7s %7s %7s %6s %5s %6s %6s %4s %4s %5s %5s %4s %4s\n",
+ $line=sprintf("$datetime %7s %8s %8s %8s %8s %7s %7s %7s %7s %7s %7s %6s %5s %6s %6s %4s %4s %5s %5s %4s %4s\n",
cvt($memTot/$intSecs,7,1,1), cvt($memUsedC/$intSecs,7,1,1), cvt($memFreeC/$intSecs,7,1,1),
cvt($memBufC/$intSecs,7,1,1), cvt($memCachedC/$intSecs,7,1,1),
- cvt($memSlabC/$intSecs,7,1,1), cvt($memMapC/$intSecs,7,1,1), cvt($memAnonC/$intSecs,7,1,1),
+ cvt($memSlabC/$intSecs,7,1,1), cvt($memMapC/$intSecs,7,1,1),
+ cvt($memAnonC/$intSecs,7,1,1), cvt($memAnonHC/$intSecs,7,1,1),
cvt($memCommitC/$intSecs,7,1,1), cvt($memLockedC/$intSecs,7,1,1), cvt($memInactC/$intSecs,5,1,1),
cvt($swapTotal,5,1,1), cvt($swapUsedC/$intSecs,5,1,1), cvt($swapFreeC/$intSecs,5,1,1),
cvt($swapin/$intSecs,5,1,1), cvt($swapout/$intSecs,5,1,1),
@@ -6744,7 +6777,7 @@
{
# we've got the room so let's use an extra column for each and have the same
# headers for 'R' and because I'm lazy.
- printText("#$miniFiller Node Total Used Free Slab Mapped Anon Locked Inact");
+ printText("#$miniFiller Node Total Used Free Slab Mapped Anon AnonH Locked Inact");
printText(" HitPct") if $memOpts!~/R/;
printText("\n");
}
@@ -6757,13 +6790,15 @@
if ($memOpts!~/R/)
{
# total hits can be 0 if no data collected
- my $misses=$numaStat[$i]->{for}+$numaStat[$i]->{miss};
- my $hitrate=($misses) ? $numaStat[$i]->{hits}/($numaStat[$i]->{hits}+$misses)*100 : 100;
- $line.=sprintf("$datetime %4d %8s %8s %8s %8s %8s %8s %8s %8s %6.2f\n", $i,
+ my $hitsplusmisses=$numaStat[$i]->{hits}+$numaStat[$i]->{for}+$numaStat[$i]->{miss};
+ my $hitrate=($hitsplusmisses) ? $numaStat[$i]->{hits}/$hitsplusmisses*100 : 100;
+
+ $line.=sprintf("$datetime %4d %8s %8s %8s %8s %8s %8s %8s %8s %8s %6.2f\n", $i,
cvt($numaMem[$i]->{used}+$numaMem[$i]->{free},7,1,1),
cvt($numaMem[$i]->{used},7,1,1), cvt($numaMem[$i]->{free},7,1,1),
cvt($numaMem[$i]->{slab},7,1,1), cvt($numaMem[$i]->{map},7,1,1),
- cvt($numaMem[$i]->{anon},7,1,1), cvt($numaMem[$i]->{lock},7,1,1),
+ cvt($numaMem[$i]->{anon},7,1,1), cvt($numaMem[$i]->{anonH},7,1,1),
+ cvt($numaMem[$i]->{lock},7,1,1),
cvt($numaMem[$i]->{inact},7,1,1), $hitrate);
}
else
@@ -6772,7 +6807,8 @@
cvt($numaMem[$i]->{usedC}+$numaMem[$i]->{freeC},7,1,1),
cvt($numaMem[$i]->{usedC},7,1,1), cvt($numaMem[$i]->{freeC},7,1,1),
cvt($numaMem[$i]->{slabC},7,1,1), cvt($numaMem[$i]->{mapC},7,1,1),
- cvt($numaMem[$i]->{anonC},7,1,1), cvt($numaMem[$i]->{lockC},7,1,1),
+ cvt($numaMem[$i]->{anonC},7,1,1), cvt($numaMem[$i]->{anonH},7,1,1),
+ cvt($numaMem[$i]->{lockC},7,1,1),
cvt($numaMem[$i]->{inactC},7,1,1));
}
}
@@ -8928,7 +8964,7 @@
undef %procSeen;
}
-# This output only goes to the .prc file
+# This output goes to the .prc file if -f specified
sub printPlotProc
{
my $date=shift;
@@ -8973,7 +9009,7 @@
# Username comes from translation hash OR we just print the UID
$procPlot.=sprintf("%s${SEP}%d${SEP}%s${SEP}%s${SEP}%s${SEP}%d${SEP}%s${SEP}%s${SEP}%s${SEP}%s${SEP}%s${SEP}%s${SEP}%s${SEP}%s${SEP}%s${SEP}%s${SEP}%s${SEP}%d${SEP}%s${SEP}%d${SEP}%d${SEP}%d${SEP}%d${SEP}%d${SEP}%d${SEP}%d${SEP}%s${SEP}%s${SEP}%s",
$datetime, $procPid[$i], $procUser[$i], $procPri[$i],
- $procPpid[$i], $procThread[%i], $procState[$i],
+ $procPpid[$i], $procTCount[$i], $procState[$i],
defined($procVmSize[$i]) ? $procVmSize[$i] : 0,
defined($procVmLck[$i]) ? $procVmLck[$i] : 0,
defined($procVmRSS[$i]) ? $procVmRSS[$i] : 0,
@@ -9876,4 +9912,4 @@
return(@dirs);
}
-1;
+1;
\ No newline at end of file
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/lexpr.ph new/collectl-4.0.4/lexpr.ph
--- old/collectl-4.0.2/lexpr.ph 2015-05-27 15:02:55.000000000 +0200
+++ new/collectl-4.0.4/lexpr.ph 2016-01-29 15:38:21.000000000 +0100
@@ -24,6 +24,7 @@
sub lexprInit
{
+ error('lexpr is an export, not an import') if $import=~/lexpr/;
error('--showcolheader not supported by lexpr') if $showColFlag;
# on the odd chance someone did -s-all and have other ways to generate data, collectl
@@ -252,8 +253,10 @@
$diskDetString.=sendData("diskinfo.reads.$dskName", $dskRead[$i]/$intSecs);
$diskDetString.=sendData("diskinfo.readkbs.$dskName", $dskReadKB[$i]/$intSecs);
+ $diskDetString.=sendData("diskinfo.readw.$dskName", $dskWaitR[$i]/$intSecs);
$diskDetString.=sendData("diskinfo.writes.$dskName", $dskWrite[$i]/$intSecs);
$diskDetString.=sendData("diskinfo.writekbs.$dskName", $dskWriteKB[$i]/$intSecs);
+ $diskDetString.=sendData("diskinfo.writew.$dskName", $dskWaitW[$i]/$intSecs);
$diskDetString.=sendData("diskinfo.quelen.$dskName", $dskQueLen[$i]/$intSecs);
$diskDetString.=sendData("diskinfo.wait.$dskName", $dskWait[$i]/$intSecs);
$diskDetString.=sendData("diskinfo.svctime.$dskName", $dskSvcTime[$i]/$intSecs);
@@ -340,6 +343,7 @@
$memString.=sendData("meminfo.slab", $memSlab, 1);
$memString.=sendData("meminfo.map", $memMap, 1);
$memString.=sendData("meminfo.anon", $memAnon, 1);
+ $memString.=sendData("meminfo.anonH", $memAnonH, 1);
$memString.=sendData("meminfo.dirty", $memDirty, 1);
$memString.=sendData("meminfo.locked", $memLocked, 1);
$memString.=sendData("meminfo.inactive", $memInact, 1);
@@ -362,7 +366,7 @@
{
for (my $i=0; $i<$CpuNodes; $i++)
{
- foreach my $field ('used', 'free', 'slab', 'map', 'anon', 'lock', 'act', 'inact')
+ foreach my $field ('used', 'free', 'slab', 'map', 'anon', 'anonH', 'lock', 'act', 'inact')
{
$memDetString.=sendData("numainfo.$field.$i", $numaMem[$i]->{$field}, 1);
}
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/man1/collectl.1 new/collectl-4.0.4/man1/collectl.1
--- old/collectl-4.0.2/man1/collectl.1 2015-05-27 15:02:56.000000000 +0200
+++ new/collectl-4.0.4/man1/collectl.1 2016-01-29 15:38:21.000000000 +0100
@@ -246,6 +246,11 @@
Just remember that unlike --dskfilt which only filters during display, records filtered
with this switch are never even recorded and so lost forever.
+You can optionally specify your filter with a leading plus-sign which tells collectl
+to just add your filter to the default specification. Care should be taken here as
+longer filters will slightly increase overhead and with a lot of disks and/or shorter
+monitoring intervals can add up.
+
As a side benefit of this switch, if you really want to look at partition level stats
you can do so by leaving off the trailing space in the default pattern.
@@ -974,6 +979,13 @@
z \- only applies to disk details, do not report any lines with values of all zeros.
.RE
+.B "--dskremap aaa:bbb,ccc:ddd..."
+.RS
+This will cause disk names matching the perl pattern aaa to be replaced with the string bbb. In some
+cases, you may simply want to remove the entire string in which case the second string should be left empty.
+If you want to remove a string container a /, be sure to escape it with a backslash.
+.RE
+
.B "--envopts Environmental Options"
.RS
The default is to display ALL data but the following will cause a subset to be displayed
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/man1/colmux.1 new/collectl-4.0.4/man1/colmux.1
--- old/collectl-4.0.2/man1/colmux.1 2015-05-27 15:02:56.000000000 +0200
+++ new/collectl-4.0.4/man1/colmux.1 2016-01-29 15:38:21.000000000 +0100
@@ -393,6 +393,10 @@
By default and when -nocheck not specified, colmux checks the versions of all collectl instances
against that of the first node found to be running collectl and if different, reports the
mismatch. This switch suppresses that warning.
+
+When a connection is received from an unexpected address, a warning is also
+reported and the request promptly ignored. This switch also suppresses those
+messages as well. For more information on problems connecting, see CONNECTION PROBLEMS.
.RE
.B "-reachable"
@@ -424,9 +428,8 @@
.B "-retaddr addr"
.RS
-By default, colmux listens on the default routes interface's address and tell collectl to connect
-back to it using that. However, in some cases this address won't work correctly and so using this
-switch will tell collectl to use a different address to connect back over.
+Tell remote collectls to open a socket on this address instead of the preselected one. For
+more details on this, see CONNECTION PROBLEMS.
.RE
.B "-timeout secs"
@@ -437,55 +440,14 @@
V3.6.4 be used because earlier version do not support this feature.
.RE
-.SH WHAT HAS CHANGED WITH VERSION 3?
-
-Users of Version 2 will find this to look like a new utility though in actuality only a couple
-of enhancements have been made to the functionality, which include:
-
-.B "sorting of multi-line data"
-
-Rather than simply report all the data for all hosts specified, something ver few people actually
-used, only the top-n hosts will now have their data reported, sorted by the column specified by -column.
-
-.B "ability to playback data from collectl files"
-
-Simply add -p to the collectl command and the associated file(s) for the same day will be played back
-and the data reported in either multi- or single-line format.
-
-.B "new features, include -test to show which column(s) selected"
-
-Instead of manually counting which column(s) you wish to select for sorting or single-line mode, -test
-will show you column numbering, which can be particularly useful for wide lines. Additional switches
-for enhanced multi-line formatting have also been included.
-
-.B "several changes to single line mode"
-
-.RS
-.B "new way to request prefacing lines with timestamps:"
-Simply add the desired time format using -o to the collectl command
-
-.B "no longer need -w for non-plot data:"
-colmux is smart enough to recognize fields that end in K/M/G and convert them to the
-appropriate values before sorting. However it will still display them in their
-original forms. Further, you can even sort on non-numeric fields such as device
-names and many of the fields reported for process data.
-.RE
-
-.B "several switched eliminated"
+.B "-timerange secs"
.RS
-Yes, it is hard to believe but a number of switches have been eliminated either
-because their functionality is encompassed in other mechanisms or their function
-has been deemed obsolete.
-
-.B "-date, -mmdd, -time:"
-time formats now handled with -o in collectl command
-
-.B "-hosts, -machines:"
-use -address
-
-.B "-rsh:"
-nobody uses rsh anymore
-.RE
+When colmux starts up and checks the connectivity to all the machines specified by -addr,
+it also gets their current date/time and using that computes the range of system times
+across all nodes. If that time is found to be more then -timerange seconds, colmux generates
+a warning as this difference could cause reporting probems. One can increase the range
+to get rid of the message (not recommended unless other factors are preventing nodes from
+responding quickly enough to the date command) OR suppress the warning with -quiet.
.RE
.SH PLAYBACK MODE RESTRICTIONS
@@ -549,14 +511,33 @@
colmux requires passwordless ssh between the node it is running on those it is monitoring.
also be sure the port you are using for communications, the default is 2655, if open
-.SH KNOWN PROBLEMS
+.SH CONNECTION PROBLEMS
-see source code
+The way colmux works is to choose an address it wants to communicate over and starts up
+one or more remote copies of collectl, telling them to connect back to colmux using that address.
+The easiest way to see this, is to run colmux with -noesc, which tells it NOT to issue any escape
+sequences and therefore not to run in full screen mode. The addional switch of -debug 1 tells
+it to show the remote collectl startup command. When there is a communications problem you
+will typically see 'connection timed out' messages displayed.
+
+There are actually a couple of possibilities here, one of which is a firewall is preventing
+connections and the easiest way to test this is run collectl on the local machine like this:
+collectl -Aserver. This tells collectl run as a server, listening for connections just
+like colmux. Then log into a remote machine and run
+/usr/share/collectl/util/client.pl addr-of-server which tells
+client.pl to open a socket to that copy of collectl. It should fail just like when it was
+run via colmux, so try opening the firewall and try it again. If it fixes the problem,
+it was indeed the firewall blocking things and colmux should now work just fine.
+
+Sometimes there are multiple interfaces defined on the machine hosting colmux and in some
+cases only some addresses will allow socket connections. Again, using client.pl on the remote
+machine try connecting back to collectl over different addresses and when you find one that
+works, tell colmux to use that address for communication via the -retaddr switch.
.SH AUTHOR
This program was written by Mark Seger (mark.seger@hp.com).
.br
-Copyright 2014 Hewlett-Packard Development Company, L.P.
+Copyright 2015 Hewlett-Packard Development Company, L.P.
.SH SEE ALSO
http://collectl-utils.sourceforge.net/colmux.html
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/collectl-4.0.2/service/collectl.service new/collectl-4.0.4/service/collectl.service
--- old/collectl-4.0.2/service/collectl.service 1970-01-01 01:00:00.000000000 +0100
+++ new/collectl-4.0.4/service/collectl.service 2016-01-29 15:38:21.000000000 +0100
@@ -0,0 +1,11 @@
+[Unit]
+Description=collectl metric collection
+
+[Service]
+Type=forking
+PIDFile=/var/run/collectl.pid
+ExecStart=/usr/bin/collectl -D
+ExecReload=/bin/kill -USR1 $MAINPID
+
+[Install]
+WantedBy=multi-user.target