Hello community,
here is the log from the commit of package pg_comparator for openSUSE:Factory checked in at 2014-08-20 10:51:16
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/pg_comparator (Old)
and /work/SRC/openSUSE:Factory/.pg_comparator.new (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "pg_comparator"
Changes:
--------
--- /work/SRC/openSUSE:Factory/pg_comparator/pg_comparator.changes 2014-04-30 15:10:14.000000000 +0200
+++ /work/SRC/openSUSE:Factory/.pg_comparator.new/pg_comparator.changes 2014-08-20 10:51:48.000000000 +0200
@@ -1,0 +2,6 @@
+Tue Aug 5 07:17:07 UTC 2014 - aldemir.akpinar@gmail.com
+
+- New Version 2.2.5
+- Minor Update & Bug fixes
+
+-------------------------------------------------------------------
Old:
----
pg_comparator-2.2.3.tgz
New:
----
pg_comparator-2.2.5.tgz
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Other differences:
------------------
++++++ pg_comparator.spec ++++++
--- /var/tmp/diff_new_pack.SsfvsH/_old 2014-08-20 10:51:50.000000000 +0200
+++ /var/tmp/diff_new_pack.SsfvsH/_new 2014-08-20 10:51:50.000000000 +0200
@@ -22,14 +22,14 @@
%define pgdocdir %{expand:%%(/usr/bin/pg_config --docdir)}
Name: pg_comparator
-Version: 2.2.3
+Version: 2.2.5
Release: 1%{?dist}
Summary: A tool to compare and sync tables in different locations
License: BSD-3-Clause
Group: Development/Libraries
Url: http://pgfoundry.org/projects/pg-comparator/
-Source: http://pgfoundry.org/frs/download.php/3643/pg_comparator-%{version}.tgz
+Source: http://pgfoundry.org/frs/download.php/3661/pg_comparator-%{version}.tgz
BuildRoot: %{_tmppath}/%{name}-%{version}-build
++++++ pg_comparator-2.2.3.tgz -> pg_comparator-2.2.5.tgz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pg_comparator-2.2.3/pg_comparator new/pg_comparator-2.2.5/pg_comparator
--- old/pg_comparator-2.2.3/pg_comparator 2014-04-19 17:11:43.000000000 +0200
+++ new/pg_comparator-2.2.5/pg_comparator 2014-07-24 10:21:30.000000000 +0200
@@ -1,6 +1,6 @@
#!/usr/bin/perl
#
-# $Id: pg_comparator.pl 1494 2014-04-19 15:11:43Z coelho $
+# $Id: pg_comparator.pl 1512 2014-07-24 08:21:31Z coelho $
#
# HELP 1: pg_comparator --man
# HELP 2: pod2text pg_comparator
@@ -191,7 +191,7 @@
Maximum relative search effort. The search is stopped if the number of results
is above this threshold expressed relatively to the table size.
-Use 2.0 for no limit (all tuples were deleted and new one are inserted).
+Use 2.0 for no limit (all tuples were deleted and new ones are inserted).
Default is B<0.1>, i.e. an overall 10% difference is allowed before giving up.
@@ -203,7 +203,7 @@
the table size is known.
Default is to compute the maximum number of reported differences based on
-the C<--max-ratio> option.
+the C<--max-ratio> option, with a mimimum of 100 differences allowed.
=item C<--max-levels=0>
@@ -227,6 +227,11 @@
Show option summary.
+=item C<--pg-copy=128>
+
+Experimental option to use PostgreSQL's COPY instead of INSERT/UPDATE
+when synchronizing, by chunks of the specified size.
+
=item C<--prefix='pgc_cmp'>
Name prefix, possibly schema qualified, used for generated comparison tables
@@ -282,8 +287,8 @@
=item C<--synchronize> or C<-S>
Actually perform operations to synchronize the second table wrt the first.
-Well, not really. It is only done if you add C<--do-it> or C<-D>.
-Save your data before attempting anything like that!
+Well, not really, it is only a dry run. It is actually done if you add
+C<--do-it> or C<-D>. Save your data before attempting anything like that!
Default is not to synchronize.
@@ -394,6 +399,7 @@
Note that some default value used by DBI drivers may be changed with
driver-specific environment variables, and that DBI also provides its own
defaults and overrides, so what actually happens may not always be clear.
+Default values for the second URL are mostly taken from the first URL.
=over 4
@@ -1024,11 +1030,12 @@
key columns, number of value columns, aggregate function, checksum function,
null handling, folding factor, table locking or not...).
-=item I<feature> - about 5 minutes & 168 or 474 runs
+=item I<feature> - about 5 minutes & 171 or 477 runs
Test various features:
I<cc> for checksum computation strategies,
I<auto> for trigger-maintained checksums on PostgreSQL,
+I<pgcopy> for PostgreSQL copy test,
I<empty> for corner cases with empty tables,
I<quote> for table quoting,
I<engine> for InnoDB vs MyISAM MySQL backends,
@@ -1038,7 +1045,7 @@
I<mylite> for SQLite/MySQL mixed mode with some restrictions,
I<pglite> for SQLite/PostgreSQL mixed mode with some restrictions.
-=item I<release> - about 20 minutes & 938 runs
+=item I<release> - about 20 minutes & 944 runs
This is I<feature> with two table sizes, I<fast>, and I<collisions>
to test possible hash collisions.
@@ -1091,6 +1098,11 @@
Mixed SQLite vs PostgreSQL or MySQL table comparison may not work properly in
all cases, because of SQLite dynamic type handling and reduced capabilities.
+The script creates (temporary) tables on both sides for comparing the target
+tables: this imply that you must be allowed to do that for the comparison...
+However, read-only replicas do not allow creating objects, which mean that you
+cannot use pg_comparator to compare table contents on a synchronized replica.
+
=head1 VERSIONS
See Lhttp://pgfoundry.org/projects/pg-comparator/> for the latest
@@ -1098,6 +1110,27 @@
=over 4
+=item B (r1512 on 2014-07-24)
+
+Fix broken URL defaults to use UNIX sockets with an empty host name,
+per report by I<Ivan Mincik>.
+Fix C<--where> condition handling with C<--pg-copy> in corner cases.
+Do not take execution timestamps when not required.
+Allow a larger number of differences by default for small table comparisons.
+Add more sanity checks.
+Improve some error messages.
+The I<release> validation was run successfully
+on PostgreSQL 9.4b1 and MySQL 5.5.38.
+
+=item B (r1506 on 2014-07-13)
+
+Add experimental support for using COPY instead of INSERT/UPDATE for PostgreSQL,
+in chunks of size specified with option C<--pg-copy>,
+as suggested by I<Graeme Bell>.
+Minor fix when computing the maximum number of differences to report.
+The I<release> validation was run successfully
+on PostgreSQL 9.4b1 and MySQL 5.5.37.
+
=item B (r1494 on 2014-04-19)
Improved documentation.
@@ -1360,8 +1393,8 @@
=cut
-my $script_version = '2.2.3 (r1494)';
-my $revision = '$Revision: 1494 $';
+my $script_version = '2.2.5 (r1512)';
+my $revision = '$Revision: 1512 $';
$revision =~ tr/0-9//cd;
################################################################# SOME DEFAULTS
@@ -1375,7 +1408,7 @@
my ($factor, $expect_warn) = (7, 0);
# condition, tests, max size of blobs, data sources...
my ($expect, $longreadlen, $source1, $source2, $key_cs, $tup_cs, $do_lock,
- $env_pass, $max_report, $stats);
+ $env_pass, $max_report, $stats, $pg_copy);
# algorithm defaults
# hmmm... could rely on base64 to handle binary keys?
@@ -1894,16 +1927,14 @@
my ($db, $user, $pass, $host, $port, $base, $tabl, $keys, $cols);
# get driver name
- if ($c =~ /^(pg|my)(sql)?:\/\//) {
- $db = $1 . 'sql';
- }
- elsif ($c =~ /^(sqlite|firebird):\/\//) {
+ if ($c =~ /^(\w+):\/\//) {
$db = $1;
+ $db .= 'sql' if $db eq 'pg' or $db eq 'my'; # allow pg & my for pgsql & mysql
+ $c =~ s/^\w+:\/\///; # remove driver part
}
else {
- verb 2, "no driver found in URL: $c" if $debug;
+ verb 2, "no driver in URL: $c" if $debug;
}
- $c =~ s/^\w+:\/\///;
# split authority and path on first '/'
die "invalid connection string '$c', must contain '\/'\n"
@@ -1911,27 +1942,25 @@
my ($auth, $path) = ($1, $2);
- if ("$auth")
- {
+ if ("$auth") {
# parse authority if non empty. ??? url-translation?
die "invalid authority string '$auth'\n"
unless $auth =~ /^((\w+) # login
(:([^.]*) # :password
- )?\@)? # @
+ )?\@)? # @ => auth string is before
([^\@:\/]*) # host
(:(\d+))?$ # :port
/x;
$user=$2 if defined $1;
$pass=$4 if defined $3;
- $host=$5; # may be empty, but must be defined!
+ $host=$5; # may be empty, but is always defined *if* there is a non empty auth
$port=$7 if defined $6;
verb 3, "user=$user pass=$pass host=$host port=" . defined $port? $port: '?'
if $debug;
}
- if ("$path")
- {
+ if ("$path") {
my $kc_str;
if (defined $db and ($db eq 'sqlite' or $db eq 'firebird')) {
@@ -1939,7 +1968,7 @@
# if so, the last "/" is mandatory to mark the table name
die "invalid path string '$path'\n"
unless $path =~ /
- ^((.*) # base file path
+ ^((.*) # base file (longest) path
\/(\w+|\"[^\"+]\")?)? # table
(\?(.+))? # key,part:column,list...
/x;
@@ -2844,7 +2873,8 @@
"stats-name=s" => \$name, # name of test
# misc
"long-read-len|lrl|L=i" => \$longreadlen,
- "version|V" => sub { print "$0 version is $script_version\n"; exit 0; }
+ "version|V" => sub { print "$0 version is $script_version\n"; exit 0; },
+ "pg-copy:i" => \$pg_copy
) or die "$! (try $0 --help)";
# propagate expect specification
@@ -2878,6 +2908,9 @@
$factor = 1 if $factor<1;
$factor = 30 if $factor>30;
+# use pg_copy if possible, currently for inserts
+$pg_copy = 128 if defined $pg_copy and ($pg_copy eq '' or $pg_copy eq '0');
+
# intermediate table names
# what about putting the table name as well?
my ($name1, $name2) = ("${prefix}_1_", "${prefix}_2_");
@@ -2895,10 +2928,11 @@
# set defaults and check minimum definitions.
$db1 = 'pgsql' unless defined $db1;
$u1 = $ENV{USER} unless defined $u1;
-$h1 = 'localhost' unless defined $h1;
+$h1 = '' unless defined $h1; # defaults to Unix socket
$p1 = $M{$db1}{port} if not defined $p1 and exists $M{$db1}{port};
+# k/c defaults set later
-# these are necessary
+# these are obviously necessary:-)
die "no base on first connection" unless defined $b1 or defined $source1;
die "no table on first connection" unless defined $t1 or defined $source1;
@@ -2909,13 +2943,16 @@
if defined $db2 and $db2 eq 'sqlite' and
(defined $u2 or defined $h2 or defined $p2);
-# fix some default values for connection 2
+# default values for connection 2 is mostly to reuse from connection 1
$db2 = $db1 unless defined $db2;
$u2 = $u1 unless defined $u2;
-$h2 = 'localhost' unless defined $h2;
-$p2 = $M{$db2}{port} if not defined $p2 and exists $M{$db2}{port};
+$h2 = $h1 unless defined $h2;
+# same as fist iff same driver, or driver default
+$p2 = ($db2 eq $db1)? $p1: $M{$db2}{port}
+ unless defined $p2 and exists $M{$db2}{port};
$b2 = $b1 unless defined $b2;
$t2 = $t1 unless defined $t2;
+# k/c defaults set later
die "null should be 'text' or 'hash', got $null"
unless $null =~ /^(text|hash)$/i;
@@ -2930,13 +2967,11 @@
unless $agg =~ /^(xor|sum)$/i;
# database connection...
-if (defined $env_pass and not defined $w1)
-{
+if (defined $env_pass and not defined $w1) {
$w1 = $ENV{"${env_pass}1"};
$w1 = $ENV{$env_pass} unless defined $w1;
}
-if ($ask_pass and not defined $w1)
-{
+if ($ask_pass and not defined $w1) {
require Term::ReadPassword;
$w1 = Term::ReadPassword::read_password('connection 1 password> ');
}
@@ -2947,21 +2982,29 @@
$w2 = $w1 unless $w2 or not $w1 or $u1 ne $u2 or $h1 ne $h2 or $p1 ne $p2;
-if (defined $env_pass and not defined $w2)
-{
+if (defined $env_pass and not defined $w2) {
$w2 = $ENV{"${env_pass}2"};
$w2 = $ENV{$env_pass} unless defined $w2;
}
-if ($ask_pass and not defined $w2)
-{
+if ($ask_pass and not defined $w2) {
require Term::ReadPassword;
$w2 = Term::ReadPassword::read_password('connection 2 password> ');
}
-# some sanity checks, that are skipped under debugging so as to test
+# some sanity checks
+die "sorry, --pg-copy option requires connections to postgresql"
+ if defined $pg_copy and ($db1 ne 'pgsql' or $db2 ne 'pgsql');
+
+die "--pg_copy must be strictly positive, got '$pg_copy'"
+ if defined $pg_copy and $pg_copy <= 0;
+
+# sanity check skipped under debugging so as to test
die "sorry, threading does not seem to work with PostgreSQL driver"
if not $debug and $threads and ($db1 eq 'pgsql' or $db2 eq 'pgsql');
+die "sorry, --pg-copy currently requires --no-async"
+ if not $debug and defined $pg_copy and $async;
+
# fix some settings for SQLite
if (not $debug and ($db1 eq 'sqlite' or $db2 eq 'sqlite'))
{
@@ -3149,25 +3192,27 @@
!$synchronize);
}
-# set defaults...
-if (not defined $k1)
-{
+# get/set k/c defaults once connected
+if (not defined $k1) {
$k1 = [get_table_pkey($dbh1, $db1, $b1, $t1)];
warn "default key & attribute on first connection but not on second..."
if defined $k2;
+ die "no primary key found on first connection table $t1" unless @$k1;
}
-if (not defined $c1)
-{
+if (not defined $c1) {
$c1 = [get_table_attributes($dbh1, $db1, $b1, $t1, @$k1)];
# warn, as this may lead to unexpected results...
warn "default attributes on first connection but not on second..."
if defined $c2;
}
+# fix second connection default
$k2 = $k1 unless defined $k2;
$c2 = $c1 unless defined $c2;
# some sanity checks
+die "empty key on first connection, must specify one" unless @$k1;
+die "empty key on second connection, must specify one" unless @$k2;
die "key number of attributes does not match" unless @$k1 == @$k2;
die "column number of attributes does not match" unless @$c1 == @$c2;
@@ -3282,11 +3327,16 @@
$size = $count1>$count2? $count1: $count2; # MAX size of both tables
# stop at this number of differences
-$max_report = $max_ratio * $size unless defined $max_report;
+if (not (defined $max_report or $expect_warn and defined $expect)) {
+ $max_report = int($max_ratio * $size);
+ # bee cool with small stuff...
+ $max_report = 100 if $max_report < 100;
+}
# can we already stop now?
my $min_diff = abs($count2-$count1);
-die "too many differences, at least $min_diff > $max_report"
+die "too many differences, at least $min_diff > $max_report, " .
+ "consider raising --max-ratio or --max-report"
if defined $max_report and $min_diff>$max_report;
# compute initial "full" masks which must be larger than size
@@ -3461,82 +3511,109 @@
my $where_k2 = is_equal($dbh2, $dhpbt2, $db2, $k2);
my $set_c2 = (join '=?, ', @$c2) . '=?';
- # delete rows
- if (@$del or @$delb)
+ # DELETE rows, including updates with copy
+ if (@$del or @$delb or ($pg_copy and @$upt))
{
my $del_sql = "DELETE FROM $t2 WHERE " .
($where? "($where) AND ": '') . $where_k2;
verb 2, $del_sql;
my $del_sth = $dbh2->prepare($del_sql) if $do_it;
- for my $d (@$del, @$delb) {
+ for my $d (@$del, @$delb, $pg_copy? @$upt: ()) {
sth_param_exec($do_it, "DELETE $t2", $del_sth, $d);
}
# undef $del_sth;
}
- # get values for insert or update
- my ($val_sql, $val_sth);
- if ($c1 and @$c1)
- {
- $val_sql = "SELECT " . join(',', @$c1) . " FROM $t1 WHERE " .
- ($where? "($where) AND ": '') . $where_k1;
- verb 2, $val_sql;
- $val_sth = $dbh1->prepare($val_sql)
- if @$ins or @$insb or @$upt;
+ # insert/update rows
+ # note: I could skip fetching if there is no data column
+ if ($pg_copy and (@$ins or @$upt or defined $insb)) { # use COPY
+ sql_do($dbh2, $db2, "COPY $t2(" . join(',', @$k2, @$c2) . ") FROM STDIN");
+ #async_wait($dbh2, $db2, 'copy from 2') if $async;
+ my $select = "SELECT " . join(',', @$k1, @$c1) . " FROM $t1 WHERE ";
+ $select .= "($where) AND " if $where;
+ $select .= "(" . join(',', @$k1) . ") IN (";
+ # we COPY both inserts and updates
+ my @allins = (@$ins, @$insb, @$upt);
+ while (@allins) {
+ my $bulk = '';
+ for my $k (splice(@allins, 0, $pg_copy)) { # chunked
+ $bulk .= ',' if $bulk;
+ #$copy_bulk .= $dbh1->quote($k);
+ $bulk .= "(@$k)";
+ $query_data++;
+ }
+ sql_do($dbh1, $db1, "COPY ($select$bulk)) TO STDOUT");
+ #async_wait($dbh1, $db1, 'copy to 1') if $async;
+ my $row = '';
+ while (($dbh1->pg_getcopydata($row)) != -1) {
+ $dbh2->pg_putcopydata($row) if $do_it;
+ }
+ }
+ $dbh2->pg_putcopyend();
}
+ else { # use generic INSERT/UPDATE
- # insert rows
- if (@$ins or @$insb)
- {
- my $ins_sql = "INSERT INTO $t2(" .
- (@$c2? join(',', @$c2) . ',': '') . join(',', @$k2) . ') ' .
- 'VALUES(?' . ',?' x (@$k2+@$c2-1) . ')';
- verb 2, $ins_sql;
- my $ins_sth = $dbh2->prepare($ins_sql) if $do_it;
- for my $i (@$ins, @$insb)
+ # get values for insert or update
+ my ($val_sql, $val_sth);
+ if ($c1 and @$c1)
{
- $query_data++;
- my @c1values = ();
- # query the other column values for key $i
- if ($c1 and @$c1)
- {
- sth_param_exec(1, "SELECT $t1", $val_sth, $i);
- @c1values = $val_sth->fetchrow_array();
- # hmmm... may be raised on blobs?
- die "unexpected values fetched for insert"
- unless @c1values and @c1values == @$c1;
+ $val_sql = "SELECT " . join(',', @$c1) . " FROM $t1 WHERE " .
+ ($where? "($where) AND ": '') . $where_k1;
+ verb 2, $val_sql;
+ $val_sth = $dbh1->prepare($val_sql)
+ if @$ins or @$insb or @$upt;
+ }
- &{$M{$db1}{close_cursor}}($val_sth) if exists $M{$db1}{close_cursor};
+ # handle inserts
+ if (@$ins or @$insb)
+ {
+ my $ins_sql = "INSERT INTO $t2(" . join(',', @$c2, @$k2) . ") " .
+ 'VALUES(?' . ',?' x (@$k2+@$c2-1) . ')';
+ verb 2, $ins_sql;
+ my $ins_sth = $dbh2->prepare($ins_sql) if $do_it;
+ for my $i (@$ins, @$insb) {
+ $query_data++;
+ my @c1values = ();
+ # query the other column values for key $i
+ if ($c1 and @$c1) {
+ sth_param_exec(1, "SELECT $t1", $val_sth, $i);
+ @c1values = $val_sth->fetchrow_array();
+ # hmmm... may be raised on blobs?
+ die "unexpected values fetched for insert"
+ unless @c1values and @c1values == @$c1;
+
+ &{$M{$db1}{close_cursor}}($val_sth) if exists $M{$db1}{close_cursor};
+ }
+ # then insert the missing tuple
+ sth_param_exec($do_it, "INSERT $t2", $ins_sth, $i, @c1values);
}
- # then insert the missing tuple
- sth_param_exec($do_it, "INSERT $t2", $ins_sth, $i, @c1values);
+ # $ins_sth
}
- # $ins_sth
- }
- # update rows
- if (@$upt)
- {
- die "there must be some columns to update" unless $c1;
- my $upt_sql = "UPDATE $t2 SET $set_c2 WHERE " .
- ($where? "($where) AND ": '') . $where_k2;
- verb 2, $upt_sql;
- my $upt_sth = $dbh2->prepare($upt_sql) if $do_it;
- for my $u (@$upt)
+ # handle updates
+ if (@$upt)
{
- $query_data++;
- # get value for key $u
- sth_param_exec(1, "SELECT $t1", $val_sth, $u);
- my @c1values = $val_sth->fetchrow_array();
- # hmmm... may be raised on blobs?
- die "unexpected values fetched for update"
+ die "there must be some columns to update" unless $c1;
+ my $upt_sql = "UPDATE $t2 SET $set_c2 WHERE " .
+ ($where? "($where) AND ": '') . $where_k2;
+ verb 2, $upt_sql;
+ my $upt_sth = $dbh2->prepare($upt_sql) if $do_it;
+ for my $u (@$upt)
+ {
+ $query_data++;
+ # get value for key $u
+ sth_param_exec(1, "SELECT $t1", $val_sth, $u);
+ my @c1values = $val_sth->fetchrow_array();
+ # hmmm... may be raised on blobs?
+ die "unexpected values fetched for update"
unless @c1values and @c1values == @$c1;
- # use it to update the other table
- sth_param_exec($do_it, "UPDATE $t2", $upt_sth, $u, @c1values);
+ # use it to update the other table
+ sth_param_exec($do_it, "UPDATE $t2", $upt_sth, $u, @c1values);
- &{$M{$db1}{close_cursor}}($val_sth) if exists $M{$db1}{close_cursor};
+ &{$M{$db1}{close_cursor}}($val_sth) if exists $M{$db1}{close_cursor};
+ }
+ # $upt_sth
}
- # $upt_sth
}
# close synchronization transaction if any
@@ -3557,7 +3634,7 @@
unless $do_it;
}
-$tsyn = [gettimeofday];
+$tsyn = [gettimeofday] if $stats;
if ($clear)
{
@@ -3580,7 +3657,7 @@
verb 4, "clearing done."
}
-$tclr = [gettimeofday];
+$tclr = [gettimeofday] if $stats;
# recreate database handler for the end...
dbh_materialize($dbh1, $db1);
@@ -3607,7 +3684,7 @@
}
# final timestamp
-$tend = [gettimeofday];
+$tend = [gettimeofday] if $stats;
# some stats are collected out of time measures
if ($stats)
@@ -3644,6 +3721,7 @@
# build options as a bit vector
my $options =
+ (($pg_copy?1:0) << 11) | # --pg-copy=...
(($tup_cs?1:0) << 10) | # --tuple-checksum=...
(($key_cs?1:0) << 9) | # --key-checksum=...
($do_lock << 8) | # --lock
--
To unsubscribe, e-mail: opensuse-commit+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse-commit+help@opensuse.org