On Sunday 15 September 2024, Andrei Borzenkov wrote:
On 15.09.2024 03:10, Michael Hamilton wrote:
Thanks for the feedback - it inspired me to update the gist. I've added a -g (--group) option that groups packages that have the same change-log text.
Just check the source rpm. Same changelog means they are built from the same SRPM.
rpm --qf '%{sourcerpm}'
Thanks, good suggestion. I experimentally modified it to do: rpm -q --qf '%{sourcerpm}\n' --changelog Grouping by sourcerpm didn't speed up the script. So a smaller key wasn't a win. I still have to accumulate the text, so no win on RAM. I could refrain from retrieving the changelog until the final pass, but that doubles the execs of rpm. I could potentially use the non-standard DiskDict module, to keep the dictionary on disk - probably slower though. Grouping by sourcerpm could be argued to be more accurate because unrelated rpms will not be grouped together just because they have the same change-log text. If different sourcerpms do share the same change-log text, then the output for those rpms will be duplicated. So some output compression may be lost. It does add a smidgen of complexity, but it feels tidier. I'll leave it to simmer and see what other suggestions/opinions come in. If it begins to be used widely I suspect I should do something to reduce the number of execs of rpm, that may be a big win on speed (an xargs kind of approach). Out of curiosity I tried python3.13t with and without the JIT, didn't seem to make a notifiable difference. Michael