From corporate_gadfly at hotmail.com Fri Jan 7 22:28:13 2005 From: corporate_gadfly at hotmail.com (Corporate Gadfly) Date: Fri, 07 Jan 2005 22:28:13 +0000 Subject: [cvsspam-devel] question about cvsspam-0.2.8 and addition/removal of binary files Message-ID: Hi, I am hoping someone will be able to expand on the following issue. Thanks for a wonderful product, by the way. I was running 0.2.8 and have since upgraded to 0.2.11 but I didn't see anything in the Changelog to suggest that my problem was addressed in the version upgrade. The problem that I experienced in 0.2.8 is related to removal/addition of binary files. (I have $no_removed_file_diff = true in my conf file, so I wouldn't know about removal but I think it is an identical situation). Whenever a binary file (for example a .pdf file) is added to the repository CVSspam provides the whole contents of the file as a diff instead of just providing the link. As you can see this is probably not the desired behavior. Has anyone else seen the same behavior? File in question: cvs status src/t2202a-flat-fill-03b.pdf =================================================================== File: t2202a-flat-fill-03b.pdf Status: Up-to-date Working revision: 1.2 Repository revision: 1.2 /www/REPO/sis/src/t2202a-flat-fill-03b.pdf,v Sticky Tag: (none) Sticky Date: (none) Sticky Options: -kb As you can see -kb is set. IMHO, when a binary file is added/removed, the correct behavior should be to override whatever was set for the $no_added_file_diff option. Thoughts? Thanks in advance. From dave at badgers-in-foil.co.uk Sat Jan 8 11:31:49 2005 From: dave at badgers-in-foil.co.uk (David Holroyd) Date: Sat, 8 Jan 2005 11:31:49 +0000 Subject: [cvsspam-devel] question about cvsspam-0.2.8 and addition/removal of binary files In-Reply-To: References: Message-ID: <20050108113149.GB4316@vhost.badgers-in-foil.co.uk> Hi there, On Fri, Jan 07, 2005 at 10:28:13PM +0000, Corporate Gadfly wrote: > I am hoping someone will be able to expand on the following issue. Thanks > for a wonderful product, by the way. > > I was running 0.2.8 and have since upgraded to 0.2.11 but I didn't see > anything in the Changelog to suggest that my problem was addressed in the > version upgrade. The problem that I experienced in 0.2.8 is related to > removal/addition of binary files. (I have $no_removed_file_diff = true in > my conf file, so I wouldn't know about removal but I think it is an > identical situation). Whenever a binary file (for example a .pdf file) is > added to the repository CVSspam provides the whole contents of the file as > a diff instead of just providing the link. As you can see this is probably > not the desired behavior. Has anyone else seen the same behavior? > > File in question: > > cvs status src/t2202a-flat-fill-03b.pdf > =================================================================== > File: t2202a-flat-fill-03b.pdf Status: Up-to-date > > Working revision: 1.2 > Repository revision: 1.2 /www/REPO/sis/src/t2202a-flat-fill-03b.pdf,v > Sticky Tag: (none) > Sticky Date: (none) > Sticky Options: -kb > > As you can see -kb is set. I would accept that CVSspam can do a better job in this case. The reason we fail to do the correct thing is that CVSspam simply invokes 'cvs diff' for every file changed. We then treat files as binary only when diff reports them as such. Some types of file (uncompressed PDFs being a prime example) fool the huristic that diff uses, and therefore fool CVSspam too. It would be resonable to use the presence of the -kb option as an additional hint to control the handling of the file. I will try to get around to implementing this at some point -- I've added it to the TODO list. Thanks for a great suggestion, dave -- http://david.holroyd.me.uk/ From asousa01 at tufts.edu Wed Jan 12 23:51:08 2005 From: asousa01 at tufts.edu (Alex Sousa) Date: Wed, 12 Jan 2005 18:51:08 -0500 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? Message-ID: <1105573868.9971.38.camel@cronus.phy.tufts.edu> Hi, I maintain a CVS repository for a small project and have been a very happy user of CVSspam for a few months now. As always, the project is growing and more people jumping aboard and I've been asked to include the identity of the user who commits a file or a mod, in the CVSspam messages. I went through the instructions and through the threads and tried adding: --from $USER to CVSROOT/loginfo as follows; ^NueAnalysis /home/cvs/CVSROOT/collect_diffs.rb --from $USER --to nuecvs@minos.phy.tufts.edu %{sVv} where nuecvs is a mailing list, defined in /etc/aliases, containing the users allowed to checkout and commit to the repository. I also have a CVSROOT/users file containing username:email_address entries for each of the users. What am I doing wrong? Thanks very much, Alex From pardinilist at pardini.net Thu Jan 13 02:18:09 2005 From: pardinilist at pardini.net (Ricardo Pardini) Date: Thu, 13 Jan 2005 00:18:09 -0200 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <1105573868.9971.38.camel@cronus.phy.tufts.edu> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> Message-ID: <41E5DA61.5010503@pardini.net> I got that working using my mailer/MTA (exim). It 'fills in' unqualified local addresses by reading local GECOS field in /etc/passwd ("real" name of the user, among others) and the file /etc/email-addresses. I think you can do that with Postfix using a canonical map. Alex Sousa wrote: >Hi, > >I maintain a CVS repository for a small project and have been a very >happy user of CVSspam for a few months now. As always, the project is >growing and more people jumping aboard and I've been asked to include >the identity of the user who commits a file or a mod, in the CVSspam >messages. > >I went through the instructions and through the threads and tried >adding: > > --from $USER > >to CVSROOT/loginfo as follows; > >^NueAnalysis /home/cvs/CVSROOT/collect_diffs.rb --from $USER --to >nuecvs@minos.phy.tufts.edu %{sVv} > >where nuecvs is a mailing list, defined in /etc/aliases, containing the >users allowed to checkout and commit to the repository. >I also have a CVSROOT/users file containing username:email_address >entries for each of the users. > > From dave at badgers-in-foil.co.uk Thu Jan 13 08:57:45 2005 From: dave at badgers-in-foil.co.uk (David Holroyd) Date: Thu, 13 Jan 2005 08:57:45 +0000 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <1105573868.9971.38.camel@cronus.phy.tufts.edu> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> Message-ID: <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> Hello there, On Wed, Jan 12, 2005 at 06:51:08PM -0500, Alex Sousa wrote: > I maintain a CVS repository for a small project and have been a very > happy user of CVSspam for a few months now. As always, the project is > growing and more people jumping aboard and I've been asked to include > the identity of the user who commits a file or a mod, in the CVSspam > messages. > > I went through the instructions and through the threads and tried > adding: > > --from $USER > > to CVSROOT/loginfo as follows; > > ^NueAnalysis /home/cvs/CVSROOT/collect_diffs.rb --from $USER --to > nuecvs@minos.phy.tufts.edu %{sVv} > > where nuecvs is a mailing list, defined in /etc/aliases, containing the > users allowed to checkout and commit to the repository. > I also have a CVSROOT/users file containing username:email_address > entries for each of the users. What user do emails appear to come from? Is it the user who owns the CVS server process? What method do you use to send emails? The default is to invoke a 'sendmail'-like program, but maybe you've configured CVSspam to connect directly to an SMTP server? dave -- http://david.holroyd.me.uk/ From dave at badgers-in-foil.co.uk Thu Jan 13 18:04:57 2005 From: dave at badgers-in-foil.co.uk (David Holroyd) Date: Thu, 13 Jan 2005 18:04:57 +0000 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <1105634245.5202.11.camel@cronus.phy.tufts.edu> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> <1105634245.5202.11.camel@cronus.phy.tufts.edu> Message-ID: <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> On Thu, Jan 13, 2005 at 11:37:24AM -0500, Alexandre Sousa wrote: > Hi Dave, > > > > I went through the instructions and through the threads and tried > > > adding: > > > > > > --from $USER > > > > > > to CVSROOT/loginfo as follows; > > > > > > ^NueAnalysis /home/cvs/CVSROOT/collect_diffs.rb --from $USER --to > > > nuecvs@minos.phy.tufts.edu %{sVv} > > > > > > where nuecvs is a mailing list, defined in /etc/aliases, containing the > > > users allowed to checkout and commit to the repository. > > > I also have a CVSROOT/users file containing username:email_address > > > entries for each of the users. > > > > What user do emails appear to come from? Is it the user who owns the > > CVS server process? > > Yes, the emails originate from the "cvs" user, sent to the "nuecvs" > mailing list. And yes, this is the user who owns the cvs pserver, which > the authorized users can only access through ssh (public key method). > > > What method do you use to send emails? The default is to invoke a > > 'sendmail'-like program, but maybe you've configured CVSspam to connect > > directly to an SMTP server? > > I'm using just the server's local sendmail. In fact, so far I left the > cvsspam.conf file untouched as the defaults seemed to work fine. Can you try a little test on your CVS server to see if this user is allowed to set the sender address: 1) switch to the cvs user (you need to be root, I assume), su - cvs 2) Try to send an email, setting a From address, (change your_address to an account where you collect mail), echo -e "From:nemo\nTo:your_address\n\nTest!" | /usr/sbin/sendmail -t What does the resulting email look like? dave -- http://david.holroyd.me.uk/ From asousa at minos.phy.tufts.edu Thu Jan 13 18:35:08 2005 From: asousa at minos.phy.tufts.edu (Alexandre Sousa) Date: Thu, 13 Jan 2005 13:35:08 -0500 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> <1105634245.5202.11.camel@cronus.phy.tufts.edu> <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> Message-ID: <1105641308.5202.23.camel@cronus.phy.tufts.edu> Hi Dave, > > 1) switch to the cvs user (you need to be root, I assume), > > su - cvs > > > 2) Try to send an email, setting a From address, (change your_address to > an account where you collect mail), > > echo -e "From:nemo\nTo:your_address\n\nTest!" | /usr/sbin/sendmail -t Somehow the -e option did not quite work although it's supported in my "echo", so: echo "From:nemo\nTo:asousa@minos.phy.tufts.edu\n\nTest!" | /usr/sbin/sendmail -t produces: ============================================= From: nemo@minos.phy.tufts.edu To: asousa@minos.phy.tufts.edu Date: Thu, 13 Jan 2005 13:26:37 -0500 Test! ============================================= in my inbox. Looks normal, no? Alex From skoehler at upb.de Sat Jan 22 17:46:26 2005 From: skoehler at upb.de (=?ISO-8859-15?Q?Sven_K=F6hler?=) Date: Sat, 22 Jan 2005 18:46:26 +0100 Subject: [cvsspam-devel] [BUG] fileXX anchors are wrong Message-ID: <41F29172.4090901@upb.de> Hi, i'm using cvsspam 0.2.11 + the last patch you sent me (that fixed the unconfirmed fix). The link on the first filename committed ends with "#file1" but the first anchor actually present is "" which is obviously a wrong numbering starting at 2 instead of 1. Thx Sven From dave at badgers-in-foil.co.uk Sat Jan 22 23:19:53 2005 From: dave at badgers-in-foil.co.uk (David Holroyd) Date: Sat, 22 Jan 2005 23:19:53 +0000 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <1105641308.5202.23.camel@cronus.phy.tufts.edu> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> <1105634245.5202.11.camel@cronus.phy.tufts.edu> <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> <1105641308.5202.23.camel@cronus.phy.tufts.edu> Message-ID: <20050122231953.GA5195@vhost.badgers-in-foil.co.uk> --sm4nu43k4a2Rpi4c Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hello, sorry for the delayed reply, On Thu, Jan 13, 2005 at 01:35:08PM -0500, Alexandre Sousa wrote: > > 2) Try to send an email, setting a From address, (change your_address to > > an account where you collect mail), > > > > echo -e "From:nemo\nTo:your_address\n\nTest!" | /usr/sbin/sendmail -t > > Somehow the -e option did not quite work although it's supported in my > "echo", so: > > echo "From:nemo\nTo:asousa@minos.phy.tufts.edu\n\nTest!" | /usr/sbin/sendmail -t > > produces: > > ============================================= > From: nemo@minos.phy.tufts.edu > To: asousa@minos.phy.tufts.edu > Date: Thu, 13 Jan 2005 13:26:37 -0500 > > Test! > > ============================================= > > in my inbox. Looks normal, no? It does look normal. I can't think of a reason for the problem, so maybe you could try making the attached change to your copy of cvsspam.rb so that it displays the value we attempt to include in the message? You'll need to change the loginfo line that invokes collect_diffs.rb to add the --debug option. Then, see what output you get for a test commit. For instance, I see something like this: ... cvsspam.rb: invoking '/usr/sbin/sendmail -t -oi' cvsspam.rb: Mail From: ... dave -- http://david.holroyd.me.uk/ --sm4nu43k4a2Rpi4c Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="debug_sender-patch1.diff" Index: cvsspam.rb =================================================================== RCS file: /var/lib/cvs/cvsspam/cvsspam.rb,v retrieving revision 1.61 diff -u -r1.61 cvsspam.rb --- cvsspam.rb 9 Dec 2004 23:51:31 -0000 1.61 +++ cvsspam.rb 22 Jan 2005 23:06:40 -0000 @@ -1621,6 +1621,11 @@ IO.popen(cmd, "w") do |mail| ctx = MailContext.new(mail) ctx.header("To", recipients.join(',')) + if from + blah("Mail From: <#{from}>") + else + blah("Mail From not set") + end ctx.header("From", from) if from yield ctx end @@ -1660,6 +1665,7 @@ smtp.ready(from, recipients) do |mail| ctx = MailContext.new(IOAdapter.new(mail)) ctx.header("To", recipients.join(',')) + blah("Mail From: <#{from}>") ctx.header("From", from) if from ctx.header("Date", Time.now.utc.strftime(DATE_HEADER_FORMAT)) yield ctx --sm4nu43k4a2Rpi4c-- From dave at badgers-in-foil.co.uk Sun Jan 23 00:06:11 2005 From: dave at badgers-in-foil.co.uk (David Holroyd) Date: Sun, 23 Jan 2005 00:06:11 +0000 Subject: [cvsspam-devel] [BUG] fileXX anchors are wrong In-Reply-To: <41F29172.4090901@upb.de> References: <41F29172.4090901@upb.de> Message-ID: <20050123000608.GB5195@vhost.badgers-in-foil.co.uk> --Bn2rw/3z4jIqBvZU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Jan 22, 2005 at 06:46:26PM +0100, Sven K?hler wrote: > Hi, > > i'm using cvsspam 0.2.11 + the last patch you sent me (that fixed the > unconfirmed fix). The link on the first filename committed ends with > "#file1" but the first anchor actually present is "" > which is obviously a wrong numbering starting at 2 instead of 1. Looks like I should have changed something when I re-ordered some of the HTML generating code. The attached patch should fix the bug, I think? dave -- http://david.holroyd.me.uk/ --Bn2rw/3z4jIqBvZU Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="anchor_numbering_oboe-patch1.diff" Index: cvsspam.rb =================================================================== RCS file: /var/lib/cvs/cvsspam/cvsspam.rb,v retrieving revision 1.61 diff -u -r1.61 cvsspam.rb --- cvsspam.rb 9 Dec 2004 23:51:31 -0000 1.61 +++ cvsspam.rb 22 Jan 2005 23:58:44 -0000 @@ -930,7 +930,7 @@ # start the diff output, using the given lines as the 'preamble' bit def start_output(*lines) - println("
") + println("
") case $file.type when "A" print("") @@ -1621,6 +1621,11 @@ IO.popen(cmd, "w") do |mail| ctx = MailContext.new(mail) ctx.header("To", recipients.join(',')) + if from + blah("Mail From: <#{from}>") + else + blah("Mail From not set") + end ctx.header("From", from) if from yield ctx end @@ -1660,6 +1665,7 @@ smtp.ready(from, recipients) do |mail| ctx = MailContext.new(IOAdapter.new(mail)) ctx.header("To", recipients.join(',')) + blah("Mail From: <#{from}>") ctx.header("From", from) if from ctx.header("Date", Time.now.utc.strftime(DATE_HEADER_FORMAT)) yield ctx --Bn2rw/3z4jIqBvZU-- From dave at badgers-in-foil.co.uk Sun Jan 23 00:10:48 2005 From: dave at badgers-in-foil.co.uk (David Holroyd) Date: Sun, 23 Jan 2005 00:10:48 +0000 Subject: [cvsspam-devel] [BUG] fileXX anchors are wrong In-Reply-To: <20050123000608.GB5195@vhost.badgers-in-foil.co.uk> References: <41F29172.4090901@upb.de> <20050123000608.GB5195@vhost.badgers-in-foil.co.uk> Message-ID: <20050123001045.GC5195@vhost.badgers-in-foil.co.uk> --5G06lTa6Jq83wMTw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Second try, On Sun, Jan 23, 2005 at 12:06:11AM +0000, David Holroyd wrote: > On Sat, Jan 22, 2005 at 06:46:26PM +0100, Sven K?hler wrote: > > i'm using cvsspam 0.2.11 + the last patch you sent me (that fixed the > > unconfirmed fix). The link on the first filename committed ends with > > "#file1" but the first anchor actually present is "" > > which is obviously a wrong numbering starting at 2 instead of 1. > > Looks like I should have changed something when I re-ordered some of the > HTML generating code. The attached patch should fix the bug, I think? Oops, too many changes in that last patch. This one only changes the line in error... dave -- http://david.holroyd.me.uk/ --5G06lTa6Jq83wMTw Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="anchor_numbering_oboe-patch2.diff" Index: cvsspam.rb =================================================================== RCS file: /var/lib/cvs/cvsspam/cvsspam.rb,v retrieving revision 1.61 diff -u -r1.61 cvsspam.rb --- cvsspam.rb 9 Dec 2004 23:51:31 -0000 1.61 +++ cvsspam.rb 22 Jan 2005 23:58:44 -0000 @@ -930,7 +930,7 @@ # start the diff output, using the given lines as the 'preamble' bit def start_output(*lines) - println("
") + println("
") case $file.type when "A" print("") --5G06lTa6Jq83wMTw-- From asousa at minos.phy.tufts.edu Mon Jan 24 21:44:51 2005 From: asousa at minos.phy.tufts.edu (Alexandre Sousa) Date: Mon, 24 Jan 2005 16:44:51 -0500 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <20050122231953.GA5195@vhost.badgers-in-foil.co.uk> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> <1105634245.5202.11.camel@cronus.phy.tufts.edu> <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> <1105641308.5202.23.camel@cronus.phy.tufts.edu> <20050122231953.GA5195@vhost.badgers-in-foil.co.uk> Message-ID: <1106603092.27769.22.camel@cronus.phy.tufts.edu> Hi Dave, > You'll need to change the loginfo line that invokes collect_diffs.rb to > add the --debug option. Then, see what output you get for a test > commit. For instance, I see something like this: > > ... > cvsspam.rb: invoking '/usr/sbin/sendmail -t -oi' > cvsspam.rb: Mail From: > ... Following your suggestion, I modified loginfo thus: ^NueAnalysis /home/cvs/CVSROOT/collect_diffs.rb --from $USER --to nuecvs@minos.phy.tufts.edu %{sVv} --debug After a test commit I get: collect_diffs.rb: CVSROOT is /home/cvs collect_diffs.rb: ARGV is collect_diffs.rb: about to run cvs -nq diff -Nu -r1.6 -r1.7 test.txt collect_diffs.rb: sending spam. (I am /home/cvs/CVSROOT/collect_diffs.rb) cvsspam.rb: Using config '/home/cvs/CVSROOT/cvsspam.conf' cvsspam.rb: invoking '/usr/sbin/sendmail -t -oi' cvsspam.rb: leaving file /tmp/#cvsspam.3822.506-32314704/logfile.emailtmp collect_diffs.rb: leaving file /tmp/#cvsspam.3822.506-32314704/logfile So, --from is not being used at all. I'm guessing $USER must be empty or something. Do I need to change some sendmail config in order to $USER to be set? Thanks, Alex > > dave From dave at badgers-in-foil.co.uk Tue Jan 25 00:14:47 2005 From: dave at badgers-in-foil.co.uk (David Holroyd) Date: Tue, 25 Jan 2005 00:14:47 +0000 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <1106603092.27769.22.camel@cronus.phy.tufts.edu> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> <1105634245.5202.11.camel@cronus.phy.tufts.edu> <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> <1105641308.5202.23.camel@cronus.phy.tufts.edu> <20050122231953.GA5195@vhost.badgers-in-foil.co.uk> <1106603092.27769.22.camel@cronus.phy.tufts.edu> Message-ID: <20050125001446.GA14076@vhost.badgers-in-foil.co.uk> --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Mon, Jan 24, 2005 at 04:44:51PM -0500, Alexandre Sousa wrote: > Following your suggestion, I modified loginfo thus: > > ^NueAnalysis /home/cvs/CVSROOT/collect_diffs.rb --from $USER --to > nuecvs@minos.phy.tufts.edu %{sVv} --debug > > > After a test commit I get: > > collect_diffs.rb: CVSROOT is /home/cvs > collect_diffs.rb: ARGV is > collect_diffs.rb: about to run cvs -nq diff -Nu -r1.6 -r1.7 test.txt > collect_diffs.rb: sending spam. (I > am /home/cvs/CVSROOT/collect_diffs.rb) > cvsspam.rb: Using config '/home/cvs/CVSROOT/cvsspam.conf' > cvsspam.rb: invoking '/usr/sbin/sendmail -t -oi' > cvsspam.rb: leaving > file /tmp/#cvsspam.3822.506-32314704/logfile.emailtmp > collect_diffs.rb: leaving file /tmp/#cvsspam.3822.506-32314704/logfile > > So, --from is not being used at all. I'm guessing $USER must be empty or > something. Do I need to change some sendmail config in order to $USER to > be set? I don't think that you also made the change to cvsspam.rb that I suggested. I've attached a modified version of cvsspam.rm that adds extra debug info. The $USER vaiable should be handled by CVS, and transformed into the username of the committing user before collect_diffs.rb gets invoked. With this changed file, I would expect to see, just after the 'invoking sendmail' line, either: cvsspam.rb: Mail From: or cvsspam.rb: Mail From not set If you have time, give it a go, and we can see if the problem lies with CVSspam or the MTA config. dave -- http://david.holroyd.me.uk/ --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="cvsspam.rb" #!/usr/bin/ruby -w # Part of CVSspam # http://www.badgers-in-foil.co.uk/projects/cvsspam/ # Copyright (c) David Holroyd # collect_diffs.rb expects to find this script in the same directory as it # # TODO: exemplify syntax for 'cvs admin -m' when log message is missing # TODO: make max-line limit on diff output configurable # TODO: put more exact max size limit on whole email # TODO: support non-html mail too (text/plain, multipart/alternative) # If you want another 'todo keyword' (TODO & FIXME are highlighted by default) # you could add # $task_keywords << "KEYWORD" << "MAYBEANOTHERWORD" # to your cvssppam.conf $version = "0.2.11" $maxSubjectLength = 200 $maxLinesPerDiff = 1000 $maxDiffLineLength = 1000 # may be set to nil for no limit $charset = nil # nil implies 'don't specify a charset' $mailSubject = '' def blah(text) $stderr.puts("cvsspam.rb: #{text}") if $debug end def min(a, b) a0 && start.length+match.length>@right_margin io.puts(start) start = " " match.sub!(/^\s+/, "") # strip existing leading-whitespace end start << match end io.puts(start) end UNDERSCORE = chr("_") SPACE = chr(" ") TAB = chr("\t") # encode a header value according to the RFC-2047 quoted-printable spec, # allowing non-ASCII characters to appear in header values, and wrapping # long values with header continuation lines as needed def rfc2047_encode_quoted(io, start, rest) raise "no charset" if @charset.nil? code_begin = "=?#{@charset}?#{@encoding}?" start << code_begin rest.each_byte do |b| code = if b>126 || b==UNDERSCORE || b==TAB sprintf("=%02x", b) elsif b == SPACE "_" else b.chr end if start.length+code.length+2 > @right_margin io.puts(start + "?=") start = " " + code_begin end start << code end io.puts(start + "?=") end # test to see of the given string contains non-ASCII characters def requires_rfc2047?(word) (word =~ /[\177-\377]/) != nil end end # Provides access to the datafile previously created by collect_diffs.rb. # Each call to getLines() will return an object that will read lines of the # same 'type' (e.g. lines of commit log comment) from the file, and stop when # lines of a different type (e.g. line giving the next file's name) are # encountered. class LogReader def initialize(logIO) @io = logIO advance end def currentLineCode ; @line[1,1] end class ConstrainedIO def initialize(reader) @reader = reader @linecode = reader.currentLineCode end def each return if @reader == nil while true yield @reader.currentLine break unless @reader.advance && currentValid? end @reader = nil end def gets return nil if @reader == nil line = @reader.currentLine return nil if line==nil || !currentValid? @reader.advance return line end def currentValid? @linecode == @reader.currentLineCode end end def getLines ConstrainedIO.new(self) end def eof ; @line==nil end def advance @line = @io.gets return false if @line == nil unless @line[0,1] == "#" raise "#{$logfile}:#{@io.lineno} line did not begin with '#': #{@line}" end return true end def currentLine @line==nil ? nil : @line[3, @line.length-4] end end # returns a copy of the fiven string with instances of the HTML special # characters '&', '<' and '>' encoded as their HTML entity equivalents. def htmlEncode(text) text.gsub(/./) do case $& when "&" then "&" when "<" then "<" when ">" then ">" else $& end end end # Encodes characters that would otherwise be special in a URL using the # "%XX" syntax (where XX are hex digits). # actually, allows '/' to appear def urlEncode(text) text.sub(/[^a-zA-Z0-9\-,.*_\/]/) do "%#{sprintf('%2X', $&[0])}" end end # Represents a top-level directory under the $CVSROOT (which is properly called # a module -- this class is named incorrectly). Collects a list of # all #FileEntry objects that are 'in' this repository. Class methods provide # a list of all repositories (ick!) class Repository @@repositories = Hash.new def initialize(name) @name = name @common_prefix = nil @all_tags = Hash.new end # records that the given branch tag name was used for some file that was # committed to this repository. The argument nil is taken to signify the # MAIN branch, or 'trunk' of the project. def add_tag(tag_name) if @all_tags[tag_name] @all_tags[tag_name] += 1 else @all_tags[tag_name] = 1 end end # true, if #add_tag has been passed more than one distinct value def has_multiple_tags @all_tags.length > 1 end # iterate over the tags that have been recorded against this Repository def each_tag @all_tags.each_key do |tag| yield tag end end # true if the only tag that has been recorded against this repository was # the 'trunk', i.e. no branch tags at all def trunk_only? @all_tags.length==1 && @all_tags[nil]!=nil end # true if the files committed to this Repository have been of more than one # branch (not a common situation, I've only seen it in real life when things # are b0rked in someone's working directory). def mixed_tags? @all_tags.length>1 end # returns the number of tags seen during the commit to this Repository def tag_count @all_tags.length end # calculate the path prefix shared by all files commited to this # reposotory def merge_common_prefix(path) if @common_prefix == nil @common_prefix = path.dup else path = path.dup until @common_prefix == path if @common_prefix.size>path.size if @common_prefix.sub!(/(.*)\/.*$/, '\1').nil? raise "unable to merge '#{path}' in to '#{@common_prefix}': prefix totally different" end else if path.sub!(/(.*)\/.*$/, '\1').nil? raise "unable to merge '#{path}' in to '#{@common_prefix}': prefix totally different" end end end end end attr_reader :name, :common_prefix # gets the Repository object for the first component of the given path def Repository.get(name) name =~ /^[^\/]+/ name = $& rep = @@repositories[name] if rep.nil? rep = Repository.new(name) @@repositories[name] = rep end rep end # returns the total number of top-level directories seen during this commit def Repository.count @@repositories.size end # iterate over all the Repository objects created for this commit def Repository.each @@repositories.each_value do |rep| yield rep end end # returns an array of all the repository objects seen during this commit def Repository.array @@repositories.values end # get a string representation of the repository to appear in email subjects. # This will be the repository name, plus (possibly) the name of the branch # on which the commit occured. If the commit was to multiple branches, the # text '..' is used, rather than a branch name def to_s if trunk_only? @name elsif mixed_tags? "#{@name}@.." else "#{@name}@#{@all_tags.keys[0]}" end end end # Records properties of a file that was changed during this commit class FileEntry def initialize(path) @path = path @lineAdditions = @lineRemovals = 0 @repository = Repository.get(path) @repository.merge_common_prefix(basedir()) @isEmpty = @isBinary = false @has_diff = nil end # the full path and filename within the repository attr_accessor :path # the type of change committed 'M'=modified, 'A'=added, 'R'=removed attr_accessor :type # records number of 'addition' lines in diff output, once counted attr_accessor :lineAdditions # records number of 'removal' lines in diff output, once counted attr_accessor :lineRemovals # records whether 'cvs diff' reported this as a binary file attr_accessor :isBinary # records if diff output (and therefore the added file) was empty attr_accessor :isEmpty # file version number before the commit attr_accessor :fromVer # file version number after the commit attr_accessor :toVer # works out the filename part of #path def file @path =~ /.*\/(.*)/ $1 end # set the branch on which this change was committed, and add it to the list # of branches for which we've seen commits (in the #Repository) def tag=(name) @tag = name @repository.add_tag(name) end # gives the branch on which this change was committed def tag @tag end # works out the directory part of #path def basedir @path =~ /(.*)\/.*/ $1 end # gives the Repository object this file was automatically associated with # on construction def repository @repository end # gets the part of #path that comes after the prefix common to all files # in the commit to #repository def name_after_common_prefix @path.slice(@repository.common_prefix.size+1,@path.size-@repository.common_prefix.size-1) end # was this file removed during the commit? def removal? @type == "R" end # was this file added during the commit? def addition? @type == "A" end # was this file simply modified during the commit? def modification? @type == "M" end # passing true, this object remembers that a diff will appear in the email, # passing false, this object remembers that no diff will appear in the email. # Once the value is set, it will not be changed def has_diff=(diff) # TODO: this 'if @has_diff.nil?' is counterintuitive; remove! @has_diff = diff if @has_diff.nil? end # true if this file has had a diff recorded def has_diff? @has_diff end # true only if this file's diff (if any) should be included in the email, # taking into account global diff-inclusion settings. def wants_diff_in_mail? !($no_diff || removal? && $no_removed_file_diff || addition? && $no_added_file_diff) end end # Superclass for things that eat lines of input, and turn them into output # for our email. The 'input' will be provided by #LogReader # Subclasses of LineConsumer will be registered in the global $handlers later # on in this file. class LineConsumer # passes each line from 'lines' to the consume() method (which must be # implemented by subclasses). def handleLines(lines, emailIO) @emailIO = emailIO @lineCount = 0 setup lines.each do |line| @lineCount += 1 consume(line) end teardown end # Template method called by handleLines to do any subclass-specific setup # required. Default implementation does nothing def setup end # Template method called by handleLines to do any subclass-specific cleanup # required. Default implementation does nothing def teardown end # Returns the number of lines handleLines() has seen so far def lineno @lineCount end # adds a line to the output def println(text) @emailIO.puts(text) end # adds a string to the current output line def print(text) @emailIO.print(text) end end # TODO: consolidate these into a nicer framework, mailSub = proc { |match| "#{match}" } urlSub = proc { |match| "#{match}" } bugzillaSub = proc { |match| match =~ /([0-9]+)/ "#{match}" } jiraSub = proc { |match| "#{match}" } ticketSub = proc { |match| match =~ /([0-9]+)/ "#{match}" } commentSubstitutions = { '(?:mailto:)?[\w\.\-\+\=]+\@[\w\-]+(?:\.[\w\-]+)+\b' => mailSub, '\b(?:http|https|ftp):[^ \t\n<>"]+[\w/]' => urlSub} # outputs commit log comment text supplied by LogReader as preformatted HTML class CommentHandler < LineConsumer def initialize @lastComment = nil end def setup @haveBlank = false @comment = "" end def consume(line) if line =~ /^\s*$/ @haveBlank = true else if @haveBlank @comment += "\n" @haveBlank = false end $mailSubject = line unless $mailSubject.length > 0 @comment += line += "\n" end end def teardown unless @comment == @lastComment println("
")
      encoded = htmlEncode(@comment)
      $commentEncoder.gsub!(encoded)
      println(encoded)
      println("
") @lastComment = @comment end end end # Handle lines from LogReader that represent the name of the branch tag for # the next file in the log. When files are committed to the trunk, the log # will not contain a line specifying the branch tag name, and getLastTag # will return nil. class TagHandler < LineConsumer def initialize @tag = nil end def consume(line) # TODO: check there is only one line @tag = line end # returns the last tag name this object recorded, and resets the record, such # that a subsequent call to this method will return nil def getLastTag tmp = @tag @tag = nil tmp end end # records, from the log file, a line specifying the old and new revision numbers # for the next file to appear in the log. The values are recorded in the global # variables $fromVer and $toVer class VersionHandler < LineConsumer def consume(line) # TODO: check there is only one line $fromVer,$toVer = line.split(/,/) end end # Reads a line giving the path and name of the current file being considered # from our log of all files changed in this commit. Subclasses make different # records depending on whether this commit adds, removes, or just modifies this # file class FileHandler < LineConsumer def setTagHandler(handler) @tagHandler = handler end def consume(line) $file = FileEntry.new(line) if $diff_output_limiter.choose_to_limit? $file.has_diff = false end $fileEntries << $file $file.tag = getTag handleFile($file) end protected def getTag @tagHandler.getLastTag end end # A do-nothing superclass for objects that know how to create hyperlinks to # web CVS interfaces (e.g. CVSweb). Subclasses overide these methods to # wrap HTML link tags arround the text that this classes methods generate. class NoFrontend # Just returns an HTML-encoded version of the 'path' argument. Subclasses # should turn this into a link to a webpage view of this CVS directory def path(path, tag) htmlEncode(path) end # Just returns the value of the 'version' argument. Subclasses should change # this into a link to the given version of the file. def version(path, version) version end # Gerarates a little 'arrow' that superclasses may turn into links that will # give an alternative 'diff' view of a change. def diff(file) '->' end end # Superclass for objects that can link to CVS frontends on the web (ViewCVS, # Chora, etc.). class WebFrontend < NoFrontend attr_accessor :repository_name def initialize(base_url) @base_url = base_url @repository_name = nil end def path(path, tag) path_for_href = "" result = "" path.split("/").each do |component| unless result == "" result << "/" path_for_href << "/" end path_for_href << component # The link is split over two lines so that long paths don't create # huge HTML source-lines in the resulting email. This is an attempt to # avoid having to prroduce a quoted-printable message (so that long lines # can be dealt with properly), result << "#{htmlEncode(component)}" end result end def version(path, version) "#{version}" end def diff(file) "#{super(file)}" end protected def add_repo(url) if @repository_name if url =~ /\?/ "#{url}&cvsroot=#{urlEncode(@repository_name)}" else "#{url}?cvsroot=#{urlEncode(@repository_name)}" end else url end end end # Link to ViewCVS class ViewCVSFrontend < WebFrontend def initialize(base_url) super(base_url) end def path_url(path, tag) if tag == nil add_repo(@base_url + urlEncode(path)) else add_repo("#{@base_url}#{urlEncode(path)}?only_with_tag=#{urlEncode(tag)}") end end def version_url(path, version) add_repo("#{@base_url}#{urlEncode(path)}?rev=#{version}&content-type=text/vnd.viewcvs-markup") end def diff_url(file) add_repo("#{@base_url}#{urlEncode(file.path)}.diff?r1=#{file.fromVer}&r2=#{file.toVer}") end end # Link to Chora, from the Horde framework class ChoraFrontend < WebFrontend def path_url(path, tag) # TODO: can we pass the tag somehow? "#{@base_url}/cvs.php/#{urlEncode(path)}" end def version_url(path, version) "#{@base_url}/co.php/#{urlEncode(path)}?r=#{version}" end def diff_url(file) "#{@base_url}/diff.php/#{urlEncode(file.path)}?r1=#{file.fromVer}&r2=#{file.toVer}" end end # Link to CVSweb class CVSwebFrontend < WebFrontend def path_url(path, tag) if tag == nil add_repo(@base_url + urlEncode(path)) else add_repo("#{@base_url}#{urlEncode(path)}?only_with_tag=#{urlEncode(tag)}") end end def version_url(path, version) add_repo("#{@base_url}#{urlEncode(path)}?rev=#{version}&content-type=text/x-cvsweb-markup") end def diff_url(file) add_repo("#{@base_url}#{urlEncode(file.path)}.diff?r1=text&tr1=#{file.fromVer}&r2=text&tr2=#{file.toVer}&f=h") end end # in need of refactoring... # Note when LogReader finds record of a file that was added in this commit class AddedFileHandler < FileHandler def handleFile(file) file.type="A" file.toVer=$toVer end end # Note when LogReader finds record of a file that was removed in this commit class RemovedFileHandler < FileHandler def handleFile(file) file.type="R" file.fromVer=$fromVer end end # Note when LogReader finds record of a file that was modified in this commit class ModifiedFileHandler < FileHandler def handleFile(file) file.type="M" file.fromVer=$fromVer file.toVer=$toVer end end # Used by UnifiedDiffHandler to record the number of added and removed lines # appearing in a unidiff. class UnifiedDiffStats def initialize @diffLines=3 # the three initial lines in the unidiff end def diffLines @diffLines end def consume(line) @diffLines += 1 case line[0,1] when "+" then $file.lineAdditions += 1 when "-" then $file.lineRemovals += 1 end end end # TODO: change-within-line colourisation should really be comparing the # set of lines just removed with the set of lines just added, but # it currently considers just a single line # Used by UnifiedDiffHandler to produce an HTML, 'highlighted' version of # the input unidiff text. class UnifiedDiffColouriser < LineConsumer def initialize @currentState = "@" @currentStyle = "info" @lineJustDeleted = nil @lineJustDeletedSuperlong = false @truncatedLineCount = 0 end def output=(io) @emailIO = io end def consume(line) initial = line[0,1] superlong_line = false if $maxDiffLineLength && line.length > $maxDiffLineLength+1 line = line[0, $maxDiffLineLength+1] superlong_line = true @truncatedLineCount += 1 end if initial != @currentState prefixLen = 1 suffixLen = 0 if initial=="+" && @currentState=="-" && @lineJustDeleted!=nil # may be an edit, try to highlight the changes part of the line a = line[1,line.length-1] b = @lineJustDeleted[1,@lineJustDeleted.length-1] prefixLen = commonPrefixLength(a, b)+1 suffixLen = commonPrefixLength(a.reverse, b.reverse) # prevent prefix/suffux having overlap, suffixLen = min(suffixLen, min(line.length,@lineJustDeleted.length)-prefixLen) deleteInfixSize = @lineJustDeleted.length - (prefixLen+suffixLen) addInfixSize = line.length - (prefixLen+suffixLen) oversize_change = deleteInfixSize*100/@lineJustDeleted.length>33 || addInfixSize*100/line.length>33 if prefixLen==1 && suffixLen==0 || deleteInfixSize<=0 || oversize_change print(htmlEncode(@lineJustDeleted)) else print(htmlEncode(@lineJustDeleted[0,prefixLen])) print("") print(formatChange(@lineJustDeleted[prefixLen,deleteInfixSize])) print("") print(htmlEncode(@lineJustDeleted[@lineJustDeleted.length-suffixLen,suffixLen])) end if superlong_line println("[...]") else println("") end @lineJustDeleted = nil end if initial=="-" @lineJustDeleted=line @lineJustDeletedSuperlong = superlong_line shift(initial) # we'll print it next time (fingers crossed) return elsif @lineJustDeleted!=nil print(htmlEncode(@lineJustDeleted)) if @lineJustDeletedSuperlong println("[...]") else println("") end @lineJustDeleted = nil end shift(initial) if prefixLen==1 && suffixLen==0 || addInfixSize<=0 || oversize_change encoded = htmlEncode(line) else encoded = htmlEncode(line[0,prefixLen]) + "" + formatChange(line[prefixLen,addInfixSize]) + "" + htmlEncode(line[line.length-suffixLen,suffixLen]) end else encoded = htmlEncode(line) end if initial=="-" unless @lineJustDeleted==nil print(htmlEncode(@lineJustDeleted)) if @lineJustDeletedSuperlong println("[...]") else println("") end @lineJustDeleted=nil end end if initial=="+" $task_keywords.each do |task| if line =~ /\b(#{task}\b.*)/ $task_list << $1 encoded.sub!(/\b#{task}\b/, "#{task}") encoded = "" + encoded break end end end print(encoded) if superlong_line println("[...]") else println("") end end def teardown unless @lineJustDeleted==nil print(htmlEncode(@lineJustDeleted)) if @lineJustDeletedSuperlong println("[...]") else println("") end @lineJustDeleted = nil end shift(nil) if @truncatedLineCount>0 println("[Note: Some over-long lines of diff output only partialy shown]") end end # start the diff output, using the given lines as the 'preamble' bit def start_output(*lines) println("
") case $file.type when "A" print("") print($frontend.path($file.basedir, $file.tag)) println("
") println("
#{htmlEncode($file.file)} added at #{$frontend.version($file.path,$file.toVer)}
") when "R" print("") print($frontend.path($file.basedir, $file.tag)) println("
") println("
#{htmlEncode($file.file)} removed after #{$frontend.version($file.path,$file.fromVer)}
") when "M" print("") print($frontend.path($file.basedir, $file.tag)) println("
") println("
#{htmlEncode($file.file)} #{$frontend.version($file.path,$file.fromVer)} #{$frontend.diff($file)} #{$frontend.version($file.path,$file.toVer)}
") end print("
")
    lines.each do |line|
      println(htmlEncode(line))
    end
  end

 private

  def formatChange(text)
    return '^M' if text=="\r"
    htmlEncode(text).gsub(/ /, ' ')
  end

  def shift(nextState)
    unless @currentState == nil
      if @currentStyle == "info"
        print("
") else print("") end @currentStyle = case nextState when "\\" then "info" # as in '\ No newline at end of file' when "@" then "info" when " " then "context" when "+" then "added" when "-" then "removed" end unless nextState == nil if @currentStyle=='info' print("
")
        else
          print("
")
        end
      end
    end
    @currentState = nextState
  end

  def commonPrefixLength(a, b)
    length = 0
    a.each_byte do |char|
      break unless b[length]==char
      length = length + 1
    end
    return length
  end
end


# Handle lines from LogReader that are the output from 'cvs diff -u' for the
# particular file under consideration
class UnifiedDiffHandler < LineConsumer
  def setup
    @stats = UnifiedDiffStats.new
    @colour = UnifiedDiffColouriser.new
    @colour.output = @emailIO
    @lookahead = nil
  end

  def consume(line)
    case lineno()
     when 1
      @diffline = line
     when 2
      @lookahead = line
     when 3
      if $file.wants_diff_in_mail?
        @colour.start_output(@diffline, @lookahead, line)
      end
     else
      @stats.consume(line)
      if $file.wants_diff_in_mail?
        if @stats.diffLines < $maxLinesPerDiff
          @colour.consume(line)
        elsif @stats.diffLines == $maxLinesPerDiff
          @colour.consume(line)
          @colour.teardown
        end
      end
    end
  end

  def teardown
    if @lookahead == nil
      $file.isEmpty = true
    elsif @lookahead  =~ /Binary files .* and .* differ/
      $file.isBinary = true
    else
      if $file.wants_diff_in_mail?
        if @stats.diffLines > $maxLinesPerDiff
          println("
") println("[truncated at #{$maxLinesPerDiff} lines; #{@stats.diffLines-$maxLinesPerDiff} more skipped]") else @colour.teardown end println("
") # end of "file" div $file.has_diff = true end end end end # a filter that counts the number of characters output to the underlying object class OutputCounter # TODO: This should probably be a subclass of IO # TODO: assumes unix end-of-line convention def initialize(io) @io = io # TODO: use real number of chars representing end of line (for platform) @eol_size = 1 @count = 0; end def puts(text) @count += text.length @count += @eol_size unless text =~ /\n$/ @io.puts(text) end def print(text) @count += text.length @io.print(text) end attr_reader :count end # a filter that can be told to stop outputing data to the underlying object class OutputDropper def initialize(io) @io = io @drop = false end def puts(text) @io.puts(text) unless @drop end def print(text) @io.print(text) unless @drop end attr_accessor :drop end # TODO: the current implementation of the size-limit continues to generate # HTML-ified diff output, but doesn't add it to the email. This means we # can report 'what you would have won', but is less efficient than turning # of the diff highlighting code. Does this matter? # Counts the amount of data written, and when choose_to_limit? is called, # checks this count against the configured limit, discarding any further # output if the limit is exceeded. We aren't strict about the limit becase # we don't want to chop-off the end of a tag and produce invalid HTML, etc. class OutputSizeLimiter def initialize(io, limit) @dropper = OutputDropper.new(io) @counter = OutputCounter.new(@dropper) @limit = limit @written_count = nil end def puts(text) @counter.puts(text) end def print(text) @counter.print(text) end def choose_to_limit? return true if @dropper.drop if @counter.count >= @limit @dropper.drop = true @written_count = @counter.count return true end return false end def total_count @counter.count end def written_count if @written_count.nil? total_count else @written_count end end end cvsroot_dir = "#{ENV['CVSROOT']}/CVSROOT" $config = "#{cvsroot_dir}/cvsspam.conf" $users_file = "#{cvsroot_dir}/users" $debug = false $recipients = Array.new $sendmail_prog = "/usr/sbin/sendmail" $no_removed_file_diff = false $no_added_file_diff = false $no_diff = false $task_keywords = ['TODO', 'FIXME'] $bugzillaURL = nil $jiraURL = nil $ticketURL = nil $viewcvsURL = nil $choraURL = nil $cvswebURL = nil $from_address = nil $subjectPrefix = nil $files_in_subject = false; $smtp_host = nil $repository_name = nil # 2MiB limit on attached diffs, $mail_size_limit = 1024 * 1024 * 2 require 'getoptlong' opts = GetoptLong.new( [ "--to", "-t", GetoptLong::REQUIRED_ARGUMENT ], [ "--config", "-c", GetoptLong::REQUIRED_ARGUMENT ], [ "--debug", "-d", GetoptLong::NO_ARGUMENT ], [ "--from", "-u", GetoptLong::REQUIRED_ARGUMENT ] ) opts.each do |opt, arg| $recipients << arg if opt=="--to" $config = arg if opt=="--config" $debug = true if opt=="--debug" $from_address = arg if opt=="--from" end if ARGV.length != 1 if ARGV.length > 1 $stderr.puts "extra arguments not needed: #{ARGV[1, ARGV.length-1].join(', ')}" else $stderr.puts "missing required file argument" end puts "Usage: cvsspam.rb [ --to ] [ --config ] " exit(-1) end $logfile = ARGV[0] $additionalHeaders = Array.new $problemHeaders = Array.new # helper function called from the 'config file' def addHeader(name, value) if name =~ /^[!-9;-~]+$/ $additionalHeaders << [name, value] else $problemHeaders << [name, value] end end # helper function called from the 'config file' def addRecipient(email) $recipients << email end # 'constant' used from the 'config file' class GUESS end if FileTest.exists?($config) blah("Using config '#{$config}'") load $config else blah("Config file '#{$config}' not found, ignoring") end if $recipients.empty? fail "No email recipients defined" end if $viewcvsURL != nil $viewcvsURL << "/" unless $viewcvsURL =~ /\/$/ $frontend = ViewCVSFrontend.new($viewcvsURL) elsif $choraURL !=nil $frontend = ChoraFrontend.new($choraURL) elsif $cvswebURL !=nil $cvswebURL << "/" unless $cvswebURL =~ /\/$/ $frontend = CVSwebFrontend.new($cvswebURL) else $frontend = NoFrontend.new end if $viewcvsURL != nil || $cvswebURL !=nil if $repository_name == GUESS # use the last component of the repository path as the name ENV['CVSROOT'] =~ /([^\/]+$)/ $frontend.repository_name = $1 elsif $repository_name != nil $frontend.repository_name = $repository_name end end if $bugzillaURL != nil commentSubstitutions['\b[Bb][Uu][Gg]\s*#?[0-9]+'] = bugzillaSub end if $jiraURL != nil commentSubstitutions['\b[a-zA-Z]+-[0-9]+\b'] = jiraSub end if $ticketURL != nil commentSubstitutions['\b[Tt][Ii][Cc][Kk][Ee][Tt]\s*#?[0-9]+\b'] = ticketSub end $commentEncoder = MultiSub.new(commentSubstitutions) tagHandler = TagHandler.new $handlers = Hash[">" => CommentHandler.new, "U" => UnifiedDiffHandler.new, "T" => tagHandler, "A" => AddedFileHandler.new, "R" => RemovedFileHandler.new, "M" => ModifiedFileHandler.new, "V" => VersionHandler.new] $handlers["A"].setTagHandler(tagHandler) $handlers["R"].setTagHandler(tagHandler) $handlers["M"].setTagHandler(tagHandler) $fileEntries = Array.new $task_list = Array.new $allTags = Hash.new File.open("#{$logfile}.emailtmp", File::RDWR|File::CREAT|File::TRUNC) do |mail| $diff_output_limiter = OutputSizeLimiter.new(mail, $mail_size_limit) File.open($logfile) do |log| reader = LogReader.new(log) until reader.eof handler = $handlers[reader.currentLineCode] if handler == nil raise "No handler file lines marked '##{reader.currentLineCode}'" end handler.handleLines(reader.getLines, $diff_output_limiter) end end end if $subjectPrefix == nil $subjectPrefix = "[CVS #{Repository.array.join(',')}]" end if $files_in_subject all_files = "" $fileEntries.each do |file| name = htmlEncode(file.name_after_common_prefix) if all_files != "" all_files = all_files + ";" + name else all_files = name end end $mailSubject = all_files + ":" + $mailSubject end mailSubject = "#{$subjectPrefix} #{$mailSubject}" if mailSubject.length > $maxSubjectLength mailSubject = mailSubject[0, $maxSubjectLength] end $encoder = HeaderEncoder.new $encoder.charset = $charset.nil? ? "ISO-8859-1" : $charset # generate the email header (and footer) having already generated the diffs # for the email body to a temp file (which is simply included in the middle) def make_html_email(mail) mail.puts(< HEAD unless ($problemHeaders.empty?) mail.puts("Bad header format in '#{$config}':
    ") $stderr.puts("Bad header format in '#{$config}':") $problemHeaders.each do |header| mail.puts("
  • #{htmlEncode(header[0])}
  • ") $stderr.puts(" - #{header[0]}") end mail.puts("
") end mail.puts("") haveTags = false Repository.each do |repository| haveTags |= repository.has_multiple_tags end filesAdded = 0 filesRemoved = 0 filesModified = 0 totalLinesAdded = 0 totalLinesRemoved = 0 file_count = 0 lastPath = "" last_repository = nil $fileEntries.each do |file| unless file.repository == last_repository last_repository = file.repository mail.print("") end file_count += 1 if (file_count%2==0) mail.print("") else mail.print("") end if file.addition? filesAdded += 1 elsif file.removal? filesRemoved += 1 elsif file.modification? filesModified += 1 end name = htmlEncode(file.name_after_common_prefix) slashPos = name.rindex("/") if slashPos==nil prefix = "" else thisPath = name[0,slashPos] name = name[slashPos+1,name.length] if thisPath == lastPath prefix = " "*(slashPos) + "/" else prefix = thisPath + "/" end lastPath = thisPath end if file.addition? name = "#{name}" elsif file.removal? name = "#{name}" end if file.has_diff? mail.print("") else mail.print("") end if file.isEmpty mail.print("") elsif file.isBinary mail.print("") else if file.lineAdditions>0 totalLinesAdded += file.lineAdditions mail.print("") else mail.print("") end if file.lineRemovals>0 totalLinesRemoved += file.lineRemovals mail.print("") else mail.print("") end end if last_repository.has_multiple_tags if file.tag mail.print("") else mail.print("") end elsif haveTags mail.print("") end if file.addition? mail.print("") elsif file.removal? mail.print("") elsif file.modification? mail.print("") end mail.puts("") end if $fileEntries.size>1 && (totalLinesAdded+totalLinesRemoved)>0 # give total number of lines added/removed accross all files mail.print("") if totalLinesAdded>0 mail.print("") else mail.print("") end if totalLinesRemoved>0 mail.print("") else mail.print("") end mail.print("") if haveTags mail.puts("") end mail.puts("
") if last_repository.has_multiple_tags mail.print("Mixed-tag commit") else mail.print("Commit") end mail.print(" in #{htmlEncode(last_repository.common_prefix)}") if last_repository.trunk_only? mail.print(" on MAIN") else mail.print(" on ") tagCount = 0 last_repository.each_tag do |tag| tagCount += 1 if tagCount > 1 mail.print tagCountMAIN" end end mail.puts("
#{prefix}#{name}#{prefix}#{name}[empty][binary]+#{file.lineAdditions}-#{file.lineRemovals}#{htmlEncode(file.tag)}MAINadded #{$frontend.version(file.path,file.toVer)}#{$frontend.version(file.path,file.fromVer)} removed#{$frontend.version(file.path,file.fromVer)} #{$frontend.diff(file)} #{$frontend.version(file.path,file.toVer)}
+#{totalLinesAdded}-#{totalLinesRemoved}
") totalFilesChanged = filesAdded+filesRemoved+filesModified if totalFilesChanged > 1 mail.print("") changeKind = 0 if filesAdded>0 mail.print("#{filesAdded} added") changeKind += 1 end if filesRemoved>0 mail.print(" + ") if changeKind>0 mail.print("#{filesRemoved} removed") changeKind += 1 end if filesModified>0 mail.print(" + ") if changeKind>0 mail.print("#{filesModified} modified") changeKind += 1 end mail.print(", total #{totalFilesChanged}") if changeKind > 1 mail.puts(" files
") end if $task_list.size > 0 task_count = 0 mail.puts("
") end File.open("#{$logfile}.emailtmp") do |input| input.each do |line| mail.puts(line.chomp) end end if $diff_output_limiter.choose_to_limit? mail.puts("

[Reached #{$diff_output_limiter.written_count} bytes of diffs.") mail.puts("Since the limit is about #{$mail_size_limit} bytes,") mail.puts("a further #{$diff_output_limiter.total_count-$diff_output_limiter.written_count} were skipped.]

") end if $debug blah("leaving file #{$logfile}.emailtmp") else File.unlink("#{$logfile}.emailtmp") end mail.puts("
CVSspam #{$version}
") mail.puts("") end # Tries to look up an 'alias' email address for the given string in the # CVSROOT/users file, if the file exists. The argument is returned unchanged # if no alias is found. def sender_alias(address) if File.exists?($users_file) File.open($users_file) do |io| io.each_line do |line| if line =~ /^([^:]+)\s*:\s*([^\n\r]+)/ if address == $1 return $2 end end end end end address end # A handle for code that needs to add headers and a body to an email being # sent. This wraps an underlying IO object, and is responsible for doing # sensible header formatting, and for ensuring that the body is seperated # from the message headers by a blank line (as it is required to be). class MailContext def initialize(io) @done_headers = false @io = io end # add a header to the email. raises an exception if #body has already been # called def header(name, value) raise "headers already commited" if @done_headers if name == "Subject" $encoder.encode_header(@io, "Subject", value) else @io.puts("#{name}: #{value}") end end # yields an IO that should be used to write the message body def body @done_headers = true @io.puts yield @io end end # provides a send() method for sending email by invoking the 'sendmail' # command-line program class SendmailMailer def send(from, recipients) # The -t option causes sendmail to take message headers, as well as the # message body, from its input. The -oi option stops a dot on a line on # its own from being interpreted as the end of the message body (so # messages that have such a line don't fail part-way though sending), cmd = "#{$sendmail_prog} -t -oi" blah("invoking '#{cmd}'") IO.popen(cmd, "w") do |mail| ctx = MailContext.new(mail) ctx.header("To", recipients.join(',')) if from blah("Mail From: <#{from}>") else blah("Mail From not set") end ctx.header("From", from) if from yield ctx end end end # provides a send() method for sending email by connecting to an SMTP server # using the Ruby Net::SMTP package. class SMTPMailer def initialize(smtp_host) @smtp_host = smtp_host end class IOAdapter def initialize(mail) @mail = mail end def puts(text="") @mail.write(text) @mail.write("\r\n") end def print(text) @mail.write(text) end end def send(from, recipients) if from == nil from = ENV['USER'] || ENV['USERNAME'] || 'cvsspam' end unless from =~ /@/ from = "#{from}@#{ENV['HOSTNAME']||'localhost'}" end smtp = Net::SMTP.new(@smtp_host) blah("connecting to '#{@smtp_host}'") smtp.start() smtp.ready(from, recipients) do |mail| ctx = MailContext.new(IOAdapter.new(mail)) ctx.header("To", recipients.join(',')) blah("Mail From: <#{from}>") ctx.header("From", from) if from ctx.header("Date", Time.now.utc.strftime(DATE_HEADER_FORMAT)) yield ctx end end end if $smtp_host require 'net/smtp' mailer = SMTPMailer.new($smtp_host) else mailer = SendmailMailer.new end $from_address = sender_alias($from_address) unless $from_address.nil? mailer.send($from_address, $recipients) do |mail| mail.header("Subject", mailSubject) mail.header("MIME-Version", "1.0") mail.header("Content-Type", "text/html" + ($charset.nil? ? "" : "; charset=\"#{$charset}\"")) if ENV['REMOTE_HOST'] # TODO: I think this will always be an IP address. If a hostname is # possible, it may need encoding of some kind, mail.header("X-Originating-IP", "[#{ENV['REMOTE_HOST']}]") end unless ($additionalHeaders.empty?) $additionalHeaders.each do |header| mail.header(header[0], header[1]) end end mail.header("X-Mailer", "CVSspam #{$version} ") mail.body do |body| make_html_email(body) end end --2oS5YaxWCcQjTEyO-- From asousa at minos.phy.tufts.edu Tue Jan 25 05:11:03 2005 From: asousa at minos.phy.tufts.edu (Alexandre Sousa) Date: Tue, 25 Jan 2005 00:11:03 -0500 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <20050125001446.GA14076@vhost.badgers-in-foil.co.uk> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> <1105634245.5202.11.camel@cronus.phy.tufts.edu> <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> <1105641308.5202.23.camel@cronus.phy.tufts.edu> <20050122231953.GA5195@vhost.badgers-in-foil.co.uk> <1106603092.27769.22.camel@cronus.phy.tufts.edu> <20050125001446.GA14076@vhost.badgers-in-foil.co.uk> Message-ID: <1106629863.5194.18.camel@cronus.phy.tufts.edu> Hi Dave, > > I don't think that you also made the change to cvsspam.rb that I > suggested. I've attached a modified version of cvsspam.rm that adds > extra debug info. Yes, you're right! I missed your attached patch in the previous e-mail... > With this changed file, I would expect to see, just after the 'invoking > sendmail' line, either: > > cvsspam.rb: Mail From: > or > cvsspam.rb: Mail From not set Now I get something: cvsspam.rb: Using config '/home/cvs/CVSROOT/cvsspam.conf' cvsspam.rb: invoking '/usr/sbin/sendmail -t -oi' cvsspam.rb: Mail From: cvsspam.rb: leaving file /tmp/#cvsspam.4343.506-40349066/logfile.emailtmp So, $USER just seems set to the user who sends the e-mails and owns the repository instead of the user who commits. Not really sure if it's some setting in sendmail.cf that is preventing CVSspam from changing the "From:" field. Just for reference, the mailing list is defined in /etc/aliases with a link in /etc/mail/aliases and /etc/mail/aliases.db created from: makemap hash /etc/mail/aliases.db < /etc/mail/aliases Thanks for your continued interest in this, Alex From dave at badgers-in-foil.co.uk Tue Jan 25 12:28:30 2005 From: dave at badgers-in-foil.co.uk (David Holroyd) Date: Tue, 25 Jan 2005 12:28:30 +0000 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <1106629863.5194.18.camel@cronus.phy.tufts.edu> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> <1105634245.5202.11.camel@cronus.phy.tufts.edu> <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> <1105641308.5202.23.camel@cronus.phy.tufts.edu> <20050122231953.GA5195@vhost.badgers-in-foil.co.uk> <1106603092.27769.22.camel@cronus.phy.tufts.edu> <20050125001446.GA14076@vhost.badgers-in-foil.co.uk> <1106629863.5194.18.camel@cronus.phy.tufts.edu> Message-ID: <20050125122830.GA24034@vhost.badgers-in-foil.co.uk> On Tue, Jan 25, 2005 at 12:11:03AM -0500, Alexandre Sousa wrote: > Now I get something: > > cvsspam.rb: Using config '/home/cvs/CVSROOT/cvsspam.conf' > cvsspam.rb: invoking '/usr/sbin/sendmail -t -oi' > cvsspam.rb: Mail From: > cvsspam.rb: leaving > file /tmp/#cvsspam.4343.506-40349066/logfile.emailtmp > > So, $USER just seems set to the user who sends the e-mails and owns the > repository instead of the user who commits. > Not really sure if it's some setting in sendmail.cf that is preventing > CVSspam from changing the "From:" field. That debug info shows that CVSspam is explicitly telling your MTA to set the sender address to 'cvs'. There's not much your mail config can do to change things, because the problem occurs earlier in the chain of events. So, how you actually commit files to this repository? - Do you use pserver, local repository, etc.? - Does each user of the system have a seperate account, or do they all explicitly give the username 'cvs'? Note that if your CVSROOT looks like... :pserver:cvs@hostname:/path/to/cvsroot ...then CVS has no idea who your users are, so I don't think CVSspam can be made to differentiate them either. Whereas, if your CVSROOT looks like... :pserver:@hostname:/path/to/cvsroot ...or, implicity using the current logged in user's name (I think)... :pserver:hostname:/path/to/cvsroot ...then I'd suspect some bug in CVSspam (or *maybe* in the CVS config). dave -- http://david.holroyd.me.uk/ From asousa at minos.phy.tufts.edu Tue Jan 25 18:21:40 2005 From: asousa at minos.phy.tufts.edu (Alexandre Sousa) Date: Tue, 25 Jan 2005 13:21:40 -0500 Subject: [cvsspam-devel] How to show who committed a file in CVSspam? In-Reply-To: <20050125122830.GA24034@vhost.badgers-in-foil.co.uk> References: <1105573868.9971.38.camel@cronus.phy.tufts.edu> <20050113085745.GB7817@vhost.badgers-in-foil.co.uk> <1105634245.5202.11.camel@cronus.phy.tufts.edu> <20050113180457.GA15755@vhost.badgers-in-foil.co.uk> <1105641308.5202.23.camel@cronus.phy.tufts.edu> <20050122231953.GA5195@vhost.badgers-in-foil.co.uk> <1106603092.27769.22.camel@cronus.phy.tufts.edu> <20050125001446.GA14076@vhost.badgers-in-foil.co.uk> <1106629863.5194.18.camel@cronus.phy.tufts.edu> <20050125122830.GA24034@vhost.badgers-in-foil.co.uk> Message-ID: <1106677301.5577.27.camel@cronus.phy.tufts.edu> Hi Dave, > That debug info shows that CVSspam is explicitly telling your MTA to set > the sender address to 'cvs'. There's not much your mail config can do > to change things, because the problem occurs earlier in the chain of > events. > > So, how you actually commit files to this repository? > > - Do you use pserver, local repository, etc.? > - Does each user of the system have a seperate account, or do they all > explicitly give the username 'cvs'? Well, I tell my users to: setenv CVS_RSH ssh setenv CVSROOT :ext:cvs@minos.phy.tufts.edu:/home/cvs and they have read/write privileges. So I guess I fall in the category: > Note that if your CVSROOT looks like... > > :pserver:cvs@hostname:/path/to/cvsroot > > ...then CVS has no idea who your users are, so I don't think CVSspam can be > made to differentiate them either. I set this up in a similar way to our experiment central repository where one does: setenv CVSROOT :ext:minoscvs@minoscvs.fnal.gov:/cvs/minoscvs/rep1 For this repository some home-brewed "spamming" system is used, which I really find almost useless, but it does tell you who committed to the repository. From all your help, I deduce that the system must use something other than CVS internals to get the user who committed. I'll contact the repository admins and see if I find out how they do it. Thanks, Alex