[cvsspam-devel] RuntimeError => line did not begin with '#'

David Holroyd dave at badgers-in-foil.co.uk
Sun Feb 1 00:41:23 GMT 2009


Giuliano,

Looking back on this problem, I realise that trying to launch
collect_diffs.rb in the background will potentially cause several
instances to run concurrently, and this could be the cause of your
problem.

CVS will execute the script mentioned in 'loginfo' once for each folder
containing files being committed.  Normally, each invocation of
collect_diffs.rb gets a chance to update the tmp file which will be read
by cvsspam.rb.

So it could be you now have a situation where two or more
collect_diffs.rb processes are trying to update the same tmp file at
once, resulting in the corruption you've spotted.

I guess you either need locking, or a different approach :)


On Fri, Nov 07, 2008 at 09:45:53AM -0500, Piergiuliano Bossi wrote:
> On Wed, Nov 5, 2008 at 8:45 AM, David Holroyd <dave at badgers-in-foil.co.uk>wrote:
> 
> > I don't remember ever having seen something like this.
> >
> > Could it somehow be related to end-of-line-sequence handling (i.e.  are
> > extra windows-style carriage-return characters at the end of the line in
> > question)?
> >
> 
> It doesn't seem to be the case.
> 
> 
> > I seem to recall that the Ruby 'puts' method doesn't append a newline to
> > the string if there's one there already -- maybe windows line endings
> > cause this logic to foul up in one of the code paths?
> 
> 
> I'm not saying it's impossible, but it doesn't seem to me to be related:
> basically a line from one file gets mingled in the diff of another file.
> I'm more and more inclined to think that this is because of concurrent
> execution in background, but it's not clear to me how.
> 
> 
> > Would be good to see the actual data, if possible?  (If you want to send
> > a private mail, I can treat it in confidence).
> 
> 
> I'll send you some data privately.
> 
> Here I can show the changes that I've made to collect_diffs.rb and
> cvsspam.rb, together with the shell script that is invoked when a commit is
> made (triggered by a configuration line in loginfo).
> 
> 
> --- cvsspam-0.2.12/collect_diffs.rb.original    2008-10-17
> 12:07:37.000000000 -0400
> +++ collect_diffs.rb    2008-10-31 12:29:07.000000000 -0400
> @@ -147,7 +147,7 @@
> 
>    raise "missing data dir (#{$tmpdir}/#{$dirtemplate}-XXXXXX)" if
> $datadir==nil
> 
> -  line = $stdin.gets
> +  line = $inputfile.gets
>    unless line =~ /^Update of (.+)/
>      fail "Log preamble looks suspect (doesn't start 'Update of ...')"
>    end
> @@ -167,7 +167,7 @@
>    end
> 
>    # look for the start of the user's comment
> -  $stdin.each do |line|
> +  $inputfile.each do |line|
>      break if line =~ /^Log Message/
>    end
> 
> @@ -176,14 +176,14 @@
>    end
> 
>    File.open("#{$datadir}/logfile", File::WRONLY|File::CREAT|File::APPEND)
> do |file|
> -    $stdin.each do |line|
> +    $inputfile.each do |line|
>        # remove any trailing whitespace; we don't want the split() below to
>        # produce empty trailing items due to '\r' at the end of the line
>        # (if input is 'DOS' style),
>        line.sub!(/\s*$/, "")
> 
>        # 'Mac' clients sending logs to a unix server may denote end-of-line
> with
> -      # a carriage-return, defeating the $stdin.each above (i.e. the whole
> log
> +      # a carriage-return, defeating the $inputfile.each above (i.e. the
> whole log
>        # message will appear as a single line containing '\r's).  We handle
> this
>        # case explicitly here, so that cvsspam.rb's Subject header
> generation
>        # doesn't break,
> @@ -264,8 +264,19 @@
>      end
>    end
> 
> +  remove_inputfile()
>  end
> 
> +def remove_inputfile()
> +  if $inputfile != $stdin
> +    blah "Closing inputfile"
> +    $inputfile.close
> +  end
> +  unless $filename.nil?
> +    blah "If I wasn't debugging I'd have removed #{$filename}"
> +    File.delete($filename) unless $debug
> +  end
> +end
> 
>  # sometimes, CVS would exit with an error like,
>  #
> @@ -334,7 +345,9 @@
> 
>  $config = nil
>  $cvs_prog = "cvs"
> -$debug = false
> +$debug = true
> +$inputfile = $stdin
> +$filename = nil
>  $diff_ignore_keywords = false
>  $task_keywords = []
> 
> @@ -342,7 +355,6 @@
>    fail "$CVSROOT not defined.  It should be when I am invoked from
> CVSROOT/loginfo"
>  end
> 
> -
>  def handle_operation?(args)
>    # The CVS 1.12.x series pass an argument with the value "- New directory"
>    # whereas previous versions passed "some/path - New directory".  The
> newer
> @@ -360,12 +372,6 @@
>  end
> 
> 
> -unless handle_operation?(ARGV)
> -  consume_stdin()
> -  exit
> -end
> -
> -
>  require 'getoptlong'
> 
>  opts = GetoptLong.new(
> @@ -373,7 +379,8 @@
>    [ "--config", "-c", GetoptLong::REQUIRED_ARGUMENT ],
>    [ "--debug",  "-d", GetoptLong::NO_ARGUMENT ],
>    [ "--from",   "-u", GetoptLong::REQUIRED_ARGUMENT ],
> -  [ "--charset",      GetoptLong::REQUIRED_ARGUMENT ]
> +  [ "--charset",      GetoptLong::REQUIRED_ARGUMENT ],
> +  [ "--file",   "-f", GetoptLong::OPTIONAL_ARGUMENT ]
>  )
> 
>  # arguments to pass though to 'cvsspam.rb'
> @@ -387,6 +394,17 @@
>    end
>    $config = arg if opt=="--config"
>    $debug = true if opt == "--debug"
> +  if opt == "--file"
> +    $filename = arg
> +    $inputfile = File.new(arg)
> +    blah "inputfile is #{$inputfile.path}"
> +  end
> +end
> +
> +unless handle_operation?(ARGV)
> +  consume_stdin()
> +  remove_inputfile()
> +  exit
>  end
> 
>  blah("CVSROOT is #{ENV['CVSROOT']}")
> 
> 
> 
> 
> 
> 
> 
> 
> 
> --- cvsspam-0.2.12/cvsspam.rb.original  2008-10-17 12:40:22.000000000 -0400
> +++ cvsspam.rb  2008-10-31 12:36:36.000000000 -0400
> @@ -226,7 +226,9 @@
>      @line = @io.gets
>      return false if @line == nil
>      unless @line[0,1] == "#"
> -      raise "#{$logfile}:#{@io.lineno} line did not begin with '#':
> #{@line}"
> +      raise "#{$logfile}:#{@io.lineno} line did not begin with '#':
> #{@line}" unless $debug
> +      blah  "#{$logfile}:#{@io.lineno} line did not begin with '#' =>
> SKIPPING: #{@line}"
> +      return advance
>      end
>      return true
>    end
> @@ -1281,7 +1283,6 @@
>    $arg_charset = arg if opt == "--charset"
>  end
> 
> -
>  if ARGV.length != 1
>    if ARGV.length > 1
>      $stderr.puts "extra arguments not needed: #{ARGV[1,
> ARGV.length-1].join(', ')}"
> @@ -1761,10 +1762,11 @@
>      unless from.address =~ /@/
>        from.address = "#{from.address}@#{$hostname}"
>      end
> -    smtp = Net::SMTP.new(@smtp_host)
> +    smtp = Net::SMTP.new(@smtp_host, 25)
>      blah("connecting to '#{@smtp_host}'")
> -    smtp.start()
> -    smtp.ready(from.address, recipients.map{|addr| addr.address}) do |mail|
> +    #smtp.set_debug_output $stderr
> +    smtp.start('__omissis__', '__omissis__', '__omissis__', :login)
> +    smtp.ready('__omissis__', recipients.map{|addr| addr.address}) do
> |mail|
>        ctx = MailContext.new(IOAdapter.new(mail))
>        ctx.header("To", recipients.map{|addr| addr.encoded}.join(','))
>        blah("Mail From: <#{from}>")
> @@ -1772,6 +1774,7 @@
>        ctx.header("Date", Time.now.utc.strftime(DATE_HEADER_FORMAT))
>        yield ctx
>      end
> +    smtp.finish
>    end
>  end
> 
> 
> 
> 
> 
> 
> 
> [cvsspam]# cat collect_diffs_in_background.sh
> #!/bin/bash
> date >> $CVSROOT/collect_diffs.log
> echo $USER >> $CVSROOT/collect_diffs.log
> echo $1 >> $CVSROOT/collect_diffs.log
> 
> tmpfile="/tmp/tempfile.$$.${RANDOM%1000}"
> cat > "$tmpfile"
> echo "$tmpfile" >> $CVSROOT/collect_diffs.log
> 
> nohup /usr/local/lib/cvsspam/collect_diffs.rb --debug --file "$tmpfile"
> --from $USER "$@" >> $CVSROOT/collect_diffs.log 2>&1 &
> [cvsspam]#
> 
> 
> 
> So as you can see the differences are:
> *) Upon commit the script collect_diffs_in_background.sh is called (the
> relevant line in loginfo looks like the following)
>                ALL /usr/local/lib/cvsspam/collect_diffs_in_background.sh
> %{sVv}
> *) I use SMTP with recipients configured in the conf file (I have omitted
> the credentials in the diff above, of course)
> *) In order to collect input from cvs and pass it to collect_diffs.rb in
> background I use a temporary file with a random name ==> collect_diffs.rb
> then reads from the file rather than from stdin
> *) In debugging mode, if a line doesn't begin with '#' I simply skip to the
> next one
> 
> After avoiding to raise an exception and turning debugging on the issue
> happened again once. I'll send you the relevant files privately.
> 
> Thanks
> Giuliano

> _______________________________________________
> cvsspam-devel mailing list
> cvsspam-devel at lists.badgers-in-foil.co.uk
> http://lists.badgers-in-foil.co.uk/mailman/listinfo/cvsspam-devel


-- 
http://david.holroyd.me.uk/



More information about the cvsspam-devel mailing list