[cvsspam-devel] tagging hooks for use with cvsspam

Haroon Rafique haroon.rafique at utoronto.ca
Fri Dec 15 16:15:19 UTC 2006


On Nov 28 at 10:49pm, DH=>David Holroyd <dave at badgers-in-foil.co.uk> wrote:

DH> 
DH> Do your scripts use the same kind of logic as CVSspam's 
DH> commit-processing, or did they make the tagging hooks a bit less odd?
DH> 

Hi David,

Sorry for the late reply, I just got around to catching up with some of my 
emails.

Believe me there is nothing "less odd" about the tagging hooks. I have 
used the same idea as the original cvsspam scripts (which is to keep 
writing the last tagged directory and when the posttag hook tells you that 
we are really processing the last directory, fire off the email).


DH> I've thought of having a cron job or background process of some sort 
DH> to try and batch events where CVS simply just doesn't provide the info 
DH> to link them as they happen (e.g. addition of directories, IIRC).  I 
DH> could never be bothered to implement that though.
DH> 
DH> It would be great to see your work!  Sorry about the slow reply!
DH> 
DH> 
DH> ta,
DH> dave
DH> 
DH> 

In my CVSROOT/taginfo I have entries similar to:

^haroondotfiles /home/haroon/bin/record_last_tagged_dir.rb %t %o %p %{sv}

In my CVSROOT/posttag I have entries similar to (one line):

^haroondotfiles /home/haroon/bin/collect_tags.rb --config 
/home/haroon/etc/cvsspam-haroondotfiles.conf %b %t %o %p %{sVv}

Find attached all the related scripts and 1 sample output (edited to keep 
prying eyes away from web front-end for viewcvs)

Regards,
--
Haroon Rafique
<haroon.rafique at utoronto.ca>
-------------- next part --------------
#!/home/haroon/bin/ruby -w

# Part of CVSspam
#   http://www.badgers-in-foil.co.uk/projects/cvsspam/
# Copyright (c) David Holroyd

$debug = false

def blah(msg)
  if $debug
    $stderr.puts "taginfo: #{msg}"
  end
end

blah("ARGV is <#{ARGV.join('>, <')}>")

$repositorydir = ARGV[2]

$tmpdir = ENV["TMPDIR"] || "/tmp"

# try to pick a name to avoid collisions with other people's commits
$dirtemplate = "#cvsspam.#{Process.getpgrp}.#{Process.uid}"

def find_data_dir
  Dir["#{$tmpdir}/#{$dirtemplate}-*"].each do |dir|
    stat = File.stat(dir)
    return dir if stat.owned?
  end
  nil
end

$datadir = find_data_dir()

if $datadir==nil
  $datadir = "#{$tmpdir}/#{$dirtemplate}-#{rand(99999999)}"
  Dir.mkdir($datadir, 0700)
end

# Record the directory currently being commited to.
# 
# This script (and collect_tags.rb) will be run just for the files in a
# single directory.
# 
# A commit to files in multiple directories will therefore produce multiple
# invocations of these scripts.  To send the email only when the whole commit
# is done, each run overwrites the 'lastdir' file; collect_tags.rb will
# later inspect the value it contains to work out if it needs to generate the
# email yet.

File.open("#{$datadir}/lastdir", "w") { |file|
	file.write $repositorydir
}
-------------- next part --------------
#!/home/haroon/bin/ruby -w

# Part of CVSspam
#   http://www.badgers-in-foil.co.uk/projects/cvsspam/
# Copyright (c) David Holroyd

# collect_tags.rb expects to find this script in the same directory as it
#

$version = "0.2.12"


$maxSubjectLength = 200
$charset = nil          # nil implies 'don't specify a charset'
$mailSubject = ''

def blah(text)
  $stderr.puts("#{$0}: #{text}") if $debug
end

def min(a, b)
  a<b ? a : b
end

# NB must ensure the time is UTC
# (the Ruby Time object's strftime() doesn't supply a numeric timezone)
DATE_HEADER_FORMAT = "%a, %d %b %Y %H:%M:%S +0000"

# returns the character-code of the given character
def chr(txt)
  txt[0]
end

# Limited support for encoding non-US_ASCII characters in mail headers
class HeaderEncoder
  def initialize
    @right_margin = 78
    @encoding = 'q' # quoted-printable, base64 not supported
    @charset = nil # TODO: some better default?
  end

  # character set to be used if any encoding is required.  defaults to nil,
  # which will cause an exception if encoding is attempted without another
  # value being specified
  attr_accessor :charset

  # write an encoded version of the header name/value to the given io
  def encode_header(io, name, value)
    name = name + ": "
    if requires_rfc2047?(value)
      rfc2047_encode_quoted(io, name, value)
    else
      wrap_basic_header(io, name, value)
    end
  end


 private
  # word wrap long headers, putting a space at the begining of wraped lines
  # (i.e. SMTP header continuations)
  def wrap_basic_header(io, start, rest)
    rest.scan(/\s*\S+/) do |match|
      if start.length>0 && start.length+match.length>@right_margin
        io.puts(start)
        start = " "
        match.sub!(/^\s+/, "") # strip existing leading-whitespace
      end
      start << match
    end
    io.puts(start)
  end

  UNDERSCORE = chr("_")
  SPACE = chr(" ")
  TAB = chr("\t")

  # encode a header value according to the RFC-2047 quoted-printable spec,
  # allowing non-ASCII characters to appear in header values, and wrapping
  # long values with header continuation lines as needed
  def rfc2047_encode_quoted(io, start, rest)
    raise "no charset" if @charset.nil?
    code_begin = marker_start_quoted
    start << code_begin
    each_char_encoded(rest) do |code|
      if start.length+code.length+2 > @right_margin
        io.puts(start + marker_end_quoted)
        start = " " + code_begin
      end
      start << code
    end
    io.puts(start + marker_end_quoted)
  end

  # return a string representing the given character-code in quoted-printable
  # format
  def quoted_encode_char(b)
    if b>126 || b==UNDERSCORE || b==TAB
      sprintf("=%02x", b)
    elsif b == SPACE
      "_"
    else
      b.chr
    end
  end

  public

  # yields a quoted-printable version of each byte in the given string
  def each_char_encoded(text)
    text.each_byte do |b|
      yield quoted_encode_char(b)
    end
  end

  # gives the string "?=",which is used to mark the end of a quoted-printable
  # characte rsequence
  def marker_end_quoted
    "?="
  end

  # gives a string starting "=?", and including a charset specification, that
  # marks the start of a quoted-printable character sequence
  def marker_start_quoted
    "=?#{@charset}?#{@encoding}?"
  end

  # test to see of the given string contains non-ASCII characters
  def requires_rfc2047?(word)
    (word =~ /[\177-\377]/) != nil
  end
end


# Provides access to the datafile previously created by collect_tags.rb.
# Each call to getLines() will return an object that will read lines of the
# same 'type' (e.g. lines describing tag) from the file, and stop when
# lines of a different type (e.g. line giving the
# addition/deletion/modification of tags) are encountered.
class LogReader
  def initialize(logIO)
    @io = logIO
    advance
  end

  def currentLineCode ; @line[1,1]  end


  class ConstrainedIO
    def initialize(reader)
      @reader = reader
      @linecode = reader.currentLineCode
    end

    def each
      return if @reader == nil
      while true
        yield @reader.currentLine
        break unless @reader.advance && currentValid?
      end
      @reader = nil
    end

    def gets
      return nil if @reader == nil
      line = @reader.currentLine
      return nil if line==nil || !currentValid?
      @reader.advance
      return line
    end

    def currentValid?
      @linecode == @reader.currentLineCode
    end
  end

  def getLines
    ConstrainedIO.new(self)
  end

  def eof ; @line==nil  end

  def advance
    @line = @io.gets
    return false if @line == nil
    unless @line[0,1] == "#"
      raise "#{$logfile}:#{@io.lineno} line did not begin with '#': #{@line}"
    end
    return true
  end

  def currentLine
    @line==nil ? nil : @line[3, @line.length-4]
  end
end


# returns a copy of the given string with instances of the HTML special
# characters '&', '<' and '>' encoded as their HTML entity equivalents.
def htmlEncode(text)
  text.gsub(/./) do
    case $&
      when "&" then "&amp;"
      when "<" then "&lt;"
      when ">" then "&gt;"
      else $&
    end
  end
end

# Encodes characters that would otherwise be special in a URL using the
# "%XX" syntax (where XX are hex digits).
# actually, allows '/' to appear
def urlEncode(text)
  text.sub(/[^a-zA-Z0-9\-,.*_\/]/) do
    "%#{sprintf('%2X', $&[0])}"
  end
end


# Represents a top-level directory under the $CVSROOT (which is properly called
# a module -- this class is named incorrectly).  Collects a list of
# all #FileEntry objects that are 'in' this repository.  Class methods provide
# a list of all repositories (ick!)
class Repository
  @@repositories = Hash.new

  def initialize(name)
    @name = name
    @common_prefix = nil
    @all_tags = Hash.new
  end

  # calculate the path prefix shared by all files commited to this
  # reposotory
  def merge_common_prefix(path)
    if path == nil
      path = ""
    end
    if @common_prefix == nil
      @common_prefix = path.dup
    else
      path = path.dup
      until @common_prefix == path
        if @common_prefix.size>path.size
          if @common_prefix.sub!(/(.*)\/.*$/, '\1').nil?
            raise "unable to merge '#{path}' in to '#{@common_prefix}': prefix totally different"
          end
        else
          if path.sub!(/(.*)\/.*$/, '\1').nil?
            raise "unable to merge '#{path}' in to '#{@common_prefix}': prefix totally different"
          end
        end
      end
    end
  end

  attr_reader :name, :common_prefix

  # gets the Repository object for the first component of the given path
  def Repository.get(name)
    # Leading './' is ignored (for peeps who have done 'cvs checkout .')
    # Trailing '/' ensures no match for files in root (we just want dirs)
    name =~ /^(?:\.\/)?([^\/]+)\//  
    name = $1
    name = "/" if name.nil?  # file at top-level?  fake up a name for repo
    rep = @@repositories[name]
    if rep.nil?
      rep =  Repository.new(name)
      @@repositories[name] = rep
    end
    rep
  end

  # returns the total number of top-level directories seen during this commit
  def Repository.count
    @@repositories.size
  end

  # iterate over all the Repository objects created for this commit
  def Repository.each
    @@repositories.each_value do |rep|
      yield rep
    end
  end

  # returns an array of all the repository objects seen during this commit
  def Repository.array
    @@repositories.values
  end

  # get a string representation of the repository to appear in email subjects.
  # This will be the repository name, plus (possibly) the name of the branch
  # on which the commit occured.  If the commit was to multiple branches, the
  # text '..' is used, rather than a branch name
  def to_s
    @name
  end
end

# Records properties of a file that were changed during this commit
class FileEntry
  def initialize(path)
    @path = path
    @repository = Repository.get(path)
    @repository.merge_common_prefix(basedir())
  end

  # the full path and filename within the repository
  attr_accessor :path
  # the type of change committed 'M'=modified, 'A'=added, 'R'=removed
  attr_accessor :type
  # file version number before the commit
  attr_accessor :fromVer
  # file version number after the commit
  attr_accessor :toVer

  # works out the filename part of #path
  def file
    @path =~ /.*\/(.*)/
    $1
  end

  # works out the directory part of #path
  def basedir
    @path =~ /(.*)\/.*/
    $1
  end

  # gives the Repository object this file was automatically associated with
  # on construction
  def repository
    @repository
  end

  # gets the part of #path that comes after the prefix common to all files
  # in the commit to #repository
  def name_after_common_prefix
    @path.slice(@repository.common_prefix.size+1, at path.size-@repository.common_prefix.size-1)
  end

  # was tag removed?
  def removal?
    @type == "R"
  end

  # was tag added?
  def addition?
    @type == "A"
  end

  # was tag simply moved?
  def modification?
    @type == "M"
  end

end

# Superclass for things that eat lines of input, and turn them into output
# for our email.  The 'input' will be provided by #LogReader
# Subclasses of LineConsumer will be registered in the global $handlers later
# on in this file.
class LineConsumer
  # passes each line from 'lines' to the consume() method (which must be
  # implemented by subclasses).
  def handleLines(lines)
    @lineCount = 0
    setup
    lines.each do |line|
      @lineCount += 1
      consume(line)
    end
    teardown
  end

  # Template method called by handleLines to do any subclass-specific setup
  # required.  Default implementation does nothing
  def setup
  end

  # Template method called by handleLines to do any subclass-specific cleanup
  # required.  Default implementation does nothing
  def teardown
  end

  # Returns the number of lines handleLines() has seen so far
  def lineno
    @lineCount
  end

end


# Handle lines from LogReader that represent the name of the tag for
# the next file(s) in the log.
class TagHandler < LineConsumer
  def initialize
    @tag = nil
    @path = nil
    @type = nil
  end

  def consume(line)
    # TODO: check there is only one line
    @tag, at path, at type = line.split(/,/)
    $tag = @tag
    $module_path = @path
    $tag_type = @type
  end
end

# A do-nothing superclass for objects that know how to create hyperlinks to
# web CVS interfaces (e.g. CVSweb).  Subclasses overide these methods to
# wrap HTML link tags arround the text that this classes methods generate.
class NoFrontend
  # Just returns an HTML-encoded version of the 'path' argument.  Subclasses
  # should turn this into a link to a webpage view of this CVS directory
  def path(path, tag)
    htmlEncode(path)
  end

  # Just returns the value of the 'version' argument.  Subclasses should change
  # this into a link to the given version of the file.
  def version(path, version)
    version
  end
end

# Superclass for objects that can link to CVS frontends on the web (ViewCVS,
# Chora, etc.).
class WebFrontend < NoFrontend

  attr_accessor :repository_name

  def initialize(base_url)
    @base_url = base_url
    @repository_name = nil
  end

  def path(path, tag)
    path_for_href = ""
    result = ""
    path.split("/").each do |component|
      unless result == ""
        result << "/"
        path_for_href << "/"
      end
      path_for_href << component
      # The link is split over two lines so that long paths don't create
      # huge HTML source-lines in the resulting email.  This is an attempt to
      # avoid having to prroduce a quoted-printable message (so that long lines
      # can be dealt with properly),
      result << "<a\n"
      result << "href=\"#{path_url(path_for_href, tag)}\">#{htmlEncode(component)}</a>"
    end
    result
  end

  def version(path, version)
    if version == "NONE"
      version
    else
      "<a href=\"#{version_url(path, version)}\">#{version}</a>"
    end
  end

 protected
  def add_repo(url)
    if @repository_name
      if url =~ /\?/
        "#{url}&amp;cvsroot=#{urlEncode(@repository_name)}"
      else
        "#{url}?cvsroot=#{urlEncode(@repository_name)}"
      end
    else
      url
    end
  end
end

# Link to ViewCVS
class ViewCVSFrontend < WebFrontend
  def initialize(base_url)
    super(base_url)
  end

  def path_url(path, tag)
    if tag == nil
      add_repo(@base_url + urlEncode(path))
    else
      add_repo("#{@base_url}#{urlEncode(path)}?only_with_tag=#{urlEncode(tag)}")
    end
  end

  def version_url(path, version)
    add_repo("#{@base_url}#{urlEncode(path)}?rev=#{version}&amp;content-type=text/vnd.viewcvs-markup")
  end

end

# Link to Chora, from the Horde framework
class ChoraFrontend < WebFrontend
  def path_url(path, tag)
    # TODO: can we pass the tag somehow?
    "#{@base_url}/cvs.php/#{urlEncode(path)}"
  end

  def version_url(path, version)
    "#{@base_url}/co.php/#{urlEncode(path)}?r=#{version}"
  end

end

# Link to CVSweb
class CVSwebFrontend < WebFrontend
  def path_url(path, tag)
    if tag == nil
      add_repo(@base_url + urlEncode(path))
    else
      add_repo("#{@base_url}#{urlEncode(path)}?only_with_tag=#{urlEncode(tag)}")
    end
  end

  def version_url(path, version)
    add_repo("#{@base_url}#{urlEncode(path)}?rev=#{version}&amp;content-type=text/x-cvsweb-markup")
  end

end


# in need of refactoring...

# Note when LogReader finds record of a file that had a tag added in this
# commit
class AddedFileHandler < LineConsumer
  def consume(line)
    path,toVer = line.split(/,/)
    $file = FileEntry.new($module_path + "/" + path)
    $file.toVer = toVer
    $file.type = "A"
    $fileEntries << $file
  end
end

# Note when LogReader finds record of a file that had a tag removed in this
# commit
class RemovedFileHandler < LineConsumer
  def consume(line)
    path,fromVer = line.split(/,/)
    $file = FileEntry.new($module_path + "/" + path)
    $file.fromVer = fromVer
    $file.type = "R"
    $fileEntries << $file
  end
end

# Note when LogReader finds record of a file that had a tag modified in this
# commit
class ModifiedFileHandler < LineConsumer
  def consume(line)
    path,fromVer,toVer = line.split(/,/)
    $file = FileEntry.new($module_path + "/" + path)
    $file.fromVer = fromVer
    $file.toVer = toVer
    $file.type = "M"
    $fileEntries << $file
  end
end


# an RFC 822 email address
class EmailAddress
  def initialize(text)
    if text =~ /^\s*([^<]+?)\s*<\s*([^>]+?)\s*>\s*$/
      @personal_name = $1
      @address = $2
    else
      @personal_name = nil
      @address = text
    end
  end

  attr_accessor :personal_name, :address

  def has_personal_name?
    return !@personal_name.nil?
  end

  def encoded
    if has_personal_name?
      "#{encoded_personal_name} <#{address}>"
    else
      @address
    end
  end

  def to_s
    if has_personal_name?
      "#{personal_name} <#{address}>"
    else
      @address
    end
  end

  private

  def encoded_personal_name
    personal_name.split(" ").map{|word| encode_word(word)}.join(" ")
  end

  # rfc2047 encode the word, if it contains non-ASCII characters
  def encode_word(word)
    if $encoder.requires_rfc2047?(word)
      encoded = $encoder.marker_start_quoted
      $encoder.each_char_encoded(word) do |code|
        encoded << code
      end
      encoded << $encoder.marker_end_quoted
      return encoded
    end
    word
  end
end


cvsroot_dir = "#{ENV['CVSROOT']}/CVSROOT"
$config = "#{cvsroot_dir}/cvsspam.conf"
$users_file = "#{cvsroot_dir}/users"

$debug = false
$recipients = Array.new
$sendmail_prog = "/usr/sbin/sendmail"
$hostname = ENV['HOSTNAME'] || 'localhost'
$viewcvsURL = nil
$choraURL = nil
$cvswebURL = nil
$from_address = nil
$subjectPrefix = nil
$files_in_subject = false;
$smtp_host = nil
$repository_name = nil
# 2MiB limit on attached diffs,
$mail_size_limit = 1024 * 1024 * 2
$arg_charset = nil

blah("ARGV is <#{ARGV.join('>, <')}>")

require 'getoptlong'

opts = GetoptLong.new(
  [ "--to",     "-t", GetoptLong::REQUIRED_ARGUMENT ],
  [ "--config", "-c", GetoptLong::REQUIRED_ARGUMENT ],
  [ "--debug",  "-d", GetoptLong::NO_ARGUMENT ],
  [ "--from",   "-u", GetoptLong::REQUIRED_ARGUMENT ],
  [ "--charset",      GetoptLong::REQUIRED_ARGUMENT ]
)

opts.each do |opt, arg|
  $recipients << EmailAddress.new(arg) if opt=="--to"
  $config = arg if opt=="--config"
  $debug = true if opt=="--debug"
  $from_address = EmailAddress.new(arg) if opt=="--from"
  # must use different variable as the config is readed later.
  $arg_charset = arg if opt == "--charset"
end


if ARGV.length != 1
  if ARGV.length > 1
    $stderr.puts "extra arguments not needed: #{ARGV[1, ARGV.length-1].join(', ')}"
  else
    $stderr.puts "missing required file argument"
  end
  puts "Usage: #{$0} [ --to <email> ] [ --config <file> ] <collect_tags file>"
  exit(-1)
end

$logfile = ARGV[0]


$additionalHeaders = Array.new
$problemHeaders = Array.new

# helper function called from the 'config file'
def addHeader(name, value)
  if name =~ /^[!-9;-~]+$/
    $additionalHeaders << [name, value]
  else
    $problemHeaders << [name, value]
  end
end
# helper function called from the 'config file'
def addRecipient(email)
  $recipients << EmailAddress.new(email)
end
# 'constant' used from the 'config file'
class GUESS
end

if FileTest.exists?($config)
  load $config
else
  blah("Config file '#{$config}' not found, ignoring")
end

unless $arg_charset.nil?
  $charset = $arg_charset
end

if $recipients.empty?
  fail "No email recipients defined"
end

if $viewcvsURL != nil
  $viewcvsURL << "/" unless $viewcvsURL =~ /\/$/
  $frontend = ViewCVSFrontend.new($viewcvsURL)
elsif $choraURL !=nil
  $frontend = ChoraFrontend.new($choraURL)
elsif $cvswebURL !=nil
  $cvswebURL << "/" unless $cvswebURL =~ /\/$/
  $frontend = CVSwebFrontend.new($cvswebURL)
else
  $frontend = NoFrontend.new
end

if $viewcvsURL != nil || $cvswebURL !=nil
  if $repository_name == GUESS
    # use the last component of the repository path as the name
    ENV['CVSROOT'] =~ /([^\/]+$)/
    $frontend.repository_name = $1
  elsif $repository_name != nil
    $frontend.repository_name = $repository_name
  end
end



$handlers = Hash["T" => TagHandler.new,
     "A" => AddedFileHandler.new,
     "R" => RemovedFileHandler.new,
     "M" => ModifiedFileHandler.new]

$fileEntries = Array.new
$module_path = nil


File.open($logfile) do |log|
  reader = LogReader.new(log)

  until reader.eof
    handler = $handlers[reader.currentLineCode]
    if handler == nil
      raise "No handler file lines marked '##{reader.currentLineCode}'"
    end
    handler.handleLines(reader.getLines)
  end
end

if $fileEntries.length == 0
  blah("No tagging operation performed")
  exit
end

if $subjectPrefix == nil
  $subjectPrefix = "[CVS #{Repository.array.join(',')}] tag operation"
end

if $files_in_subject
  all_files = ""
  $fileEntries.each do |file|
    name = htmlEncode(file.name_after_common_prefix)
    if all_files != ""
      all_files = all_files + ";" + name
    else
      all_files = name
    end
  end
  $mailSubject = all_files + ":" + $tag
end

mailSubject = "#{$subjectPrefix} #{$mailSubject}"
if mailSubject.length > $maxSubjectLength
  mailSubject = mailSubject[0, $maxSubjectLength]
end

$encoder = HeaderEncoder.new
# TODO: maybe we should use the system-default value instead of ISO Latin 1?
$encoder.charset = $charset.nil? ? "ISO-8859-1" : $charset


# generate the email header (and footer) for the email body to a temp file
# (which is simply included in the middle)
def make_html_email(mail)
  mail.puts(<<HEAD)
<html>
<head>
<style><!--
  body {background-color:#ffffff;}
  #summary {border:1px solid #eeeeee;margin-top:1em;margin-bottom:1em;}
  #branch {background-color:#ffe4c4;}
  tr.alt {background-color:#eeeeee;}
  .added {background-color:#ddffdd;}
  tr.alt .added {background-color:#ccf7cc;}
  .removed {background-color:#ffdddd;}
  tr.alt .removed {background-color:#f7cccc;}
  .modified {background-color:#bde8e6;}
  tr.alt .modified {background-color:#add8e6;}
  #info {color:#888888;}
  td {padding-left:.3em;padding-right:.3em;}
  tr.head {border-bottom-width:1px;border-bottom-style:solid;}
  tr.head td {padding:.5em;}
  .error {color:red;}
--></style>
</head>
<body>
HEAD

  unless ($problemHeaders.empty?)
    mail.puts("<strong class=\"error\">Bad header format in '#{$config}':<ul>")
    $stderr.puts("Bad header format in '#{$config}':")
    $problemHeaders.each do |header|
      mail.puts("<li><pre>#{htmlEncode(header[0])}</pre></li>")
      $stderr.puts(" - #{header[0]}")
    end
    mail.puts("</ul></strong>")
  end
  mail.puts("<div id=\"summary\">")
  mail.puts("<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" rules=\"cols\">")

  filesAdded = 0
  filesRemoved = 0
  filesModified  = 0
  file_count = 0
  lastPath = ""
  last_repository = nil
  branch = $tag_type == 'T' ? ' <span id="branch">(branch)</span>' : '';
  $fileEntries.each do |file|
    unless file.repository == last_repository
      last_repository = file.repository
      mail.print(<<EOT)
<tr class="head"><td colspan="2">
Tag operation in
#{$frontend.path(last_repository.common_prefix,$tag)}
with tag <b>#{$tag}</b>.#{branch}</td></tr>
EOT
    end
    file_count += 1
    if (file_count%2==0)
      mail.print("<tr class=\"alt\">")
    else
      mail.print("<tr>")
    end
    if file.addition?
      filesAdded += 1
    elsif file.removal?
      filesRemoved += 1
    elsif file.modification?
      filesModified += 1
    end
    name = htmlEncode(file.name_after_common_prefix)
    slashPos = name.rindex("/")
    if slashPos==nil
      prefix = ""
    else
      thisPath = name[0,slashPos]
      name = name[slashPos+1,name.length]
      if thisPath == lastPath
        prefix = "&nbsp;"*(slashPos) + "/"
      else 
        prefix = thisPath + "/"
      end
      lastPath = thisPath
    end
    if file.addition?
      name = "<span class=\"added\">#{name}</span>"
    elsif file.removal?
      name = "<span class=\"removed\">#{name}</span>"
    elsif file.modification?
      name = "<span class=\"modified\">#{name}</span>"
    end
    mail.print("<td><tt>#{prefix}#{name}</tt></td>")
    if file.addition?
      mail.print("<td nowrap=\"nowrap\" align=\"right\">tag added to #{$frontend.version(file.path,file.toVer)}</td>")
    elsif file.removal?
      mail.print("<td nowrap=\"nowrap\">tag removed from #{$frontend.version(file.path,file.fromVer)}</td>")
    elsif file.modification?
      mail.print("<td nowrap=\"nowrap\" align=\"center\">tag moved from #{$frontend.version(file.path,file.fromVer)} to #{$frontend.version(file.path,file.toVer)}</td>")
    end

    mail.puts("</tr>")
  end
  
  mail.puts("</table>")
  mail.puts("</div>")

  totalFilesChanged = filesAdded+filesRemoved+filesModified
  if totalFilesChanged > 1
    mail.print("<small id=\"info\">")
    changeKind = 0
    if filesAdded>0
      mail.print("#{filesAdded} files with tags added")
      changeKind += 1
    end
    if filesRemoved>0
      mail.print(" + ") if changeKind>0
      mail.print("#{filesRemoved} files with tags removed")
      changeKind += 1
    end
    if filesModified>0
      mail.print(" + ") if changeKind>0
      mail.print("#{filesModified} files with tags moved")
      changeKind += 1
    end
    mail.print(", total #{totalFilesChanged}") if changeKind > 1
    mail.puts("</small><br />")
  end

  mail.puts("<center><small><a href=\"http://www.badgers-in-foil.co.uk/projects/cvsspam/\" title=\"tag -&gt; email\">CVSspam</a> #{$version}</small></center>")

  mail.puts("</body></html>")

end

# Tries to look up an 'alias' email address for the given string in the
# CVSROOT/users file, if the file exists.  The argument is returned unchanged
# if no alias is found.
def sender_alias(email)
  if File.exists?($users_file)
    File.open($users_file) do |io|
      io.each_line do |line|
        if line =~ /^([^:]+)\s*:\s*(['"]?)([^\n\r]+)(\2)/
          if email.address == $1
            return EmailAddress.new($3)
          end
        end
      end
    end
  end
  email
end

# A handle for code that needs to add headers and a body to an email being
# sent.  This wraps an underlying IO object, and is responsible for doing
# sensible header formatting, and for ensuring that the body is seperated
# from the message headers by a blank line (as it is required to be).
class MailContext
  def initialize(io)
    @done_headers = false
    @io = io
  end

  # add a header to the email.  raises an exception if #body has already been
  # called
  def header(name, value)
    raise "headers already commited" if @done_headers
    if name == "Subject"
      $encoder.encode_header(@io, "Subject", value)
    else
      @io.puts("#{name}: #{value}")
    end
  end

  # yields an IO that should be used to write the message body
  def body
    @done_headers = true
    @io.puts
    yield @io
  end
end

# provides a send() method for sending email by invoking the 'sendmail'
# command-line program
class SendmailMailer
  def send(from, recipients)
    # The -t option causes sendmail to take message headers, as well as the
    # message body, from its input.  The -oi option stops a dot on a line on
    # its own from being interpreted as the end of the message body (so
    # messages that have such a line don't fail part-way though sending),
    cmd = "#{$sendmail_prog} -t -oi"
    blah("invoking '#{cmd}'")
    IO.popen(cmd, "w") do |mail|
      ctx = MailContext.new(mail) 
      ctx.header("To", recipients.map{|addr| addr.encoded}.join(','))
      if from
        blah("Mail From: <#{from}>")
      else
        blah("Mail From not set")
      end
      ctx.header("From", from.encoded) if from
      yield ctx
    end
  end
end

# provides a send() method for sending email by connecting to an SMTP server
# using the Ruby Net::SMTP package.
class SMTPMailer
  def initialize(smtp_host)
    @smtp_host = smtp_host
  end

  class IOAdapter
    def initialize(mail)
      @mail = mail
    end
    def puts(text="")
      @mail.write(text)
      @mail.write("\r\n")
    end
    def print(text)
      @mail.write(text)
    end
  end

  def send(from, recipients)
    if from == nil
      from = EmailAddress.new(ENV['USER'] || ENV['USERNAME'] || 'cvsspam')
    end  
    unless from.address =~ /@/
      from.address = "#{from.address}@#{$hostname}"
    end
    smtp = Net::SMTP.new(@smtp_host)
    blah("connecting to '#{@smtp_host}'")
    smtp.start()
    smtp.ready(from.address, recipients.map{|addr| addr.address}) do |mail|
      ctx = MailContext.new(IOAdapter.new(mail))
      ctx.header("To", recipients.map{|addr| addr.encoded}.join(','))
      blah("Mail From: <#{from}>")
      ctx.header("From", from.encoded) if from
      ctx.header("Date", Time.now.utc.strftime(DATE_HEADER_FORMAT))
      yield ctx
    end
  end
end


def make_msg_id(localpart, hostpart)
  "<cvsspam-#{localpart}@#{hostpart}>"
end


# replaces control characters, and a selection of other characters that
# may not appear unquoted in an RFC822 'word', with underscores.  (It
# doesn't actually zap '.' though.)
def zap_header_special_chars(text)
  text.gsub(/<>()\[\]@,;:\\[\000-\037\177]/, "_")
end


# Mail clients will try to 'thread' together a conversation over
# several email messages by inspecting the In-Reply-To and References headers,
# which should refer to previous emails in the conversation by mentioning
# the value of the previous message's Message-Id header.  This function invents
# values for these headers so that, in the special case where a *single* file
# is committed to repeatedly, the emails giving notification of these commits
# can be threaded together automatically by the mail client.
def inject_threading_headers(mail)
  return unless $fileEntries.length == 1
  file = $fileEntries[0]
  name = zap_header_special_chars(file.path)
  unless file.fromVer == "NONE"
    mail.header("References", make_msg_id("#{name}.#{file.fromVer}", $hostname))
  end
  unless file.toVer == "NONE"
    mail.header("Message-ID", make_msg_id("#{name}.#{file.toVer}", $hostname))
  end
end


if $smtp_host
  require 'net/smtp'
  mailer = SMTPMailer.new($smtp_host)
else
  mailer = SendmailMailer.new
end

if $from_address == nil
  $from_address = EmailAddress.new(ENV['USER'] || ENV['USERNAME'] || 'cvsspam')
end
$from_address = sender_alias($from_address)

mailer.send($from_address, $recipients) do |mail|
  mail.header("Subject", mailSubject)
  inject_threading_headers(mail)
  mail.header("MIME-Version", "1.0")
  mail.header("Content-Type", "text/html" + ($charset.nil? ? "" : "; charset=\"#{$charset}\""))
  if ENV['REMOTE_HOST']
    # TODO: I think this will always be an IP address.  If a hostname is
    # possible, it may need encoding of some kind,
    mail.header("X-Originating-IP", "[#{ENV['REMOTE_HOST']}]")
  end
  unless ($additionalHeaders.empty?)
    $additionalHeaders.each do |header|
      mail.header(header[0], header[1])
    end
  end
  mail.header("X-Mailer", "CVSspam #{$version} <http://www.badgers-in-foil.co.uk/projects/cvsspam/>")

  mail.body do |body|
    make_html_email(body)
  end
end
-------------- next part --------------
#!/home/haroon/bin/ruby -w

# Part of CVSspam
#   http://www.badgers-in-foil.co.uk/projects/cvsspam/
# Copyright (c) David Holroyd

#  ARGV is 'tagname,tagaction,foo/dir,filea,NONE,1.1,fileb,1.3,1.4'
#  


# Assumptions
# - file names do not contain newlines or single quotes


$tmpdir = ENV["TMPDIR"] || "/tmp"
$dirtemplate = "#cvsspam.#{Process.getpgrp}.#{Process.uid}"

def find_data_dir
  Dir["#{$tmpdir}/#{$dirtemplate}-*"].each do |dir|
    stat = File.stat(dir)
    return dir if stat.owned?
  end
  nil
end


def blah(msg)
  if $debug
    $stderr.puts "#{$0}: #{msg}"
  end
end


class ChangeInfo
  def initialize(file, fromVer, toVer, tagOp)
    @file, @fromVer, @toVer, @tagOp = file, fromVer, toVer, tagOp
    if fromVer == toVer
      unless tagOp == 'del' || tagOp == 'mov'
        fail "'from' and 'to' versions should be different ('#{fromVer}')"
      end
    end
  end
  attr_reader :file, :fromVer, :toVer, :tagOp
  def to_s
    "<ChangeInfo \"#{@file}\" tag: \"#{@tagOp}\" #{@toVer}<--#{@fromVer}>"
  end

  def isAddition ; tagOp == 'add' end

  def isRemoval ; tagOp == 'del' end

  def isModification ; tagOp == 'mov' end

  def isRedundant ; isModification && fromVer == toVer end
end


# cvs_info comes from the command line, ultimately as the expansion of the
# %{sVv} in $CVSROOT/posttag.  It isn't possible to parse this value
# unambiguously, but we make an effort to get it right in as many cases as
# possible.
def collect_modern_style_args(cvs_info, operation)
  changes = Array.new
  i = 0
  while i < cvs_info.length
    change = ChangeInfo.new(cvs_info[i], cvs_info[i+=1], cvs_info[i+=1], operation)
    changes << change unless change.isRedundant
    i+=1
  end
  return changes
end

# Replace multiple adjecent forward slashes with a single slash.
def sanitise_path(path)
  path.gsub(/\/+/, "/")
end

def process_log(cvs_info)
  cvsroot = sanitise_path(ENV['CVSROOT'])

  $datadir = find_data_dir()

  raise "missing data dir (#{$tmpdir}/#{$dirtemplate}-XXXXXX)" if $datadir==nil

  tag = cvs_info.shift
  operation = cvs_info.shift
  $repository_path = cvs_info.shift

  if $use_modern_argument_list
    changes = collect_modern_style_args(cvs_info, operation)
  end

  File.open("#{$datadir}/tagfile", File::WRONLY|File::CREAT|File::APPEND) do |file|

    if changes.length != 0
      # record tag information
      file.puts "#T #{tag},#{$repository_path},#{$tag_type}"
    end

    changes.each do |change|

      if change.isAddition
        file.puts "#A #{change.file},#{change.toVer}"
      elsif change.isRemoval
        file.puts "#R #{change.file},#{change.fromVer}"
      else
        file.puts "#M #{change.file},#{change.fromVer},#{change.toVer}"
      end

    end
  end
end



def mailtest
  lastdir = nil
  File.open("#{$datadir}/lastdir") do |file|
    lastdir = sanitise_path(file.gets)
  end
  if $repository_path == lastdir
    blah("sending spam.  (I am #{$0})")
    # REVISIT: $0 will not contain the path to this script on all systems
    cmd = File.dirname($0) + "/cvsspam_tag.rb"
    unless system(cmd, "#{$datadir}/tagfile", *$passthroughArgs)
      fail "problem running '#{cmd}' #{$!}"
    end
    if $debug
      blah("leaving file #{$datadir}/tagfile")
    else 
      File.unlink("#{$datadir}/tagfile")
    end
    if $debug
      blah("leaving file #{$datadir}/lastdir")
    else 
      File.unlink("#{$datadir}/lastdir")
    end
    Dir.rmdir($datadir) unless $debug
  else
    blah("not spam time yet, #{$repository_path}!=#{lastdir}")
  end
end


class CVSConfig
  def initialize(filename)
    @data = Hash.new
    File.open(filename) do |io|
      read(io)
    end
  end

  def read(io)
    io.each do |line|
      parse_line(line)
    end
  end

  def parse_line(line)
    # strip any comment (assumes values can't contain '#')
    line.sub!(/#.*$/, "")
    if line =~ /^\s*(.*?)\s*=\s*(.*?)\s*$/
      @data[$1] = $2
    end
  end

  def [](key)
    @data[key]
  end
end

$config = nil
$debug = false

unless ENV.has_key?('CVSROOT')
  fail "$CVSROOT not defined.  It should be when I am invoked from CVSROOT/posttag"
end





require 'getoptlong'

opts = GetoptLong.new(
  [ "--to",     "-t", GetoptLong::REQUIRED_ARGUMENT ],
  [ "--config", "-c", GetoptLong::REQUIRED_ARGUMENT ],
  [ "--debug",  "-d", GetoptLong::NO_ARGUMENT ],
  [ "--from",   "-u", GetoptLong::REQUIRED_ARGUMENT ],
  [ "--charset",      GetoptLong::REQUIRED_ARGUMENT ]
)

# arguments to pass though to 'cvsspam_tag.rb'
$passthroughArgs = Array.new
opts.each do |opt, arg|
  if ["--to", "--config", "--from", "--charset"].include?(opt)
    $passthroughArgs << opt << arg
  end
  if ["--debug"].include?(opt)
    $passthroughArgs << opt
  end
  $config = arg if opt=="--config"
  $debug = true if opt == "--debug"
end

blah("CVSROOT is #{ENV['CVSROOT']}")
blah("ARGV is <#{ARGV.join('>, <')}>")

cvsroot_dir = "#{ENV['CVSROOT']}/CVSROOT"

if $config == nil
  if FileTest.exists?("#{cvsroot_dir}/cvsspam.conf")
    $config = "#{cvsroot_dir}/cvsspam.conf"
  elsif FileTest.exists?("/etc/cvsspam/cvsspam.conf")
    $config = "/etc/cvsspam/cvsspam.conf"
  end

  if $config != nil
    $passthroughArgs << "--config" << $config
  end
end


$use_modern_argument_list = false

cvs_config_filename = "#{cvsroot_dir}/config"

if FileTest.exists?(cvs_config_filename)
  cvs_config = CVSConfig.new(cvs_config_filename)

  $use_modern_argument_list = cvs_config["UseNewInfoFmtStrings"] == "yes"
end

if $config != nil
  if FileTest.exists?($config)
    def addHeader(name,val)
    end
    def addRecipient(who)
    end
    class GUESS
    end
    load $config
  else
    blah("Config file '#{$config}' not found, ignoring")
  end
end

if $use_modern_argument_list
  $tag_type = ARGV.shift
  if ARGV.length % 3 != 0
    $stderr.puts "Expected 3 arguments for each file"
  end
  process_log(ARGV)
end
mailtest
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.badgers-in-foil.co.uk/pipermail/cvsspam-devel/attachments/20061215/bb665658/sample-0001.html


More information about the cvsspam-devel mailing list