[cvsspam-devel] encoding from header

David Holroyd dave at badgers-in-foil.co.uk
Fri Feb 16 23:50:56 UTC 2007


On Fri, Feb 16, 2007 at 06:12:48PM +0200, Elan Ruusam?e wrote:
> On Monday 11 July 2005 19:35:45 David Holroyd wrote:
> > On Fri, Jul 08, 2005 at 04:49:56PM +0300, Elan Ruusam?e wrote:
> > > but today i noticed some problem, particularilly, i have patch which
> > > allows you to specify --charset argument to cvsspam [1], which will
> > > override charset sent to outgoing email. it is used because different
> > > locations in cvs have files in different encodings. but the problem is
> > > that CVSROOT/users file is in constant encoding (iso8859-1 in my case),
> > > and as a side effect the From header gets wrongly encoded.
> > >
> > > cvsspam called with --charset utf-8 parameter produces mail header:
> > > From: Elan =?utf-8?q?Ruusam=e4e?= <glen at delfi.ee>
> > >
> > > perhaps you could look into this issue?
> > >
> > > [1] http://cvs.pld-linux.org/SOURCES/cvsspam-charset-arg.patch
> >
> > Would it be reasonable to try and detect the system character encoding,
> > and use that in preference to the --charset argument?  Any idea how that
> > detection should work?
> >
> > I've sometimes seen a charset appended to $LANG, but this is not the
> > case on most of the systems I use.
> >
> > I guess that there could simply be an extra option to specify the
> > encoding, but it would be nice for this to 'just work' without the need
> > for extra config.
> 
> sorry for bringing up this old issue, how about adding config option which 
> defines the charset of CVSROOT/users file? and if the variable is not 
> defined, just default to --charset arg or 'iso8859-1'.
> 
> it just has to set proper encoding when encoding From header, no encoding 
> coversion is neccessary.
> 
> currently sent out line:
> From: Elan =?utf-8?q?Ruusam=e4e?= <glen at delfi.ee>
> 
> should be just:
> From: Elan =?iso8859-1?q?Ruusam=e4e?= <glen at delfi.ee>

How about something like that attached (untested) change..?


ta,
dave

-- 
http://david.holroyd.me.uk/
-------------- next part --------------
Index: cvsspam.rb
===================================================================
--- cvsspam.rb	(revision 254)
+++ cvsspam.rb	(working copy)
@@ -162,8 +162,9 @@
 
   # gives a string starting "=?", and including a charset specification, that
   # marks the start of a quoted-printable character sequence
-  def marker_start_quoted
-    "=?#{@charset}?#{@encoding}?"
+  def marker_start_quoted(charset=nil)
+    charset = @charset if charset.nil?
+    "=?#{charset}?#{@encoding}?"
   end
 
   # test to see of the given string contains non-ASCII characters
@@ -1243,7 +1244,7 @@
 
 # an RFC 822 email address
 class EmailAddress
-  def initialize(text)
+  def initialize(text, charset=nil)
     if text =~ /^\s*([^<]+?)\s*<\s*([^>]+?)\s*>\s*$/
       @personal_name = $1
       @address = $2
@@ -1251,9 +1252,10 @@
       @personal_name = nil
       @address = text
     end
+    @charset=charset
   end
 
-  attr_accessor :personal_name, :address
+  attr_accessor :personal_name, :address, :charset
 
   def has_personal_name?
     return !@personal_name.nil?
@@ -1284,7 +1286,7 @@
   # rfc2047 encode the word, if it contains non-ASCII characters
   def encode_word(word)
     if $encoder.requires_rfc2047?(word)
-      encoded = $encoder.marker_start_quoted
+      encoded = $encoder.marker_start_quoted(@charset)
       $encoder.each_char_encoded(word) do |code|
 	encoded << code
       end
@@ -1299,6 +1301,7 @@
 cvsroot_dir = "#{ENV['CVSROOT']}/CVSROOT"
 $config = "#{cvsroot_dir}/cvsspam.conf"
 $users_file = "#{cvsroot_dir}/users"
+$users_file_charset = nil
 
 $debug = false
 $recipients = Array.new
@@ -1762,7 +1765,7 @@
       io.each_line do |line|
         if line =~ /^([^:]+)\s*:\s*(['"]?)([^\n\r]+)(\2)/
           if email.address == $1
-            return EmailAddress.new($3)
+            return EmailAddress.new($3, $users_file_charset)
           end
         end
       end


More information about the cvsspam-devel mailing list