[cvsspam-devel] encoding from header
David Holroyd
dave at badgers-in-foil.co.uk
Fri Feb 16 23:50:56 UTC 2007
On Fri, Feb 16, 2007 at 06:12:48PM +0200, Elan Ruusam?e wrote:
> On Monday 11 July 2005 19:35:45 David Holroyd wrote:
> > On Fri, Jul 08, 2005 at 04:49:56PM +0300, Elan Ruusam?e wrote:
> > > but today i noticed some problem, particularilly, i have patch which
> > > allows you to specify --charset argument to cvsspam [1], which will
> > > override charset sent to outgoing email. it is used because different
> > > locations in cvs have files in different encodings. but the problem is
> > > that CVSROOT/users file is in constant encoding (iso8859-1 in my case),
> > > and as a side effect the From header gets wrongly encoded.
> > >
> > > cvsspam called with --charset utf-8 parameter produces mail header:
> > > From: Elan =?utf-8?q?Ruusam=e4e?= <glen at delfi.ee>
> > >
> > > perhaps you could look into this issue?
> > >
> > > [1] http://cvs.pld-linux.org/SOURCES/cvsspam-charset-arg.patch
> >
> > Would it be reasonable to try and detect the system character encoding,
> > and use that in preference to the --charset argument? Any idea how that
> > detection should work?
> >
> > I've sometimes seen a charset appended to $LANG, but this is not the
> > case on most of the systems I use.
> >
> > I guess that there could simply be an extra option to specify the
> > encoding, but it would be nice for this to 'just work' without the need
> > for extra config.
>
> sorry for bringing up this old issue, how about adding config option which
> defines the charset of CVSROOT/users file? and if the variable is not
> defined, just default to --charset arg or 'iso8859-1'.
>
> it just has to set proper encoding when encoding From header, no encoding
> coversion is neccessary.
>
> currently sent out line:
> From: Elan =?utf-8?q?Ruusam=e4e?= <glen at delfi.ee>
>
> should be just:
> From: Elan =?iso8859-1?q?Ruusam=e4e?= <glen at delfi.ee>
How about something like that attached (untested) change..?
ta,
dave
--
http://david.holroyd.me.uk/
-------------- next part --------------
Index: cvsspam.rb
===================================================================
--- cvsspam.rb (revision 254)
+++ cvsspam.rb (working copy)
@@ -162,8 +162,9 @@
# gives a string starting "=?", and including a charset specification, that
# marks the start of a quoted-printable character sequence
- def marker_start_quoted
- "=?#{@charset}?#{@encoding}?"
+ def marker_start_quoted(charset=nil)
+ charset = @charset if charset.nil?
+ "=?#{charset}?#{@encoding}?"
end
# test to see of the given string contains non-ASCII characters
@@ -1243,7 +1244,7 @@
# an RFC 822 email address
class EmailAddress
- def initialize(text)
+ def initialize(text, charset=nil)
if text =~ /^\s*([^<]+?)\s*<\s*([^>]+?)\s*>\s*$/
@personal_name = $1
@address = $2
@@ -1251,9 +1252,10 @@
@personal_name = nil
@address = text
end
+ @charset=charset
end
- attr_accessor :personal_name, :address
+ attr_accessor :personal_name, :address, :charset
def has_personal_name?
return !@personal_name.nil?
@@ -1284,7 +1286,7 @@
# rfc2047 encode the word, if it contains non-ASCII characters
def encode_word(word)
if $encoder.requires_rfc2047?(word)
- encoded = $encoder.marker_start_quoted
+ encoded = $encoder.marker_start_quoted(@charset)
$encoder.each_char_encoded(word) do |code|
encoded << code
end
@@ -1299,6 +1301,7 @@
cvsroot_dir = "#{ENV['CVSROOT']}/CVSROOT"
$config = "#{cvsroot_dir}/cvsspam.conf"
$users_file = "#{cvsroot_dir}/users"
+$users_file_charset = nil
$debug = false
$recipients = Array.new
@@ -1762,7 +1765,7 @@
io.each_line do |line|
if line =~ /^([^:]+)\s*:\s*(['"]?)([^\n\r]+)(\2)/
if email.address == $1
- return EmailAddress.new($3)
+ return EmailAddress.new($3, $users_file_charset)
end
end
end
More information about the cvsspam-devel
mailing list