[cvsspam-devel] diffs not character safe

Elan Ruusamäe glen at delfi.ee
Tue Mar 6 10:35:36 UTC 2007


appears that when passed --charset utf-8 to collect_diffs the diffs are not 
characterwise but bytewise

and as cvsspamm appears to make diffs on same line coloured darker, it breaks 
multibytes

so if the diff would be:
-	'map_tab_label'			=> 'карта',
+	'map_tab_label'			=> 'Карта',

cvsspam hilights after first byte of letter 'k' because it's unicode first 
part is the same byte.

i've attached the mail fragment as i it can't be displayed properly in this 
utf8-encoded email.

-- 
glen
-------------- next part --------------
</pre><pre class="diff" id="removed">-	'map_tab_label'			=&gt; 'Ð<span id="removedchars">º</span>арта',
</pre><pre class="diff" id="added">+	'map_tab_label'			=&gt; 'Ð<span id="addedchars">š</span>арта',


More information about the cvsspam-devel mailing list