Neil Turner's Blog

Blogging about technology and randomness since 2002

Blocking ASCII Comments

Last Saturday I was hit by a comment spam using the wrong character set. Well, it now appears that I can block ASCII comments. So comments made using the ISO-8859-1 character set will get blocked, whereas comments made using Unicode UTF-8 will be permitted. As this is what the comment form insists on, this would, in theory stop some of the comment spam getting through. I haven’t yet implemented it but may consider it.


  1. Erm, doesn’t that plugin require that there be at least one multi-byte character? All it seems to be doing is ensuring that the comment doesn’t match the regex /^[\x00-\xff]+$/, which seems quite useful if you have a site in Japanese or some other multibyte language, but perhaps less useful here, unless you are going to make hëävy mëtäl ümläüt use mandatory.

  2. Oh, wait, those aren’t multibyte in UTF-8, either. Requiring that all comments be made in Japanese or Chinese or another non-western language would probably cut all your comment problems down to size, though.

  3. I was actually waiting for someone like you to tell me that this wasn’t the plugin I wanted. Evidently I don’t know enough about character encoding.

  4. This plug-in broke leaving comments at all on my site! Do not use this plug-in!