Revision of Laconica Word Filter Plugin (Wordfilter) from 06/15/2009

07 Jan Tagged Laconica, microblogging, php, plugin, profanity

The TWiT Network has pretty strict rules about profanity across all channels including the netcasts, chatrooms and the TWiT Army Canteen. There are usually moderators lurking the IRC and microblog but once in a while some profanity gets through the cracks. Therefore I wrote a little word filter to preempt the profanity. I'm not sure if they'll use it but it was fun to write. Sorry about the profanity in this post but it's sort of necessary.

Installation

Save the plugin to plugins/Wordfilter.php and add the following to config.php

# Wordfilter plugin.
require_once('plugins/Wordfilter.php');

# Aggressive filtering replaces ALL occurrances of search terms even
# if the search term appears in the middle of a word.  This increases
# false positives but also prevents circumventing the filter by wrapping
# curse words like "-shit-"
$config['wordfilter']['aggressive'] = true;

# List search and replace terms here.  A few have been included as examples.
# Replacements should be less than or equal in length since size matters.

/* alternate list syntax
$config['wordfilter']['search'] = array('fuck', 'shit', 'bitch');
$config['wordfilter']['replace] = array('frak', 'poop', 'dog');
*/


$config['wordfilter']['search'][] = 'blatherskite'; // for testing so you don't have to swear on your site.
$config['wordfilter']['replace'][] = 'blatherin';

$config['wordfilter']['search'][] = 'fuck';
$config['wordfilter']['replace'][] = 'frak';

$config['wordfilter']['search'][] = 'shit';
$config['wordfilter']['replace'][] = 'poop';

$config['wordfilter']['search'][] = 'bitch';
$config['wordfilter']['replace'][] = 'dog';

$wordfilter = new Wordfilter();

Plugin source code

<?php if (!defined('LACONICA')) exit(1);
/**
 * Wordfilter Plugin
 *
 * @category Plugin
 * @package  Laconica
 * @author   Kyle Hasegawa  @kylehase
 * @license  http://www.fsf.org/licensing/licenses/agpl-3.0.html GNU Affero General Public License version 3.0
 * @version  Wordfilter.php,v 0.2 2009/06/15 00:40:25 +0900
 *
 */


class Wordfilter extends Plugin
{
    function __construct()
    {
        parent::__construct();
    }
       
    // Hook StartNoticeSave
    function onStartNoticeSave($notice){
        $search = common_config('wordfilter','search');
        $replace = common_config('wordfilter','replace');
        // Aggressive filtering replaces all occurrances of the search terms
        if(common_config('wordfilter','aggressive')) {
            $notice->rendered = str_ireplace($search, $replace, $notice->rendered);
            $notice->content = str_ireplace($search, $replace, $notice->content);                                                        
        }
        // Less aggressive filtering only replaces search terms at the beginning or end of words
        else {
            $notice->rendered = ' '.$notice->rendered.' ';
            $notice->content = ' '.$notice->content.' ';
            $count = count($search);
            for($i=0; $i<=$count; $i++) {
                $notice->rendered = preg_replace('/(\s+)'.$search[$i].'|'.$search[$i].'(\s+)/i', '$1'.$replace[$i].'$2', $notice->rendered);
                $notice->content = preg_replace('/(\s+)'.$search[$i].'|'.$search[$i].'(\s+)/i', '$1'.$replace[$i].'$2', $notice->content);
            }
            $notice->rendered = trim($notice->rendered);
            $notice->content = trim($notice->content);                  
        }
    }
}

Update

Regarding longer replacements, according to thefrogman

longer words show up in their entirety on army [web interface] but get cutoff in twhirl [clients]. Didn't seem to break anything
So it's not a major problem if the replacement string is longer than the original.

All code on this site is free for use at your own risk and provided as-is under the WTFPL license unless otherwise stated. Attribution is appreciated but not required.
Blog content, with the exception of externally quoted material, is licensed under the Creative Commons Attribution 3.0 license