Thread: Incredibly fast advanced chat filter

Results 1 to 4 of 4
  1. #1 Incredibly fast advanced chat filter 
    Jire

    Jire's Avatar
    Join Date
    Nov 2008
    Posts
    12
    Thanks given
    2
    Thanks received
    10
    Rep Power
    250
    Requires Java 8 or newer, with the libraries Fastutils and either OpenHFT zero-allocation-hashing or chronicle-core.

    Code:
    compile group: 'net.openhft', name: 'zero-allocation-hashing', version: '0.8'
    compile group: 'it.unimi.dsi', name: 'fastutil', version: '8.2.2'
    Originally this was created by Pim, but the performance of it was just good enough. I have improved the performance by over an order of magnitude at the same time as eliminating all allocations.
    This is now fast enough to be called hundreds of thousands times per tick now without any worry. Absolutely no allocations are created on checking against the blacklist.

    All words are filtered in every combo, for example all of these combos of "idiot" are filtered, not even including special characters and more that could be added:
    Spoiler for Blocked combos for "idiot", not including special chars:
    Code:
    idiot
    1diot
    !diot
    i)iot
    i}iot
    i]iot
    i0iot
    id1ot
    id!ot
    idi0t
    idio+
    idio7
    1)iot
    !)iot
    !}iot
    !]iot
    !0iot
    !d1ot
    !d!ot
    !di0t
    !dio+
    !dio7
    1}iot
    1]iot
    10iot
    1d1ot
    i}1ot
    1d!ot
    1)1ot
    i}!ot
    1di0t
    i}i0t
    1)!ot
    1dio+
    1)i0t
    1dio7
    1)io+
    i)1ot
    1)io7
    i}io+
    i}io7
    i)!ot
    i]1ot
    1)10t
    i]!ot
    1}1ot
    1)1o+
    1}!ot
    i]i0t
    1)1o7
    1}i0t
    i]io+
    1}io+
    i]io7
    1}io7
    i01ot
    i0!ot
    i0i0t
    i0io+
    i0io7
    i)i0t
    id!0t
    id10t
    i)io+
    i)io7
    idi0+
    idi07
    id1o+
    id!o+
    1]1ot
    id!o7
    101ot
    1d1o+
    1d10t
    !d1o+
    i)1o+
    1d1o7
    id1o7
    1d!0t
    !d!0t
    !}1ot
    i)!0t
    !}!ot
    i}!0t
    !}i0t
    i]!0t
    !}io+
    i0!0t
    !}io7
    id!0+
    id!07
    1d!o+
    1]!ot
    !d!o+
    1]i0t
    !)1ot
    i)!o+
    1]io+
    i}!o+
    1]io7
    i]!o+
    i)10t
    i0!o+
    10!ot
    i)1o7
    10i0t
    10io+
    10io7
    !01ot
    10!0t
    !0!ot
    10!o+
    !0i0t
    10!o7
    !0io+
    !0io7
    !]1ot
    !d10t
    !d1o7
    1d!o7
    !)!ot
    !)i0t
    !]!ot
    !]i0t
    !d!o7
    !]io+
    !]io7
    1}1o+
    1]1o+
    1010t
    101o+
    1d10+
    10i0+
    10i07
    !)io+
    !)io7
    !di0+
    !)1o+
    !di07
    1)!0t
    !}1o+
    1)!o+
    !]1o+
    1)!o7
    !01o+
    !d10+
    i}10t
    i}1o+
    i]1o+
    i}1o7
    1)i0+
    1)i07
    i}!o7
    i)10+
    i}i0+
    i}i07
    101o7
    i)!o7
    1di0+
    i)i0+
    1}10t
    i}10+
    i)i07
    1}1o7
    i01o+
    id10+
    1]10t
    1}10+
    1]1o7
    1}107
    1)10+
    1}!0t
    1}i0+
    1di07
    1}i07
    i]10t
    i]!o7
    i]1o7
    1}!o+
    1}!o7
    1)107
    i0!o7
    1]10+
    1]107
    1]!0t
    1d107
    1]!o+
    1]!o7
    !0!o7
    i0!07
    1]i0+
    i010t
    1]i07
    i01o7
    i]i0+
    i]i07
    !]!o+
    !]i0+
    id107
    !]1o7
    !]!o7
    1010+
    !]i07
    10107
    !}10t
    !}!0t
    !}1o7
    !}!o+
    !}!o7
    10!0+
    !}i0+
    !}i07
    !)!0t
    1d!0+
    1d!07
    !]!0t
    !0!0t
    !d!0+
    !d!07
    10!07
    !)1o7
    !)!o7
    !)i07
    !)!o+
    !0i07
    !d107
    !0!o+
    i}107
    i0i0+
    i0i07
    i010+
    i)!0+
    i]10+
    i]!0+
    !]10t
    i0!0+
    i]107
    i)!07
    !010t
    !01o7
    !0i0+
    !]10+
    i}!0+
    i}!07
    !0!07
    !)10t
    !)i0+
    1]!0+
    1]!07
    !0!0+
    !)10+
    !)107
    !]!0+
    1}!0+
    1}!07
    !)!0+
    i)107
    i]!07
    !]107
    !]!07
    1)!0+
    1)!07
    !}107
    i0107
    !0107
    !}!0+
    !}!07
    !}10+
    !)!07
    !010+


    More explanation here: [Only registered and activated users can see links. ]

    To use:

    First, make sure you load the bad words somewhere with:
    Code:
    BadWords.loadBadWords();
    Then, in your chat code or wherever you'd like:
    Code:
    if (BadWords.containsBadWord(chatMessage)) {
        player.sendMessage("Don't say bad words!");
        return;
    }
    Code:
    import it.unimi.dsi.fastutil.longs.Long2ObjectMap;
    import it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap;
    import net.openhft.hashing.LongHashFunction;
    
    /**
     * Originally created by Pim De Witte.
     * <p>
     * Performance drastically improved by over an order of magnitude by Thomas G. P. Nappo (Jire).
     * Garbage production has been eliminated as well.
     */
    public final class BadWords {
    	
    	private static final String[] emptyComboWords = new String[]{};
    	private static Long2ObjectMap<String[]> words = new Long2ObjectOpenHashMap<>();
    	private static int largestWordLength = 0;
    	
    	public static void flag(String word, String[] ignoreComboWords) {
    		if (word.length() > largestWordLength) {
    			largestWordLength = word.length();
    		}
    		words.put(LongHashFunction.xx().hashChars(word), ignoreComboWords);
    	}
    	
    	public static void loadBadWords() {
    		for (String blacklisted : blacklist) {
    			flag(blacklisted, emptyComboWords);
    		}
    		flag("scape", new String[]{
    				"runescape",
    				"landscape",
    				"machinescape",
    				"fashionscape",
    				"07scape",
    				"2007scape",
    				"osrscape",
    				"osrsscape",
    				"moparscape",
    				"didyscape",
    				"scapeing",
    		});
    	}
    	
    	private static final String[] blacklist = {
    			/* Bad words */
    			"nigger",
    			
    			/* Advertisement basics */
    			"www",
    			".com",
    			".org",
    			".net",
    			".io",
    			".ps",
    			".tk",
    			"dotcom",
    			"dotorg",
    			"dotnet",
    			"dottk",
    			
    			/* Individual server names */
    			"kratos",
    			"atlas",
    			"osscape",
    			"alora",
    			"elkoy",
    			"osrune",
    			"guthixp",
    			"dawntained",
    			"locopk",
    			"imagineps",
    			"nearreal",
    			"pkhonor",
    			"dreamsc",
    			"manicps",
    			"imagineps",
    			"draganoth",
    			"alosps",
    			"rsps2",
    			"lostisle",
    			"necrotic",
    			"redrune",
    			"deathwish",
    			"pkowned",
    			"osbase",
    			"beastpk",
    			"roatpk",
    			"rsgenesis",
    			"trinityps",
    			"boxrune",
    			"runique",
    			"furiousp",
    			"novus",
    			"ikov",
    			"joinmy",
    			"atarax",
    			"nardahp",
    			"illerai",
    			"letspk",
    			"ratedpixel",
    			"cloudnine",
    			"viceos",
    			"deprivedr",
    			"exoria",
    			"simplicityp",
    			"cruxp",
    			"ospkz",
    			"scapewar",
    			"amberp",
    			"diviner",
    			"osunity",
    			"amulius",
    			"zenyteps",
    			"zenyteosrs"
    	};
    	
    	private static final char[][] leetspeakToNormal = {
    			{'1', 'i'},
    			{'!', 'i'},
    			{'3', 'e'},
    			{'4', 'a'},
    			{'@', 'a'},
    			{'5', 's'},
    			{'7', 't'},
    			{'0', 'o'},
    			{'9', 'g'},
    			
    			/* Additional leetspeak support that Jire added. */
    			{'6', 'g'},
    			{'$', 's'},
    			{'&', 'a'},
    			{'(', 'c'},
    			{')', 'd'},
    			{'+', 't'}
    	};
    	
    	private static final ThreadLocal<StringBuilder> sb = ThreadLocal.withInitial(StringBuilder::new); // make this regular if you don't need thread safety.
    	
    	/**
    	 * Iterates over a String input and checks whether a cuss word was found in a list, then checks if the word should be ignored (e.g. bass contains the word *ss).
    	 */
    	public static boolean containsBadWord(String input) {
    		if (input == null) {
    			return false;
    		}
    		
    		StringBuilder sb = BadWords.sb.get();
    		sb.setLength(0);
    		
    		removeLeetspeak:
    		for (int i = 0; i < input.length(); i++) {
    			char c = input.charAt(i);
    			if (Character.isLetter(c)) {
    				sb.append(Character.toLowerCase(c));
    			} else {
    				for (char[] conversion : leetspeakToNormal) {
    					if (c == conversion[0]) {
    						sb.append(conversion[1]);
    						continue removeLeetspeak;
    					}
    				}
    			}
    		}
    		
    		// iterate over each letter in the word
    		for (int start = 0; start < sb.length(); start++) {
    			// from each letter, keep going to find bad words until either the end of the sentence is reached, or the max word length is reached.
    			for (int offset = 1; offset < (sb.length() + 1 - start) && offset < largestWordLength; offset++) {
    				long hash = LongHashFunction.xx().hashChars(sb, start, offset);
    				if (words.containsKey(hash)) {
    					// for example, if you want to say the word bass, that should be possible.
    					String[] ignoreCheck = words.get(hash);
    					boolean ignore = false;
    					for (int s = 0; s < ignoreCheck.length; s++) {
    						if (indexOf(sb, ignoreCheck[s]) >= 0) {
    							ignore = true;
    							break;
    						}
    					}
    					if (!ignore) {
    						return true;
    					}
    				}
    			}
    		}
    		
    		return false;
    	}
    	
    	private static int indexOf(CharSequence source, CharSequence target) {
    		int sourceCount = source.length();
    		int targetCount = target.length();
    		int sourceOffset = 0;
    		int targetOffset = 0;
    		
    		if (0 >= sourceCount) {
    			return (targetCount == 0 ? sourceCount : -1);
    		}
    		if (targetCount == 0) {
    			return 0;
    		}
    		
    		char first = target.charAt(targetOffset);
    		int max = sourceOffset + (sourceCount - targetCount);
    		
    		for (int i = sourceOffset; i <= max; i++) {
    			/* Look for first character. */
    			if (source.charAt(i) != first) {
    				while (++i <= max && source.charAt(i) != first) ;
    			}
    			
    			/* Found first character, now look at the rest of v2 */
    			if (i <= max) {
    				int j = i + 1;
    				int end = j + targetCount - 1;
    				for (int k = targetOffset + 1; j < end && source.charAt(j)
    						== target.charAt(k); j++, k++)
    					;
    				
    				if (j == end) {
    					/* Found whole string. */
    					return i - sourceOffset;
    				}
    			}
    		}
    		return -1;
    	}
    	
    }

    Reply With Quote  
     

  2. Thankful users:


  3. #2  




    Scu11's Avatar
    Join Date
    Aug 2007
    Age
    24
    Posts
    15,919
    Thanks given
    6,999
    Thanks received
    11,516
    Rep Power
    5000
    Consider a LongObjectHashMap from the [Only registered and activated users can see links. ] library. It may be more performant than the one from fastutil
    Reply With Quote  
     

  4. Thankful users:


  5. #3  
    Jire

    Jire's Avatar
    Join Date
    Nov 2008
    Posts
    12
    Thanks given
    2
    Thanks received
    10
    Rep Power
    250
    Quote Originally Posted by Scu11 View Post
    Consider a LongObjectHashMap from the [Only registered and activated users can see links. ] library. It may be more performant than the one from fastutil
    The HPPC-RT fork looks like quite an improvement over the regular: [Only registered and activated users can see links. ]
    Thanks for mentioning it

    Reply With Quote  
     

  6. #4  
    Krator || OSRSTM || Modeler

    Jordan Belfort's Avatar
    Join Date
    Dec 2012
    Posts
    975
    Thanks given
    458
    Thanks received
    390
    Rep Power
    863
    Your signature looks like it’s in dire need of this snippet.
    [Only registered and activated users can see links. ]

    Reply With Quote  
     


Thread Information
Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 21
    Last Post: 12-13-2010, 02:11 PM
  2. Removing the chat filter from a 317
    By thefunnypro in forum Help
    Replies: 6
    Last Post: 01-18-2010, 08:59 PM
  3. Chat filter help PL1X PL0X PL1X
    By slayer621 in forum Help
    Replies: 16
    Last Post: 07-08-2009, 11:19 AM
  4. chat filter
    By Eminem™ in forum Requests
    Replies: 3
    Last Post: 03-28-2009, 09:40 PM
  5. Deltascape Chat filter
    By grekory5 in forum Requests
    Replies: 2
    Last Post: 12-30-2008, 09:22 PM
Posting Permissions
  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •