Thread: Incredibly fast advanced chat filter

Results 1 to 8 of 8
  1. #1 Incredibly fast advanced chat filter 
    Jire

    Jire's Avatar
    Join Date
    Nov 2008
    Posts
    12
    Thanks given
    8
    Thanks received
    16
    Rep Power
    264
    Requires Java 8 or newer, with the libraries Fastutils and either OpenHFT zero-allocation-hashing or chronicle-core.

    Code:
    compile group: 'net.openhft', name: 'zero-allocation-hashing', version: '0.8'
    compile group: 'it.unimi.dsi', name: 'fastutil', version: '8.2.2'
    Originally this was created by Pim, but the performance of it was just good enough. I have improved the performance by over an order of magnitude at the same time as eliminating all allocations.
    This is now fast enough to be called hundreds of thousands times per tick now without any worry. Absolutely no allocations are created on checking against the blacklist.

    All words are filtered in every combo, for example all of these combos of "idiot" are filtered, not even including special characters and more that could be added:
    Spoiler for Blocked combos for "idiot", not including special chars:
    Code:
    idiot
    1diot
    !diot
    i)iot
    i}iot
    i]iot
    i0iot
    id1ot
    id!ot
    idi0t
    idio+
    idio7
    1)iot
    !)iot
    !}iot
    !]iot
    !0iot
    !d1ot
    !d!ot
    !di0t
    !dio+
    !dio7
    1}iot
    1]iot
    10iot
    1d1ot
    i}1ot
    1d!ot
    1)1ot
    i}!ot
    1di0t
    i}i0t
    1)!ot
    1dio+
    1)i0t
    1dio7
    1)io+
    i)1ot
    1)io7
    i}io+
    i}io7
    i)!ot
    i]1ot
    1)10t
    i]!ot
    1}1ot
    1)1o+
    1}!ot
    i]i0t
    1)1o7
    1}i0t
    i]io+
    1}io+
    i]io7
    1}io7
    i01ot
    i0!ot
    i0i0t
    i0io+
    i0io7
    i)i0t
    id!0t
    id10t
    i)io+
    i)io7
    idi0+
    idi07
    id1o+
    id!o+
    1]1ot
    id!o7
    101ot
    1d1o+
    1d10t
    !d1o+
    i)1o+
    1d1o7
    id1o7
    1d!0t
    !d!0t
    !}1ot
    i)!0t
    !}!ot
    i}!0t
    !}i0t
    i]!0t
    !}io+
    i0!0t
    !}io7
    id!0+
    id!07
    1d!o+
    1]!ot
    !d!o+
    1]i0t
    !)1ot
    i)!o+
    1]io+
    i}!o+
    1]io7
    i]!o+
    i)10t
    i0!o+
    10!ot
    i)1o7
    10i0t
    10io+
    10io7
    !01ot
    10!0t
    !0!ot
    10!o+
    !0i0t
    10!o7
    !0io+
    !0io7
    !]1ot
    !d10t
    !d1o7
    1d!o7
    !)!ot
    !)i0t
    !]!ot
    !]i0t
    !d!o7
    !]io+
    !]io7
    1}1o+
    1]1o+
    1010t
    101o+
    1d10+
    10i0+
    10i07
    !)io+
    !)io7
    !di0+
    !)1o+
    !di07
    1)!0t
    !}1o+
    1)!o+
    !]1o+
    1)!o7
    !01o+
    !d10+
    i}10t
    i}1o+
    i]1o+
    i}1o7
    1)i0+
    1)i07
    i}!o7
    i)10+
    i}i0+
    i}i07
    101o7
    i)!o7
    1di0+
    i)i0+
    1}10t
    i}10+
    i)i07
    1}1o7
    i01o+
    id10+
    1]10t
    1}10+
    1]1o7
    1}107
    1)10+
    1}!0t
    1}i0+
    1di07
    1}i07
    i]10t
    i]!o7
    i]1o7
    1}!o+
    1}!o7
    1)107
    i0!o7
    1]10+
    1]107
    1]!0t
    1d107
    1]!o+
    1]!o7
    !0!o7
    i0!07
    1]i0+
    i010t
    1]i07
    i01o7
    i]i0+
    i]i07
    !]!o+
    !]i0+
    id107
    !]1o7
    !]!o7
    1010+
    !]i07
    10107
    !}10t
    !}!0t
    !}1o7
    !}!o+
    !}!o7
    10!0+
    !}i0+
    !}i07
    !)!0t
    1d!0+
    1d!07
    !]!0t
    !0!0t
    !d!0+
    !d!07
    10!07
    !)1o7
    !)!o7
    !)i07
    !)!o+
    !0i07
    !d107
    !0!o+
    i}107
    i0i0+
    i0i07
    i010+
    i)!0+
    i]10+
    i]!0+
    !]10t
    i0!0+
    i]107
    i)!07
    !010t
    !01o7
    !0i0+
    !]10+
    i}!0+
    i}!07
    !0!07
    !)10t
    !)i0+
    1]!0+
    1]!07
    !0!0+
    !)10+
    !)107
    !]!0+
    1}!0+
    1}!07
    !)!0+
    i)107
    i]!07
    !]107
    !]!07
    1)!0+
    1)!07
    !}107
    i0107
    !0107
    !}!0+
    !}!07
    !}10+
    !)!07
    !010+


    More explanation here: [Only registered and activated users can see links. ]

    To use:

    First, make sure you load the bad words somewhere with:
    Code:
    BadWords.loadBadWords();
    Then, in your chat code or wherever you'd like:
    Code:
    if (BadWords.containsBadWord(chatMessage)) {
        player.sendMessage("Don't say bad words!");
        return;
    }
    Code:
    import it.unimi.dsi.fastutil.longs.Long2ObjectMap;
    import it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap;
    import net.openhft.hashing.LongHashFunction;
    
    /**
     * Originally created by Pim De Witte.
     * <p>
     * Performance drastically improved by over an order of magnitude by Thomas G. P. Nappo (Jire).
     * Garbage production has been eliminated as well.
     */
    public final class BadWords {
    	
    	private static final String[] emptyComboWords = new String[]{};
    	private static Long2ObjectMap<String[]> words = new Long2ObjectOpenHashMap<>();
    	private static int largestWordLength = 0;
    	
    	public static void flag(String word, String... ignoreComboWords) {
    		if (word.length() > largestWordLength) {
    			largestWordLength = word.length();
    		}
    		words.put(LongHashFunction.xx().hashChars(word), ignoreComboWords);
    	}
    	
    	public static void loadBadWords() {
    		for (String blacklisted : blacklist) {
    			flag(blacklisted, emptyComboWords);
    		}
    		flag("scape",
    				"runescape",
    				"landscape",
    				"machinescape",
    				"fashionscape",
    				"07scape",
    				"2007scape",
    				"osrscape",
    				"osrsscape",
    				"moparscape",
    				"didyscape",
    				"scapeing"
    		);
    	}
    	
    	private static final String[] blacklist = {
    			/* Bad words */
    			"nigger",
    			
    			/* Advertisement basics */
    			"www",
    			".com",
    			".org",
    			".net",
    			".io",
    			".ps",
    			".tk",
    			"dotcom",
    			"dotorg",
    			"dotnet",
    			"dottk",
    			
    			/* Individual server names */
    			"kratos",
    			"atlas",
    			"osscape",
    			"alora",
    			"elkoy",
    			"osrune",
    			"guthixp",
    			"dawntained",
    			"locopk",
    			"imagineps",
    			"nearreal",
    			"pkhonor",
    			"dreamsc",
    			"manicps",
    			"imagineps",
    			"draganoth",
    			"alosps",
    			"rsps2",
    			"lostisle",
    			"necrotic",
    			"redrune",
    			"deathwish",
    			"pkowned",
    			"osbase",
    			"beastpk",
    			"roatpk",
    			"rsgenesis",
    			"trinityps",
    			"boxrune",
    			"runique",
    			"furiousp",
    			"novus",
    			"ikov",
    			"joinmy",
    			"atarax",
    			"nardahp",
    			"illerai",
    			"letspk",
    			"ratedpixel",
    			"cloudnine",
    			"viceos",
    			"deprivedr",
    			"exoria",
    			"simplicityp",
    			"cruxp",
    			"ospkz",
    			"scapewar",
    			"amberp",
    			"diviner",
    			"osunity",
    			"amulius",
    			"zenyteps",
    			"zenyteosrs"
    	};
    	
    	private static final char[][] leetspeakToNormal = {
    			{'1', 'i'},
    			{'!', 'i'},
    			{'3', 'e'},
    			{'4', 'a'},
    			{'@', 'a'},
    			{'5', 's'},
    			{'7', 't'},
    			{'0', 'o'},
    			{'9', 'g'},
    			
    			/* Additional leetspeak support that Jire added. */
    			{'6', 'g'},
    			{'$', 's'},
    			{'&', 'a'},
    			{'(', 'c'},
    			{')', 'd'},
    			{'+', 't'}
    	};
    	
    	private static final ThreadLocal<StringBuilder> sb = ThreadLocal.withInitial(StringBuilder::new); // make this regular if you don't need thread safety.
    	
    	/**
    	 * Iterates over a String input and checks whether a cuss word was found in a list, then checks if the word should be ignored (e.g. bass contains the word *ss).
    	 */
    	public static boolean containsBadWord(String input) {
    		if (input == null) {
    			return false;
    		}
    		
    		StringBuilder sb = BadWords.sb.get();
    		sb.setLength(0);
    		
    		removeLeetspeak:
    		for (int i = 0; i < input.length(); i++) {
    			char c = input.charAt(i);
    			if (Character.isLetter(c)) {
    				sb.append(Character.toLowerCase(c));
    			} else {
    				for (char[] conversion : leetspeakToNormal) {
    					if (c == conversion[0]) {
    						sb.append(conversion[1]);
    						continue removeLeetspeak;
    					}
    				}
    			}
    		}
    		
    		// iterate over each letter in the word
    		for (int start = 0; start < sb.length(); start++) {
    			// from each letter, keep going to find bad words until either the end of the sentence is reached, or the max word length is reached.
    			for (int offset = 1; offset < (sb.length() + 1 - start) && offset < largestWordLength; offset++) {
    				long hash = LongHashFunction.xx().hashChars(sb, start, offset);
    				if (words.containsKey(hash)) {
    					// for example, if you want to say the word bass, that should be possible.
    					String[] ignoreCheck = words.get(hash);
    					boolean ignore = false;
    					for (int s = 0; s < ignoreCheck.length; s++) {
    						if (indexOf(sb, ignoreCheck[s]) >= 0) {
    							ignore = true;
    							break;
    						}
    					}
    					if (!ignore) {
    						return true;
    					}
    				}
    			}
    		}
    		
    		return false;
    	}
    	
    	private static int indexOf(CharSequence source, CharSequence target) {
    		int sourceCount = source.length();
    		int targetCount = target.length();
    		int sourceOffset = 0;
    		int targetOffset = 0;
    		
    		if (0 >= sourceCount) {
    			return (targetCount == 0 ? sourceCount : -1);
    		}
    		if (targetCount == 0) {
    			return 0;
    		}
    		
    		char first = target.charAt(targetOffset);
    		int max = sourceOffset + (sourceCount - targetCount);
    		
    		for (int i = sourceOffset; i <= max; i++) {
    			/* Look for first character. */
    			if (source.charAt(i) != first) {
    				while (++i <= max && source.charAt(i) != first) ;
    			}
    			
    			/* Found first character, now look at the rest of v2 */
    			if (i <= max) {
    				int j = i + 1;
    				int end = j + targetCount - 1;
    				for (int k = targetOffset + 1; j < end && source.charAt(j)
    						== target.charAt(k); j++, k++)
    					;
    				
    				if (j == end) {
    					/* Found whole string. */
    					return i - sourceOffset;
    				}
    			}
    		}
    		return -1;
    	}
    	
    }
    Reply With Quote  
     


  2. #2  




    Scu11's Avatar
    Join Date
    Aug 2007
    Age
    25
    Posts
    16,007
    Thanks given
    7,027
    Thanks received
    11,631
    Rep Power
    5000
    Consider a LongObjectHashMap from the [Only registered and activated users can see links. ] library. It may be more performant than the one from fastutil

    [Only registered and activated users can see links. ]
    Reply With Quote  
     

  3. Thankful users:


  4. #3  
    Jire

    Jire's Avatar
    Join Date
    Nov 2008
    Posts
    12
    Thanks given
    8
    Thanks received
    16
    Rep Power
    264
    Quote Originally Posted by Scu11 View Post
    Consider a LongObjectHashMap from the [Only registered and activated users can see links. ] library. It may be more performant than the one from fastutil
    The HPPC-RT fork looks like quite an improvement over the regular: [Only registered and activated users can see links. ]
    Thanks for mentioning it
    Reply With Quote  
     

  5. #4  
    Krator || OSRSTM || Modeler

    Jordan Belfort's Avatar
    Join Date
    Dec 2012
    Posts
    1,016
    Thanks given
    495
    Thanks received
    443
    Rep Power
    987
    Your signature looks like it’s in dire need of this snippet.
    [Only registered and activated users can see links. ]

    Reply With Quote  
     

  6. Thankful users:


  7. #5  
    Registered Member
    Join Date
    Jan 2018
    Posts
    118
    Thanks given
    111
    Thanks received
    10
    Rep Power
    58
    @Jire - You are a legend! Thank you @Professor Oak - For your comment pointing me towards this snippet
    Reply With Quote  
     

  8. #6  
    nbness2#5894

    nbness2's Avatar
    Join Date
    Aug 2011
    Posts
    639
    Thanks given
    243
    Thanks received
    116
    Rep Power
    260
    fucking hell now i cant advertise my clan: "!)!07"
    KT/JAVA - NBX 637 - [Only registered and activated users can see links. ]!
    KT - Drop tables: Simple and Readable - [Only registered and activated users can see links. ]!
    KT - Item Containers: Safe, Easy and Powerful - [Only registered and activated users can see links. ]
    KT - NbUtil: Make your kotlin easier - [Only registered and activated users can see links. ]
    KT - Hopping Islands: From Java to Kotlin - [Only registered and activated users can see links. ] - [Only registered and activated users can see links. ] - [Only registered and activated users can see links. ] - [Only registered and activated users can see links. ]
    Reply With Quote  
     

  9. #7  
    Donator

    `Michael's Avatar
    Join Date
    Mar 2009
    Posts
    604
    Thanks given
    103
    Thanks received
    29
    Rep Power
    196
    good contribution won't use but nice
    Come check out my youtube I am A Machinima Partner, I do daily let's plays/commentaries 720P growing fast!

    [Only registered and activated users can see links. ]


    We built the pyramids baby-Yusuf Amir
    Reply With Quote  
     

  10. #8  
    Owner of Ghreborn

    Sgsrocks's Avatar
    Join Date
    Mar 2014
    Posts
    805
    Thanks given
    33
    Thanks received
    83
    Rep Power
    84
    thank you.

    Discord: Sgsrocks#5004
    Reply With Quote  
     


Thread Information
Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 21
    Last Post: 12-13-2010, 02:11 PM
  2. Removing the chat filter from a 317
    By thefunnypro in forum Help
    Replies: 6
    Last Post: 01-18-2010, 08:59 PM
  3. Chat filter help PL1X PL0X PL1X
    By slayer621 in forum Help
    Replies: 16
    Last Post: 07-08-2009, 11:19 AM
  4. chat filter
    By Eminem™ in forum Requests
    Replies: 3
    Last Post: 03-28-2009, 09:40 PM
  5. Deltascape Chat filter
    By grekory5 in forum Requests
    Replies: 2
    Last Post: 12-30-2008, 09:22 PM
Posting Permissions
  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •