Profanity Filter Tutorial: Censor Bad Words using Python

profanity filter in python

Introduction

Have you ever wondered how movies and videos manage to censor offensive language without completely removing the words? It’s often accomplished through manual editing. However, In this tutorial, we’ll learn about an automated solution, profanity filters in Python. We’ll explore the profanity-filter library to detect and censor bad words from text data.

Profanity filters are essential tools for maintaining a clean and respectful online environment. They play a crucial role in websites, social media platforms, and more, by identifying and censoring inappropriate language within user-generated content. Administrators can customize these filters to target specific categories of inappropriate language, including hate speech, sexual content, swear words, and other undesirable terms.

In Python, there are two prominent tools for implementing profanity filters the better-profanity package and the profanity-filter library. In this tutorial, we’ll be focusing on profanity-filter. It’s a powerful choice because it offers several key features:

  • Flexibility: Censor entire sentences or target specific words.
  • Multilingual Support: Handles both English and Russian, even in mixed-language text.
  • Deep Analysis: Goes beyond exact matches to catch cleverly disguised offensive words.
  • Partial Censoring: Option to mask only part of a word (e.g., “h***”).

By the end of this journey, you’ll be equipped to write Python programs that effectively censor profanity, creating a more positive and inclusive online experience.

👉Visit Also: Say I Love You using a Python Code

Setting Up the Environment

To get started, let’s make sure you have the right tools. Here’s what you’ll need:

  1. Install profanity-filter: pip install profanity-filter
  2. Download Language Resources: python -m space download en

Censoring Bad Words from a Text

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()
print(pf.censor("You've stolen my money, you bastard!"))

# Censor individual words
print(pf.censor_word('lesbian'))

Output

You’ve stolen my money, you *******!
*******


Censoring Profanity in Russian Texts

censoring profane words from russian text

Profanity-Filter extends its support to the Russian language as well. Here’s how you can filter obscene words in Russian texts:

from profanity_filter import ProfanityFilter

# Select Russian Language
pf = ProfanityFilter(languages=['ru'])

#Блядь == F**k...
print(pf.censor("Блядь"))

Output

*******

Customizing Censorship

You can control how bad words are censored. Replace them with asterisks (*), a funny symbol like &%$#@.

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()

pf.censor_char = '$'
print(pf.censor("You look like a shit"))

Output

You look like a $$$$

Checking whether a string contains any swear words

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()

bad_text = "You are a bitch!"

# Check if the string contains any bad words or not
print("Not bad words: ", pf.is_clean(bad_text))
print("Bad words: ", pf.is_profane(bad_text))

Output

Not bad words: False
Bad words: True

Censoring obscene words with a custom word list

First, we’ll add our own custom word list. But there is a problem. By doing so, the program can filter only those words present in the custom list. Let’s see an example:

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()

pf.custom_profane_word_dictionaries = {'en': {'want', 'marry'}}
print(pf.censor("I want to marry her., shit"))

Output

I **** to ***** her., shit

In this example, we’ll solve the above problem. First, we’ll restore the default profane word dictionaries and then add a custom word dictionary to this line of code: ProfanityFilter.extra_profane_word_dictionaries().

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()

pf.custom_profane_word_dictionaries = {'en': {'want', 'marry'}}
print(pf.censor("I want to marry her., shit"))

# Restore the default profane word dictionaries
pf.restore_profane_word_dictionaries()

# A simple change in the code, custom -> extra
pf.extra_profane_word_dictionaries = {'en': {'want', 'marry'}}
print(pf.censor("I want to marry her., shit"))

Output

I **** to ***** her., shit
I **** to * her., ****

👉Visit Also: Wish Happy Birthday using Python Programs

Summary

Congratulations! In this tutorial, you’ve learned how to use the profanity-filter library in Python to censor obscene words from your text data. We’ve explored several programming examples to get you started.

The profanity-filter library can actually detect distorted and derivative words, not just exact matches. However, this “deep analysis” feature requires additional libraries and dictionaries specific to your chosen language. Since it’s a more complex setup, I can create a separate tutorial if there’s enough interest. Just let me know in the comments below!

Level up your Python skills! Visit our dedicated page packed with creative and practical Cool Python Programs you can try next.

Thanks for reading! I hope you found this tutorial helpful.

Share your love
Subhankar Rakshit
Subhankar Rakshit

Hey there! I’m Subhankar Rakshit, the brains behind PySeek. I’m a Post Graduate in Computer Science. PySeek is where I channel my love for Python programming and share it with the world through engaging and informative blogs.

Articles: 147