Profanity Filter Tutorial: Censor Bad Words using Python

profanity filter in python

Introduction

Have you ever thought about how movies and videos manage to censor offensive language without completely removing the words? It’s often accomplished through manual editing. However, In this tutorial, we’ll learn about an automated solution, profanity filters in Python. We’ll use the profanity-filter library to detect and censor bad words from text data.

Profanity filters are essential tools for maintaining a clean and respectful online environment. These tools play an important role in websites’ comment section, social media platforms, and much more. Profanity filters help to identify and censor inappropriate language within user-generated content.

However, administrators can customize these filters to target some specific categories of language, for example, hate speech, sexual content, swear words, and other undesirable terms.

In Python, there are two prominent tools for implementing profanity filters the better-profanity package and the profanity-filter library. In this tutorial, we’ll focus on profanity-filter because it offers several key features:

  • Flexibility: Censor entire sentences or target specific words.
  • Multilingual Support: Handles both English and Russian, even in mixed-language text.
  • Deep Analysis: Goes beyond exact matches to catch cleverly disguised offensive words.
  • Partial Censoring: Option to mask only part of a word (e.g., “h***”).

By the end of this tutorial, you’ll be able to write Python programs that can effectively censor profane words from a given text. So, let’s get started.

👉Visit Also: Say I Love You using a Python Code

Setting Up the Environment

Before we start, let’s make sure you have the right tools. Here’s what you’ll need:

  1. Install profanity-filter: pip install profanity-filter
  2. Download Language Resources: python -m space download en

Censoring Bad Words from a Text

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()
print(pf.censor("You've stolen my money, you bastard!"))

# Censor individual words
print(pf.censor_word('lesbian'))

Output

You’ve stolen my money, you *******!
*******


Censoring Profanity in Russian Texts

censoring profane words from russian text

Profanity-Filter extends its support to the Russian language as well. Here’s how you can filter obscene words in Russian texts:

from profanity_filter import ProfanityFilter

# Select Russian Language
pf = ProfanityFilter(languages=['ru'])

#Блядь == F**k...
print(pf.censor("Блядь"))

Output

*******

Customizing Censorship

You can control how bad words are censored. Replace them with asterisks (*), a funny symbol like &%$#@.

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()

pf.censor_char = '$'
print(pf.censor("You look like a shit"))

Output

You look like a $$$$

Checking whether a string contains any swear words

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()

bad_text = "You are a bitch!"

# Check if the string contains any bad words or not
print("Not bad words: ", pf.is_clean(bad_text))
print("Bad words: ", pf.is_profane(bad_text))

Output

Not bad words: False
Bad words: True

Censoring obscene words with a custom word list

First, we’ll add our own custom word list. But there is a problem. By doing so, the program can filter only those words present in the custom list. Let’s see an example:

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()

pf.custom_profane_word_dictionaries = {'en': {'want', 'marry'}}
print(pf.censor("I want to marry her., shit"))

Output

I **** to ***** her., shit

Let’s solve the above problem. First, we’ll restore the default profane word dictionaries and then add a custom word dictionary to this line of code: ProfanityFilter.extra_profane_word_dictionaries().

from profanity_filter import ProfanityFilter

pf = ProfanityFilter()

pf.custom_profane_word_dictionaries = {'en': {'want', 'marry'}}
print(pf.censor("I want to marry her., shit"))

# Restore the default profane word dictionaries
pf.restore_profane_word_dictionaries()

# A simple change in the code, custom -> extra
pf.extra_profane_word_dictionaries = {'en': {'want', 'marry'}}
print(pf.censor("I want to marry her., shit"))

Output

I **** to ***** her., shit
I **** to * her., ****

👉Visit Also: Wish Happy Birthday using Python Programs

Summary

Congratulations! In this tutorial, you’ve learned how to use the profanity-filter library in Python to censor obscene/profane words from any text data. We’ve explored several programming examples.

The profanity-filter library can actually detect distorted and derivative words, not just exact matches. However, this “deep analysis” feature requires additional libraries and dictionaries specific to your chosen language. Since it’s a more complex setup, I can create a separate tutorial if there’s enough interest. Just let me know in the comments below!

Level up your Python skills! Visit our dedicated page packed with various Creative Python programs you can try next.

Thanks for reading! I hope you found this tutorial helpful.

Share your love
Subhankar Rakshit
Subhankar Rakshit

Hey there! I’m Subhankar Rakshit, the brains behind PySeek. I’m a Post Graduate in Computer Science. PySeek is where I channel my love for Python programming and share it with the world through engaging and informative blogs.

Articles: 209