
Introduction
Have you ever played the game “Six Degrees of Kevin Bacon“? It’s a fun and intriguing way to connect any actor or actress to Kevin Bacon within six movies. But did you know that this game has a deeper significance and has even inspired a scientific theory? Well, In this article, we will delve into the origins of the game, its rules, and some fascinating examples of how actors and actresses can be linked to Kevin Bacon. Next, we will create a project named Six Degrees of Kevin Bacon in Python.
In this project, we will employ Artificial Intelligence to find the minimal distance between two actors or actresses and a movie will be considered as the distance here.
The Origins of Six Degrees of Kevin Bacon
The concept of Six Degrees of Kevin Bacon emerged in the mid-1990s when a group of college students at Albright College in Pennsylvania started playing a game based on the idea that any actor could be connected to Kevin Bacon through six or fewer co-starring roles. It was inspired by the popular theory of six degrees of separation, which suggests that any two people in the world can be connected through a chain of acquaintances in six steps or less.
The game gained widespread recognition after a conversation between Kevin Bacon and a journalist, where Bacon mentioned that he had worked with almost everyone in Hollywood or someone who had worked with them. This led to the creation of the “Six Degrees of Kevin Bacon” game, which quickly gained popularity and became a cultural phenomenon.
The Rules of the Game
The rules of the game are simple. The objective is to connect any actor or actress to Kevin Bacon within six movies. Each movie connection counts as one degree of separation. For example, if an actor has worked directly with Kevin Bacon in a film, they have a Bacon number of one. If they have worked with someone who has worked with Kevin Bacon, their Bacon number is two, and so on.
It’s important to note that the game is not limited to Kevin Bacon alone. The concept can be applied to any actor or actress in the film industry. However, Kevin Bacon is often used as the central figure due to his extensive filmography and the diverse range of actors he has worked with.
The Project Details
There are a ton of movies worldwide, featuring numerous actors and actresses. Using this project, our motto is to find out the shortest connections between two actors or actresses globally. As we discussed earlier, a movie serves as a connection here.
The project file follows this hierarchy:
- six-degrees-of-kevin-bacon
- large
- people.csv
- movies.csv
- stars.csv
- small
- people.csv
- movies.csv
- stars.csv
degrees.py
util.py
- large
The large
folder contains a huge amount of data stored in separate csv files, each containing over a million data entries. In the small
folder, you’ll find the same CSV files, but they contain a smaller dataset intended for pre-testing purposes.
Real-life Examples
Let’s explore some real-life examples that can be solved using this Artificial Intelligence Project.
Example 1
For example, we want to find the connections between Tom Cruise and Tom Hanks. By employing our project, it yields the following optimal result:
2 degrees of separation.
1: Tom Hanks and Bill Paxton starred in Apollo 13
2: Bill Paxton and Tom Cruise starred in Edge of Tomorrow

It’s important to note that, as the program goes through millions of datasets to find the optimal solution, the results may vary over time. For instance, with the same input as above, the program may also return the following result:
2 degrees of separation.
1: Tom Cruise and Craig T. Nelson starred in All the Right Moves
2: Craig T. Nelson and Tom Hanks starred in Turner & Hooch

The main point is that, although the distance stays the same every time, the connections might be different.
Example 2
Let’s find a connection between Bollywood and Hollywood using our Python AI Project. This time we will try to find the minimum connections between renowned Bollywood star Shah Rukh Khan and Hollywood icon Tom Cruise. Check out one possible result:
3 degrees of separation.
1: Tom Cruise and Samantha Morton starred in Minority Report
2: Samantha Morton and Om Puri starred in Code 46
3: Om Puri and Shah Rukh Khan starred in Don 2

Set Up the Environment
Download the zip file of the dataset and extract its content. After that, make two Python files in the unzipped folder: degrees.py
and util.py
. The code you need is provided in the following sections.
degrees.py
import csv
import sys
from util import Node, StackFrontier, QueueFrontier
# Maps names to a set of corresponding person_ids
names = {}
# Maps person_ids to a dictionary of: name, birth, movies (a set of movie_ids)
people = {}
# Maps movie_ids to a dictionary of: title, year, stars (a set of person_ids)
movies = {}
def load_data(directory):
"""
Load data from CSV files into memory.
"""
# Load people
with open(f"{directory}/people.csv", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
people[row["id"]] = {
"name": row["name"],
"birth": row["birth"],
"movies": set()
}
if row["name"].lower() not in names:
names[row["name"].lower()] = {row["id"]}
else:
names[row["name"].lower()].add(row["id"])
# Load movies
with open(f"{directory}/movies.csv", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
movies[row["id"]] = {
"title": row["title"],
"year": row["year"],
"stars": set()
}
# Load stars
with open(f"{directory}/stars.csv", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
try:
people[row["person_id"]]["movies"].add(row["movie_id"])
movies[row["movie_id"]]["stars"].add(row["person_id"])
except KeyError:
pass
def main():
if len(sys.argv) > 2:
sys.exit("Usage: python degrees.py [directory]")
directory = sys.argv[1] if len(sys.argv) == 2 else "large"
# Load data from files into memory
print("Loading data...")
load_data(directory)
print("Data loaded.")
source = person_id_for_name(input("Name: "))
if source is None:
sys.exit("Person not found.")
target = person_id_for_name(input("Name: "))
if target is None:
sys.exit("Person not found.")
path = shortest_path(source, target)
if path is None:
print("Not connected.")
else:
degrees = len(path)
print(f"{degrees} degrees of separation.")
path = [(None, source)] + path
for i in range(degrees):
person1 = people[path[i][1]]["name"]
person2 = people[path[i + 1][1]]["name"]
movie = movies[path[i + 1][0]]["title"]
print(f"{i + 1}: {person1} and {person2} starred in {movie}")
def shortest_path(source, target):
"""
Returns the shortest list of (movie_id, person_id) pairs
that connect the source to the target.
If no possible path, returns None.
"""
solution = list()
explored = set()
solution_found = False
empty = False
start = Node(state=source, parent=None, action=None)
frontier = QueueFrontier()
frontier.add(start)
while not solution_found:
if frontier.empty():
solution_found = True
empty = True
# Choose a node from frontier
node = frontier.remove()
# If node is the target, then we have a solution
if node.state == target:
solution_found = True
while node.parent is not None:
pid, mid = node.state, node.action
solution.append((mid, pid))
node = node.parent
solution.reverse()
# Mark node as explored
explored.add(node)
neighbors = neighbors_for_person(node.state)
for neighbor in neighbors:
child = Node(state=neighbor[1], action=neighbor[0], parent=node)
# Add neighbor to frontier
frontier.add(child)
# If any child node from neighbors is the target, then we have a solution
if child.state == target:
solution_found = True
while child.parent is not None:
pid, mid = child.state, child.action
solution.append((mid, pid))
child = child.parent
solution.reverse()
if solution_found:
if empty:
return None
return solution
def person_id_for_name(name):
"""
Returns the IMDB id for a person's name,
resolving ambiguities as needed.
"""
person_ids = list(names.get(name.lower(), set()))
if len(person_ids) == 0:
return None
elif len(person_ids) > 1:
print(f"Which '{name}'?")
for person_id in person_ids:
person = people[person_id]
name = person["name"]
birth = person["birth"]
print(f"ID: {person_id}, Name: {name}, Birth: {birth}")
try:
person_id = input("Intended Person ID: ")
if person_id in person_ids:
return person_id
except ValueError:
pass
return None
else:
return person_ids[0]
def neighbors_for_person(person_id):
"""
Returns (movie_id, person_id) pairs for people
who starred with a given person.
"""
movie_ids = people[person_id]["movies"]
neighbors = set()
for movie_id in movie_ids:
for person_id in movies[movie_id]["stars"]:
neighbors.add((movie_id, person_id))
return neighbors
if __name__ == "__main__":
main()
This is the runner program. It finds the shortest connections between two people using data from CSV files. It works with information about actors/actresses, movies, and the relationships between them. Here’s a simplified breakdown:
Key Components
- Data Structures: The program creates three dictionaries:
names
,people
, andmovies
, to store information about people, movies, and their connections. - Loading Data: The
load_data
function loads information from CSV files into memory, organizing data into dictionaries. - Main Function: The
main
function prompts users to input the names of two people and then calculates the shortest path (connections) between them in terms of the movies they starred in. - Shortest Path Function: It determines the shortest path between two people using a breadth-first search algorithm.
- Person Id for Name: The
person_id_for_name
function in the program plays a crucial role in resolving ambiguities related to individuals’ names.- When a user inputs a person’s name, this function converts the name to the corresponding IMDB ID.
- It utilizes the
names
dictionary, which maps lowercase names to sets of person IDs. - If there’s more than one person with the same name, the function prompts the user to choose the intended person.
- Neighbors for Person: The
neighbors_for_person
function identifies the movie connections of a given person.- Given a person’s ID, the function looks up their associated movies from the people dictionary, which contains information about individuals and the movies they’ve been in.
- It then explores each movie associated with the person and identifies other individuals (persons) who starred in the same movies.
- For each movie, it retrieves the person IDs of co-stars from the
movies
dictionary. - The function forms pairs of (
movie_id
,person_id
) to represent the connections between the given person and their co-stars in specific movies.
Important Note: Our program can handle cases where there are multiple people with the same name and offers the user options to choose the intended person.
util.py
class Node():
def __init__(self, state, parent, action):
self.state = state
self.parent = parent
self.action = action
class StackFrontier():
def __init__(self):
self.frontier = []
def add(self, node):
self.frontier.append(node)
def contains_state(self, state):
return any(node.state == state for node in self.frontier)
def empty(self):
return len(self.frontier) == 0
def remove(self):
if self.empty():
raise Exception("empty frontier")
else:
node = self.frontier[-1]
self.frontier = self.frontier[:-1]
return node
class QueueFrontier(StackFrontier):
def remove(self):
if self.empty():
raise Exception("empty frontier")
else:
node = self.frontier[0]
self.frontier = self.frontier[1:]
return node
In this code, we have two classes to help us navigate through possible paths:
- Node Class: Represents a point in our exploration. Each node has a state (current situation), a parent (the node we came from), and an action (the move we made to get here).
- StackFrontier Class: Manages a stack (last in, first out) of nodes. It helps us keep track of our current exploration path. We can add nodes, check if a state is already in our exploration path, check if it’s empty, and remove the last node.
- QueueFrontier Class (inherits from
StackFrontier
): A variation that works like a queue (first in, first out). It’s similar to theStackFrontier
but removes nodes from the front of the queue instead of the end.
Output
Loading data...
Data loaded.
Name: Tom Cruise
Name: Shah Rukh Khan
3 degrees of separation.
1: Tom Cruise and Samantha Morton starred in Minority Report
2: Samantha Morton and Om Puri starred in Code 46
3: Om Puri and Shah Rukh Khan starred in Don 2
Summary
The “Six Degrees of Kevin Bacon” game is a captivating way to explore the interconnection of the film industry. It showcases how actors and actresses from different generations, genres, and backgrounds can be linked through a relatively small number of connections.
Not only film industries, the theory also suggests that any two people in the world can be connected through a chain of acquaintances in six steps or less and the theory is based on the six degrees of separation.
In this tutorial, we built a project called Six Degrees of Kevin Bacon in Python. In the project, we worked with a super huge dataset – think millions of entries – that has info about actors, actresses, and the movies they’ve been in from all over the world. The project uses smart Artificial Intelligence to find out the minimum connections between two actors/actresses globally where the movies are considered as connections.
As our ‘Six Degrees of Kevin Bacon in Python’ journey wraps up, know that connections are endless. Grab your metaphorical popcorn, because in the vast universe of cinema, every ending is just the beginning of a new story.
If you have any burning questions about this Python project, drop them in the comments below. I’m here and ready to help you out.