Create a Weather Prediction AI Model using Python

Showing a person analyzing weather data on a laptop screen, featuring weather icons like sun, cloud, and rain, with keywords AI, Python, and Weather Prediction displayed, representing a weather prediction AI model in Python.

Weather prediction is one of the most practical applications of machine learning. Using historical weather data, we can easily create models forecasting temperature, rainfall, and other weather conditions.

In this article, we’ll learn how to create a simple weather prediction AI model using Python. Here, we’ll build two separate models:

One to predict the maximum temperature using linear regression.
Another one is to predict whether it will rain tomorrow using logistic regression.

Don’t worry, this tutorial is completely beginner-friendly. We will explain each part of the code to help you understand the logic behind this AI weather prediction model.

Visit Also: Create a Sentiment Analysis Project in Python using NLP

Requirements

Before we start coding, make sure the following are installed on your system:

Python 3.8+
A weather dataset (a CSV file)
Install pandas, scikit-learn, and matplotlib using the following command:

pip install pandas scikit-learn matplotlib

Also, keep a weather dataset named weather.csv in the same directory as your script. You can download the sample dataset from here or directly using the ‘Download’ button:

weather Download

🌡️ Temperature Prediction Model (Linear Regression)

Here, we will create a model that will predict the maximum temperature based on some key weather features.

Import the Required Libraries

import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

pandas: It is used for reading and processing the dataset.
LinearRegression: We’ll use this algorithm to predict temperature.
train_test_split: It helps us split the data into training and testing sets.

Load and Clean the Dataset

Real-world data often has missing values. It’s important to remove every row with missing entries:

data = pd.read_csv("weather.csv").dropna()

We load the weather.csv file and remove any rows with missing values using .dropna() function.

Select Features and Target

X = data[['MinTemp', 'Humidity9am', 'Pressure9am', 'WindSpeed9am']]
y = data['MaxTemp']

X includes the input features that affect temperature.
y is the target value — the maximum temperature we want to predict.

Split Data into Training and Testing Sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Here, we divide the dataset into two parts using the train_test_split function:

80% for training the model,
20% for testing how well it performs on unseen data.

Train the Linear Regression Model

model = LinearRegression()
model.fit(X_train, y_train)

In the above code, we create a linear regression model and train it using the training data. The model learns how the input features affect the output temperature.

Evaluate the Model

print(f"R² Score: {model.score(X_test, y_test):.2f}")

The R² Score shows how well the model fits the data. A score closer to 1 means better accuracy.

Output

R² Score: 0.74

Temperature Prediction

new_data = pd.DataFrame([[10.0, 60, 1015, 15]], columns=X_train.columns)
print(f"Predicted MaxTemp: {model.predict(new_data)[0]:.1f}°C")

Here, we use a new sample input to predict the maximum temperature. You can change the values to experiment with different weather conditions.

Output

R² Score: 0.74
Predicted MaxTemp: 23.0°C

🌧️ Rain Prediction Model (Logistic Regression)

Now we will create the rain prediction model using logistic regression.

Import Required Libraries

import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import train_test_split

From the above code, we’ll use:

LogisticRegression to build a binary classification model.
accuracy_score and classification_report to evaluate how well our model predicts rain.
train_test_split() to split our dataset into two parts:
- Training Set – The part of the data used to train our machine learning model.
- Testing Set – The part of the data used to test how well our model performs on unseen data.

Load and Preprocess the Data

data = pd.read_csv("weather.csv").dropna(subset=['RainTomorrow'])
data['RainTomorrow'] = data['RainTomorrow'].map({'Yes': 1, 'No': 0})

In the above code:

We remove all the rows that don’t have a value for RainTomorrow.
Then, we convert the ‘Yes’/’No’ labels into 1 and 0, which are easier for machine learning algorithms to work with.

Select Features and Target

features = ['MinTemp', 'MaxTemp', 'Humidity9am', 'Rainfall', 'Pressure9am']
X = data[features]
y = data['RainTomorrow']

Here:

X contains the features that we believe affect rain (temperature, humidity, rainfall, etc.).
y is the target column — will it rain tomorrow (1 = Yes, 0 = No).

Split Data into Training and Testing Sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

As per the previous weather prediction, we divided the dataset into two parts, 80% for training and 20% for testing.

Train the Logistic Regression Model

model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

Using the above code, we train our model to classify future weather as rainy or not. max_iter=1000 ensures the model gets enough iterations to learn well.

Evaluate the Model

y_pred = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print(classification_report(y_test, y_pred))

In the above code:

Accuracy shows the percentage of correct predictions.
Classification Report includes precision, recall, and F1-score — these are important metrics for classification problems.

Predict if It Will Rain Tomorrow

new_data = pd.DataFrame([[15.0, 25.0, 80, 0.0, 1015]], columns=features)
print("Will it rain tomorrow?", "Yes" if model.predict(new_data)[0] == 1 else "No")

We have given a sample input to test our model. You can adjust the input values to simulate various weather conditions and see if it predicts rain or not.

Output

Accuracy: 0.82
              precision    recall  f1-score   support

           0       0.82      1.00      0.90        58
           1       1.00      0.19      0.32        16

    accuracy                           0.82        74
   macro avg       0.91      0.59      0.61        74
weighted avg       0.86      0.82      0.77        74

Will it rain tomorrow? No

Summary

In this tutorial, we learned how to build a Weather Prediction AI Model using Python. We created two separate models using scikit-learn:

A Linear Regression model to predict the maximum temperature.
A Logistic Regression model to predict whether it will rain tomorrow.

We also learned how to preprocess data, select features, split datasets, train models, and create real-world predictions — all using just a few lines of Python code.

Remember, this is just the beginning. You can enhance these models using more real-world data, feature engineering, and advanced algorithms for even better predictions.

For any queries related to this topic, contact me at contact@pyseek.com.

Happy Coding!

Frequently Asked Questions

Q: Where can I find weather datasets?

A: Try Kaggle.

Q: Why is my accuracy low?

A: Try collecting more data or adding better features.

Q: Can I predict temperature instead?

A: Yes! Use LinearRegression instead of LogisticRegression.

Create a Weather Prediction AI Model using Python

Requirements

🌡️ Temperature Prediction Model (Linear Regression)

Import the Required Libraries

Load and Clean the Dataset

Select Features and Target

Split Data into Training and Testing Sets

Train the Linear Regression Model

Evaluate the Model

Temperature Prediction

🌧️ Rain Prediction Model (Logistic Regression)

Import Required Libraries

Load and Preprocess the Data

Select Features and Target

Split Data into Training and Testing Sets

Train the Logistic Regression Model

Evaluate the Model

Predict if It Will Rain Tomorrow

Summary

Frequently Asked Questions

Subhankar Rakshit

How to Generate Music using Python & Deep Learning