Skip to content

This project is an Intrusion Detection System (IDS) using machine learning (ML) and deep learning (DL) to detect network intrusions. It leverages the CICIDS2018 dataset to classify traffic as normal or malicious. Key features include data preprocessing, model training, hyperparameter tuning, and Docker containerization for scalable deployment.

License

Notifications You must be signed in to change notification settings

MohammedSaim-Quadri/Intrusion_Detection-System

Repository files navigation

Intrusion Detection System (IDS): The "Accuracy Paradox" Case Study

Python
TensorFlow
Docker
License: MIT


πŸš€ MAJOR UPDATE β€” November 2025

This repository now hosts the code for our paper:
"A Reproducible and Explainable Intrusion Detection System: A Case Study on the 'Accuracy Paradox'".

πŸ”‘ Key Findings

  • XGBoost outperforms CNN+LSTM by 19.5Γ— in throughput
  • Accuracy Paradox: 98.43% Accuracy but 0% Recall on critical attacks like Infiltration
  • Full Reproducibility: The entire pipeline now runs from end to end with a single script

Stay tuned!


Table of Contents


Overview

This project implements an end-to-end, reproducible MLOps pipeline for Network Intrusion Detection.
We benchmark:

  • XGBoost (tabular ML, CPU)
  • CNN+LSTM (deep learning, GPU)

using the CICIDS2018 dataset.

Unlike typical IDS implementations, this work focuses on:

  • Reproducibility
  • Explainability

We integrate SHAP (Shapley Additive Explanations) to diagnose how models fail on minority classes even when reporting 98%+ accuracy.

System Architecture


Project Structure

Note: Data and trained models are generated automatically via the pipeline and are not stored in the repository.

β”œβ”€β”€ Dockerfile                    # Reproducible container environment
β”œβ”€β”€ requirements.txt              # Dependencies
β”œβ”€β”€ setup.py                      # Package installer
β”œβ”€β”€ create_master_dataset.py      # Generates stratified CICIDS2018 sample (CRITICAL)
β”œβ”€β”€ run_benchmarks.py             # Latency & Throughput benchmarking
β”œβ”€β”€ run_xai.py                    # SHAP explainability plots
β”œβ”€β”€ Fig1_XGBoost_Matrix_NEW.png   # Results image
β”œβ”€β”€ FigF2_CNN_Matrix_NEW.png      # Results image
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ components/
β”‚   β”‚   β”œβ”€β”€ data_ingestion.py         # Train/Test split
β”‚   β”‚   β”œβ”€β”€ data_transformation.py    # Scaling, Encoding, LDA Feature Selection
β”‚   β”‚   β”œβ”€β”€ model_trainer.py          # XGBoost + CNN-LSTM training
β”‚   β”‚   └── optuna_tuner.py           # Hyperparameter optimization
β”‚   β”œβ”€β”€ utils.py
β”‚   β”œβ”€β”€ logger.py
β”‚   └── exception.py
β”‚
β”œβ”€β”€ artifacts/                   # Generated models (*.pkl, *.keras)
β”œβ”€β”€ dataset/                     # Generated sampled CSV
└── logs/                        # Runtime logs

Key Features

  • Statistically Valid Sampling
    create_master_dataset.py generates a memory-safe, stratified sample of the full CICIDS2018 dataset.

  • Comparative Benchmarking
    run_benchmarks.py automates latency and throughput comparisons for CPU vs GPU models.

  • Explainable AI (XAI)
    Integrated SHAP plots reveal feature-level reasoning and model bias.

  • Strict MLOps Principles
    Modular code, typed exceptions, logging, and full Dockerization.


Performance & The Accuracy Paradox

Our "Golden Run" results highlight the risk of using accuracy as the primary IDS metric.


1. XGBoost Performance β€” Champion Model

Metric Value
Accuracy 98.43%
F1 Score 97.96%
Balanced Accuracy 78.11%
Throughput 185,680 samples/sec

⚠️ The Accuracy Paradox:
Despite 98.43% overall accuracy, the model achieves 0% Recall on the critical Infiltration attack class.

XGBoost Confusion Matrix
Fig 1: XGBoost confusion matrix β€” complete failure on minority attacks.


2. CNN+LSTM Performance β€” Baseline

  • Accuracy: 96.31%
  • Balanced Accuracy: 55.16% (catastrophic)
  • Throughput: 9,522 samples/sec (19.5Γ— slower than XGBoost)

CNN Confusion Matrix
Fig 2: CNN+LSTM confusion matrix β€” collapses into "predict Benign for everything."


Installation

1. Clone the Repository

git clone https://github.com/MohammedSaim-Quadri/Intrusion_Detection-System.git
cd Intrusion_Detection-System

2. Create a Virtual Environment

python -m venv venv

# Windows
venv\Scripts\activate

# Mac/Linux
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

Usage & Reproducibility

To reproduce the Golden Run, execute the pipeline in this order:


Step 1 β€” Generate the Master Dataset

Download all CICIDS2018 CSV files and run:

python create_master_dataset.py

Outputs:
dataset/train_data_SAMPLED.csv


Step 2 β€” Data Ingestion

python src/components/data_ingestion.py

Outputs:
artifacts/train.csv
artifacts/test.csv


Step 3 β€” Transform and Train

python src/components/data_transformation.py
python src/components/model_trainer.py

Outputs:
artifacts/model_trained.pkl (XGBoost)
artifacts/model_trained.keras (CNN+LSTM)


Step 4 β€” Benchmarks & XAI

python run_benchmarks.py
python run_xai.py

Docker Setup

Build the Docker Image

docker build -t ids-system .

Run the Container

docker run --rm ids-system

This executes the complete pipeline inside an isolated environment.


Future Work

  • Address Class Imbalance
    Implement SMOTE/GAN-based synthetic oversampling to fix minority-class recall.

  • Real-Time Deployment
    Connect prediction to live packet capture with CICFlowMeter.

  • Ensemble Stacking
    Combine XGBoost + CNN to capture complementary patterns.


Contributing

Contributions are welcome!
Please fork the repository and submit a pull request.

git checkout -b feature/MyFeature
git commit -m "Add MyFeature"
git push origin feature/MyFeature

License

This project is licensed under the MIT License.
See the LICENSE file for details.

About

This project is an Intrusion Detection System (IDS) using machine learning (ML) and deep learning (DL) to detect network intrusions. It leverages the CICIDS2018 dataset to classify traffic as normal or malicious. Key features include data preprocessing, model training, hyperparameter tuning, and Docker containerization for scalable deployment.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published