This repository now hosts the code for our paper:
"A Reproducible and Explainable Intrusion Detection System: A Case Study on the 'Accuracy Paradox'".
- XGBoost outperforms CNN+LSTM by 19.5Γ in throughput
- Accuracy Paradox: 98.43% Accuracy but 0% Recall on critical attacks like Infiltration
- Full Reproducibility: The entire pipeline now runs from end to end with a single script
Stay tuned!
- Overview
- Project Structure
- Key Features
- Performance & The Accuracy Paradox
- Installation
- Usage & Reproducibility
- Docker Setup
- Future Work
- Contributing
- License
This project implements an end-to-end, reproducible MLOps pipeline for Network Intrusion Detection.
We benchmark:
- XGBoost (tabular ML, CPU)
- CNN+LSTM (deep learning, GPU)
using the CICIDS2018 dataset.
Unlike typical IDS implementations, this work focuses on:
- Reproducibility
- Explainability
We integrate SHAP (Shapley Additive Explanations) to diagnose how models fail on minority classes even when reporting 98%+ accuracy.
Note: Data and trained models are generated automatically via the pipeline and are not stored in the repository.
βββ Dockerfile # Reproducible container environment
βββ requirements.txt # Dependencies
βββ setup.py # Package installer
βββ create_master_dataset.py # Generates stratified CICIDS2018 sample (CRITICAL)
βββ run_benchmarks.py # Latency & Throughput benchmarking
βββ run_xai.py # SHAP explainability plots
βββ Fig1_XGBoost_Matrix_NEW.png # Results image
βββ FigF2_CNN_Matrix_NEW.png # Results image
β
βββ src/
β βββ components/
β β βββ data_ingestion.py # Train/Test split
β β βββ data_transformation.py # Scaling, Encoding, LDA Feature Selection
β β βββ model_trainer.py # XGBoost + CNN-LSTM training
β β βββ optuna_tuner.py # Hyperparameter optimization
β βββ utils.py
β βββ logger.py
β βββ exception.py
β
βββ artifacts/ # Generated models (*.pkl, *.keras)
βββ dataset/ # Generated sampled CSV
βββ logs/ # Runtime logs
-
Statistically Valid Sampling
create_master_dataset.pygenerates a memory-safe, stratified sample of the full CICIDS2018 dataset. -
Comparative Benchmarking
run_benchmarks.pyautomates latency and throughput comparisons for CPU vs GPU models. -
Explainable AI (XAI)
Integrated SHAP plots reveal feature-level reasoning and model bias. -
Strict MLOps Principles
Modular code, typed exceptions, logging, and full Dockerization.
Our "Golden Run" results highlight the risk of using accuracy as the primary IDS metric.
| Metric | Value |
|---|---|
| Accuracy | 98.43% |
| F1 Score | 97.96% |
| Balanced Accuracy | 78.11% |
| Throughput | 185,680 samples/sec |
Despite 98.43% overall accuracy, the model achieves 0% Recall on the critical Infiltration attack class.

Fig 1: XGBoost confusion matrix β complete failure on minority attacks.
- Accuracy: 96.31%
- Balanced Accuracy: 55.16% (catastrophic)
- Throughput: 9,522 samples/sec (19.5Γ slower than XGBoost)

Fig 2: CNN+LSTM confusion matrix β collapses into "predict Benign for everything."
git clone https://github.com/MohammedSaim-Quadri/Intrusion_Detection-System.git
cd Intrusion_Detection-Systempython -m venv venv
# Windows
venv\Scripts\activate
# Mac/Linux
source venv/bin/activatepip install -r requirements.txtTo reproduce the Golden Run, execute the pipeline in this order:
Download all CICIDS2018 CSV files and run:
python create_master_dataset.pyOutputs:
dataset/train_data_SAMPLED.csv
python src/components/data_ingestion.pyOutputs:
artifacts/train.csv
artifacts/test.csv
python src/components/data_transformation.py
python src/components/model_trainer.pyOutputs:
artifacts/model_trained.pkl (XGBoost)
artifacts/model_trained.keras (CNN+LSTM)
python run_benchmarks.py
python run_xai.pydocker build -t ids-system .docker run --rm ids-systemThis executes the complete pipeline inside an isolated environment.
-
Address Class Imbalance
Implement SMOTE/GAN-based synthetic oversampling to fix minority-class recall. -
Real-Time Deployment
Connect prediction to live packet capture with CICFlowMeter. -
Ensemble Stacking
Combine XGBoost + CNN to capture complementary patterns.
Contributions are welcome!
Please fork the repository and submit a pull request.
git checkout -b feature/MyFeature
git commit -m "Add MyFeature"
git push origin feature/MyFeatureThis project is licensed under the MIT License.
See the LICENSE file for details.
