Questions tagged [gpu]
Graphics Processing Units (GPUs) within the context of Machine Learning often refer to the hardware requirements, design considerations, or level of parallelization for implementing and running various machine learning algorithms.
171 questions
5
votes
1
answer
72
views
Windows hyper v full gpu passthrough
I'm trying to fully pass through my GPU to a hyper v VM.
However, all guides and tutorials only partition it, resulting in the GPU not appearing as a GPU in the VM's task manger performance tab. My ...
2
votes
1
answer
135
views
XGBoost GPU version not outperforming CPU on small dataset despite parameter tuning – suggestions needed
I'm currently working on a Parallel and Distributed Computing project where I'm comparing the performance of both XGBoost and CatBoost when trained on CPU vs GPU. The goal is to demonstrate how GPU ...
6
votes
0
answers
65
views
Poor availability on Google Cloud Platform
I am trying to setup a VM on GCP, but every time I try to create an instance in Compute Engine, there is an error message saying that the configuration that I asked is not currently available in the ...
0
votes
0
answers
18
views
Confusion Matrix Not Synchronized Properly in DDP with PyTorch Lightning
I am working on a typical classification task using the MNIST dataset and training with PyTorch Lightning and DDP. I am encountering an issue where the row sums in the confusion matrix are not ...
1
vote
0
answers
79
views
How to solve the issue with getting free ports in Pytorch DDP?
I am facing issues with getting a free port in the DDP setup block of PyTorch for parallelizing my deep learning training job across multiple GPUs on a Linux HPC cluster.
I am trying to submit a deep ...
1
vote
0
answers
107
views
How to efficiently run a large language model with a 60k+ token context window across multiple GPUs?
I'm working with a large language model (LLM) that requires a large context window of 60,000 to 70,000 tokens for my application. My setup includes five GPUs, with three 16GB GPUs and two 8GB GPUs. I'...
1
vote
0
answers
140
views
Efficient Net V2 M ONNX model infers significantly slower on small input
When I convert an Efficient net v2 m model from Pytorch to Onnx on differently sized inputs, I notice a strange and unexplained behavior. I was hoping to find an explanation to my observations from ...
2
votes
1
answer
2k
views
Advice on deep learning PC build using dual 4090s
I’m an engineering grad student, and I’ve been tasked with finding parts for building a shared workstation for my lab. Our work includes deep learning, computer vision, network analysis, reinforcement ...
1
vote
0
answers
132
views
GPU requirements for training vs inference
How to estimate GPU requirements for model Inference vs model training/fine tuning.
If it's differ, then in what ratio? just as a rule of thumb
0
votes
0
answers
69
views
Why can't I increase my GPU utilization?
I have a simple UNet model (~1M params) written in Keras 3.0.1, running with a torch backend. My CUDA version is ...
0
votes
1
answer
1k
views
Transformers Trainer: "RuntimeError: module must have its parameters ... on device cuda:6 (device_ids[0]) but found one of them on device: cuda:0"
I ask this since I could not fix it with the help of:
Stack Overflow RuntimeError: module must have its parameters and buffers on device cuda:1 (device_ids[0]) but found one of them on device: cuda:2 ...
2
votes
1
answer
209
views
"model.to('cuda:6')" becomes (nvidia-smi) GPU 4, same with any other "cuda:MY_GPU", only "cuda:0" becomes GPU 0. How do I get rid of this mapping?
Strange mapping: example
In the following example, the first column is chosen in the code, second column is the one that does the work instead:
0:0 1234 MiB
1:2 1234 MiB
2:7 1234 MiB
3:5 2341 MiB
4:1 ...
1
vote
1
answer
425
views
Holding batch size constant, will a bigger dataset consume more GPU memory?
If you hold (mini) batch size constant (as well as everything else) but increase the number of examples (and therefore the number of training iterations), should you expect a (significant) increase in ...
0
votes
1
answer
258
views
How to run our python scripts utilizing our device's GPU?
My laptop has NVIDIA GeForce GTX1650 GPU. I want to utilize this GPU to run my Python script. Any help in the form of code would be really helpful. I mean tried researching this so much but I couldn't ...
1
vote
0
answers
201
views
Using gpu accelerated libSVM in python
I have been using libSVM in python notebook to classify my dataset and it takes approximately 5 hours for one run and for 5 fold cross validation, it will take almost a day+ time.
I am planning to ...