Skip to main content

Unanswered Questions

283 questions with no upvoted or accepted answers
6 votes
0 answers
618 views

Adversarial Learning for Semantic Segmentation

I am incorporating Adversarial Training for Semantic Segmentation from Adversarial Learning for Semi-Supervised Semantic Segmentation. The idea is like this: The discriminator takes as input a ...
3 votes
0 answers
44 views

Loss while fine tuning a transformer based pose estimation model not reducing

I am trying to fine-tune a transformer/encoder based pose estimation model available here at: https://huggingface.co/docs/transformers/en/model_doc/vitpose When passing "labels" attribute to ...
3 votes
1 answer
122 views

How to train next token prediction text generation model using Pytorch Transformer classes?

For learning purposes, I have tried to train a text generation model at a tiny scale in this notebook using RNN/LSTM model. But I am not able to take it further to use transformer model. Can anyone ...
3 votes
1 answer
399 views

How is padding masking considered in the Attention Head of a Transformer?

For purely educational purposes, my goal is to implement basic Transformer architecture from scratch. So far I focused on the encoder for classification tasks and assumed that all samples in a batch ...
3 votes
0 answers
263 views

Cluster tabular data with text in some columns

Let's say I have a following features in the my dataframe: user_id user_age is_student is_graduate salary resume integer integer binary binary integer text (up to 1000 symbols) And also a few more ...
3 votes
0 answers
312 views

Struggling to understand/implement Transformer Decoder

I'm struggling to understand the decoder in a Transformer model, specifically with regards to some aspects of its architecture as well as how it actually handles the data during training. What I have ...
3 votes
0 answers
893 views

What exactly negative/positive value of Captum's Integrated Gradient mean?

I use Captum's Integrated Gradient to interprete my PyTorch's neural network. I know that from github and original paper mentioned that ... Positive attribution score means that the input in that ...
3 votes
0 answers
1k views

PyTorch: Train without dataloader (loop trough dataframe instead)

I was wondering if it is bad practice to instead of using built in tools such as dataloader just loop trough each row in a pandas df. Lets say I am doing text classification and my training loop looks ...
3 votes
1 answer
167 views

How to specify version for dependencies so that each one is compatible and stays within a size limit?

I am trying to deploy a web app to Heroku. The free tier is limited to 500 MB. I am using my resnet34 model as a .pkl file. I create model with it using the fastai ...
3 votes
0 answers
160 views

AlexNet Research Paper VS PytTorch and Tensorflow implementation

I'm making my way through Deep Learning research papers, starting with AlexNet, and I found differences in the implementation of PyTorch and Tensorflow that I can't explain. In the research paper, ...
3 votes
0 answers
760 views

Explain FastText model using SHAP values

I have trained fastText model and some fully connected network build on its embeddings. I figured out how to use Lime on it: complete example can be found in Natural Language Processing Is Fun Part 3: ...
3 votes
1 answer
376 views

Policy Gradient not "learning"

I'm attempting to implement the policy gradient taken from the "Hands-On Machine Learning" book by Geron, which can be found here. The notebook uses Tensorflow and I'm attempting to do it with PyTorch....
3 votes
1 answer
288 views

Is it possible to solve Rubik's cube using DQN?

I'm trying to solve Rubik's cube using deep learning and I came across with DQN, so I decided to give it a try. I developed all the code and started training but I got this results: Loss goes up and ...
3 votes
0 answers
871 views

Understanding depthwise convolution vs convolution with group parameters in pytorch

So in the mobilenet-v1 network, depthwise conv layers are used. And I understand that as follows. For a input feature map of (C_in, F_in, F_in), we take only 1 ...
3 votes
0 answers
912 views

How can I get testing accuracy using tensorboard for Detectron2?

I'm learning to use Detecron2. I've followed this link to create a custom object detector. My training code - ...

15 30 50 per page
1
2 3 4 5
19