Skip to main content

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Visit Stack Exchange

Loading…

current community
- Cross Validated
  
  help chat
- Cross Validated Meta
your communities

Sign up or log in to customize your list.

more stack exchange communities
company blog
Log in
Sign up

1. Home
2. Questions
3. Unanswered
4. AI Assist
5. Tags
7. Chat
8. Users
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Stack Internal
Bring the best of human thought and AI automation together at your work. Learn more

Stack Internal

Knowledge at work

Bring the best of human thought and AI automation together at your work.

Explore Stack Internal

Questions tagged [computer-vision]

Ask Question

Questions related to image representation, segmentation, visual object categorization and image processing algorithms in general.

Learn more…
Top users
Synonyms

479 questions

Newest Active Bountied Unanswered

Bountied 0
Unanswered
Frequent
Score
Trending
Week
Month
Unanswered (my tags)

Filter by

No answers

No upvoted or accepted answers

Has bounty

Days old

Sorted by

Newest

Recent activity

Highest score

Most frequent

Bounty ending soon

Trending

Most activity

Tagged with

My watched tags

The following tags:

0 votes

0 answers

40 views

Looking for suggestion on point cloud analysis for computer vision

I have a point cloud and the point cloud consists of room-like blocks where one wall is open of this rectangular box to make this as an open as entrance. The point cloud is extracted from the CAD ...

computer-vision

Encipher

185

asked Nov 20 at 16:23

0 votes

0 answers

27 views

Is it a bad idea to use Transformer models on long-tailed datasets?

I’m working on a video classification task with a long-tailed dataset where a few classes have many samples while most classes have very few. More specifically, my dataset has around 9k samples and 3....

machine-learning
neural-networks
classification
computer-vision
transformers

Olivia

191

asked Nov 1 at 1:36

1 vote

0 answers

18 views

Assembling boxes on pallets (YOLOv8n), problems with detection

I'm developing a computer vision solution to count boxes in fractional palettes. Problem: inconsistent detections. I don't know if it's due to a lack of data, annotations, architecture, or ...

machine-learning
python
computer-vision
yolo

Lucas Anael

11

asked Oct 1 at 18:00

1 vote

0 answers

40 views

Why aren't Data Augmentations used when training autoencoder or neural-based compressors for images / video?

I was curious if anyone happens to know why data augmentations (like color jitter, random cropping, etc) appear to not be always used when training autoencoder or neural-based compressors for images ...

autoencoders
computer-vision
compression

thisIsAUsername

11

asked Sep 13 at 19:03

0 votes

0 answers

51 views

I want to build a few shot object detector. is YOLO a good choice?

I am currently working on the problem of creating a virtual library of toys. The preferred flow- user uploads a short video/series of photos and then, from the next session, the model can certainly ...

machine-learning
computer-vision
object-detection
yolo

Soham Bhaumik

111

asked Sep 2 at 20:44

0 votes

0 answers

51 views

kernel to detect straight lines and edges in Canny edge images

I have run canny edge detector on an image that contains doors, beds, etc, in whose perimeters I am interested in. So in the Canny outputs I have, I can clearly see the edges of these objects clearly. ...

image-processing
computer-vision
image-segmentation

Sameer Kulkarni

101

asked Aug 21 at 10:40

1 vote

0 answers

33 views

Trying to classify sequences of video frames as a particular posture for quadrupeds

I am fairly new to machine learning and I've been tinkering with transformers for a short while now. I have written a transformer architecture that should in my opinion be able to understand why this ...

computer-vision
transformers

Soham Bhaumik

111

asked Jul 27 at 21:50

1 vote

0 answers

37 views

Is it possible to train a neural network to reconstruct total image of an object based on partial image [closed]

Lets say that I want to train a network where the input is an image of a small part of an object. For eg: image of a building with corners and some part of exterior walls and some part of roof. I want ...

neural-networks
image-processing
computer-vision

user146290

121

asked Jun 6 at 6:07

0 votes

0 answers

35 views

Is Cochran's Q test appropriate for comparing a few VLMs on a dataset with more than one question per image?

I have a dataset of N images, k questions per image, Y/N answer for each question. I want to compare the accuracy of m VLMs (Vision Language Models) over this dataset: these are models, similar to the ...

multiple-comparisons
group-differences
computer-vision
cochran-q
llm

DeltaIV

18.6k

asked May 31 at 9:24

1 vote

0 answers

104 views

Reconstruct images with CLIP image embedding

I recently started working on a project that solely uses the semantic knowledge of image embedding that is encoded from a CLIP-based model (e.g., SigLIP) to reconstruct a semantically similar image. ...

machine-learning
computer-vision
gan
embeddings
stable-distribution

Beothuk

11

asked Mar 20 at 14:19

0 votes

0 answers

61 views

Model comparison and experimentation for a thesis result

We are conducting a study to compare the accuracy of two computer vision models: Model A: Trained on a non-augmented dataset of 11,200 real-world images. Model B: Trained on an augmented dataset ...

machine-learning
statistical-significance
t-test
computer-vision
data-augmentation

markcalendario

101

asked Dec 2, 2024 at 11:44

0 votes

0 answers

65 views

Why does the image classification model perform worse when augmenting only minority class

I have a problem of data imbalance (1:10 ratio) for image classification tasks. To cope with it, I tried different imbalance training strategies, including weighted loss function, different loss ...

machine-learning
neural-networks
convolutional-neural-network
computer-vision
data-augmentation

Yuju Ahn

1

asked Nov 18, 2024 at 9:39

4 votes

1 answer

272 views

Does shuffling the training data cause information leakage in a time-series model with image sequences?

I am working on a predictive model for solar power production based on image sequences captured at 10-minute intervals. A single example my model receives as input consists of a sequence of images. My ...

machine-learning
time-series
lstm
image-processing
computer-vision

schefflaa

51

asked Nov 13, 2024 at 9:43

0 votes

0 answers

100 views

Captioning an average of image set

I'm looking for a captioning model that would be able to describe a group of images in a single sentence. Alternatively, I need a way to conceptually average a group of images before feeding that &...

machine-learning
image-processing
computer-vision
text-generation

Seedmanc

101

asked Nov 10, 2024 at 14:15

1 vote

0 answers

41 views

How to perform user assisted image segmentation using Gaussian Mixture Models?

I have a general idea of Gaussian Mixture Models. My understanding: GMM is a way of clustering data points which, unlike K means clustering, soft assigns them under different distributions by ...

gaussian-mixture-distribution
expectation-maximization
computer-vision
image-segmentation

DeadAsDuck

11

asked Sep 17, 2024 at 23:36

15 30 50 per page

1

2 3 4 5

…

Featured on Meta
Chat room owners can now establish room guidelines
AI Assist is now available on Stack Overflow

Hot Network Questions

Chrome says "To get security updates you need at least macOS 10.15. Please upgrade your OS." on 10.13 (High Sierra). How can I disable that message?
What does Chesterton mean by "the ignorance of history is the only clear evidence of the knowledge of science?"
Why did Jerusalem inhabitants arrogantly tell the Israelite exiles (Ezekiel 11:15)"Go far from the Lord; this land has been given us as a possession"?
Are the Bhagavad Gita’s ideas on non-attached action discussed in contemporary academic philosophy?
In "An Equal Music," why did Michael think Julia was crying?
Extract virus bootsectors from Spy Format Elite
плотно meaning in context
What's does the asterisk mean in the vocabulary list that is shown next to texts of the HSK Standard Course series?
How to play the chords passages to make it melodious (Chopin Op. 15 No. 3 from m. 89)
Statistical testing for bimodal/multimodal sample
Absolute-path Python function taking as argument a path relative to the directory of a script (.py) or Jupyter notebook (.ipynb) file
Can a 11th level Beast Master ranger still attack after using their action for Bestial Fury?
Why does my ceiling fan spark and require power cycling to work again?
Who are "they" in Isaiah 52:15 who see/understand what they were not told/did not hear?
I'd like to know more about my Shan Shui
Is "We will review application starting from..." a deadline?
Polynomial taking only 0 and 1 values at many consecutive integers
Science fiction book with a bizarre punishment/execution method where the prisoner is compressed into a very small gap by an approaching wall
The weight of a free falling object
Are some bash ambiguous job specs impossible to differentiate?
Cauchy addressing "metaphysical difficulties" of Calculus
Failure in PhD supervision, can I realistically publish alone?
Where do I find the explanation of some design patterns from "Game Mechanics: Advanced Game Design" book?
Why doesn’t my "usr-local.mount" unit start automatically as expected?

more hot questions

Newest computer-vision questions feed

Subscribe to RSS

Newest computer-vision questions feed

To subscribe to this RSS feed, copy and paste this URL into your RSS reader.

Cross Validated

Tour
Help
Chat
Contact
Feedback

Company

Stack Overflow
Stack Internal
Stack Data Licensing
Stack Ads
About
Press
Legal
Privacy Policy
Terms of Service
Cookie Policy

Stack Exchange Network

Technology
Culture & recreation
Life & arts
Science
Professional
Business
API
Data

Blog
Facebook
Twitter
LinkedIn
Instagram

Site design / logo © 2025 Stack Exchange Inc; user contributions licensed under CC BY-SA . rev 2025.12.4.37651