does this mean im actually able to try object detection in opencv now? i mean i ...

wongarsu · 2026-06-09T07:49:49 1780991389

YOLO has basically solved that for my use cases for a couple years now. If you want labels that are not in the pretrained labels it's also easy to fine-tune, provided you're willing to label 200 or so images

If you need something less restricted to existing labels (say wanting all the red apples, or all cardboard signs) SAM3 is great, as the sibling comment says

IanCal · 2026-06-09T08:27:50 1780993670

> provided you're willing to label 200 or so images

A quick note to say that this is also a task you can hand to things like gemini.

dekhn · 2026-06-09T23:28:29 1781047709

Yep- this is what I do. I use a high quality VLM to generate labelled boxes (in my case, around tardigrades in a microscope image), do some light editing to fix the small number of errors, and then train YOLO26 with it. Works great, saved me tens of hours of labelling. It's a bit scary that there is a VLM that works as well as my fine-tuned model (although much slower).

globalnode · 2026-06-10T00:49:06 1781052546

thats a fantastic strategy thank you, and thanks to all the other helpful posters as well here. do you have any tips for how to choose the base yolo model? or just any generic one will do?

IX-103 · 2026-06-09T17:11:01 1781025061

How do you handle object disambiguation with YOLO? All the examples I've played with have the problem where if two "cars" get too close to each other then the tracking IDs keep switching between them, meaning we'd need an additional kinetic model for disambiguation.

fnands · 2026-06-09T07:39:48 1780990788

That seems to be the way things are going.

Large general models have taken over in NLP, and (outside of embedded/low latency applications) it seems like they are coming for CV next.

So you should soon be able to have large generic model that can detect whatever for you.

It's already pretty much possible with open-vocabulary detectors like SAM3, where you could just prompt it with "Apple": https://ai.meta.com/research/sam3/

Npovview · 2026-06-09T16:36:29 1781022989

Roboflow is your friend.

shenberg · 2026-06-09T08:04:44 1780992284

moondream is a beast