AI Object Detector
Upload any photo and let AI identify every object in it. See bounding boxes, labels, and confidence scores drawn directly on your image. Everything runs locally in your browser.
Drop a photo here or click to upload
Supports JPG, PNG, WebP
Detected Objects
How AI Object Detection Works
This tool uses DETR (DEtection TRansformer), a state-of-the-art object detection model developed by Facebook Research. It combines a convolutional neural network backbone (ResNet-50) with a Transformer encoder-decoder to locate and classify objects in images. The model can detect 91 different object categories from the COCO dataset.
The model runs entirely in your browser using Transformers.js, which converts the PyTorch model into a format that runs on WebAssembly and WebGPU. The model is downloaded once (about 100 MB) and cached for offline use.
What Objects Can Be Detected
- People — Persons, faces in group photos, crowds
- Vehicles — Cars, trucks, buses, motorcycles, bicycles, airplanes, boats
- Animals — Dogs, cats, birds, horses, cows, sheep, elephants, bears
- Food — Bananas, apples, pizza, sandwiches, cakes, bottles
- Furniture — Chairs, couches, beds, dining tables, desks
- Electronics — TVs, laptops, cell phones, keyboards, remotes
Fun and Educational Uses
Object detection is one of the most visual and intuitive AI tasks. Try uploading photos from your camera roll and see what the AI finds. It is great for learning how computer vision works — you can see exactly where the model places bounding boxes and how confident it is about each detection.
Teachers can use this to demonstrate AI concepts in classrooms. Photographers can analyze scene composition. Developers can prototype vision features without writing backend code. Since everything runs locally, there is no API cost and no rate limit.
Tips for Better Detection
Use clear, well-lit photos where objects are reasonably sized. The model works best when objects are not heavily occluded. Outdoor scenes, room interiors, street scenes, and nature photos tend to produce the most interesting results with multiple detections.