Skip to main content

Overview

RTSP Human Capture supports multiple detection models with automatic fallback. The system tries to load models in this order:
  1. YOLOv4 (recommended for best accuracy)
  2. YOLOv3 (fallback if YOLOv4 not found)
  3. HOG (OpenCV built-in, no files required)
The HOG detector runs automatically when no YOLO model files are found. While less accurate than YOLO, it requires no additional downloads and works immediately.

Model Directory Structure

By default, model files should be placed in the model/ directory:
rtsp-human-capture/
├── model/
│   ├── yolov4.weights      # Main model weights
│   ├── yolov4.cfg          # Model configuration
│   └── coco.names          # Class labels (80 classes)
├── main.py
└── config.cfg
You can change the model directory location in config.cfg by setting model_dir under the [paths] section.

Downloading YOLOv4 Files

YOLOv4 provides the best detection accuracy. Download these three files:
1

Download yolov4.weights

Download the pre-trained weights file (245 MB):
wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights -P model/
Or download manually from: yolov4.weights
2

Download yolov4.cfg

Download the model configuration file:
wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4.cfg -P model/
Or download manually from: yolov4.cfg
3

Download coco.names

Download the COCO class labels file:
wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/data/coco.names -P model/
Or download manually from: coco.names

Downloading YOLOv3 Files (Alternative)

If you prefer YOLOv3 or want it as a fallback:
wget https://pjreddie.com/media/files/yolov3.weights -P model/

HOG Fallback Detector

The Histogram of Oriented Gradients (HOG) detector is OpenCV’s built-in person detection method.

When HOG is Used

HOG activates automatically when:
  • No yolov4.weights file is found
  • No yolov3.weights file is found
  • YOLO model loading fails

HOG Characteristics

Advantages

  • No model files required
  • Lightweight and fast
  • Works immediately after installation

Limitations

  • Lower accuracy than YOLO
  • More false positives/negatives
  • Less robust to varying poses

HOG Implementation

From person_detector.py:53-60:
except:
    print("Warning: YOLO weights not found. Using OpenCV's built-in HOG person detector as fallback.")
    self.net = None
    self.hog = cv2.HOGDescriptor()
    # Convert to numpy array for setSVMDetector
    default_people_detector = np.array(
        cv2.HOGDescriptor.getDefaultPeopleDetector(), dtype=np.float32)
    self.hog.setSVMDetector(default_people_detector)

Verifying Model Installation

1

Check file presence

Verify all required files exist:
ls -lh model/
You should see:
yolov4.weights  (~245 MB)
yolov4.cfg      (~12 KB)
coco.names      (~1 KB)
2

Test with an image

Run a quick test to verify the model loads correctly:
uv run main.py --test-image test.jpg --save image
Look for this output:
Loading person detection model...
Model loaded: YOLOv4
Confidence threshold: 0.5
Person area threshold: 1000 pixels
3

Check GPU detection (optional)

If you have CUDA installed, you should see:
CUDA available, using GPU for inference
If not:
CUDA not available, using CPU for inference

Model Loading Logic

The PersonDetector class attempts to load models in this sequence (from person_detector.py:42-60):
# Try to load YOLO model files
try:
    self.net = cv2.dnn.readNet(
        f"{model_dir}/yolov4.weights", f"{model_dir}/yolov4.cfg")
    model_name = "YOLOv4"
except:
    try:
        self.net = cv2.dnn.readNet(
            f"{model_dir}/yolov3.weights", f"{model_dir}/yolov3.cfg")
        model_name = "YOLOv3"
    except:
        print("Warning: YOLO weights not found. Using OpenCV's built-in HOG person detector as fallback.")
        self.net = None
        self.hog = cv2.HOGDescriptor()
        default_people_detector = np.array(
            cv2.HOGDescriptor.getDefaultPeopleDetector(), dtype=np.float32)
        self.hog.setSVMDetector(default_people_detector)
        model_name = "HOG"

COCO Classes

The coco.names file contains 80 object classes. RTSP Human Capture only detects class 0: person. From person_detector.py:130-131:
# Only detect persons (class_id = 0 in COCO)
if class_id == 0 and confidence > self.confidence_threshold:
The COCO dataset includes 80 classes:
  1. person
  2. bicycle
  3. car
  4. motorbike
  5. aeroplane
  6. bus
  7. train
  8. truck
  9. boat
  10. traffic light … (and 70 more)
Only “person” (class 0) is used for detection in this application.

Troubleshooting

Symptoms:
Warning: YOLO weights not found. Using OpenCV's built-in HOG person detector as fallback.
Solutions:
  • Verify files are in the correct directory (default: model/)
  • Check file names are exactly: yolov4.weights, yolov4.cfg, coco.names
  • Ensure files aren’t corrupted (check file sizes)
  • Verify read permissions on model files
Symptoms:
Error: OpenCV(4.x.x) ... DNN module is not built with CUDA backend
Solution: This is just a warning. The system will automatically fall back to CPU inference. See GPU Acceleration for CUDA setup.
Behavior: System uses fallback class list (from person_detector.py:79-81):
except:
    # Default COCO classes if file not found
    self.classes = ["person", "bicycle", "car",
                    "motorbike", "aeroplane", "bus", "train", "truck"]
Impact: Detection still works, but only first 8 classes are named.Solution: Download coco.names as shown above.
Symptoms: Files exist but system reports they’re not found.Solution: Check your config.cfg file:
[paths]
model_dir = model  # Make sure this matches your directory
Or override at runtime:
python main.py --config config.cfg --rtsp "rtsp://..." --save image

Performance Comparison

ModelAccuracySpeed (CPU)Speed (GPU)File Size
YOLOv4Excellent~100-300ms/frame~10-30ms/frame245 MB
YOLOv3Very Good~80-250ms/frame~8-25ms/frame248 MB
HOGFair~50-150ms/frameN/A0 MB
Times are approximate and vary based on image resolution, hardware, and scene complexity.

Next Steps