Model Setup - RTSP Human Capture

Overview

RTSP Human Capture supports multiple detection models with automatic fallback. The system tries to load models in this order:

YOLOv4 (recommended for best accuracy)
YOLOv3 (fallback if YOLOv4 not found)
HOG (OpenCV built-in, no files required)

The HOG detector runs automatically when no YOLO model files are found. While less accurate than YOLO, it requires no additional downloads and works immediately.

Model Directory Structure

By default, model files should be placed in the model/ directory:

rtsp-human-capture/
├── model/
│   ├── yolov4.weights      # Main model weights
│   ├── yolov4.cfg          # Model configuration
│   └── coco.names          # Class labels (80 classes)
├── main.py
└── config.cfg

You can change the model directory location in config.cfg by setting model_dir under the [paths] section.

Downloading YOLOv4 Files

YOLOv4 provides the best detection accuracy. Download these three files:

Download yolov4.weights

Download the pre-trained weights file (245 MB):

wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights -P model/

Or download manually from: yolov4.weights

Download yolov4.cfg

Download the model configuration file:

wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4.cfg -P model/

Or download manually from: yolov4.cfg

Download coco.names

Download the COCO class labels file:

wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/data/coco.names -P model/

Or download manually from: coco.names

Downloading YOLOv3 Files (Alternative)

If you prefer YOLOv3 or want it as a fallback:

YOLOv3 Weights
YOLOv3 Config
COCO Names

wget https://pjreddie.com/media/files/yolov3.weights -P model/

wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3.cfg -P model/

wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/data/coco.names -P model/

HOG Fallback Detector

The Histogram of Oriented Gradients (HOG) detector is OpenCV’s built-in person detection method.

When HOG is Used

HOG activates automatically when:

No yolov4.weights file is found
No yolov3.weights file is found
YOLO model loading fails

HOG Characteristics

Advantages

No model files required
Lightweight and fast
Works immediately after installation

Limitations

Lower accuracy than YOLO
More false positives/negatives
Less robust to varying poses

HOG Implementation

From person_detector.py:53-60:

except:
    print("Warning: YOLO weights not found. Using OpenCV's built-in HOG person detector as fallback.")
    self.net = None
    self.hog = cv2.HOGDescriptor()
    # Convert to numpy array for setSVMDetector
    default_people_detector = np.array(
        cv2.HOGDescriptor.getDefaultPeopleDetector(), dtype=np.float32)
    self.hog.setSVMDetector(default_people_detector)

Verifying Model Installation

Check file presence

Verify all required files exist:

ls -lh model/

You should see:

yolov4.weights  (~245 MB)
yolov4.cfg      (~12 KB)
coco.names      (~1 KB)

Test with an image

Run a quick test to verify the model loads correctly:

uv run main.py --test-image test.jpg --save image

Look for this output:

Loading person detection model...
Model loaded: YOLOv4
Confidence threshold: 0.5
Person area threshold: 1000 pixels

Check GPU detection (optional)

If you have CUDA installed, you should see:

CUDA available, using GPU for inference

If not:

CUDA not available, using CPU for inference

Model Loading Logic

The PersonDetector class attempts to load models in this sequence (from person_detector.py:42-60):

# Try to load YOLO model files
try:
    self.net = cv2.dnn.readNet(
        f"{model_dir}/yolov4.weights", f"{model_dir}/yolov4.cfg")
    model_name = "YOLOv4"
except:
    try:
        self.net = cv2.dnn.readNet(
            f"{model_dir}/yolov3.weights", f"{model_dir}/yolov3.cfg")
        model_name = "YOLOv3"
    except:
        print("Warning: YOLO weights not found. Using OpenCV's built-in HOG person detector as fallback.")
        self.net = None
        self.hog = cv2.HOGDescriptor()
        default_people_detector = np.array(
            cv2.HOGDescriptor.getDefaultPeopleDetector(), dtype=np.float32)
        self.hog.setSVMDetector(default_people_detector)
        model_name = "HOG"

COCO Classes

The coco.names file contains 80 object classes. RTSP Human Capture only detects class 0: person. From person_detector.py:130-131:

# Only detect persons (class_id = 0 in COCO)
if class_id == 0 and confidence > self.confidence_threshold:

Full COCO Class List

The COCO dataset includes 80 classes:

person
bicycle
car
motorbike
aeroplane
bus
train
truck
boat
traffic light … (and 70 more)

Only “person” (class 0) is used for detection in this application.

Troubleshooting

Model files not loading

Symptoms:

Warning: YOLO weights not found. Using OpenCV's built-in HOG person detector as fallback.

Solutions:

Verify files are in the correct directory (default: model/)
Check file names are exactly: yolov4.weights, yolov4.cfg, coco.names
Ensure files aren’t corrupted (check file sizes)
Verify read permissions on model files

OpenCV DNN errors

Symptoms:

Error: OpenCV(4.x.x) ... DNN module is not built with CUDA backend

Solution: This is just a warning. The system will automatically fall back to CPU inference. See GPU Acceleration for CUDA setup.

coco.names not found

Behavior: System uses fallback class list (from person_detector.py:79-81):

except:
    # Default COCO classes if file not found
    self.classes = ["person", "bicycle", "car",
                    "motorbike", "aeroplane", "bus", "train", "truck"]

Impact: Detection still works, but only first 8 classes are named.Solution: Download coco.names as shown above.

Wrong model directory

Symptoms: Files exist but system reports they’re not found.Solution: Check your config.cfg file:

[paths]
model_dir = model  # Make sure this matches your directory

Or override at runtime:

python main.py --config config.cfg --rtsp "rtsp://..." --save image

Performance Comparison

Model	Accuracy	Speed (CPU)	Speed (GPU)	File Size
YOLOv4	Excellent	~100-300ms/frame	~10-30ms/frame	245 MB
YOLOv3	Very Good	~80-250ms/frame	~8-25ms/frame	248 MB
HOG	Fair	~50-150ms/frame	N/A	0 MB

Times are approximate and vary based on image resolution, hardware, and scene complexity.

Next Steps

Configuration

Configure detection thresholds and paths

GPU Acceleration

Enable CUDA for faster inference

Single Stream

Process your first RTSP stream

Multi-Stream

Monitor multiple cameras simultaneously

​Overview

​Model Directory Structure

​Downloading YOLOv4 Files

​Downloading YOLOv3 Files (Alternative)

​HOG Fallback Detector

​When HOG is Used

​HOG Characteristics

Advantages

Limitations

​HOG Implementation

​Verifying Model Installation

​Model Loading Logic

​COCO Classes

​Troubleshooting

​Performance Comparison

​Next Steps

Configuration

GPU Acceleration

Single Stream

Multi-Stream

Overview

Model Directory Structure

Downloading YOLOv4 Files

Downloading YOLOv3 Files (Alternative)

HOG Fallback Detector

When HOG is Used

HOG Characteristics

HOG Implementation

Verifying Model Installation

Model Loading Logic

COCO Classes

Troubleshooting

Performance Comparison

Next Steps