Overview
Single stream mode processes one RTSP camera feed with real-time person detection. When a person enters the frame, the system can:- Save an annotated JPEG snapshot
- Record an MP4 video clip of their presence
- Display a live annotated window
- Print detection events to console
Single stream mode uses a dedicated display window (1280×720 resolution). For multiple cameras, see Multi-Stream Processing.
Basic Usage
Minimal Example
- Connects to the RTSP stream
- Runs person detection on every 15th frame (default)
- Saves JPEG snapshots when persons are detected
- Shows a live display window
With uv (Recommended)
Command Structure
The basic command structure for single stream processing:RTSP stream URL to process
Save mode:
image for snapshots or video for clipsDisplay Options
Control whether to show a live display window during processing.With Display (Default)
By default, single stream mode shows a live annotated window:- Resized to 1280×720 for consistent viewing
- Green bounding boxes around detected persons
- Confidence scores displayed above boxes
- Person count and entry counter in top-left
- Press ‘q’ to quit
stream_processor.py:363-396):
Without Display (Headless)
Disable the display window for headless servers or background processing:Disable the live display window (single stream only)
- Running on servers without GUI
- Background processing
- Reduced resource usage
- Remote/SSH sessions
- With Display
- Without Display
- Live OpenCV window appears
Save Modes
Image Mode (Snapshots)
Captures a single annotated JPEG when a person first enters the frame.- One snapshot per person entry event
- Annotated with bounding boxes and confidence scores
- Saved immediately when person detected
- Filename includes timestamp and entry counter
stream_processor.py:321-327):
stream_processor.py:416-434):
Video Mode (Clips)
Records an MP4 clip for the entire duration a person is present in the frame.- Recording starts when person enters frame
- Continues while person is present
- Stops after 3 consecutive frames without detection
- All frames written at source stream FPS
stream_processor.py:329-339):
stream_processor.py:344-355):
Exit threshold: 3 consecutive frames without detectionWith
frame_skip=15 on a 30fps stream:- Detection runs every 0.5 seconds
- 3 misses = ~1.5 seconds of no detection
- Prevents premature clip termination from brief occlusions
Save Mode Comparison
Image Mode
Pros:
- Minimal storage
- Fast to review
- One file per event
- Good for counting
- No temporal context
- Miss behavior details
- Single frame only
Video Mode
Pros:
- Full context
- Review behavior
- Continuous recording
- Better for security
- Large file sizes
- More storage needed
- Slower to review
Detection Flow
Understanding how detection works in single stream mode:Run detection every Nth frame
stream_processor.py:303-305- Default: every 15th frame
- Throttled to max 2 fps
Console Output
Understanding the console output during processing:Model loading
Model loading
Connection status
Connection status
Detection events
Detection events
Exit tracking
Exit tracking
Configuration Overrides
Override config.cfg values for single stream processing:Automatic Reconnection
Single stream mode includes automatic reconnection on connection loss:stream_processor.py:282-296
Reconnection behavior:
- Max 5 retry attempts
- 2 second delay between attempts
- Resets counter on successful read
- Exits after exhausting retries
Real-World Examples
Example 1: Store Entrance Monitoring
Goal: Count customers entering a store- Image mode: one snapshot per customer
- Higher confidence: reduce false positives
- Larger area threshold: only detect close persons (actually entering)
- No display: runs in background
Example 2: Security Incident Recording
Goal: Record full video of any activity in restricted area- Video mode: capture full behavior
- Lower confidence: don’t miss any detections
- More frequent checking: faster detection response
- With display: monitor in real-time
Example 3: Parking Lot Wide-Angle
Goal: Detect people in large parking area- Image mode: event logging
- Lower confidence: better for distant detection
- Low area threshold: detect small/distant persons
- Less frequent: acceptable for slow-moving subjects
Troubleshooting
Cannot connect to stream
Cannot connect to stream
Error:Possible causes:
- Incorrect RTSP URL
- Network connectivity issues
- Camera authentication required
- Firewall blocking connection
- Verify URL with VLC or ffplay:
- Check network connectivity:
- Add credentials to URL:
Display window not showing
Display window not showing
Issue: No OpenCV window appearsPossible causes:
--no-displayflag set- Headless environment (no X server)
- Display environment variable not set
- Remove
--no-displayflag - For SSH: enable X11 forwarding
- Set DISPLAY variable:
No detections happening
No detections happening
Issue: Stream works but no persons detectedDebugging steps:
-
Check model is loaded:
-
Lower confidence threshold:
-
Lower area threshold:
-
Test with image first:
Frequent reconnections
Frequent reconnections
Issue:Possible causes:
- Unstable network
- Camera stream issues
- Network bandwidth limitations
- Router/switch problems
- Check network stability
- Reduce stream quality at camera
- Use wired connection instead of WiFi
- Check camera logs for issues
High CPU/GPU usage
High CPU/GPU usage
Issue: System resources maxed outSolutions:
-
Increase frame_skip:
-
Disable display:
-
Use HOG instead of YOLO:
- Move YOLO weights out of model directory
- HOG is faster but less accurate
- Enable GPU if available:
Performance Tips
Optimize frame_skip
Start with
frame_skip=30 and decrease until detection responsiveness is acceptable.Use GPU acceleration
CUDA-enabled OpenCV provides 5-10x speedup. See GPU setup.
Adjust resolution
Configure camera to stream at lower resolution (e.g., 1280×720 instead of 1920×1080).
Tune thresholds
Higher confidence/area thresholds = fewer detections = less processing.