Stream ESP32CAM video
Capture ESP32CAM Video Stream in Python on a Raspberry Pi 5
29 December 20235 minute read
By Kevin McAleer
Share this article on
Capture ESP32CAM Video Stream in Python on a Raspberry Pi 5
29 December 2023
By Kevin McAleer
Share this article on
The ESP32Cam is a tiny module that allows you to stream video from the camera to a web browser. It’s a great way to get easily add a POV capability to your robotics project. The firmware that comes loaded on the ESP32Cam has a web server, self hosted wifi hotspot as well as a built-in a video streamer.
We’ll use the video streamer to capture the video stream in Python on a Raspberry Pi 5, and then process that video in realtime.
The ESP32Cam video streamer uses RTSP
to stream the video.
RTSP
stands for Real-Time Streaming Protocol. It’s a network protocol, a set of rules, used for controlling the streaming of audio and video data over the Internet in real-time. Think of it as a ‘remote control’ for live video feeds.
Connection: First, your device (like your computer or smartphone) contacts the server (where the video is stored or being broadcast from) using RTSP. It’s like dialing a phone number to start a call.
Control Commands: Once connected, RTSP allows you to send control commands to the server. You can tell the server to do things like ‘play the video’, ‘pause’, ‘rewind’, or ‘fast forward’ – similar to how you use a remote control with your TV.
Streaming: Unlike downloading a file, where you wait for the entire file to download before viewing, RTSP allows the video or audio to be played as it’s being transmitted. This is known as streaming
.
Separate Data Transport: RTSP itself doesn’t send the video or audio data. Instead, it works alongside other protocols (like RTP - Real-Time Transport Protocol) that handle the actual transmission of the audio and video data.
Live Control: RTSP is great for situations where you need real-time control over streaming, like in security camera feeds, live broadcasts, or video conferencing.
Efficiency: It’s efficient for streaming live content because it reduces delay and allows for interactive control over the stream.
Flexibility: RTSP supports various media types and can be used with different kinds of networks and devices.
We can use the cv2
library in Python to capture the video stream from the ESP32Cam. We’ll use the cv2.VideoCapture()
function to capture the video stream. We’ll pass the URL of the ESP32Cam video stream to the cv2.VideoCapture()
function, and it will return a video stream object that we can use to capture the video frames.
rtsp_url = 'http://192.168.4.1:81/stream'
# Capture the video stream
cap = cv2.VideoCapture(rtsp_url)
Before we write a simple program to capture the RTSP stream and process it, we need to setup a new Python environment on the Raspberry Pi 5. We’ll use the cvzone
module to detect objects in the video stream. The cvzone
module is a wrapper around the cv2
library, and makes it easier to detect objects in the video stream.
python3 -m venv venv
source venv/bin/activate
pip3 install cvzone mediapipe opencv-python
We can use the cv2
library to detect objects in the video stream. To detect faces, we can use the CVZone module and the FaceDetector
class. The FaceDetector
class will detect faces in the video stream, and return a list of faces that it has detected. The cv2.imshow()
function will display the video stream in a window on the screen, with a greenbox around the detected faces, along with a percentage confidence score.
```python
while True:
ret, frame = cap.read()
if not ret:
print("Can't receive frame (stream end?). Exiting ...")
break
# Process the frame with OpenCV here
frame, list_faces = face_detector.findFaces(frame)
cv2.imshow("Face Detection", frame)
The ESP32Cam is low power, simple to setup and configure and pretty low cost too (£0.57 each on Aliexpress - plus shipping). By offloading the video processing to the Raspberry Pi 5, we don’t need to change the ESP32Cam firmware and can build on the image processing capabilities on the Pi 5.
We can use the image data to make decision on how to control the robot remotely, making it move towards objects or look at a face.
Item | Description | Price per item | Qty | Cost |
---|---|---|---|---|
ESP32CAM | ESP32CAM Module | £0.57 | 1 | £0.57 |
import cv2
import cvzone
from cvzone import FaceDetectionModule
face_detector = FaceDetectionModule.FaceDetector()
# Replace with your RTSP stream URL
rtsp_url = 'http://192.168.4.1:81/stream'
# Capture the video stream
cap = cv2.VideoCapture(rtsp_url)
while True:
ret, frame = cap.read()
if not ret:
print("Can't receive frame (stream end?). Exiting ...")
break
# Process the frame with OpenCV here
frame, list_faces = face_detector.findFaces(frame)
cv2.imshow("Face Detection", frame)
if cv2.waitKey(1) == ord('q'):
break
# Release everything if job is finished
cap.release()
cv2.destroyAllWindows()
Kevin McAleer
I build robots, bring them to life with code, and have a whole load of fun along the way
Social Links:
If you found this high quality content useful please consider supporting my work, so I can continue to create more content for you.
I give away all my content for free: Weekly video content on YouTube, 3d Printable designs, Programs and Code, Reviews and Project write-ups, but 98% of visitors don't give back, they simply read/watch, download and go. If everyone who reads or watches my content, who likes it, helps fund it just a little, my future would be more secure for years to come. A price of a cup of coffee is all I ask.
There are a couple of ways you can support my work financially:
If you can't afford to provide any financial support, you can also help me grow my influence by doing the following:
Thank you again for your support and helping me grow my hobby into a business I can sustain.
- Kevin McAleer