Is it feasible to run MobileNetV2 for image recognition on an ESP32 with a 320x240 camera feed?

Hello,I have a question about the feasibility of running an image recognition model on an ESP32 to recognize a single specific object, such as a person, using a camera with a live video feed. The images the model will process are 320x240 pixels in resolution. I'm considering using a lightweight model like MobileNetV2. Additionally, I'm curious about whether the ESP32 has any hardware acceleration capabilities for image processing, such as leveraging the ESP-NN library for optimized neural network inference. The power budget for the ESP32 is quite limited, but some delay in recognition is acceptable, though it should not be significant. Given these constraints and requirements, is it realistic to achieve this goal with the ESP32, or should I consider a different approach or hardware?
5 Replies
RED HAT
RED HAT4mo ago
Running an image recognition model like MobileNetV2 on an ESP32 is a challenging task, especially with a live video feed at 320x240 resolution. The ESP32 has limited processing power and memory, which makes it difficult to run complex models efficiently.
RED HAT
RED HAT4mo ago
While it’s possible to run lightweight models on the ESP32 using frameworks like TensorFlow Lite for Microcontrollers, you’ll need to heavily optimize the model to fit within the ESP32’s constraints. This might involve reducing the model size or simplifying the architecture, which could affect accuracy.
RED HAT
RED HAT4mo ago
my point is while it’s not impossible to do basic image recognition on an ESP32, it’s not the ideal platform for your requirements, especially with a live video feed. You might want to explore alternative hardware if performance and accuracy are key considerations. @wafa_ath
Dtynin
Dtynin4mo ago
@wafa_ath Running an image recognition model on an ESP32 is doable but comes with some challenges. The ESP32 has limited processing power and memory, so you'll need to optimize models like MobileNetV2 quite a bit. While it can handle a 320x240 resolution, getting it to work in real-time without hardware acceleration is tough.
wafa_ath
wafa_ath4mo ago
Is there any specific libraries that could help improve performance? Any tips on balancing accuracy and speed would be greatly appreciated.
Want results from more Discord servers?
Add your server