DEV Community

Anhaj Uwaisulkarni
Anhaj Uwaisulkarni

Posted on

Building a Live Bus Tracker with ESP32-CAM, GPS, and Cellular Data (Part 1)

Public transportation has a massive data problem. Commuters constantly face unpredictable arrival times and have no idea how crowded a bus is until it pulls up.

I decided to fix this by building a self-contained IoT device for buses. It tracks live GPS coordinates and uses an onboard camera to capture the interior.

This is Part 1 of my case study. I will break down the hardware node, the C++ firmware, and how I managed to reliably transmit image data over a 2G cellular network.

The Hardware Stack

The goal was to keep the unit low-cost but capable of handling network failovers and image processing.

  • ESP32-CAM: The central brain of the operation. It handles the logic, captures the JPEG image, and manages the network connections.
  • NEO-6M GPS: Connected via serial to constantly pull latitude, longitude, and speed data.
  • SIM800L Module: Handles the cellular data transmission. Getting images over a 2G connection is tough, but necessary for mobile transit tracking.
  • LM2596 Buck Converter: Steps down the volatile 12V/24V bus battery to a clean 5V to keep the modules from frying.

The Firmware: Solving Network Drops

I wrote the C++ firmware to handle network failovers automatically. The ESP32 first attempts a WiFi connection (useful for debugging at the terminal). If that fails, it instantly restarts the SIM module and falls back to a GPRS cellular connection.

C++
  // Network Failover Logic
  if (WiFi.status() == WL_CONNECTED) {
    Serial.println("\nāœ… WiFi Connected!");
    activeClient = &wifiClient;
    connected = true;
  } else {
    Serial.println("\nāŒ WiFi Failed. Trying SIM800L...");
    modem.restart();

    // Attempt GPRS connection
    while (!modem.isGprsConnected() && millis() - gsmStart < 10000) {  
      modem.gprsConnect(apn, gprsUser, gprsPass);
    }
  }
Enter fullscreen mode Exit fullscreen mode

The Firmware: Chunking Image Data
The biggest headache in IoT development is memory management. The ESP32-CAM does not have the RAM to load a massive HTTP POST request into memory all at once.

If you try to send the entire JPEG buffer and the HTTP headers in a single client.print() command, the board will crash and reboot.

To solve this, I structured the HTTP request as multipart/form-data. I injected the GPS coordinates into the custom headers, and then sent the actual image binary in strict 1024-byte chunks.

C++
      // Sending the whole JPEG buffer in chunks to prevent crashes
      uint8_t *fbBuf = fb->buf;
      size_t fbLen = fb->len;
      size_t sent = 0;
      const size_t CHUNK_SIZE = 1024;

      while (sent < fbLen) {
        size_t toSend = CHUNK_SIZE;
        if (fbLen - sent < CHUNK_SIZE) {
          toSend = fbLen - sent;  // Last chunk
        }

        activeClient->write(fbBuf + sent, toSend);
        sent += toSend;
        delay(50);  // Buffer breathing room
      }
Enter fullscreen mode Exit fullscreen mode

This 50ms delay between chunks ensures the SIM800L module does not get overwhelmed and drop packets over the slow 2G network.

What's Next?
Getting the hardware to reliably capture and transmit data from a moving vehicle is only half the battle.

In Part 2, I will break down the cloud infrastructure. I will show how I deployed a Python backend to Hugging Face, used the Google Gemini API to analyze the images for crowd density, and synced it all in real-time to a React frontend.

Top comments (0)