All Projects
Enterprise·5 months·2023

IoT Edge Data Pipelines

ImpactReal-time IoT processing
Edge Devices500+
ProcessingReal-time
Health MonitoringAutomated
IoT Edge Data Pipelines

Overview

Designed and implemented a complete data pipeline for IoT edge devices, handling real-time data ingestion, cleaning, transformation, and storage. The system monitors device health and manages data loads across the infrastructure.

The Challenge

Managing and processing high-volume data streams from hundreds of IoT edge devices with varying data quality and formats.

The Solution

Built Azure Functions for data reception and queuing to EventHub, used MongoDB for raw/cleaned/transformed data storage, and deployed PySpark on Azure Databricks for large-scale data processing.

Key Results

  • Processed data from 500+ edge devices

  • Real-time device health monitoring

  • Automated data quality checks

Tech Stack

Azure FunctionsAzure EventHubMongoDBPySparkAzure DatabricksCron Jobs

Categories

AzureIoTPySparkDatabricks