Getting Started

Teaching machine learning systems requires a holistic, end-to-end perspective that encompasses the complete pipeline from data collection through deployment and maintenance. However, traditional ML systems education faces significant practical barriers: students cannot realistically train trillion-parameter models, collect millions of labeled images, or deploy systems at cloud scale within academic constraints. Embedded machine learning provides an elegant solution to this pedagogical challenge, offering a complete systems experience within accessible hardware and time constraints. Working within the severe resource limitations of embedded devices—typically 2MB of RAM and 1MB of flash storage—students encounter the same fundamental engineering trade-offs that define large-scale ML systems, but in a tangible, hands-on environment where every optimization decision has immediate, observable consequences.

Laboratory Development

These hands-on laboratories were developed by Marcelo Rovai, bringing decades of embedded systems expertise to create accessible, practical learning experiences that bridge theory with real-world implementation.

Why Embedded ML for ML Systems Education?

Traditional machine learning education often focuses on algorithmic development in unconstrained cloud environments with abundant computational resources. While this approach builds important theoretical foundations, it can obscure the engineering realities that define most real-world AI deployments. Embedded machine learning provides a uniquely effective pedagogical framework for several key reasons, organized here from foundational requirements through learning effectiveness to real-world application:

Economic Accessibility: Professional-grade development boards cost $20-50, making hands-on learning accessible without requiring expensive cloud computing credits or specialized laboratory infrastructure. Students can own and experiment with hardware beyond formal course boundaries.

Immediate, Tangible Feedback: Physical interactions—LEDs indicating classification results, buzzers responding to audio events, motors controlled by gesture recognition—transform abstract algorithmic concepts into concrete, observable behaviors. This immediate feedback accelerates learning and debugging.

End-to-End System Understanding: Unlike cloud-based exercises where infrastructure is abstracted away, embedded systems require students to understand the complete pipeline from sensor data acquisition through inference to actuator control. This comprehensive view reveals the interdependencies that characterize real-world ML systems.

Resource Constraints Drive Engineering Excellence: Working within 2MB of RAM and 1MB of flash storage forces students to confront optimization decisions typically hidden in cloud deployments. Every design choice—from model architecture to data preprocessing—has immediate, measurable consequences for system performance.

Interdisciplinary Skill Development: Embedded ML bridges computer science, electrical engineering, and systems design, preparing students for the increasingly interdisciplinary nature of modern technology development.

Industry Relevance: The majority of deployed AI systems operate on edge devices rather than in data centers. Skills developed in embedded contexts directly transfer to mobile applications, IoT deployments, and autonomous systems.

Prerequisites and Preparation

Mathematical Background: Students should possess working knowledge of linear algebra, basic probability theory, and differential calculus. While advanced mathematical sophistication is not required, comfort with matrix operations and elementary optimization concepts will enhance learning outcomes.

Programming Competency: Proficiency in Python programming is essential. Familiarity with C/C++ programming accelerates progress but is not strictly required as introductory exercises provide adequate scaffolding.

Hardware Experience: No prior embedded systems experience is assumed. Laboratory exercises include comprehensive setup procedures and troubleshooting guidance appropriate for students new to hardware development.

Learning Objectives

Upon completion of the laboratory sequence, students will demonstrate competency in:

Resource-Constrained Optimization: Deploy ML models within 2MB RAM constraints while achieving real-time inference performance on microcontroller hardware.
Power-Efficient System Design: Implement always-on sensing applications with battery life measured in months, not hours, through proper power management techniques.
Multi-Modal Data Processing: Integrate vision, audio, and sensor data streams in unified embedded systems while maintaining performance constraints.
Professional Development Workflows: Use industry-standard toolchains including TensorFlow Lite, Edge Impulse, and embedded debugging environments for complete development cycles.

Laboratory Exercise Categories

To achieve these learning objectives, the curriculum is organized into specific exercise categories, each targeting different aspects of embedded AI system development.

Computer Vision Applications

The computer vision laboratory sequence addresses the fundamental challenge of processing high-dimensional visual data on resource-constrained hardware, demonstrating optimization techniques required for real-time performance within embedded system constraints.

Image Classification Systems: Students implement object recognition algorithms that demonstrate trade-offs between model complexity and inference speed, paralleling computational challenges in smartphone cameras and autonomous vehicle systems.

Object Detection and Localization: Advanced exercises extend beyond classification to spatial object localization, implementing detection algorithms similar to those in security systems and industrial automation.

Vision-Language Integration: Cutting-edge exercises combine visual processing with natural language understanding, demonstrating how advanced AI functionality can be deployed on edge devices.

Audio and Temporal Data Processing

Audio processing laboratories focus on continuous data stream analysis while maintaining minimal power consumption, particularly relevant for always-on sensing applications where battery life is paramount.

Keyword Spotting Systems: Students implement voice interface systems demonstrating the engineering challenges of continuous audio monitoring while preserving battery life, paralleling approaches in commercial voice assistants.

Motion and Activity Recognition: Time-series analysis exercises using inertial measurement data teach pattern extraction from continuous sensor streams, mirroring functionality in fitness tracking and health monitoring devices.

Audio Event Classification: Advanced exercises extend beyond speech recognition to general acoustic event detection for security monitoring and environmental sensing applications.

Laboratory Platform Compatibility

These exercise categories are implemented across multiple hardware platforms, each offering different capabilities and constraints. This platform diversity ensures students experience the full spectrum of embedded AI deployment scenarios.

Table 1 provides a comprehensive mapping of laboratory exercises to supported hardware platforms, enabling curriculum planners to design learning sequences appropriate for available resources.

Table 1: Laboratory Exercise Compatibility Matrix

Exercise Category	Arduino Nicla	XIAOML Kit	Grove Vision AI V2	Raspberry Pi
Getting Started	✓	✓	✓	✓
Image Classification	✓	✓	✓	✓
Object Detection	✓	✓	✓	✓
Keyword Spotting	✓	✓
Motion Classification	✓	✓
No-Code Applications			✓
Large Language Models				✓
Vision Language Models				✓
DSP/Feature Engr.	✓	✓	✓	✓

Core Data Modalities

The laboratory exercises described above are organized around three fundamental data modalities that represent the majority of embedded AI applications. Understanding these modalities provides important theoretical context for the engineering challenges students will encounter:

Visual Data Processing: Image and video analysis demands the highest computational resources, requiring optimization techniques to process high-dimensional data within severe memory constraints while maintaining real-time performance.

Temporal Audio Analysis: Audio processing and time-series sensor analysis require continuous data stream processing while maintaining ultra-low power consumption, demonstrating critical trade-offs between computational complexity and energy efficiency.

Sensor Fusion and Multi-Modal Systems: Advanced applications combine multiple data sources to achieve functionality impossible with single-modality approaches, managing increased system complexity while maintaining embedded performance constraints.

Getting Started

With this foundation of learning objectives, exercise categories, platform options, and theoretical understanding in place, students are ready to begin their embedded ML journey.

Students should begin by consulting the Hardware Kits chapter to understand platform capabilities and select appropriate hardware based on their learning objectives and budget constraints.

The IDE Setup chapter provides comprehensive setup procedures, software installation guidance, and troubleshooting resources for all supported platforms.

Next Steps

After completing hardware selection and development environment setup, you’re ready to begin the laboratory exercises. The setup process varies by platform and typically takes 30-60 minutes to complete.

For detailed platform-specific setup instructions, refer to the individual setup guides: - XIAOML Kit Setup - Arduino Nicla Vision Setup - Grove Vision AI V2 Setup - Raspberry Pi Setup