What is the M5Stack Echo Pyramid Smart Speaker Base? It is a professional-grade smart speaker base designed specifically for voice interaction applications, compatible with the M5Atom, M5AtomS3, and M5AtomS3R series controllers. This pyramid-shaped base features high-fidelity audio components including the ES8311 audio codec, ES7210 microphone with AEC (Acoustic Echo Cancellation), and AW87559 Class-D amplifier, making it an ideal platform for building smart speakers, voice assistants, and IoT voice gateways.
Product Overview
The M5Stack Echo Pyramid (SKU: M5-A167) represents a significant advancement in the M5Stack ecosystem, providing developers and makers with a professional audio platform for voice-enabled applications. Unlike simpler speaker modules, the Echo Pyramid integrates multiple specialized audio chips working in harmony to deliver exceptional sound quality and voice capture capabilities.
The device features a distinctive pyramid shape measuring 83.6 x 83.6 x 56.7mm and weighing just 100.7g. This compact form factor houses an impressive array of audio components while maintaining an aesthetically pleasing design suitable for desktop use. The official M5Stack documentation provides comprehensive technical details and programming guides for this innovative device.
At the heart of the Echo Pyramid's intelligent control is the STM32G030F6P6 auxiliary microcontroller, which manages the touch interface and RGB LED lighting independently from the main controller. This architecture ensures responsive touch controls and smooth LED animations without burdening the primary ESP32 processor running your applications.
Technical Specifications
| Specification | Value |
|---|---|
| Product Name | M5Stack Echo Pyramid Smart Speaker Base |
| SKU | M5-A167 |
| Compatible Controllers | Atom, AtomS3, AtomS3R series |
| Auxiliary MCU | STM32G030F6P6 (touch/RGB management) |
| Audio Codec | ES8311 (24-bit, 16kHz-64kHz) |
| Audio ADC | ES7210 with AEC support |
| Microphone | LMA3729T381-0Y3S MEMS |
| Amplifier | AW87559 Class-D speaker driver |
| Clock Generator | Si5351 programmable low-jitter MCLK |
| LEDs | 28x WS2812 RGB LED (7 per strip) |
| Touch Interface | Dual-side capacitive (4 detection points) |
| Input Power | DC 5V |
| Standby Current | 14.92mA (without controller) |
| Operating Current | 578.47mA (max volume with controller) |
| Expansion Port | 1x HY2.0-4P Grove (I2C) |
| Operating Temperature | -10°C ~ 60°C |
| Dimensions | 83.6 x 83.6 x 56.7mm |
| Weight | 100.7g |
Hardware Features and Core Components
ES8311 High-Performance Audio Codec
The ES8311 audio codec is a critical component that enables the Echo Pyramid's impressive audio capabilities. This professional-grade codec provides 24-bit audio resolution with sampling rates ranging from 16kHz to 64kHz, ensuring high-fidelity sound reproduction for both playback and recording applications.
The ES8311 communicates via the I2S protocol, which is the industry standard for digital audio transmission. This codec supports full-duplex operation, meaning it can simultaneously handle audio input and output streams—a crucial feature for real-time voice interaction applications such as voice assistants and communication systems.
ES7210 Microphone with Acoustic Echo Cancellation
Voice capture quality is paramount for smart speaker applications, and the ES7210 audio ADC delivers exceptional performance in this area. This specialized chip features integrated Acoustic Echo Cancellation (AEC), which is essential for far-field voice capture in voice assistant applications.
AEC technology eliminates the echo that occurs when the microphone picks up sound from the device's own speaker, enabling clear voice recognition even when the device is playing audio. This makes the Echo Pyramid suitable for hands-free voice control scenarios where users need to issue commands while music or other audio is playing.
AW87559 Class-D Amplifier
The AW87559 Class-D amplifier provides efficient audio amplification with minimal power dissipation. Class-D amplifiers are known for their high efficiency—typically over 90%—which means less heat generation and longer battery life when used with portable applications. The amplifier drives the built-in speaker, delivering clear and powerful sound output suitable for voice prompts, music playback, and alert notifications.
Si5351 Programmable Clock Generator
Audio quality depends heavily on precise timing, and the Si5351 programmable clock generator provides the low-jitter master clock (MCLK) required by the ES8311 codec. This chip can generate multiple clock frequencies, allowing the system to support various audio sampling rates while maintaining signal integrity and minimizing audio distortion.
STM32G030F6P6 Auxiliary Controller
The STM32G030F6P6 microcontroller serves as a dedicated co-processor responsible for managing the touch interface and RGB LED lighting. This ARM Cortex-M0+ based MCU handles real-time tasks independently, freeing the main ESP32 processor to focus on application logic and network communication. The STM32 communicates with the main controller via I2C, providing touch event notifications and accepting LED control commands.
RGB LED Indicators
The Echo Pyramid features 28 WS2812 RGB LEDs arranged in four strips with seven LEDs per strip. These individually addressable LEDs can display various colors and animations to indicate device status, audio levels, or custom visual effects. The LED placement on the pyramid faces provides 360-degree visibility, making the device visually appealing from any angle.
Capacitive Touch Controls
Dual-side capacitive touch-slider zones provide intuitive control without physical buttons. With four detection points (two per side), users can control volume, navigate menus, or trigger custom functions through simple touch gestures. The touch interface is managed by the STM32 co-processor for responsive and reliable operation.
Key Features
- Professional Audio Quality: ES8311 codec delivers 24-bit high-fidelity audio with support for multiple sampling rates
- Advanced Voice Capture: ES7210 microphone with AEC enables clear far-field voice recognition
- Efficient Amplification: AW87559 Class-D amplifier provides powerful sound output with minimal heat
- Visual Feedback: 28 programmable RGB LEDs for status indication and visual effects
- Touch Control: Dual-side capacitive touch zones for intuitive interaction
- Precise Timing: Si5351 clock generator ensures low-jitter audio performance
- Expandable: Grove I2C port for connecting additional sensors and modules
- Compact Design: Pyramid form factor optimized for desktop use
- Ecosystem Compatible: Works seamlessly with Atom/AtomS3/AtomS3R controllers
Compatibility with M5Stack Controllers
The Echo Pyramid is designed to work with the entire Atom series ecosystem. The base features a socket that accommodates the compact M5Atom (ESP32-PICO-D4), M5AtomS3 (ESP32-S3), and M5AtomS3R controllers, providing the processing power and wireless connectivity required for smart speaker applications.
The M5AtomS3 with its ESP32-S3 processor is particularly well-suited for voice applications, offering AI acceleration capabilities for on-device wake word detection and voice recognition. The original M5Atom provides a cost-effective option for basic voice control projects, while the M5AtomS3R offers enhanced peripherals and improved performance.
Related M5Stack products that complement the Echo Pyramid include the M5StickV for computer vision applications, the M5Paper for e-ink display integration, and various Grove modules that can be connected via the expansion port to add sensors and actuators to your voice projects.
Applications
Smart Speakers and Desktop Voice Assistants
The primary application for the Echo Pyramid is building custom smart speakers and voice assistants. With its professional audio components, developers can create devices that rival commercial products in sound quality while maintaining full control over the software and privacy. The combination of ESP32's wireless capabilities and the Echo Pyramid's audio system enables integration with popular voice platforms like Alexa, Google Assistant, or custom AI models.
Voice Control Hubs and IoT Voice Gateways
The Echo Pyramid serves as an excellent central hub for voice-controlled smart home systems. By connecting to the M5Stack ecosystem of sensors and controllers via WiFi, Bluetooth, or the Grove I2C port, the device can become a voice gateway that controls lights, appliances, and other IoT devices throughout the home.
Local or Cloud-Based Voice Interaction Prototyping
For developers and researchers working on voice technology, the Echo Pyramid provides a robust hardware platform for prototyping. The M5Stack documentation includes examples for integrating with cloud speech recognition services as well as implementing local wake word detection using TensorFlow Lite on the ESP32-S3.
Educational Projects
The Echo Pyramid is an excellent educational tool for teaching audio processing, embedded systems, and IoT development. Students can learn about digital audio protocols (I2S), audio signal processing, and voice interface design while building functional projects. The M5AtomS3 platform supports both Arduino IDE and MicroPython, making it accessible to beginners while providing depth for advanced users.
I/O and Expansion Capabilities
The Echo Pyramid includes a HY2.0-4P Grove port that provides I2C connectivity for expansion modules. This allows users to connect a wide range of sensors and actuators from the Grove ecosystem, including temperature sensors, motion detectors, displays, and relay modules. The I2C bus is shared between the expansion port and the internal STM32 co-processor, enabling seamless integration of additional hardware.
Popular expansion options include connecting infrared modules for controlling traditional appliances, IMU sensors for gesture control, or relay modules for controlling high-power devices.
Pros and Cons
Pros
- Professional-grade audio components deliver excellent sound quality
- Integrated AEC enables effective far-field voice capture
- Dedicated STM32 co-processor ensures responsive touch and LED control
- Compatible with multiple Atom controller variants
- Programmable RGB LEDs provide visual feedback options
- Grove expansion port allows sensor integration
- Compact pyramid design fits well on desks and shelves
- Comprehensive documentation and examples
Cons
- Requires separate Atom controller (not included)
- 5V power requirement may limit portable battery operation
- Limited to mono audio output (no stereo capability)
- Touch controls may require calibration for optimal sensitivity
- Premium audio components result in higher cost than basic speaker modules
Getting Started
To begin using the M5Stack Echo Pyramid, you will need an Atom series controller such as the M5Atom, M5AtomS3, or M5AtomS3R. The controller connects to the base via the exposed pins, and the entire assembly is powered through the USB-C port on the Atom.
The official M5Stack documentation provides Arduino IDE examples, MicroPython libraries, and UiFlow block-based programming support. For voice assistant projects, M5Stack offers integration guides for popular platforms including OpenAI's Realtime API and open-source alternatives like XiaoZhi.
Conclusion
The M5Stack Echo Pyramid Smart Speaker Base is a thoughtfully designed platform that brings professional audio capabilities to the maker and developer community. By integrating specialized components like the ES8311 codec, ES7210 microphone with AEC, and AW87559 amplifier, M5Stack has created a device that rivals commercial smart speakers in audio quality while maintaining the flexibility and openness of the M5Stack ecosystem.
Whether you are building a custom voice assistant, developing IoT voice control systems, or prototyping the next generation of smart home devices, the Echo Pyramid provides the audio foundation you need. Its compatibility with the Atom series ensures access to powerful ESP32 processors with WiFi and Bluetooth connectivity, while the Grove expansion port and RGB LED indicators offer endless possibilities for customization and extension.
For those interested in exploring voice technology, the combination of the M5AtomS3's AI capabilities and the Echo Pyramid's professional audio system creates a platform that can handle everything from simple voice commands to complex natural language processing applications. The comprehensive documentation and growing community support make it easier than ever to get started with voice-enabled projects.
Frequently Asked Questions
What is M5Stack Echo Pyramid and what does it do?
The M5Stack Echo Pyramid is a smart speaker base designed for voice interaction applications. It provides professional audio capabilities including high-fidelity sound playback, far-field voice capture with echo cancellation, and touch controls. When paired with an Atom series controller, it becomes a complete smart speaker platform suitable for building voice assistants, IoT voice gateways, and interactive audio devices.
Do I need to purchase a separate controller for the Echo Pyramid?
Yes, the Echo Pyramid is a base/station that requires a separate Atom series controller to function. It is compatible with the M5Atom, M5AtomS3, and M5AtomS3R controllers, which provide the processing power and wireless connectivity. The controller connects to the base and sits in the top socket of the pyramid.
What makes the Echo Pyramid different from the Atomic Echo Base?
While both devices use the ES8311 audio codec, the Echo Pyramid is a larger, pyramid-shaped base with additional features including 28 RGB LEDs, dual-side capacitive touch controls, and a more powerful speaker system. The Atomic Echo Base is a smaller, rectangular module designed for compact applications, while the Echo Pyramid is optimized for desktop use with enhanced visual feedback and control options.
Can I use the Echo Pyramid for music playback?
Yes, the Echo Pyramid's ES8311 audio codec supports sampling rates from 16kHz to 64kHz with 24-bit resolution, providing high-fidelity audio suitable for music playback. The AW87559 Class-D amplifier and built-in speaker deliver clear, powerful sound. However, note that the Echo Pyramid is designed for mono audio output, not stereo.
What programming platforms are supported?
The Echo Pyramid supports multiple development platforms through the Atom controller. These include Arduino IDE, ESP-IDF, UiFlow (MicroPython), and PlatformIO. M5Stack provides libraries and example code for audio playback, recording, and voice recognition integration on each platform.
How does the Acoustic Echo Cancellation (AEC) work?
The ES7210 audio ADC chip in the Echo Pyramid includes hardware-based Acoustic Echo Cancellation. This technology detects the audio being output from the speaker and subtracts it from the microphone input in real-time. The result is clean voice capture even when the device is playing sound, which is essential for voice assistant applications where users might speak while audio is playing.
What can I connect to the Grove expansion port?
The Grove I2C port allows you to connect a wide range of sensors and modules from the Grove ecosystem. Popular connections include temperature and humidity sensors for environmental monitoring, PIR motion sensors for presence detection, OLED displays for visual output, and relay modules for controlling external devices. The I2C bus provides power (5V) and data communication.
Is the Echo Pyramid suitable for battery-powered projects?
The Echo Pyramid requires a 5V DC power supply and consumes approximately 578mA at maximum volume with a controller attached. While it can technically be powered from a battery, the relatively high current consumption makes it better suited for stationary, mains-powered applications. For portable voice projects, consider the M5StickV or M5StickC with external audio modules.
Where can I find documentation and examples?
Complete documentation for the Echo Pyramid is available on the M5Stack official documentation site. This includes hardware specifications, pinout diagrams, Arduino libraries, MicroPython examples, and tutorials for building voice assistants. The M5Stack GitHub repository also contains example code and community-contributed projects.
What are the RGB LEDs used for?
The 28 WS2812 RGB LEDs arranged around the pyramid faces serve multiple purposes. They can display device status (connected, listening, processing), visualize audio levels during playback, provide ambient lighting effects, or show custom animations programmed by the user. The LEDs are individually addressable, allowing for complex lighting patterns controlled by the STM32 co-processor.
Ready to build your own smart speaker? Explore the M5Stack ecosystem and discover the possibilities with the M5Stack Echo Pyramid Smart Speaker Base today.
