This article explains about porting custom ALSA based audio codec driver in Linux BSP. I have taken audio codec wm8960 and NXP’s IMX7 processor as an example. The source code for this linux BSP is available here under GNU General Public License.

Advanced Linux Sound Architecture (ALSA)

Advanced Linux Sound Architecture (ALSA) is a software framework and part of the Linux kernel that provides an application programming interface (API) for sound card device drivers. Some of the goals of the ALSA project at its inception were automatic configuration of sound-card hardware and graceful handling of multiple sound devices in a system. ALSA is released under the GNU General Public License (GPL) and the GNU Lesser General Public License (LGPL).

The overall project goal of the ALSA System on Chip (ASoC) layer is to provide better ALSA support for embedded system on chip procesors (e.g. pxa2xx, au1x00, iMX, etc) and portable audio codecs.

Typically, ALSA supports up to eight cards, numbered 0 through 7; each card is a physical or logical kernel device capable of input and output. Furthermore, each card may also be addressed by its ID, which is an explanatory string such as “Headset” or “mic”. A card has devices, numbered starting at 0; a device may be of playback type, meaning it outputs sound from the computer, or some other type such as capture, control, timer, or sequencer. Device number 0 is used by default when no particular device is specified.

A device may have subdevices, numbered starting at 0; a subdevice represents some relevant sound endpoint for the device, such as a speaker pair. If the subdevice is not specified, or if subdevice number −1 is specified, then any available subdevice is used.

An ALSA stream is a data flow representing sound; the most common stream format is PCM that must be produced in such a way as to match the characteristics or parameters of the hardware, including: sampling rate, sample width, sample encoding, number of channels etc. Below section Basics of Audio gives you more details about these terminologies.

Basics of Audio

An audio signal is a representation of sound, typically as an electrical voltage. Audio signals have frequencies in the audio frequency range of roughly 20 to 20,000 Hz (the limits of human hearing). Audio signals are two basic types:

  • Analog : Analog refers to audio recorded using methods that replicate the original sound waves. Example: Human speech, sound from musical
  • Digital : Digital audio is recorded by taking samples of the original sound wave at a specified rate. Example : CDs, Mp3 files

Conversion of Analog audio signals to digital audio is usually done by Pulse-code modulation (PCM) . In the below image an analog sine wave is sampled and quantized for PCM and this process is called as modulation or analog-to-digital conversion (ADC). The reverse process is called demodulation or Digital-to-analog conversion (DAC). Here are few PCM Terminology and Concepts that are used in any audio device drivers.

Samples: In PCM audio, both input or output are represented as samples . A single sample represents the amplitude of sound at a certain point in time. To represent a actual audio signal a lot of individual samples are needed. For example,in a DVD audio 44100 samples are taken every second. similarly for VOIP/SIP/telephone it is 8000 samples/sec and in CD it is 44100 samples/sec.

Channels: An audio channel is an audio signal communications channel in a storage device, used in operations such as multi-track recording and sound reinforcement. Channels are generally classified into two types. They are,

  • MONO channel : It uses only one single signal to feed all the speaker. Sampling
  • STEREO channels: It uses more than one (usually two) signals to feed different speakers with different signals. Sampling

Frame: A frame represents exactly one sample. In case of mono channel sound, a frame is consists of single sample. Whereas in stereo(2 channel), each frame consists of two samples.

Frame size: Frame size is the size in bytes of each frame. This too vary based on number of channels. For example,in mono, the frame size is one byte (8 bits ).

Sample Rate: Sample Rate or simply “Rate” is defined as the number of samples per second. PCM sound consists of a flow of sound frames.

Data rate: Date rate is the number of bytes, which must be recorded or provided per second at a given frame size and rate.

Period: Period is the time interval between each processing frames to the hardware.

Period size: This is the size of each period in Hz. These concepts are very much useful when you working on Audio HAL for different hardware devices such HDMI audio , Bluetooth SCO audio devices etc.

ALSA System on Chip (ASoC) layer

ALSA System on Chip (ASoC) layer is designed for SoC audio. The overall project goal of the ASoC layer provides better ALSA support for embedded system on chip processors and portable audio CODECs. The ASoC layer also provides the following features:

  • Codec independence. Allows reuse of codec drivers on other platforms and machines.
  • Easy I2S/PCM audio interface setup between codec and SoC. Each SoC interface and codec registers it’s audio interface capabilities with the core and are subsequently matched and configured when the application hw params are known.
  • Dynamic Audio Power Management (DAPM). DAPM automatically sets the codec to it’s minimum power state at all times. This includes powering up/down internal power blocks depending on the internal codec audio routing and any active streams.
  • Pop and click reduction. Pops and clicks can be reduced by powering the codec up/down in the correct sequence (including using digital mute). ASoC signals the codec when to change power states.
  • Machine specific controls: Allow machines to add controls to the sound card. e.g. volume control for speaker amp.

To achieve all this, ASoC basically splits an embedded audio system into 3 components:

  • Codec driver: The codec driver is platform independent and contains audio controls, audio interface capabilities, codec dapm definition and codec IO functions. Example audio codec : wm8750, wm8960, sgtl5000 etc.

  • Platform driver: The platform driver contains the audio DMA engine and audio interface drivers (e.g. I2S, AC97, PCM) for that platform.

  • Machine driver: The machine driver handles any machine specific controls and audio events. i.e. turning on an amp at start of playback. Machine driver glues together the Platform and Codec drivers.

Codec driver

Codec drivers are responsible for configuring audio codec. Currently, the stereo CODEC (WM8958, WM8960, and WM8962), 7.1 CODEC (cs42888), and AM/FM CODEC (si4763) drivers are implemented using ASoC architecture. These sound card drivers are built in independently. The stereo sound card supports stereo playback and capture.Codec drivers are present under the sound/soc/codecs directory.

  • wm8960.c - Codec driver file
  • wm8960.h - Header file for stereo CODEC driver

Platform driver

The platform driver contains the audio dma engine and audio interface drivers (e.g. I2S, AC97, PCM) for that platform. The McASP driver for Sitara would fall into this category. The machine driver glues together the Platform and Codec drivers. It also handles any machine specific controls and audio events i.e. turning on an amp at start of playback. The stereo audio CODEC is controlled by the I 2 C interface. The audio data is transferred from the user data buffer to/from the SSI FIFO through the DMA channel. The DMA channel is selected according to the audio sample bits. AUDMUX is used to set up the path between the SSI port and the output port which connects with the CODEC. The CODEC works in master mode and provides the BCLK and LRCLK. The BCLK and LRCLK can be configured according to the audio sample rate.

These files are present under the linux/sound/soc/fsl/ directory.

  • imx-pcm-dma.c - Platform layer for PCM driver in IMX
  • imx-pcm.h - Header file for PCM driver.
  • fsl_ssi.c - SSI CPU DAI driver
  • fsl_ssi.h - Header file for SSI CPU DAI driver and SSI register definitions
  • fsl_sai.c - SAI CPU DAI driver
  • fsll_sai.h - Header file for SAI CPU DAI driver & SAI register definitions

Machine driver

The machine driver handles any machine specific controls and audio events. These files are under the linux/sound/soc/fsl directory.

imx-wm8960.c - Machine layer for IMX and audio codec (CODEC as I2S Master)

Source code

All the 3 layers are often configured through the Device Tree for a specific board; Below code block is an example for audio codec WM8960 in IMX7 sabreSD board.

Device tree entry for wm8960 in IMX7 platform

/ {

    sound {
                compatible = "fsl,imx7d-evk-wm8960",
                model = "wm8960-audio";
                cpu-dai = <&sai1>;
                audio-codec = <&codec>;
                /* JD2: hp detect high for headphone*/
                hp-det = <2 0>;
                hp-det-gpios = <&gpio2 28 0>;
                audio-routing =
                        "Headphone Jack", "HP_L",
                        "Headphone Jack", "HP_R",
                        "Ext Spk", "SPK_LP",
                        "Ext Spk", "SPK_LN",
                        "Ext Spk", "SPK_RP",
                        "Ext Spk", "SPK_RN",
                        "LINPUT1", "Main MIC",
                        "Main MIC", "MICB";
                assigned-clocks = <&clks IMX7D_AUDIO_MCLK_ROOT_SRC>,
                                  <&clks IMX7D_AUDIO_MCLK_ROOT_CLK>;
                assigned-clock-parents = <&clks IMX7D_PLL_AUDIO_POST_DIV>;
                assigned-clock-rates = <0>, <12288000>;
 &sai1 {
        pinctrl-names = "default";
        pinctrl-0 = <&pinctrl_sai1>;
        assigned-clocks = <&clks IMX7D_SAI1_ROOT_SRC>,
                          <&clks IMX7D_SAI1_ROOT_CLK>;
        assigned-clock-parents = <&clks IMX7D_PLL_AUDIO_POST_DIV>;
        assigned-clock-rates = <0>, <36864000>;
        status = "okay";

&i2c4 {
        clock-frequency = <100000>;
        pinctrl-names = "default";
        pinctrl-0 = <&pinctrl_i2c4>;
        status = "okay";

        codec: wm8960@1a {
                compatible = "wlf,wm8960";
                reg = <0x1a>;
                clocks = <&clks IMX7D_AUDIO_MCLK_ROOT_CLK>;
                clock-names = "mclk";

To enable WM8960 audio codec driver support in IMX7, below defconfig values have to enabled.


Useful commands

Below are the few useful commands while testing ALSA based audio codec in linux platform.

  • To know the ALSA driver version
$ cat /proc/asound/version
Advanced Linux Sound Architecture Driver Version k4.9.11.
  • To view about audio codec information,
$ cat /proc/asound/cards 
 0 [wm8960audio ]: wm8960-audio - wm8960-audio wm8960-audio
 1 [sii902xaudio ]: sii902x-audio - sii902x-audio sii902x-audio
  • To see list of avilable capture device,
$ arecord -l
**** List of CAPTURE Hardware Devices ****
card 0: wm8960audio [wm8960-audio], device 0: HiFi wm8960-hifi-0 []
  Subdevices: 1/1
  Subdevice #0: subdevice #0
  • To view the list of playback devices,
$ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: wm8960audio [wm8960-audio], device 0: HiFi wm8960-hifi-0 []
Subdevices: 1/1
Subdevice #0: subdevice #0
card 1: sii902xaudio [sii902x-audio], device 0: sii902x hdmi snd-soc-dummy-dai-0 []
Subdevices: 1/1
Subdevice #0: subdevice #0
  • To play any .wav file,
$ aplay sample_audio_8k.wav
  • To recoder using microphone,
$ arecord -t wav -c 1 -d 4 -v tmp_file.wav 

In the above example card ID and device ID has to be given as arguments. If not default card ID and device ID will be used.

  • Once can specify the card ID and device ID as below. Default one will be 0,0.
$ aplay -Dhw:0,1 sample_audio_8k.wav
$ arecord -Dhw:1,0 -t wav -f S16_LE -c 2 -d 10 -v usb_file.wav

Where hw:X,Y comes from this mapping of your hardware – in this case, X is the card number, while Y is the device number.

  • To increase the volume/ MIC gain, one can use amixer command as below.
$ amixer sset Headphone 127
$ amixer sset "Mic Boost" 127
$ amixer sset PCM 225