User:DanielDrake/MMP audio: Difference between revisions

From OLPC
Jump to navigation Jump to search
(Created page with 'MMP audio is split up into several drivers that operate the audio-related functionality on the SoC, and a codec driver. This page aims to document my understanding of each softwa…')
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 9: Line 9:
=== sound/soc/pxa/mmp-pcm.c ===
=== sound/soc/pxa/mmp-pcm.c ===


This driver coordinates the transfer of PCM (i.e. audio) data to or from the codec into memory.
This driver coordinates the transfer of PCM (i.e. audio) data from codec into memory, or the equivalent transfer in the opposite direction.


=== sound/soc/codecs/rt5631.c ===
=== sound/soc/codecs/rt5631.c ===
Line 15: Line 15:
This is the codec driver, which turns digital PCM data into audible, analog sounds for playback, and the opposite for recording.
This is the codec driver, which turns digital PCM data into audible, analog sounds for playback, and the opposite for recording.


=== tdma ===
=== drivers/dma/mmp_tdma.c ===

This is a DMA engine backend which does the actual transfer of PCM data from codec into memory (audio SRAM) through driving the audio DMA hardware present on the SoC.


== Data transfer and DMA ==
== Data transfer and DMA ==
Line 32: Line 34:


The data is encoded as a digital electronic signal and is sent to the codec, which then turns it into an analog signal sent to the speakers.
The data is encoded as a digital electronic signal and is sent to the codec, which then turns it into an analog signal sent to the speakers.

== Memory management ==

=== Audio SRAM ===

This system revolves around the audio SRAM. This is an area of memory, CPU-accessible via regular memory access instructions, with the special property that the ADMA engine can be used to transfer data from audio SRAM to/from the codec. It also appears that this is the only option for this particular type of data transfer, as the documented PIO registers do not seem to work (not that having the CPU send every word to a register is a desirable option).

Audio SRAM comes from a larger SRAM pool. Overall, the SRAM is shared between audio, video, etc. The device tree defines the allocation between the various components. The driver on the Linux side that makes such allocation available to the drivers is arch/arm/mach-mmp/sram.c, making the individual SRAM allocations available as genpools. In the case of XO-4, the firmware allocates 128kb of the SRAM to audio ("asram").

The mmp-pcm driver reserves two regions of asram, one as a recording buffer and one as a playback buffer. These buffer sizes are also determined by device tree information (<tt>marvell,buffer-sizes</tt> property). Big buffers are desirable, for example in the recording case, this allows a good amount of recorded PCM data to be "queued up" in memory while the system is busy doing something else.

The mmp_tdma driver also reserves a small region of asram, which it needs in order to pass descriptors to the ADMA hardware which cause the DMA transfers to happen. In a quick playback test here, the tdma driver reserved 256 bytes for descriptors. So even though we aim for big buffers, we must be careful not to eat up the <em>entire</em> asram allocation in mmp-pcm.

On XO-4 (with 128kb SRAM assigned to asram), we assign 62kb to the recording buffer, 62kb to the playback buffer, and 1kb to DMA descriptors in each direction.

=== Is this DMA? ===

Even though some form of DMA has undeniably happened (e.g. recorded PCM data appeared in CPU-accessible memory without the CPU's direct involvement), this is not exactly what other systems would regard as true DMA. In other circumstances, DMA would be defined as such data arriving in <em>main</em> memory, even if that data was already CPU-accessible in (e.g.) PCI mapped memory.

Now that the recorded data has arrived in audio SRAM, the CPU must retrieve it word by word, likely copying it into main memory somewhere. And the audio SRAM is very slow. According to measurements by Mitch Bradley on MMP2, a copy from main memory to audio SRAM goes at 34mb/sec (compared to a uncached write-combining copy from main memory to framebuffer memory going at 205mb/sec). So we are losing CPU cycles performing slow bitwise memory accesses on all audio data coming in and going out.

The ADMA engine is documented as limited to only being able to copy into SRAM. However, there is a separate MDMA engine which is documented as capable of copying between SRAM and main memory. Therefore a future enhancement would be to engage the MDMA interface after ADMA, in order to make the recorded PCM data appear in (fast) main memory, without having burdened the CPU in the process.

== Simultaenous capture/playback ==

Simultaenous active capture/playback streams are supported but there are some constraints.

=== Sample rate ===

There is the requirement within the codec (maybe SSPA too?) that simultaneous capture and playback streams must run at the same rate. This is because there is only one register that configures the sample rate for both directions. This is expressed in the codec driver via the symmetric_rates ASoC functionality - the codec communicates this requirement to the higher levels, which attempt to enforce rate symmetry when parallel streams are created.

This also means that there must be symmetry in the rates we offer for capture and playback - they must be identical. Otherwise, we can end up in a sticky situation like we did in <trac>12498</trac>:
* 44100 and 48000 are offered as playback rates, but 48000 is the only available capture rate
* An application opens a playback stream at rate 44100 and leaves it running
* An application opens a capture stream. Even before it has had a chance to request a specific rate, the system rejects this (returning failure from snd_pcm_open), because it realises that it cannot record at the only available rate (48000) while also meeting the symmetry requirement (44100).

=== Format ===

There is also the requirement (within SSPA at least?) that both streams have the same format. Need to check that this is somehow represented in the driver.

Latest revision as of 15:02, 27 March 2013

MMP audio is split up into several drivers that operate the audio-related functionality on the SoC, and a codec driver. This page aims to document my understanding of each software component.

Driver components

sound/soc/pxa/mmp-sspa.c

This file drives the Synchronous Serial Port for Audio. We program the SSPA registers to request playback or recording, starting vs stopping a stream, as well as setting various parameters such as format and sample rate.

sound/soc/pxa/mmp-pcm.c

This driver coordinates the transfer of PCM (i.e. audio) data from codec into memory, or the equivalent transfer in the opposite direction.

sound/soc/codecs/rt5631.c

This is the codec driver, which turns digital PCM data into audible, analog sounds for playback, and the opposite for recording.

drivers/dma/mmp_tdma.c

This is a DMA engine backend which does the actual transfer of PCM data from codec into memory (audio SRAM) through driving the audio DMA hardware present on the SoC.

Data transfer and DMA

Taking playback as an example:

PCM audio data is written directly into the audio SRAM (this is the way that the driver is currently set up). The ASoC core calls into sspa, mmp-pcm and codec layers as appropriate so that things get set up for playback.

The SSPA layer has previously communicated some DMA parameters to the mmp-pcm layer. Specifically, it has set the DMA device address (dev_addr) in a structure shared between the two via snd_soc_dai_set_dma_data(). In the playback case, dev_addr is set to SSPA register 0x80 (SSPA_TX_DATA).

The mmp-pcm layer then uses the snd-dmaengine API to request that the audio data is copied (via the tdma driver) from audio SRAM to the write port of the SSPA - passing on the dev_addr that was set by the SSPA layer.

The above description might suggest that audio data could alternatively be reproduced by writing data to that register, but Mitch Bradley's investigation on MMP2 shows that this register simply cannot be used in this way, and also that the value of dev_addr is actually completely unused:
Reading from the SSPA RX FIFO via the SSPA_RX_DATA register and writing to the TX FIFO via the SSPA_TX_DATA register doesn't appear to be possible -- the only to move data to/from the FIFOs is by using the DMA engines.
On channel 0, which is the playback channel, all DMA source addresses are interpreted as being in audio SRAM, and all DMA destination addresses are ignored entirely, as channel 0 is hardwired to feed into the corresponding SSPA unit's TX FIFO). Consequently, whether destination address increment or destination address hold mode is chosen makes no difference at all. (similar story for the capture channel)

The data is encoded as a digital electronic signal and is sent to the codec, which then turns it into an analog signal sent to the speakers.

Memory management

Audio SRAM

This system revolves around the audio SRAM. This is an area of memory, CPU-accessible via regular memory access instructions, with the special property that the ADMA engine can be used to transfer data from audio SRAM to/from the codec. It also appears that this is the only option for this particular type of data transfer, as the documented PIO registers do not seem to work (not that having the CPU send every word to a register is a desirable option).

Audio SRAM comes from a larger SRAM pool. Overall, the SRAM is shared between audio, video, etc. The device tree defines the allocation between the various components. The driver on the Linux side that makes such allocation available to the drivers is arch/arm/mach-mmp/sram.c, making the individual SRAM allocations available as genpools. In the case of XO-4, the firmware allocates 128kb of the SRAM to audio ("asram").

The mmp-pcm driver reserves two regions of asram, one as a recording buffer and one as a playback buffer. These buffer sizes are also determined by device tree information (marvell,buffer-sizes property). Big buffers are desirable, for example in the recording case, this allows a good amount of recorded PCM data to be "queued up" in memory while the system is busy doing something else.

The mmp_tdma driver also reserves a small region of asram, which it needs in order to pass descriptors to the ADMA hardware which cause the DMA transfers to happen. In a quick playback test here, the tdma driver reserved 256 bytes for descriptors. So even though we aim for big buffers, we must be careful not to eat up the entire asram allocation in mmp-pcm.

On XO-4 (with 128kb SRAM assigned to asram), we assign 62kb to the recording buffer, 62kb to the playback buffer, and 1kb to DMA descriptors in each direction.

Is this DMA?

Even though some form of DMA has undeniably happened (e.g. recorded PCM data appeared in CPU-accessible memory without the CPU's direct involvement), this is not exactly what other systems would regard as true DMA. In other circumstances, DMA would be defined as such data arriving in main memory, even if that data was already CPU-accessible in (e.g.) PCI mapped memory.

Now that the recorded data has arrived in audio SRAM, the CPU must retrieve it word by word, likely copying it into main memory somewhere. And the audio SRAM is very slow. According to measurements by Mitch Bradley on MMP2, a copy from main memory to audio SRAM goes at 34mb/sec (compared to a uncached write-combining copy from main memory to framebuffer memory going at 205mb/sec). So we are losing CPU cycles performing slow bitwise memory accesses on all audio data coming in and going out.

The ADMA engine is documented as limited to only being able to copy into SRAM. However, there is a separate MDMA engine which is documented as capable of copying between SRAM and main memory. Therefore a future enhancement would be to engage the MDMA interface after ADMA, in order to make the recorded PCM data appear in (fast) main memory, without having burdened the CPU in the process.

Simultaenous capture/playback

Simultaenous active capture/playback streams are supported but there are some constraints.

Sample rate

There is the requirement within the codec (maybe SSPA too?) that simultaneous capture and playback streams must run at the same rate. This is because there is only one register that configures the sample rate for both directions. This is expressed in the codec driver via the symmetric_rates ASoC functionality - the codec communicates this requirement to the higher levels, which attempt to enforce rate symmetry when parallel streams are created.

This also means that there must be symmetry in the rates we offer for capture and playback - they must be identical. Otherwise, we can end up in a sticky situation like we did in <trac>12498</trac>:

  • 44100 and 48000 are offered as playback rates, but 48000 is the only available capture rate
  • An application opens a playback stream at rate 44100 and leaves it running
  • An application opens a capture stream. Even before it has had a chance to request a specific rate, the system rejects this (returning failure from snd_pcm_open), because it realises that it cannot record at the only available rate (48000) while also meeting the symmetry requirement (44100).

Format

There is also the requirement (within SSPA at least?) that both streams have the same format. Need to check that this is somehow represented in the driver.