MI_CpuCopy*

Syntax

#include <nitro/mi.h>

void MI_CpuCopy8( const void* src, void* dest, u32 size );
void MI_CpuCopy16( const void* src, void* dest, u32 size );
void MI_CpuCopy32( const void* src, void* dest, u32 size );
void MI_CpuCopyFast( const void* src, void* dest, u32 size );
void MI_CpuCopy( const void* src, void* dest, u32 size );

Arguments

src	The transfer source address.
dest	The transfer destination address.
size	Transfer size.

Return Values

None.

Description

Uses CPU to perform memory copy.

MI_CpuCopy8() selects the most efficient copy method based on the transfer source address and transfer destination address and appropriately carries out the copy in 16-bit and 32-bit units. There is no need to worry about the alignment of the transfer source address and transfer destination address. In addition, single-byte access will not be performed.

MI_CpuCopy16() copies in 16-bit units. Both the transfer source address and the transfer destination address must be 2-byte aligned.

MI_CpuCopy32() copies in 32-bit units. Both the transfer source address and the transfer destination address must be 4-byte aligned.

MI_CpuCopyFast()copies at high speed in 32-bit units. Both the transfer source address and the transfer destination address must be 4-byte aligned. The transfer size is an integral multiple of 4 bytes. It does not have to be an integral multiple of 32 bytes. After transferring in 32-byte units, the fractional part is handled by performing the same process as MI_CpuCopy32().

Therefore, MI_CpuCopyFast() and MI_CpuCopy32() have the same code for transfer operation when the transfer size is less than 32 bytes. However, MI_CpuCopyFast() checks to determine whether the fractional part is smaller than 32 bytes, so a loss occurs for this part of code. Under these circumstances MI_CpuCopy32() is just a little faster. But if the transfer size is large, MI_CpuCopyFast() is faster.

Based on these considerations, you could implement the following code to transfer data efficiently using one function:

static inline void myCpuCopy32( const void *src, void *dest, u32 size ) { if ( size >= 0x20 ) { MIi_CpuCopyFast(src, dest, size); } else { MIi_CpuCopy32(src, dest, size); } }

However, 32 bytes should be regarded as a theoretical target because it is uncertain whether the size threshold value where differences in speed appear is exactly 32 bytes. Those differences depend on the cache state of the region where the transfer is performed or on the transfer address.

Like MI_CpuCopy8, MI_CpuCopy() selects the most efficient method possible based on the transfer destination address. It also copies in 32-byte units in addition to copying in 16-bit and 32-bit units, as appropriate. There are no limitations on either the alignment or transfer size of the source or destination addresses. If the alignment or transfer size is not indeterminate, it is recommended that you split up the calls between MIi_CpuCopyFast() and MIi_CpuCopy32(), as appropriate.

Internal Operation

Processing is done by the CPU only and does not use the DMA controller. It does not use a system call. The MI_CpuCopy8 function copies in units of 16 or 32 bits, so accessing VRAM directly will not cause problems.
The MI_CpuCopy function will sometimes copy in 8-bit units depending on the target address, so it should not be used if accessing VRAM directly. (It can be used in TWL mode if extended VRAM has been configured.)

Revision History

2007/12/10 Added the MI_CpuCopy function.
2005/07/07 Added section about the speed of MI_CpuCopy32 and MI_CpuCopyFast.
2004/04/29 Added a description of MI_CpuCopy8.
2004/03/29 Described that systems calls are not used.
2003/12/01 Initial version.

CONFIDENTIAL