+ Add a small cleanup script (build/tools/cleanup-apps.sh) + Minor formatting of hello-gl2 sample.
128 lines
4.1 KiB
Plaintext
128 lines
4.1 KiB
Plaintext
Android NDK & ARM NEON instruction set extension support
|
|
--------------------------------------------------------
|
|
|
|
Introduction:
|
|
-------------
|
|
|
|
Android NDK r3 added support for the new 'armeabi-v7a' ARM-based ABI
|
|
that allows native code to use two useful instruction set extenstions:
|
|
|
|
- Thumb-2, which provides performance comparable to 32-bit ARM
|
|
instructions with similar compactness to Thumb-1
|
|
|
|
- VFPv3, which provides hardware FPU registers and computations,
|
|
to boost floating point performance significantly.
|
|
|
|
More specifically, by default 'armeabi-v7a' only supports
|
|
VFPv3-D16 which only uses/requires 16 hardware FPU 64-bit registers.
|
|
|
|
More information about this can be read in docs/CPU-ARCH-ABIS.TXT
|
|
|
|
The ARMv7 Architecture Reference Manual also defines another optional
|
|
instruction set extension known as "ARM Advanced SIMD", nick-named
|
|
"NEON". It provides:
|
|
|
|
- A set of interesting scalar/vector instructions and registers
|
|
(the latter are mapped to the same chip area than the FPU ones),
|
|
comparable to MMX/SSE/3DNow! in the x86 world.
|
|
|
|
- VFPv3-D32 as a requirement (i.e. 32 hardware FPU 64-bit registers,
|
|
instead of the minimum of 16).
|
|
|
|
Not all ARMv7-based Android devices will support NEON, but those that
|
|
do may benefit in significant ways from the scalar/vector instructions.
|
|
|
|
The NDK supports the compilation of modules or even specific source
|
|
files with support for NEON. What this means is that a specific compiler
|
|
flag will be used to enable the use of GCC ARM Neon intrinsics and
|
|
VFPv3-D32 at the same time. The intrinsics are described here:
|
|
|
|
http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html
|
|
|
|
|
|
LOCAL_ARM_NEON:
|
|
---------------
|
|
|
|
Define LOCAL_ARM_NEON to 'true' in your module definition, and the NDK
|
|
will build all its source files with NEON support. This can be useful if
|
|
you want to build a static or shared library that specifically contains
|
|
NEON code paths.
|
|
|
|
|
|
Using the .neon suffix:
|
|
-----------------------
|
|
|
|
When listing sources files in your LOCAL_SRC_FILES variable, you now have
|
|
the option of using the .neon suffix to indicate that you want to
|
|
corresponding source(s) to be built with Neon support. For example:
|
|
|
|
LOCAL_SRC_FILES := foo.c.neon bar.c
|
|
|
|
Will only build 'foo.c' with NEON support.
|
|
|
|
Note that the .neon suffix can be used with the .arm suffix too (used to
|
|
specify the 32-bit ARM instruction set for non-NEON instructions), but must
|
|
appear after it.
|
|
|
|
In other words, 'foo.c.arm.neon' works, but 'foo.c.neon.arm' does NOT.
|
|
|
|
|
|
Build Requirements:
|
|
------------------
|
|
|
|
Neon support only works when targetting the 'armeabi-v7a' ABI, otherwise the
|
|
NDK build scripts will complain and abort. It is important to use checks like
|
|
the following in your Android.mk:
|
|
|
|
# define a static library containing our NEON code
|
|
ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
|
|
include $(CLEAR_VARS)
|
|
LOCAL_MODULE := mylib-neon
|
|
LOCAL_SRC_FILES := mylib-neon.c
|
|
LOCAL_ARM_NEON := true
|
|
include $(BUILD_STATIC_LIBRARY)
|
|
endif # TARGET_ARCH_ABI == armeabi-v7a
|
|
|
|
|
|
Runtime Detection:
|
|
------------------
|
|
|
|
As said previously, NOT ALL ARMv7-BASED ANDROID DEVICES WILL SUPPORT NEON !
|
|
It is thus crucial to perform runtime detection to know if the NEON-capable
|
|
machine code can be run on the target device.
|
|
|
|
To do that, use the 'cpufeatures' library that comes with this NDK. To lean
|
|
more about it, see docs/CPU-FEATURES.TXT.
|
|
|
|
You should explicitely check that android_getCpuFamily() returns
|
|
ANDROID_CPU_FAMILY_ARM, and that android_getCpuFeatures() returns a value
|
|
that has the ANDROID_CPU_ARM_FEATURE_NEON flag set, as in:
|
|
|
|
#include <cpu-features.h>
|
|
|
|
...
|
|
...
|
|
|
|
if (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM &&
|
|
(android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON) != 0)
|
|
{
|
|
// use NEON-optimized routines
|
|
...
|
|
}
|
|
else
|
|
{
|
|
// use non-NEON fallback routines instead
|
|
...
|
|
}
|
|
|
|
...
|
|
|
|
Sample code:
|
|
------------
|
|
|
|
Look at the source code for the "hello-neon" sample in this NDK for an example
|
|
on how to use the 'cpufeatures' library and Neon intrinsics at the same time.
|
|
|
|
This implements a tiny benchmark for a FIR filter loop using a C version, and
|
|
a NEON-optimized one for devices that support it.
|