-O3: reduce code size.
-DTF_LITE_STATIC_MEMORY: cause bugs on some cores.
+DTFLITE_EMULATE_FLOAT: robuster to emulate float cucalation by fix-point.
Signed-off-by: jihandong <jihandong@xiaomi.com>
The second argument of vgetq_lane_s32(__a, __b) needs to be initialized before compilation, so unroll the for loop. and correct the passed parameters.
Signed-off-by: xinhaiteng <xinhaiteng@xiaomi.com>
The complete implementation is placed separately in mLearning/tflite-micro/operators/neon, delete this part.
Signed-off-by: xinhaiteng <xinhaiteng@xiaomi.com>
Cortex-A compilation options are added to tflite-micro and cmsis-nn, and new operator compilation environments are configured.
Signed-off-by: xinhaiteng <xinhaiteng@xiaomi.com>
VELAPLATFO-25411
On the basis of CMSIS-NN, neon was used to optimize the Add operator, which calculates the offset and addition of eight input and output data in one loop.
Signed-off-by: xinhaiteng <xinhaiteng@xiaomi.com>
Based on CMSIS-NN, the Conv operator was optimized. Using Neon acceleration, multiply 8 input data and 8 filter data in a single loop; Using Im2col technology, convert the output data into a matrix, calculate 2 rows of input data and 4 rows of filter data in a single large loop, and obtain 2x4 output data.
Signed-off-by: xinhaiteng <xinhaiteng@xiaomi.com>
This option, which resolves to -w when CONFIG_CYGWIN_WINTOOL is
configured, is now appended to INCDIR in tools/Config.mk.
See git commit # 5eae32577e5d5226e5d3027c169eeb369f83f77d in the main
Darknet is an open source neural network framework written
in C and CUDA. It is fast, easy to install, and supports
CPU and GPU computation.
You Only Look Once (YOLO) is a state-of-the-art,
real-time object detection system
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
only one .c needed for each function group
add -flax-vector-conversions to avoid build error on gcc && M55
Signed-off-by: Peter Bee <bijunda1@xiaomi.com>
only one .c needed for each function group
add -flax-vector-conversions to avoid build error on gcc && M55
Signed-off-by: Peter Bee <bijunda1@xiaomi.com>
NNABLA_RT should compile as a module to provide the necessary support
for the dnn test application
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
- support float version of convolution
- support the CHW tensor layout
following function prototypes are added:
- arm_convolve_CHW_f32_basic_nonsquare()
- arm_convolve_CHW_q15_basic_nonsquare()
- arm_convolve_CHW_q7_basic_nonsquare()
- arm_nn_CHW_mat_mult_kernel_q7_q15()
NOTE:this patch will be contributed to SMSIS and reverted later from NuttX
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
the CMSIS NN software library is a collection of efficient neural
network kernels developed to maximize the performance and minimize
the memory footprint of neural networks on Cortex-M processor cores.
Project https://github.com/ARM-software/CMSIS_5
The library is divided into a number of functions each covering
a specific category:
Convolution Functions
Activation Functions
Fully-connected Layer Functions
SVDF Layer Functions
Pooling Functions
Softmax Functions
Basic math Functions
The library has separate functions for operating on different weight
and activation data types including 8-bit integers (q7_t) and 16-bit
integers (q15_t). The descrition of the kernels are included in the
function description.
More information
https://www.keil.com/pack/doc/CMSIS/NN/html/index.html
Project license : Apache 2.0 License
https://github.com/ARM-software/CMSIS_5/blob/develop/LICENSE.txt
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>
This is a runtime library for inference Neural Network created
by Neural Network Libraries.
Project git: https://github.com/sony/nnabla-c-runtime
It is almost independent from external libraries(depends on C
standard math library) and is written in Pure C (C99).
It has been developed with priority over readability rather than
performance, making it ideal for learning and porting.
It adopts an extensible architecture, and you can use the function
you implemented yourself as necessary for applications that need performance.
Project license : Apache 2.0 License
https://github.com/sony/nnabla-c-runtime/blob/master/LICENSE
Signed-off-by: Alin Jerpelea <alin.jerpelea@sony.com>