APULPF3.h
Go to the documentation of this file.
1 /*
2  * Copyright (c) 2024-2025, Texas Instruments Incorporated
3  * All rights reserved.
4  *
5  * Redistribution and use in source and binary forms, with or without
6  * modification, are permitted provided that the following conditions
7  * are met:
8  *
9  * * Redistributions of source code must retain the above copyright
10  * notice, this list of conditions and the following disclaimer.
11  *
12  * * Redistributions in binary form must reproduce the above copyright
13  * notice, this list of conditions and the following disclaimer in the
14  * documentation and/or other materials provided with the distribution.
15  *
16  * * Neither the name of Texas Instruments Incorporated nor the names of
17  * its contributors may be used to endorse or promote products derived
18  * from this software without specific prior written permission.
19  *
20  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21  * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
22  * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
23  * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
24  * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
25  * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
26  * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
27  * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
28  * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
29  * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
30  * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
31  */
32 /*!*****************************************************************************
33  * @file APULPF3.h
34  * @brief <b>PRELIMINARY</b> APU driver interface
35  *
36  * <b>WARNING</b> These APIs are <b>PRELIMINARY</b>, and subject to
37  * change in the next few months.
38  *
39  * @anchor ti_drivers_APU_Overview
40  * # Overview
41  * The APU driver allows you to interact with the Algorithm Processing Unit
42  * (APU), a linear algebra accelerator peripheral using 64-bit complex numbers.
43  * Complex numbers are represented by the _float complex_ type from
44  * _complex.h_, where each number is made up of two 32-bit floats. The APU
45  * works with complex numbers in either Cartesian or Polar formats. In the
46  * Cartesian representation, the lower 32 bits are the real part and the upper
47  * 32 bits are the imaginary part. In Polar format, the lower 32 bits are the
48  * absolute part and the upper 32 bits are the angle, represented as pi
49  * radians.
50  * As long as arguments to the APU functions use the same format, the result
51  * will also be in that format. Mixing formats produces wrong results. The
52  * driver provides basic data types for vectors and matrices constructed from
53  * _float complex_, as well as many common linear algebra operations on these
54  * data types.
55  *
56  * @anchor ti_drivers_APU_DataManagement
57  * ## Data management
58  * All the vector/matrix operations can accept input and output pointers that
59  * are inside or outside APU RAM. If all pointers provided to an operation are
60  * inside APU RAM, the APU will operate in <b>scratchpad mode</b>.
61  * This means the driver will assume that, given the current pointers, the
62  * input is already in APU memory and that input and result will not overlap
63  * each other. Therefore, no data will be copied, and the function output will
64  * be placed inside APU memory. This is the most efficient way to utilize the
65  * APU and is ideal for algorithms with multiple operations that feed into each
66  * other, such as MUSIC (https://en.wikipedia.org/wiki/MUSIC_(algorithm)). When
67  * chaining together operations, make sure as many as possible use vectors and
68  * matrices that are already in APU memory, to prevent unnecessary copying and
69  * overhead.
70  *
71  * If any of the pointers are outside APU memory, the driver will copy input
72  * data to the start of its memory, place the result immediately following, and
73  * then copy the output back to the provided pointer. This may overwrite data
74  * that was already in this location.
75  *
76  * The APU driver uses uDMA for data transfers involving the APU data memory.
77  * More specifically, channel 8 is used.
78  *
79  * @warning Due to errata SYS_211, the APU memory has strict access rules. Use
80  * the provided APU functions to move data safely.
81  * To load data into APU memory: #APULPF3_loadArgMirrored() or
82  * #APULPF3_loadTriangular().
83  * To directly perform a memory access: #APULPF3_dataMemTransfer().
84  * Copying data back from APU memory is automatically handled by the
85  * driver, and happens in an interrupt when the result pointer is
86  * outside APU memory.
87  *
88  * @warning Due to errata SYS_211, no other SW DMA transactions can occur
89  * while the APU is being used.
90  *
91  * @warning Due to errata SYS_211, the APU cannot be used at the same time as
92  * I2S.
93  *
94  * The primary purpose of this driver is executing the MUSIC algorithm for
95  * distance estimation in Bluetooth Channel Sounding. An implementation of
96  * MUSIC using the APU can be found in the apu_music example.
97  *
98  * @anchor ti_drivers_APU_Usage
99  * # Usage
100  * This section will cover driver usage.
101  *
102  * @anchor ti_drivers_APU_Synopsis
103  * ## Synopsis
104  * @anchor ti_drivers_APU_Synopsis_Code
105  * @code
106  *
107  * APULPF3_init();
108  *
109  * float complex *apuMem = (float complex *)APULPF3_MEM_BASE;
110  * float complex argA[10];
111  * float complex argB[10];
112  *
113  * APULPF3_ComplexVector vecA = {.data = bufA, .size = 10};
114  * APULPF3_ComplexVector vecB = {.data = bufB, .size = 10};
115  * APULPF3_ComplexVector resultVec = {.data = apuMem, .size = 10};
116  *
117  * // Get control of APU
118  * APULPF3_startOperationSequence();
119  *
120  * // Perform element-wise product without conjugation,
121  * // placing the result in resultVec, which is inside APU memory
122  * APULPF3_vectorMult(&argA, &argB, false, resultVec);
123  *
124  * // Perform non-conjugated dot product inside of APU memory, which
125  * // reduces overhead.
126  * APULPF3_dotProduct(&resultVec, &resultVec, false, apuMem)
127  *
128  * // Give up control of APU
129  * APULPF3_stopOperationSequence();
130  *
131  * @endcode
132  ***************************************************************************
133  */
134 #ifndef ti_drivers_APULPF3__include
135 #define ti_drivers_APULPF3__include
136 
137 #include <stddef.h>
138 #include <stdint.h>
139 #include <stdbool.h>
140 #include <complex.h>
141 
142 #include <ti/drivers/dpl/HwiP.h>
144 
145 #include <ti/devices/DeviceFamily.h>
146 #include DeviceFamily_constructPath(inc/hw_memmap.h)
147 #include DeviceFamily_constructPath(driverlib/apu.h)
148 #include DeviceFamily_constructPath(driverlib/udma.h)
149 #ifdef __cplusplus
150 extern "C" {
151 #endif
152 
153 /*
154  * ========= APU datatype structs =========
155  * These structs define the core data types used by APU functions,
156  * which are vectors, full matrices and upper triangle matrices of
157  * complex numbers. Matrices are column-major.
158  */
159 
173 typedef struct
174 {
176  complex float *data;
178  uint16_t size;
180 
192 typedef struct
193 {
195  complex float *data;
197  uint16_t rows;
199  uint16_t cols;
201 
216 typedef struct
217 {
219  complex float *data;
221  uint16_t size;
222 
224 
230 typedef enum
231 {
232  APULPF3_R2COp_R2C = APU_OP_R2C,
233  APULPF3_R2COp_R2CC = APU_OP_R2CC,
234  APULPF3_R2COp_R2CA = APU_OP_R2CA,
235  APULPF3_R2COp_R2CAA = APU_OP_R2CCA,
236  APULPF3_R2COp_RA = APU_OP_RA,
237  APULPF3_R2COp_IMA = APU_OP_IMA,
238  APULPF3_R2COp_ABS = APU_OP_ABS,
239 } APULPF3_R2COp;
240 
246 typedef enum
247 {
251 
257 typedef enum
258 {
263 
265 
269 typedef struct
270 {
272  void *object;
274  void const *hwAttrs;
276 
280 typedef struct
281 {
282  unsigned int intPriority;
283 
284  APULPF3_OperationMode operationMode;
285 
286  APULPF3_SchedulingMode schedulingMode;
287 
291  volatile uDMAControlTableEntry *dmaTableEntry;
292 
296  uint32_t dmaChannelMask;
297 
299 
307 #define APULPF3_STATUS_SUCCESS 0
308 
312 #define APULPF3_STATUS_ERROR 1
313 
318 #define APULPF3_STATUS_RESOURCE_UNAVAILABLE 2
319 
324 #define APULPF3_RESULT_INPLACE 0
325 
329 #define APULPF3_MEM_BASE APURAM_DATA0_BASE
330 
334 #define APULPF3_MEM_SIZE_MIRRORED APURAM_DATA0_SIZE
335 
344 void APULPF3_init(void);
345 
359 
373 
374 /* Vector functions */
407 int_fast16_t APULPF3_dotProduct(APULPF3_ComplexVector *vecA,
408  APULPF3_ComplexVector *vecB,
409  bool conjugate,
410  float complex *result);
411 
448 int_fast16_t APULPF3_vectorMult(APULPF3_ComplexVector *vecA,
449  APULPF3_ComplexVector *vecB,
450  bool conjugate,
451  APULPF3_ComplexVector *result);
452 
477  float complex *scalar,
478  APULPF3_ComplexVector *result);
479 
508 int_fast16_t APULPF3_vectorScalarSum(APULPF3_ComplexVector *vecA,
509  float complex *scalar,
510  bool subtraction,
511  APULPF3_ComplexVector *result);
512 
544 int_fast16_t APULPF3_vectorSum(APULPF3_ComplexVector *vecA,
545  APULPF3_ComplexVector *vecB,
546  bool subtraction,
547  APULPF3_ComplexVector *result);
548 
573 int_fast16_t APULPF3_vectorR2C(APULPF3_ComplexVector *vecA,
574  APULPF3_ComplexVector *vecB,
575  APULPF3_R2COp
576  operator,
577  APULPF3_ComplexVector * result);
601 
624  float complex *temp,
625  APULPF3_ComplexVector *result);
626 
646 
677  uint16_t covMatrixSize,
678  bool fbAveraging,
680 
700 int_fast16_t APULPF3_computeFFT(APULPF3_ComplexVector *vec, bool inverse, APULPF3_ComplexVector *result);
701 
734  float scalarThreshold,
735  bool min,
736  APULPF3_ComplexVector *result);
737 
740 /* Matrix functions */
771 int_fast16_t APULPF3_matrixMult(APULPF3_ComplexMatrix *matA,
772  APULPF3_ComplexMatrix *matB,
773  APULPF3_ComplexMatrix *result);
774 
801 
825 int_fast16_t APULPF3_matrixScalarSum(APULPF3_ComplexMatrix *mat, float complex *scalar, APULPF3_ComplexMatrix *result);
826 
850 int_fast16_t APULPF3_matrixScalarMult(APULPF3_ComplexMatrix *mat, float complex *scalar, APULPF3_ComplexMatrix *result);
851 
871 int_fast16_t APULPF3_matrixNorm(APULPF3_ComplexMatrix *mat, float complex *result);
874 /* Matrix algorithms */
924  uint16_t maxIter,
925  float stopThreshold,
926  float epsTol,
927  APULPF3_ComplexVector *result);
928 
951 int_fast16_t APULPF3_gaussJordanElim(APULPF3_ComplexMatrix *mat, float zeroThreshold, APULPF3_ComplexMatrix *result);
952 
955 /* Utility functions */
956 
985 int_fast16_t APULPF3_unitCircle(uint16_t numPoints,
986  uint16_t constant,
987  uint16_t phase,
988  bool conjugate,
989  APULPF3_ComplexVector *result);
990 
1009 
1029 void APULPF3_prepareResult(uint16_t resultSize, uint16_t inputSize, complex float *resultBuffer);
1030 
1043 
1056 
1066 void *APULPF3_loadTriangular(APULPF3_ComplexMatrix *mat, uint16_t offset);
1067 
1081 void *APULPF3_loadArgMirrored(uint16_t argSize, uint16_t offset, float complex *src);
1082 
1100 void APULPF3_dataMemTransfer(const float *src, float *dst, size_t length);
1101 
1102 #ifdef __cplusplus
1103 }
1104 #endif
1105 
1106 #endif /* ti_drivers_APU__include */
APULPF3_SchedulingMode schedulingMode
Definition: APULPF3.h:286
int_fast16_t APULPF3_unitCircle(uint16_t numPoints, uint16_t constant, uint16_t phase, bool conjugate, APULPF3_ComplexVector *result)
Generate points evenly distributed on a unit circle. The APU generates a unit circle as follow: exp(-...
void APULPF3_dataMemTransfer(const float *src, float *dst, size_t length)
Transfer data to or from the APU data memory.
void * object
Definition: APULPF3.h:272
Definition: APULPF3.h:262
APULPF3_SchedulingMode
Define the APU memory scheduling modes, which are the ways APU memory operations are scheduled and pi...
Definition: APULPF3.h:257
APULPF3 Global configuration.
Definition: APULPF3.h:269
int_fast16_t APULPF3_vectorR2C(APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, APULPF3_R2COp operator, APULPF3_ComplexVector *result)
APU function for converting vectors back and forth between real and complex number formats...
void * APULPF3_loadTriangular(APULPF3_ComplexMatrix *mat, uint16_t offset)
Loads the upper triangular part of a full matrix into APU memory.
int_fast16_t APULPF3_jacobiEVD(APULPF3_ComplexTriangleMatrix *mat, uint16_t maxIter, float stopThreshold, float epsTol, APULPF3_ComplexVector *result)
APU function to compute the Jacobi Eigen-Decomposition (EVD) of a triangular Hermitian Matrix...
APULPF3 Hardware attributes.
Definition: APULPF3.h:280
APULPF3 Upper Triangle Matrix Struct.
Definition: APULPF3.h:216
int_fast16_t APULPF3_matrixScalarSum(APULPF3_ComplexMatrix *mat, float complex *scalar, APULPF3_ComplexMatrix *result)
APU function for adding a scalar to a each of a matrix&#39; elements.
uint16_t size
Definition: APULPF3.h:221
int_fast16_t APULPF3_computeFFT(APULPF3_ComplexVector *vec, bool inverse, APULPF3_ComplexVector *result)
APU function for computing the Discrete Fourier transform (DFT) of a complex vector using the Fast Fo...
int_fast16_t APULPF3_covMatrixSpatialSmoothing(APULPF3_ComplexVector *vec, uint16_t covMatrixSize, bool fbAveraging, APULPF3_ComplexTriangleMatrix *result)
APU function for covariance matrix computation using spatial smoothing and optionally forward-backwar...
APULPF3_OperationMode
Define the APU memory operation modes, which are the ways the APU expects data to be stored in its me...
Definition: APULPF3.h:246
unsigned int intPriority
Definition: APULPF3.h:282
uint16_t cols
Definition: APULPF3.h:199
APULPF3_OperationMode operationMode
Definition: APULPF3.h:284
Semaphore module for the RTOS Porting Interface.
void APULPF3_init(void)
APU init function.
int_fast16_t APULPF3_cartesianToPolarVector(APULPF3_ComplexVector *vec, APULPF3_ComplexVector *result)
APU function for converting a complex vector in cartesian format to polar format. ...
APULPF3 Vector Struct.
Definition: APULPF3.h:173
void const * hwAttrs
Definition: APULPF3.h:274
Definition: APULPF3.h:233
int_fast16_t APULPF3_gaussJordanElim(APULPF3_ComplexMatrix *mat, float zeroThreshold, APULPF3_ComplexMatrix *result)
Reduce the input matrix A[MxN] to reduced echelon form using Gauss-Jordan Elimination.
complex float * data
Definition: APULPF3.h:219
int_fast16_t APULPF3_matrixNorm(APULPF3_ComplexMatrix *mat, float complex *result)
Compute the Frobenius norm of a matrix.
int_fast16_t APULPF3_matrixScalarMult(APULPF3_ComplexMatrix *mat, float complex *scalar, APULPF3_ComplexMatrix *result)
APU function for multiplying each of a matrix&#39; elements by a scalar.
void * APULPF3_loadArgMirrored(uint16_t argSize, uint16_t offset, float complex *src)
Load operation arguments into APU memory, assuming the APU is in mirrored mode. This means memory is ...
complex float * data
Definition: APULPF3.h:176
void APULPF3_prepareResult(uint16_t resultSize, uint16_t inputSize, complex float *resultBuffer)
Configure the APU pointers for temporary (in APU memory) and final results for a APU operation...
int_fast16_t APULPF3_sortVector(APULPF3_ComplexVector *vec, APULPF3_ComplexVector *result)
APU function for sorting the real parts of a complex vector in descending order. This function ignore...
Definition: APULPF3.h:232
int_fast16_t APULPF3_HermLo(APULPF3_ComplexTriangleMatrix *mat, APULPF3_ComplexTriangleMatrix *result)
converting Hermitian upper-triangular to lower-triangular
uint16_t size
Definition: APULPF3.h:178
int_fast16_t APULPF3_vectorScalarMult(APULPF3_ComplexVector *vecA, float complex *scalar, APULPF3_ComplexVector *result)
APU function for calculating the product of a vector and a scalar.
uint16_t rows
Definition: APULPF3.h:197
Definition: APULPF3.h:249
int_fast16_t APULPF3_matrixMult(APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB, APULPF3_ComplexMatrix *result)
APU function for multiplying two matrices. The number of rows in the first matrix must be equal to th...
APULPF3_R2COp
Modes for converting between real and complex vectors.
Definition: APULPF3.h:230
int_fast16_t APULPF3_dotProduct(APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, bool conjugate, float complex *result)
APU function for calculating the dot product of two vectors, with the option to perform the complex c...
int_fast16_t APULPF3_vectorMult(APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, bool conjugate, APULPF3_ComplexVector *result)
APU function for calculating the element-wise product of two vectors, with the option to perform the ...
Definition: APULPF3.h:236
int_fast16_t APULPF3_vectorMaxMin(APULPF3_ComplexVector *vec, float scalarThreshold, bool min, APULPF3_ComplexVector *result)
APU function for computing max/min of the real part of a vector and a real value scalar APU accelerat...
int_fast16_t APULPF3_polarToCartesianVector(APULPF3_ComplexVector *vec, float complex *temp, APULPF3_ComplexVector *result)
APU function for converting a complex vector in polar format to cartesian format. ...
int_fast16_t APULPF3_matrixSum(APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB, APULPF3_ComplexMatrix *result)
APU function for adding two matrices. The matrices must be of exact same sizes.
Definition: APULPF3.h:234
Definition: APULPF3.h:237
volatile uDMAControlTableEntry * dmaTableEntry
Definition: APULPF3.h:291
uint32_t dmaChannelMask
Definition: APULPF3.h:296
int_fast16_t APULPF3_vectorSum(APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB, bool subtraction, APULPF3_ComplexVector *result)
APU function for calculating the summation of a vector and a scalar.
Definition: APULPF3.h:235
uint16_t APULPF3_prepareMatrices(APULPF3_ComplexMatrix *matA, APULPF3_ComplexMatrix *matB)
Configure the APU for matrix operation inputs. If not in scratchpad mode, the matrices are loaded one...
Definition: APULPF3.h:238
void APULPF3_stopOperationSequence()
APU function to finish an operation chain.
Hardware Interrupt module for the RTOS Porting Interface.
APULPF3 Matrix Struct.
Definition: APULPF3.h:192
void APULPF3_startOperationSequence()
APU function to prepare the start of an operation chain.
complex float * data
Definition: APULPF3.h:195
uint16_t APULPF3_prepareVectors(APULPF3_ComplexVector *vecA, APULPF3_ComplexVector *vecB)
Configure the APU for vector operation inputs. If not in scratchpad mode, the vectors are loaded one ...
© Copyright 1995-2025, Texas Instruments Incorporated. All rights reserved.
Trademarks | Privacy policy | Terms of use | Terms of sale