ac6-training, un département d'Ac6 SAS
 
Site displayed in English (GB)
Site affiché en English (GB)View the site in FrenchVoir le site en English (USA)
go-up

ac6 >> ac6-training >> Processors >> ARM Cores >> VFP programming Inquire Download as PDF Write us

RC0 VFP programming

This course explains how to use VFP instructions to boost multimedia algorithms

formateur
Objectives
  • This course has been designed for programmers wanting to develop algorithm based on hardware floating point calculations.
  • Each instruction family is detailed, first at assembly level, and then at C level using macros.
  • Several tricky usage of vector instructions are provided.
  • The underlying cache operation as well as preload mechanisms (instruction and hardware prefetch) are detailed to explain how a processing can be pipelined .
  • The course shows how DSP typical algorithms such as FIR and FFT can be vectorized and then optimized to be executed on VFP unit.

  • THIS COURSE IS PROPOSED EITHER AS AN INSTRUCTOR-LED COURSE OR AS E-LEARNING.

  • ACSYS has developed an optimized VFP based FFT coded in assembler language
    • performance for 1024 complex floating point single precision samples is 220_000 core clock cycles (ARM11)
    • for any information contact training@ac6-training.com
Labs are run under RVDS
A more detailed course description is available on request at training@ac6-training.com
  • Knowledge of 4T / V5TE instruction set.
  • Theoretical course
    • PDF course material (in English) supplemented by a printed version for face-to-face courses.
    • Online courses are dispensed using the Teams video-conferencing system.
    • The trainer answers trainees' questions during the training and provide technical and pedagogical assistance.
  • At the start of each session the trainer will interact with the trainees to ensure the course fits their expectations and correct if needed
  • Any embedded systems engineer or technician with the above prerequisites.
  • The prerequisites indicated above are assessed before the training by the technical supervision of the traineein his company, or by the trainee himself in the exceptional case of an individual trainee.
  • Trainee progress is assessed by quizzes offered at the end of various sections to verify that the trainees have assimilated the points presented
  • At the end of the training, each trainee receives a certificate attesting that they have successfully completed the course.
    • In the event of a problem, discovered during the course, due to a lack of prerequisites by the trainee a different or additional training is offered to them, generally to reinforce their prerequisites,in agreement with their company manager if applicable.

Course Outline

  • Floating point number coding
  • Denormalized numbers
  • NaN utilization
  • Rounding modess
  • VFP FPEXC register
  • Register bank, D registers, S registers
  • Instruction coding, either ARM or Thumb-2
  • Related system registers
  • Alignment issues
  • Context switching
  • Length / Stride combinations
  • Scalar operations
  • Vector operations
  • Mixed operations
  • Addressing modes
  • Floating point load / store
  • Floating point load / store multiple
  • Processor acceleration mechanisms: store merging buffers
  • Add / subtract / absolute value instructions
  • Multiply and multiply accumulate instructions
  • Divide instruction
  • Square root instruction
  • Compare instructions
  • Integer to FP and FP to convert instructions
  • FIR filter
    • Converting the scalar algorithm into a vector algorithm
    • Finding the VFP instructions to encode the vector algorithm
    • Optimizing the code
  • FFT (DFT)
    • Converting the scalar algorithm into a vector algorithm, understanding how circle properties can be used to process 4 angles concurrently
    • Finding the VFP instructions to encode the vector algorithm
    • Optimizing the code