Return to: 2006 Feature Stories
CLIENT: Imagination Technologies
April 2, 2006: AudioDesignLine
INSIDE THE IMAGINATION TECHNOLOGIES META PROCESSOR, PART 2
PC unit The PC unit holds two program counter registers per thread along with a simple functional unit that is used for the majority of the system's program flow manipulations such as branches. During the execution of normal background code PC is the current execution address for this thread, while PCX holds the address from which the next interrupt on this thread will execute. When an interrupt occurs, the contents of PC and PCX swap. The PC holds the execution address for the executing interrupt handler, while PCX holds the address from which background code execution will resume when the interrupt completes. The swapping of the meaning of PC and PCX happens transparently when interrupt level is entered or exited.
Branches, jumps and calls should use the address for the first instruction to be executed as no further processing is applied to this value before it is copied into a thread’s PC register.
Trigger unit The trigger unit provides a mechanism for detecting and synchronizing with various types of system event. The trigger unit provides independent systems for two levels of event handling – background and interrupt. Background control provides voluntary synchronization with zero overhead. Interrupt control is based on a conventional model with overhead. The META core supports two distinct types of trigger source via the trigger unit. These sources are hardware triggers and kicks generated by software events.
Hardware triggers are sourced both from hardware outside of the META core itself, such as Coprocessors, and internal sources like timers. Each hardware trigger provides a simple on/off flag. A set of registers is used to control the routing of internally generated and externally supplied triggers. Kicks are caused by a software action such as writing to a controlling register somewhere in the system. In addition, kicks differ from hardware triggers in that a count of kicks received is accumulated with time and atomically decremented by one as the thread responds to kick-sourced triggers. This counter can be used to implement simple software or hardware request queues using shared memory or a coprocessor interface FIFO as the storage area for the request data. All software inter-thread events should be communicated using kicks.
Control unit The control unit is a simple unit that only contains a register file - it has no associated ALU. These registers hold all of the control state that cannot be put in the memory mapped I/O space. This includes all the control registers that are needed within the core itself such as individual thread on/off controls, DSP mode switches, hardware loop controls and repeat counters.
Input/output ports The META core connects to internal and external data sources/sinks via multiple memory ports. DSP performance is achieved by reducing the direct load on the memory interface using either separate instruction and data caches, or on-chip RAM. For example, when a program is operating in a tight core loop, there may be no instruction fetch activity on the memory bus as all requests are serviced from the contents of the instruction cache or on-chip RAMs.
Coprocessor ports Up to eight read and/or write coprocessor interfaces may be used in a specific instance of the core to allow threads to operate synchronously with arbitrary hardware. The coprocessor interface module lets data be transferred to and from any application specific hardware modules, for example real-time data feeds such as digital audio. This interface allows for transfers of up to 64-bits to a coprocessor per cycle and supports flow controlling of the I/O feed. Many hardware functions, such as memory-mapped peripherals, often require shared access. Typically, such peripherals are interfaced using SoC interconnect busses and access is governed using interrupts. By interfacing such peripherals to the META coprocessor ports under the control of threads, inter-thread locks and the hardware scheduler may be used to control shared access. The ability to switch threads without any software overhead allows real-time control of I/O – essential for complex multi-function products.
System bus The system bus can carry a number of simultaneous transactions from each thread, allowing independent operation of the threads from memory-mapped hardware with differing response latencies.
Threads and thread scheduling The META core supports two to four independent hardware threads that share the processor's core resources such as register execution units and memory bandwidth. A fine-grained instruction scheduler switches between the thread contexts on a cycle-by-cycle basis. The instruction scheduler manages multiple threads by extracting a list of required resources from the next pending instruction for each thread. Resource requests are matched to resource availability via an interlocking process that yields a set of instructions that could be issued. From this set it is then possible to choose one instruction to issue this cycle via a variable priority scheduler. Each thread can use different processor resources at the same time, or one thread can use all of the processor’s resources. To support multiple threads and DSP functionality the META core has internal RAM, register execution units and external interface ports. All major functional units of META, including caches and the MMU are thread-aware.
The META pipeline is composed of three stages (post-decode) and the instruction set has three main types of instructions: