The following chapter is aimed at giving the reader an introduction to
the Barracuda project and should only give enough detail to provide an overview
over the functionality of STAR12 MCU.
Section 1 lists the overall
functionality of the module and general user information. Section 2 contains
design-related information. This includes such information as design specific
targets, source code samples.
The Barracuda MCU architecture consists of two main blocks, a CORE
with 16-bit CPU12 and a peripheral interface block with 2 x Serial
Communication Interface (SCI), a Serial Peripheral Interface (SPI), I2C, BDLC
(network interface), Pulse Width Modulator (PWM), Enhanced Capture Timer (ECT),
Clock and Reset Generator (CRG), Keyboard Wakeup Unit (KWU), 4 x CAN interface
(MSCAN).
A STAR12 MCU bus interfaces between the two blocks. Transfer
between interface modules is performed via an IPbus. An IPbridge serves for
signal conversion between STAR12 and IPbus.
The ports of each peripheral
interface module lead not directly to pins. There is a dedicated Port
Integration module (PIM) that includes in the main analogue functionality to
control driver strength, detect plugging or to enable pull-up resistors. In
this way analogue port functions have completely be separated from interface
modules.
Additionally, the Barracuda architecture includes on chip memory
(Flash memory, EEPROM, RAM) and two Analog-to-Digital Converters (ADC).
The Barracuda is a 16-bit MCS912D-Family member Microcontroller Unit (MCU), that consists of a 16-bit central processing unit (CPU12) with 256K bytes of on chip Flash EEPROM, 16K bytes of RAM, 2K bytes of EEPROM, two asynchronous serial communications interfaces (SCI), a serial peripheral interface (SPI), an IIC-bus, an enhanced capture timer (ECT), two 8- channel, 10-bit analog-to-digital converters (ADC), an eight-channel pulse-width modulator (PWM), a BDLC - J1850 interface and up to four CAN modules. The Barracuda interfaces 16-bit memory and can operate in 8-bit narrow mode for interfacing 8-bit wide memory to reduce system costs. An on-chip PLL allows power consumption and performance to be adjusted to suit operational requirements. Furthermore, Keyboard Wakeup Logic is available for 12 I/O ports. Table 1 lists the features of the Barracuda MCU.
CORE (CPU, FSC, VSC) | 16-bit CPU12, compatible with M68HC11 instruction set, a 20-bit ALU Instruction queue and indexed addressing |
2 x SCI | The asynchronous serial communications interfaces allow Full-duplex operation with Standard mark/space non-return-to-zero (NRZ) format, 13-bit baud rate selection, Programmable 8-bit or 9-bit data format. Moreover it features separately enabled transmitter and receiver, Programmable transmitter output polarity, Two receiver wakeup methods, Receiver framing error detection, Hardware parity checking, 1/16 bit-time noise detection |
SPI | The serial peripheral interface module allows full-duplex, synchronous, serial communication between the MCU and peripheral devices. Software can poll the SPI status flags or the SPI operation can be interrupt driven. Other features are: Master mode and slave mode, Bi-directional mode, Slave select output, Mode fault error flag with CPU interrupt capability, Double-buffered operation, Serial clock with programmable polarity and phase, Control of SPI operation during wait mode. |
I2C | The Inter-IC Bus (IIC or I2C) is a two-wire, bidirectional serial bus that provides a two-wire data exchange between devices. The interface is designed to operate up to 100kbps with maximum bus loading and timing. The device is capable of operating at higher baud rates, up to a maximum of clock/20, with reduced bus loading. Features are: Multi-master operation, Software programmable for one of 256 different serial clock frequencies, Software selectable acknowledge bit, Interrupt driven byte-by-byte data transfer, Arbitration lost interrupt with automatic mode switching from master to slave, Calling address identification interrupt, Start and stop signal generation/detection, Repeated start signal generation, Acknowledge bit generation/detection, Bus busy detection. |
BDLC | The J1850 interface is a serial communication module, which allows user to send and receive messages across a Society of Automotive Engineers (SAE) J1850 serial communication network. The user's software handles each transmitted or received message on a byte-by-byte basis, while the BDLC performs all of the network access: arbitration, message framing and error detection duties. Features include: 10.4 Kbps Variable Pulse Width (VPW) Bit Format, Digital Noise Filter, Collision Detection |
PWM | The Pulse-Width Modulator has eight-channels, each with a programmable period and duty cycle as well as a dedicated counter. A flexible clock select scheme allows four different clock sources to be used with the counters. Each of the modulators can create independent continuous waveforms with software-selectable duty rates from 0% to 100%. The PWM outputs can be programmed as left aligned outputs or center aligned outputs. |
ECT | The Enhanced Capture Timer features 16-Bit Buffer Register for four Input Capture (IC) channels, four 8-Bit Pulse Accumulators with 8-bit buffer registers associated with the four buffered IC channels, 16-Bit Modulus Down-Counter with 4-bit prescaler, Four user selectable Delay Counters for input noise immunity increase. The timer is configurable as two 16-Bit Pulse Accumulators and supports only 16 - bit access on the IP bus. |
CRG | The Clock and Reset Generator features a Crystal oscillator, a Phase Locked Loop (PLL) frequency multiplier, a System Clock Generator (CGEN), System clock switch, System clocks off during WAIT mode, System Reset Generator (RGEN) with Power-on Reset (POR), Computer Operating Properly (COP), Watchdog Timer with time-out clear window. |
2 x KWU | The Keyboard Wakeup Unit controls two ports H and J. Data and DDR registers allow access as a 16-bit port. There are 16 Key Wake-Up (KWU) channels to wake-up the chip from STOP mode. For each pin, which has an interrupt enabled, an active edge brings the part out of STOP. Digital filtering is included to prevent pulses shorter than a specified value from waking the part from STOP. |
4 x CAN | CAN modules are CAN 2.0 A, B software compatible with four receive and three transmit buffers, flexible identifier filter programmable as 2 x 32 bit, 4 x 16 bit or 8 x 8 bit, four separate interrupt channels for Rx, Tx, error and wake-up Low-pass filter wake-up function and Loop-back for self test operation |
2 x ADC | The 12- channel, 10-bit Analog-to-Digital Converter also works as peripheral interface module. It does not require external sample and hold circuitry. The 12 analog input channels are multiplexed internally. It features: minimum 7 msec 10-Bit Single Conversion Time, Internal transfer buffer amplifier, Programmable Sample Time, Left Justified / Unsigned Result Data and Conversion Completion Interrupt Generation. |
PIM | The Port Integration Module establishes the interface between the peripheral modules and the I/O Pads for all ports of the interface modules. Each I/O pin can be configured up several register bits allowing input/output selection, drive strength reduction, enable and select of pull resistors and interrupt enable and status flags. |
Memory | on-chip memory will be available in different configurations:
32K, 58K, 128K Flash EEPROM, 1K, 2K byte EEPROM 2K, 4K, 8k and 16K byte RAM |
The first step is analyzing the design that is to be FPGA emulated.
The module structure is required for the substitution of analogue modules to
translate the modules into a Gate Model based on the component library of the
FPGA vendor.
The modules used for evaluation (modules from the JUPITER
project) were designed to be mapped to a Motorola logic cell library. FPGAs
provide only a subset of logic cells, and above all, provide neither hardmacros
nor analogue features. To allow mapping the same design onto a FPGA without
changing the functionality, not supported cells have to be replaced by cells
available in the FPGA library.
In the course of this thesis, the complete
design has been analyzed using the Verilog RTL-code, to find out not
synthesizable cells. Most of them were hard instantiated cells e.g. driver,
inverter, buffer, flip-flops. They have been replaced by simple Verilog modules
as shown in table 2.
unsupported cell | substituted by RTL model |
---|---|
driver | wire |
hard instantiated cells (inverter, DFF, gated-clock cells, RS-FF) | RTL code with the same functionality |
internal SRAM | SRAM Megafunction of the FPGA that emulates memory using its internal dual port RAM |
analogue modules | moved to top level |
One major constrain for the Barracuda preSilicon Emulation project
was not to change source code of the modules, since we worked on old modules
that were to be replaced by the final Barracuda modules. Therefore, it was the
intention of the author to search for generic approaches that allow repeating
all actions in minutes.
For example, an analogue sub-module has to be
moved to top-level. Generally, all interfaces of modules, higher in the
hierarchy than the module to be substituted, have to be changed, including the
top-level module. Another way is, using synthesis script commands to perform
the same action with the help of Synopsys synthesis tool.
Example 1 shows
a method, developed in the course of this thesis, that allows moving submodules
to top-level without changes in RTL code using Synopsys synthesis script
commands.
First, the module to move scg_cus
is ungrouped keeping the names
of all submodules. Second, the content of the module is substituted by a dummy
module and the hierarchy is flattened to move all modules to the top-level. At
top-level, module scg_cus
is removed and replaced by the dummy module.
Afterwards, all modules that were not originally at top-level have to be
grouped again to restore hierarchy.
current_design vsc_kd128_1_0 ungroup -simple_names "MMC" /* ungroup -simple_names "KEEPERS" */ group {PI, REG, BUF, CORE} -design_name "mmv_kd1298_1_0" -cell_name "MMC" /* substitute content of module "vsc_kd128_1_0/scg_1_0/scg_cus" */ /* by content of module "vsc_kd128_1_0/scg_1_0/scg_cus_dummy" */ /* and move module "vsc_kd128_1_0/scg_1_0/scg_cus" to top-level */ /* ------------------------------------------------------------------------- */ remove_design -hier scg_cus rename_design scg_cus_dummy scg_cus current_design vsc_kd128_1_0 ungroup -simple_names "SCG" ungroup -simple_names "CUS"
For synthesis the Synopsys Design Compiler has been used. To reduce
work and simplify design changes, a set of synthesis scripts that are fully
generic have been developed in the course of this thesis. For every step in
synthesis a dedicated design independent script performs a standard procedure.
Design specific information e.g. module names has completely separated
from this scripts. Synopsys Design Compiler is configured using a global setup
script. All design specific information e.g. path names and module names is
located in Examples of Synthesis scripts: Advantages of this approach:
/* A N A L Y Z E */ sh test -f SYNMODEL if (dc_shell_status == 0) { remove_variable vlist > /dev/null vlist = execute(-s, sh echo `cat SYNMODEL `) } analyze -f verilog -lib WORK vlist > reports + "/" + DESIGN_TOP_preROUTE + "_anal.rpt" if (dc_shell_status == 0) { echo "Error - Analyze Failed" quit } /* E L A B O R A T E */ sh test -f DESIGN if (dc_shell_status == 0) { remove_variable dlist > /dev/null dlist = execute(-s, sh echo `cat DESIGN `) } foreach (dsn, dlist) { elaborate dsn -arch "verilog" -lib WORK -update > reports + "/" + DESIGN_TOP_preROUTE + "_elab.rpt" if (dc_shell_status == 0) { list dsn echo "Error: specified DESIGN not found" } }
First, to allow design independency, a method to separate module
specific names had to be found.
The execute(-s, sh echo `cat SYNMODEL `)
command reads in the content
of file fpga4.list which is a simple list of file names that represent
the modules to be processed.
fpga4.v mscan.v clkgate.v bidirecpad8.vDesign specific information is stored in variable SYNMODEL. Some exception handling has proved to be very helpful for debugging and preventing false inputs. The following command will test the existence of necessary variables and give a warning if any error occurs during execution of the scripts.
sh test -f SYNMODEL if (dc_shell_status == 0) { commands... }The designed action is done recursively for each module listed in file fpga4.list.
foreach (dsn, vlist) { elaborate dsn -arch "verilog" -lib WORK -update > reports + "/" + DESIGN_TOP_preROUTE + "_elab.rpt" if (dc_shell_status == 0) { echo "Error: specified DESIGN not found" } }
One way of FPGA emulation is the implementation of all modules in
one FPGA to test the functionality.
Another way is to separate the design and implement each part of the design in different FPGAs.
To find out the best way, the author has first analyzed all modules to get the number of
required logic cells for each module. The Barracuda project requires diagnosis
of internal bus signals as well. Therefore, the modules of the project have
been separated on multiple devices. But it turned out, that there is no other
way, because the number of required logic elements of the Barracuda design
exceed the number of available logic cells even of the biggest Flex10k FPGA by far.
After all self-contained top-level modules of the design passed the
ALTERA FPGA design tool successfully, it showed that the CORE module fitted in
one Flex10kE200 FPGA and the interface modules (SCI, SPI, I2C, and BDLC) in one
Flex10kE100. There were great parts of FPGA resources free, so we decided to
add all other modules (KWU, TIMER, and PWM) to the design.
The architecture of Barracuda MCU includes an IPbus interface
between CPU and the interface modules.
The second major task of this thesis was the development of an IPbus interface for each interface module,
since the used modules from the old (JUPITER) project had a STAR12 MCU bus interface.
The basic function of the IPbridge is to convert signals e.g.
the readwrite
signal it converted to read
and write
, and to latch data and
address busses.
For each interface module of the Barracuda design (SCI,
SPI, I2C, BDLC, KWU, CRG, PIM, ECT) an IPbridge has been designed in the course
of this thesis to allow modules to be put together. After analyzing each
module, a pintable has been created that shows which signal of the old STAR12
bus interface has to be transformed to which signal of the IPbus interface.
Table 3 shows the STAR MCU bus <=> IPbus signal conversion, the input and
output signals to the IP-bridge module. The signal names and descriptions are
given. The primary function of the signals is described first, followed by the
secondary function if applicable. After the signal conversion has been worked
out theoretically, the Verilog RTL-code of each IPbus interface has been
developed and synthesized.
The complete list of IPbus interfaces is chown
in appendix B
STAR 12 Bus | Bit | Description | IP-bus | Signal | Direction |
---|---|---|---|---|---|
IPbus-CLOCKS & RESETS |
|||||
core_clk34 | 1 | System Clock 34 | module clock | clk34 | I |
core_clk41 | 1 | System Clock 41 | module clock | bus_clk | I |
core_rst_t3 | 1 | Hardweare Reset | asynchronous reset | hard_rst_b | I |
Software Reset | synchronous reset | soft_rst_b | I | ||
IPbus-DATA BUSSES | |||||
rdb_t2 | 16 | Write Data Bus | Output data bus; always driven | data_rd | O |
core_wdb_t4 | 16 | Read Data Bus | Input data bus | data_wr | I |
IPbus-DATA BUSSES | |||||
core_ab_t2 | 4 | Address Bus | System address bus | addr | I |
core_sz8_t2 | 1 |
Size 8 Signal (for 8-bit accesses) |
enable byte accesses. One bit for each byte in the data buses |
byte_en_7_0_b byte_en_15_8_b |
I |
core_rw_t2 | Read Write Signal | read signal | read_en_b | I | |
write signal | write_en_b | I | |||
IPbus-MODE SIGNALS | |||||
core_stop_t2 | module should enter doze mode | doze_mode | I | ||
module should enter freeze mode | freeze_mode | I | |||
core_stop_t2 | 1 | Stop Signal | module should enter stop mode | stop_mode | I |
module should enter supervisor mode | supervisor_mode | I | |||
core_bdmact_t2 | 1 | background Debug Mode active | module should enter test mode | test_mode | I |
core_wait_t2 | 1 | Wait Signal | module should enter wait mode | wait_mode | I |
core_smod_t2 | 1 | Special Mode | smodT4 | I | |
core_scanmod | 1 | Scanmode Signal | scanmod | I | |
IPbus-INTERRUPT SIGNALS | |||||
ffxx | Interrupt acknowledge | int_ack, int_vector, rd_int_vector_b | I | ||
Module Plug SIGNALS | |||||
dlc_puerst_plug | 1 | Determines reset state of DLCPUE1 bit of DLCSCR register | I | ||
core_en2drv |
1 | Enable second driver for rdb_t2 | I |
Another major constrain for this project, was to make all pins of
the Barracuda design visible on the board, but it showed, this exceeds the
number of available pins of the used ALTERA FPGA by far.
The proposal of
the author, not to implement port signals that control analogue port functions,
e.g. input buffer enable or pull-up enable signals, proved the right way. More
than 50 pins could be saved by leaving these signals open.
After the
pin-out of STAR MCU bus and interface modules had been determined,
the
third major task of this thesis was the development of wrapper modules for each
FPGA that have the final pin-out of the FPGAs, see figures 2 to 5.
The pin-out of FPGA1 includes the STAR-MCU bus and a port interface.
Originally, a constrain of the project was to keep some reserved pins for
future improvements of the design e.g. additional interrupt signals that have
to be routed on the board from FPGA2/3/4 to FPGA1 (CORE). But all pins of FPGA1
have been used for MCU-bus, memory and port interface signals.
One way to
solve this problem was the implementation of the CORE in two FPGAs. The
submodules of the CORE would have been separated at any border of the module's
internal hierarchy. It showed that the interface submodules had decisively more
signals, than the interface of the top-level module. Moreover, the CORE module
was time critical, that means implementation in two FPGAs would cause
additional timing delay, since the former internal signals would have to be
routed on the prototype board.
At this stage of the project, all signals,
known after analyzing the old JUPITER modules, could be implemented in FPGA
keeping around 20 reserved pins free for future use. We decided to use only one
FPGA to implement the CORE module, but we had to keep in view the problem of
too less pins available for future use.
It proved to be impossible to implement all peripheral interface
modules in one FPGA since the number of required logic cells exceeds the size
of one FPGA.
The Port Integration Module (PIM) has the port functionality
of each interface module. If some interface modules are located in another FPGA
than the PIM module appears the problem where to route the port signals of
these modules.
One way could be splitting the PIM module or leaving it out
as this module has in the main analogue components that can not be FPGA
implemented.
Another way would be routing the port signals between FPGAs.
The proposal of the author to keep the PIM module as it is and implement it in
FPGA2 has been realized; because the PIM module includes some registers and has
its own IPbus interface. This made some intermodule signals between FPGA2 and
FPGA3/4 necessary. FPGA3 and FPGA4 have no peripheral ports. The port signals
can_ind
, can_dout
and can_oen
from each CAN0/1/2/3 (FPGA3/4) module lead to PIM
module, located in FPGA2.
Figure 6 shows the module layout of the
Barracuda ProtoBoard. Each FPGA is connected to the STAR12 MCU bus.