Abstract | Razvoj digitalnog sustava se u općem slučaju temelji na specifikaciji zahtjeva funkcionalnosti sustava i opisu specifikacije na nekoliko razina apstrakcije. To vodi konačnom izboru komponenti izvedbe i definiciji njihovog menusobnog odnosa unutar sustava. Pojava i razvoj programirljivih platformi za ugradbene sustave i alata prateće programske potpore unaprjenuju proces automatizacije dizajna sustava. Na strani specifikacije sustava napredak je ostvaren razvojem i uporabom programskih jezika posebne namjene i prevoditelja šire poznatih programskih jezika više razine u jezike za opis sklopovlja. Na taj način je postignuto da specifikacija sustava ima izravniji utjecaj na konačnu izvedbu koju je tako moguće bolje prilagoditi željenoj aplikaciji. U ovom radu se analizira izbor i konfiguracija komponenti namjenskog procesora s ciljem poboljšanja performansi predvinene aplikacije. Predloženi modeli procesora se ocjenjuju i s obzirom na svojstva konačnog sklopovskog ostvarenja. Za specifikaciju napisanu algoritamski, kodom u C programskom jeziku, a na temelju analize korištenja operanada i operacija, generira se podatkovni put arhitekture procesora. Prema zahtjevima koda razdijeljenog po blokovima upravljačkog toka, rasporenenog po stanjima konačnog automata, instanciraju se komponente registara, registarskih blokova i funkcijskih jedinice te njima pratećeg prospoja. Ovakav potpuno namjenski podatkovni put arhitekture pokazuje prednost u odnosu na konvencionalni podatkovni put arhitekture u mogućnosti postizanja većeg paralelizma, ali zaostaju za ostvarenjem sinteze s visoke razine specifikacije s potpuno namjenskom upravljačkom logikom. |
Abstract (english) | Technology constantly improves and capabilities for digital embedded system design rapidly rise. The designers productivity are not sufficient to benefit from technology advances at their pace. There is a hardware design productivity gap. The efforts to close the gap are supported with design automation software tools at higher level of design abstraction. However, the production of such software is far from expected, and we can talk about even larger system design gap. This work, as many others, is motivated by closing the system design gap task by shortening the design time. It is inspired by system-level design approach as it assumes high level specification in C code which is one of the most popular programming languages. The design is implemented within FPGA device platform as the most available prototyping platform in designer community. The key point in design flow is production of processor architecture datapath as intermediate solution abstraction understandable to common user. Knowledge of datapath components mapping to target platform allows designer impact in component types selection and insight in implementation resources occupation. In that way, the idea of area-speed design trade-off is supported. Also, the approach is not primarily focused on handling the target application control or data processing intensities, but is intended for any nature or size of application. Unlike for the traditional approaches of hardware design applied in well-known high level synthesis, this work does not use any compiler or data flow optimizations of the specification. It is focused on processor datapath optimizations where any change of design specification reflects on datapath architecture. In that way the control of hardware resource occupation during design is emphasized as key point of every hardware design. The introduction to the thesis is provided in Chapter 1 where motivation and approach of the work, overview of accomplished scope and short review of related work are presented. Chapter 2 talks on embedded systems that are ubiquitous in almost all domains of human life, and especially in high-end technological solutions. Also, software support complexity and development scale is presented with focus on system-level design topic. Within this topic the high-level synthesis as custom hardware design approach is elaborated. It is highly related to the topic of this work in concept of high-level specification, design methodology and selection of tools, languages and implementation platforms. As another important aspect of this work, the processor architecture concept is introduced with its fundamental features and overview of its types implemented in hardware. The special point of interest is the concept of embedded processors that are available as soft and hard processor cores that are optimized for implementation in FPGA platform. Chapter 3 starts the elaboration of the implemented methodology. As the initial step of the flow the pre-processing of C code specification is described. Basically, it consists of C code transformation to control and data flow graph (CDFG) with special interest on control and data dependencies extraction. The CDFG is constructed with three-address code statements that clearly map on datapath functional units constructs and form the appropriate base for processor architecture construction. As the target processor model the concept of No- Instruction-Set Computer is chosen. This concept assumes fully custom datapath and no predefined instruction set or format. The instruction format is formed according to the datapath contents. This work therefore produces new processor datapath and instruction format for any C code specification. The custom datapath construction methodology is presented in Chapter 4. It is based on CDFG three-address statements scheduling that is undertaken independently for every control flow block. After that, the operation and operand usage analysis per scheduled state is described. This analysis allows datapath components optimizations, i.e. reductions in datapath components instantiations without consequences on three-address code schedules. Allocation and binding of three-address code to components is thus explicit. Therefore, the provisional datapaths of all control flow blocks are designed with optimal dedication of operands to storage and operations to functional unit components. The synthesis of final datapath is the topic of Chapter 5. The key characteristics of datapath design are accented. The datapath is formed with contributions of all control flow blocks expressed through their optimized provisional datapaths. The control flow blocks are ordered by their significance, i.e. percent share in total scheduled executive states. The analysis of operation needs per operation types is performed and reported to designer as potential design constraint. The final datapath is formed by incremental addition of functional unit components from provisional datapaths in order by their significance. The significance criterion is applied with goal of better datapath adaptation to more demanding application functionalities in case of limited implementation resources and for preserving the execution cycle count. Further, they are updated with functional units accompanying datapath components: those storage, register and register files, and arbitral, multiplexers. Chapter 6 presents experimental results. There are three real-life test cases and their C code variations that are characterized by their control flow structures. The evaluation is performed by: measurement of the design time, comparison to two other processor concepts and highlevel synthesis tool, and comparison of resulting datapath characteristics. Chapter 7 concludes the work with review of research accomplishments and scientific contributions. Presented research evaluates the custom hardware design approach with respect of the design time, performance and resource demands. As the specification of the application the highlevel procedural language familiar to broad spectrum of users is assumed. As the middle point of design flow the processor architecture abstraction is assumed as abstraction comprehensible to common user. As final product the FPGA implementation is achieved. Results show low design time in range of seconds and performance in range of RISC type processor model and better than typical embedded processor, but worse than exemplar highlevel synthesis tool. The impact of designer is allowed with insights in processor components platform descriptions and specification of functional unit constraints. The implemented flow does not take into consideration any type of pipelining or functional unit chaining what is recognized as area of future performance improvement. In the scope of the implementation resources usage control the applied greedy constructive nature of datapath synthesis appeared fast, but considerable inefficient. Therefore, demands for iterative refinements of the datapath are exposed. |