Virtual Platform for Heterogeneous 3D-MPSoC Design Exploration

September 12, 2017 | Author: Find Programmer | Category: Cpu Cache, Arm Architecture, Supercomputer, Graphics Processing Unit, Mips Instruction Set

Share Embed Donate

Report this link

Short Description

edfsdfsdg gsdgfsdgs...

Description

UMR CNRS 6082 – FOTON - Fonctions Optiques pour les Technologies de l'informatiON INRIA Rennes – Bretagne Atlantique – IRISA UMR CNRS 6074 UMR CNRS 5270 - Institut des Nanotechnologies de Lyon Ecole Centrale de Lyon

Goals of the project Requirements of the framework Options of frameworks Roadmap Gem5 Research discussion

Optical Layer

Memory Layer

Processing Layer Computation

NoC

HO-COMP HE-COMP

Communication HO-COMM HE-COMM

Monitors Application Benchmark

Analysis tools

HO/HE-COMP: Homo/heterogeneous computation structure HO/HE-COMM: Homo/heterogeneous communications structure SM-DM: Share and distributed memories architecture

Models: Flexible (Parameterized) Support different computation, memory and communication components Processors, accelerators Memory organizations (shared, distributed) Electrical NoC, Bus, crossbar, optical NoC

Support reconfigurable architectures

Simulation: Scalable Fast Parallel

Framework: Documented Accepted by the scientific community Acceptable complexity Free if possible

Tool

Language

Type of model

Simics

C++, Fortran

Functionalaccurate

SimpleScalar

C

Cycle-accurate

OVP

SystemC

Instructionaccurate

Rabbits

SystemC, QEMU

Cycle-accurate

Gem5

C++, Python

Cycle-accurate

Components Processors (ALPHA, ARM,MIPS, Power PC, SPARC, x86) O.S (Linux, VxWorks, Solaris, FreeBSD, QNX RTEMS) Processors (ALPHA, ARM, Power PC, x86) O.S (Linux) Processors (Open Cores, RISC, ARM,MIPS, Power PC, MicroBlaze,..) O.S (Linux) Processors (ARM, x86, Power PC) Processors (ALPHA, ARM, Power PC, MIPS, SPARC, x86) O.S (Linux, Android,..)

Memory architecture

Documented

Availability

Local and shared

Yes (***)

Wind River System Liscenced

Local, shared and distributed

No anymore

Open

Local and shared

Yes (**)

Imperas Open/ Liscenced

Local and shared

Yes (*)

Open

Local and shared

Yes (***)

Open

COMM-design

Model 1 HO-COMP HO-COMM SM

Model 1b HO-COMP HO-COMM

DM

Model 3 HE-COMP HO-COMM

Model 3b HE-COMP HO-COMM

DM

Model 5 HE-COMP (GPU) HO-COMM

Model 5b HE-COMP (GPU) HO-COMM DM

COMP-design

Model 2 HO-COMP HE-COMM

Model 4 HE-COMP HE-COMM

Model 6 HE-COMP (GPU) HE-COMM

Existing models

Discrete event driven simulator platform Modular design Multiple CPU models Two execution modes: System-call Emulation and Full-system Two memory system models: Classic and Ruby Ruby allows flexible memory system modelling Used by scientific community in MPSoC and optical networks design Extension with accelerators (GPUs) – Gem5-GPU Extension with NoC simulators (TOPAZ)

2014 P. Grani, “From Hybrid Electro-Photonic to All-Optical On-chip Interconnections for Future CMPs”. In Proc. International conference on High Performance Computing & Simulation (HPCS), 2014. P. Grani, Sandro Bartolini. “Simultaneous Optical Path Setup for Reconfigurable Photonic Networks in Tiled CMPs”. In Proc. International conference on High Performance Computing & Simulation (HPCS), 2014. Z. Li, A. Qouneh, M. Joshi, W. Zhang, X. Fu, T. Li. “Aurora: A Cross-Layer Solution for Thermally Resilient Photonic Network-on-Chip”. In IEEE Transactions on Very Large Scale Integration (VLSI) Systems, February, 2014.

R. W. Morris, A. Karanth, A. Louri, R. D. Whaley. “Three-Dimensional Stacked Nanophotonic Network-on-Chip Architecture with Minimal Reconfiguration”. In IEEE Transactions on Computers, Vol. 63, January, 2014. 2013 Laer, T. Jones, P. Watts. “Full System Simulation of Optically Interconnected Chip Multiprocessors Using gem5”. In Proc. Optical Fiber Communication Conference, 2013. M. Glick, S. Rumley, R. Hendry, K. Bergman, R. Dutt. “Modeling and Simulation Environment for Photonic Interconnection Networks in High Performance Computing” Report University of Columbia, 2013. H. Chung. “Optimal Network Topologies and Resource Mappings for Heterogeneous Networks-on-Chip”. Ph.D. Thesis of Portland State University, 2013. S. Bartolini, P. Grani. “Co-tuning of a Hybrid Electronic-Optical Network for Reducing Energy Consumption in Embedded CMPs”. In Proc. MES, 2013. 2012 M. Zhang, L. He, D. Fan. “Self-Correction Trace Model: A Full-System Simulator for Optical Network-on-Chip”. In Proc IEEE 26th International Parallel and Distributed Processing Symposium, 2012. P. Abad, P. Prieto, L. Menezo, A. Colaso, V. Puente, J. Gregorio. “TOPAZ: An Open-Source Interconnection Network Simulator for Chip Multiprocessors and Supercomputers”. In Proc. NOCS 2012. H. Chung, C. Teuscher, P. Pande. “Design and Evaluation of Technology-Agnostic Heterogeneous Networks-on-Chip”. In JETC, 2012.

Bbench-gem5: Android-based web browser application Moby: Android-based mobile applications Web browser, social network, on-line shopping, email, audio, video, document, map, games

Dacap: Java-based applications SPLASH: Linux-based applications Rodinia Parboil CUDA

Processing: Computation: Call of different processors Memory mapping

Communication: New optical components (garnet/simple network) Modification in protocols (ruby management) 3D Latency and bandwidth model (link and router parameters)

Memory: Distributed memory management

Ruby: Implements the model of memory subsystem (cache hierarchies, replacement policies, coherence protocol, interconnection networks, memory controllers, sequencers). SLICC: High level specification language (specify functionality of memory controller)

3D design Topology exploration (Islands, stacked, clusters) Thermal management Routing (Crosstalk avoidance, adaptive) Mapping (Performance, cost and thermal)

Complexity

(*)

(***) (**) (**)

Protocol Protocol exploration Services for optical networks (QoS and security)

(**) (**)

Applications Traffic patterns Benchmarking (*/**/***) Other applications (Neural coding)

(*)

(*)

Thermal issues?

Distributed memory?

(*/**/***)

Delay Mandatory queue

Ruby Port

Sequencer

L1 Cache controller

L1 latency

DIRECTORY

RUBY Mem controller

Message Buffer

Message Buffer

Message Buffer

Message Buffer

Mem controller

1.

Memory request from another PE is received by calling the function of the RubyPort: RubyPort::recvTiming (src/mem/ruby/system/RubyPort.hh-cc)

2.

The RubyPort converts the request into a RubyRequest object (compliant with ruby components). Two task are performed: * Verifies that is a correct request * Send request to the proper sequencer interface (pointed by ruby) Sequencer::makerequest

3.

Sequencer performs the following tasks: * Resource allocation * Push request to the Ruby cache hierarchy -> request on Mandatory_queue m_mandatory_q_ptr

4.

L1 cache controller dequeues request from Mandatory_queue and looks up the cache * Hit: -> Reviews the coherence permissions -> L1 informs the sequencer the hit readCallback/ writeCallback

* Miss: -> Pushes the request to the next level of hierarchy -> Uses MessageBuffer class (scrc/mem/ruby/buffers/messageBuffers.hh-cc)

MessageBuffer Entry point of coherence messages to the NoC Connected according the NoC topology

5.

Sequencer call the RubyPort and returns result RubyPort::ruby_hit_callback

Virtual Platform for Heterogeneous 3D-MPSoC Design Exploration

Short Description

Description

Comments

We need your help!