Virtual Platform for Heterogeneous 3D-MPSoC Design Exploration
Short Description
edfsdfsdg gsdgfsdgs...
Description
UMR CNRS 6082 – FOTON - Fonctions Optiques pour les Technologies de l'informatiON INRIA Rennes – Bretagne Atlantique – IRISA UMR CNRS 6074 UMR CNRS 5270 - Institut des Nanotechnologies de Lyon Ecole Centrale de Lyon
Goals of the project Requirements of the framework Options of frameworks Roadmap Gem5 Research discussion
Optical Layer
Memory Layer
Processing Layer Computation
NoC
HO-COMP HE-COMP
Communication HO-COMM HE-COMM
Monitors Application Benchmark
Analysis tools
HO/HE-COMP: Homo/heterogeneous computation structure HO/HE-COMM: Homo/heterogeneous communications structure SM-DM: Share and distributed memories architecture
Models: Flexible (Parameterized) Support different computation, memory and communication components Processors, accelerators Memory organizations (shared, distributed) Electrical NoC, Bus, crossbar, optical NoC
Support reconfigurable architectures
Simulation: Scalable Fast Parallel
Framework: Documented Accepted by the scientific community Acceptable complexity Free if possible
Tool
Language
Type of model
Simics
C++, Fortran
Functionalaccurate
SimpleScalar
C
Cycle-accurate
OVP
SystemC
Instructionaccurate
Rabbits
SystemC, QEMU
Cycle-accurate
Gem5
C++, Python
Cycle-accurate
Components Processors (ALPHA, ARM,MIPS, Power PC, SPARC, x86) O.S (Linux, VxWorks, Solaris, FreeBSD, QNX RTEMS) Processors (ALPHA, ARM, Power PC, x86) O.S (Linux) Processors (Open Cores, RISC, ARM,MIPS, Power PC, MicroBlaze,..) O.S (Linux) Processors (ARM, x86, Power PC) Processors (ALPHA, ARM, Power PC, MIPS, SPARC, x86) O.S (Linux, Android,..)
Memory architecture
Documented
Availability
Local and shared
Yes (***)
Wind River System Liscenced
Local, shared and distributed
No anymore
Open
Local and shared
Yes (**)
Imperas Open/ Liscenced
Local and shared
Yes (*)
Open
Local and shared
Yes (***)
Open
COMM-design
Model 1 HO-COMP HO-COMM SM
Model 1b HO-COMP HO-COMM
DM
Model 3 HE-COMP HO-COMM
Model 3b HE-COMP HO-COMM
DM
Model 5 HE-COMP (GPU) HO-COMM
Model 5b HE-COMP (GPU) HO-COMM DM
COMP-design
Model 2 HO-COMP HE-COMM
Model 4 HE-COMP HE-COMM
Model 6 HE-COMP (GPU) HE-COMM
Existing models
Discrete event driven simulator platform Modular design Multiple CPU models Two execution modes: System-call Emulation and Full-system Two memory system models: Classic and Ruby Ruby allows flexible memory system modelling Used by scientific community in MPSoC and optical networks design Extension with accelerators (GPUs) – Gem5-GPU Extension with NoC simulators (TOPAZ)
2014 P. Grani, “From Hybrid Electro-Photonic to All-Optical On-chip Interconnections for Future CMPs”. In Proc. International conference on High Performance Computing & Simulation (HPCS), 2014. P. Grani, Sandro Bartolini. “Simultaneous Optical Path Setup for Reconfigurable Photonic Networks in Tiled CMPs”. In Proc. International conference on High Performance Computing & Simulation (HPCS), 2014. Z. Li, A. Qouneh, M. Joshi, W. Zhang, X. Fu, T. Li. “Aurora: A Cross-Layer Solution for Thermally Resilient Photonic Network-on-Chip”. In IEEE Transactions on Very Large Scale Integration (VLSI) Systems, February, 2014.
R. W. Morris, A. Karanth, A. Louri, R. D. Whaley. “Three-Dimensional Stacked Nanophotonic Network-on-Chip Architecture with Minimal Reconfiguration”. In IEEE Transactions on Computers, Vol. 63, January, 2014. 2013 Laer, T. Jones, P. Watts. “Full System Simulation of Optically Interconnected Chip Multiprocessors Using gem5”. In Proc. Optical Fiber Communication Conference, 2013. M. Glick, S. Rumley, R. Hendry, K. Bergman, R. Dutt. “Modeling and Simulation Environment for Photonic Interconnection Networks in High Performance Computing” Report University of Columbia, 2013. H. Chung. “Optimal Network Topologies and Resource Mappings for Heterogeneous Networks-on-Chip”. Ph.D. Thesis of Portland State University, 2013. S. Bartolini, P. Grani. “Co-tuning of a Hybrid Electronic-Optical Network for Reducing Energy Consumption in Embedded CMPs”. In Proc. MES, 2013. 2012 M. Zhang, L. He, D. Fan. “Self-Correction Trace Model: A Full-System Simulator for Optical Network-on-Chip”. In Proc IEEE 26th International Parallel and Distributed Processing Symposium, 2012. P. Abad, P. Prieto, L. Menezo, A. Colaso, V. Puente, J. Gregorio. “TOPAZ: An Open-Source Interconnection Network Simulator for Chip Multiprocessors and Supercomputers”. In Proc. NOCS 2012. H. Chung, C. Teuscher, P. Pande. “Design and Evaluation of Technology-Agnostic Heterogeneous Networks-on-Chip”. In JETC, 2012.
Bbench-gem5: Android-based web browser application Moby: Android-based mobile applications Web browser, social network, on-line shopping, email, audio, video, document, map, games
Dacap: Java-based applications SPLASH: Linux-based applications Rodinia Parboil CUDA
Processing: Computation: Call of different processors Memory mapping
Communication: New optical components (garnet/simple network) Modification in protocols (ruby management) 3D Latency and bandwidth model (link and router parameters)
Memory: Distributed memory management
Ruby: Implements the model of memory subsystem (cache hierarchies, replacement policies, coherence protocol, interconnection networks, memory controllers, sequencers). SLICC: High level specification language (specify functionality of memory controller)
3D design Topology exploration (Islands, stacked, clusters) Thermal management Routing (Crosstalk avoidance, adaptive) Mapping (Performance, cost and thermal)
Complexity
(*)
(***) (**) (**)
Protocol Protocol exploration Services for optical networks (QoS and security)
(**) (**)
Applications Traffic patterns Benchmarking (*/**/***) Other applications (Neural coding)
(*)
(*)
Thermal issues?
Distributed memory?
(*/**/***)
Delay Mandatory queue
Ruby Port
Sequencer
L1 Cache controller
L1 latency
DIRECTORY
RUBY Mem controller
Message Buffer
Message Buffer
Message Buffer
Message Buffer
Mem controller
1.
Memory request from another PE is received by calling the function of the RubyPort: RubyPort::recvTiming (src/mem/ruby/system/RubyPort.hh-cc)
2.
The RubyPort converts the request into a RubyRequest object (compliant with ruby components). Two task are performed: * Verifies that is a correct request * Send request to the proper sequencer interface (pointed by ruby) Sequencer::makerequest
3.
Sequencer performs the following tasks: * Resource allocation * Push request to the Ruby cache hierarchy -> request on Mandatory_queue m_mandatory_q_ptr
4.
L1 cache controller dequeues request from Mandatory_queue and looks up the cache * Hit: -> Reviews the coherence permissions -> L1 informs the sequencer the hit readCallback/ writeCallback
* Miss: -> Pushes the request to the next level of hierarchy -> Uses MessageBuffer class (scrc/mem/ruby/buffers/messageBuffers.hh-cc)
MessageBuffer Entry point of coherence messages to the NoC Connected according the NoC topology
5.
Sequencer call the RubyPort and returns result RubyPort::ruby_hit_callback
View more...
Comments