Pa2 Jay Pres

December 22, 2017 | Author: Kirtesh Tiwari | Category: Digital Electronics, Electrical Engineering, Computer Engineering, Electronic Engineering, Technology
Share Embed Donate


Short Description

good for icc flow...

Description

Ultra High Speed (5Ghz) Block Custom Physical Design Flow with ICC

Prakash Jayasekharan Senior PD Engineer

Suman Musunuru Senior Design Engineer

Maxim Integrated Products

Maxim Integrated Products 1

Agenda •

Challenges in High speed Physical Design



- Design Constraints, Library and Design issues Custom solutions with Synopsys ICC flow

- Matrix re-characterization, Synthesis improvements, placement sensitive flow, CTS waveform balancing, Signal EM, power

Timing/STA correlation results



- Star-RC vs Calibre, ICC vs PT-SI

Conclusion/Takeaways Appendix A Appendix B (scripts)

• • •

2

Maxim Integrated Products

Design Constraints 65nm SOC design



- 2.4 Million gate

- Block A and Block B @5GHz (200ps period) - 5% late, 10% early Derating (both clock and data), 5% Jitter - Target skew ~15ps

Transition ~20ps Pulse width ~ 80ps - IR < 3% Peak (or 30mv Weff)

3

Maxim Integrated Products

Library Issues Re-characterization of timing libraries



- Traditional library tables produce pessimism in timing delay calculation (setup/delays worst by 10ps at least) .lib spice

4

Maxim Integrated Products

...Library issues Extra pessimism not tolerable because



- 10ps for each cell gets added to become significant - Paths become too tight to fix Library is mostly made of weak drive strength buffers, complex gates. Realistic fanout higher switching power



=> higher insertion delay

- Weak clock tree cells cause more insertion delay > 70% of the logic is sequential. Setup (reg2reg) timing is critical



Decap cells for peak IR released late in the flow



- could not be added in block A 6

Maxim Integrated Products

...Design Issues •

Small coupling caps (1fF) due to size of design - Small nets in the design do not get extracted and can be dropped . Use coupling_abs_threshold to reduce thresh

• 4 corners for IR/EM, 3 corners for Timing - highV, high Temp added finally for IR/EM Voltage

Temp

Tag

Description

0.9

125.0

WCCOM

Traditional worst case timing

1.1

-40

LTCOM

Traditional best case timing

0.9

-40

WCLCOM

Temp inversion corner

1.1

125

MLCOM

Worst EM/IR/Leakage

7

125C

T e m p 0.9

1.1

Voltage -40C

Maxim Integrated Products

Agenda •

Challenges in High speed Physical Design



- Design Constraints, Library and Design issues Custom solutions with Synopsys ICC flow

- Matrix re-characterization, Synthesis improvements, placement sensitive flow, CTS waveform balancing, signal EM, power

Timing/STA correlation results



- Star-RC vs Calibre, ICC vs PT-SI

Conclusion/Takeaways Appendix A Appendix B (scripts)

• • •

8

Maxim Integrated Products

Matrix re-characterization timing() { related_pin : "cp" ; timing_type : setup_rising ; fall_constraint(cnst_ctin_rtin_3x3) { index_1("0.003, 0.2019, 0.9"); index_2("0.003, 0.2019, 0.9"); values("0.00995, 0.0199, 0.06965",\ "0.08955, 0.1095, 0.2089",\ "0.2189, 0.1791, 0.3184"); }

timing() { related_pin : "cp" ; timing_type : setup_rising ;

B E F O R E (3x3)

10x10 reduces extra pessimism

fall_constraint(cnst_ctin_rtin_10x10) { index_1("0.003, 0.009191, 0.03092, 0.07243, 0.1371, \ 0.2278, 0.3472, 0.4976, 0.6812, 0.9"); index_2("0.003, 0.009191, 0.03092, 0.07243, 0.1371, \ 0.2278, 0.3472, 0.4976, 0.6812, 0.9"); values("0.00995, 0.00995, 0.00995, 0.00995, 0.00995, 0.0199, 0.02985, 0.0398, 0.04975, 0.06965",\ "0.0199, 0.0199, 0.00995, 0.0199, 0.0199, 0.0199, 0.02985, 0.0398, 0.0597, 0.06965",\ "0.02985, 0.02985, 0.02985, 0.02985, 0.02985, 0.0398, 0.04975, 0.0597, 0.06965, 0.08955",\ "0.04975, 0.04975, 0.0398, 0.04975, 0.0597, 0.06965, 0.0796, 0.08955, 0.1095, 0.1194",\ "0.06965, 0.06965, 0.0597, 0.06965, 0.0796, 0.0995, 0.1095, 0.1293, 0.1492, 0.1691",\ "0.08955, 0.08955, 0.0796, 0.08955, 0.0995, 0.1194, 0.1393, 0.1691, 0.199, 0.2288",\ "0.1194, 0.1095, 0.0995, 0.1095, 0.1194, 0.1393, 0.1592, 0.189, 0.2288, 0.2686",\ "0.1492, 0.1393, 0.1194, 0.1293, 0.1393, 0.1492, 0.1791, 0.2089, 0.2487, 0.2885",\ "0.1791, 0.1791, 0.1492, 0.1492, 0.1592, 0.1691, 0.189, 0.2288, 0.2587, 0.3085",\ "0.2189, 0.2189, 0.1791, 0.1791, 0.1791, 0.189, 0.2089, 0.2388, 0.2786, 0.3184");

A F T E R ( 10x10)

} 9

Maxim Integrated Products

Synthesis Improvements Very slow cells like XOR, 4:1 Mux, AOI gates prohibited



- some sensitive logic hand instantiated to prevent AOI or XOR selection

Register Cloning/Fanout optimization to reduce fanout



- 10-15% increase in sequential area, but helps reduce flop delay

- set_register_replication (DC) can be used

Load Cap =C/2

Load Cap =C

10

Load Cap =C/2

Maxim Integrated Products

Placement Sensitive Flow Cell placement is closely controlled in all stage Bad timing due to:

• •

- Placement of cells due to loose constraints - High buffer insertion to close timing

Clocks over-constrained by 10% and incremental psynopts improves timing



- Best possible flop placement achieved

Clock latency set to simulate post-cts derating in placement



11

Maxim Integrated Products

Placement.. Default timing flow

create_placement + WNS :-0.05, 50 paths psynopt

Derating clock_opt

WNS:-0.10, 60 paths

SI+ Wires route_opt + route_opt -incr

12

WNS:-0.18, 90 paths

Maxim Integrated Products

Placement.. 40 ps uncertainty Dont upsize Just Move

PSFlow

create_placement+ psynopt

psynopt(1)

WNS :-0.05, 50 paths

Allow buffer resizing

WNS:-0.10, 80 paths

Remove extra uncertainty (24ps) Don’t move registers

psynopt(2)

WNS:-0.025, 50 paths

clock_opt –only_cts SI + wires WNS:-0.08, 20 paths

13

WNS:+0.005,10 paths

WNS:-0.015, 10 paths (waived)

route_opt+ route_opt -incr

route_opt -incr (reg2reg only)

Maxim Integrated Products

CTS-Waveform Balancing Getting around clock cells’ asymmetricism



- Decision to use same non-equal duty cycle inverter back to back to avoid pulse width issues

14

Maxim Integrated Products

CTS-others Register placement is fixed Fast transition times help speed up Ck-Q timing

• •

- Also reduces setup times at the flops

Final duty cycle tolerance -40/60% Since skew is very small eliminates hold fixing

• •

15

Maxim Integrated Products

Power Analysis Both blocks are in special power domain (not shared by top ) Target < 3% (i.e. 33mv) IR drop achieved @MLCOM (1.1, 125) is 14 + 17 = 31 mv

• • •

Pads

block B

block A

Top core 16

Maxim Integrated Products

Power EM EM, Rj issues due to high current through buses with insufficient Vias (Important run for high speed) ICC custom route tool used to add extra Via2, M2





4x2 array

17

2x pin width

Maxim Integrated Products

Signal EM

SAIF based EM Fix Signal EM (If any)

Statistical EM

* Fix Signal EM Iterations Timing clean up( Worst func mode for power )

Fix minor DRCs/Antennas

Repeat for critical functional modes.

STA

* fix_signal_em (or) script Simulate/generate vcd /saif file.

18

Reduced Timing Iterations

Maxim Integrated Products

...Signal EM •

19

Sample EM fix with repair file (clock widened 2x to 4x)

Maxim Integrated Products

Agenda •

Challenges in High speed Physical Design



- Design Constraints, Library and Design issues Custom solutions with Synopsys-ICC flow

- Matrix re-characterization, Synthesis improvements, placement sensitive flow, CTS waveform balancing , signal EM, power

Timing/STA correlation results



- Star-RC vs Calibre, ICC vs PT-SI

Conclusion/Takeaways Appendix A Appendix B (scripts)

• • •

20

Maxim Integrated Products

Correlation Bottom up flow to make sure ICC settings are close enough to PrimeTime, Star-RC ( Solvnet IC Compiler



Correlation Checklist Trilogy )

Extraction Settings



OPERATING_TEMPERATURE: 25, COUPLE_TO_GROUND: NO, COUPLING_ABS_THRESHOLD: 1e-15 , MODE=400 , EXTRACT_VIA_CAPS =YES

Noise / Timing Settings



set db_load_ccs_noise_data true, set timing_crpr_threshold_ps 0, set si_filter_accum_aggr_noise_peak_ratio 0.2

21

Maxim Integrated Products

Star-RC vs Calibre spef Block B: Star-RC within 8% mean



22

Maxim Integrated Products

ICC vs PT-SI slack Block B: ICC (4ps) slightly pessimistic vs PT (2ps)

# Paths

# Paths



-0.004

0.000

0.005

WNS(ns) 23

0.009

-0.002

0.000

0.005

WNS(ns) Maxim Integrated Products

0.011

Agenda •

Challenges in High speed Physical Design



- Design Constraints, Library and Design issues Custom solutions with Synopsys-ICC flow

- Matrix re-characterization, Synthesis improvements, placement sensitive flow, CTS waveform balancing , signal EM, power.

Timing/STA correlation results



- Star-RC vs Calibre, ICC vs PT-SI

Conclusion / Takeaways Appendix A Appendix B (scripts)

• • •

24

Maxim Integrated Products

Conclusion / Takeaways Fix Library Issues



- Good range of cells with decent strengths for optimization

- Cell names must be user friendly to limit use (for better EM/IR) - Larger matrices for setup/pulse timing to prevent timing pessimism - Symmetric clock cells tagged with special naming - Don’t use cells should be clearly marked

Fix Process Corners (e.g. MLcom , WCLcom)



- Special situations like Temperature inversion for timing, High

Temp corners for leakage, peak IR drop should be known well in advance

25

Maxim Integrated Products

…Conclusion / Takeaways Think Top level



- Think about next stage, top level

Correlate (SolvNet : IC Compiler Correlation Checklist Trilogy )



- Star-RC / ICC extraction should be correlated to device level - PT-SI and ICC noise settings should be checked Tune ICC to meet requirements (e.g. custom placement, custom cts, custom router, etc…)



- Get to know all options available - Script for Reusability

26

Maxim Integrated Products

Thanks… Synopsys Hotline Filed and accepted requests for EM gui and temperature scaling Retaining FILLs in soft block while after flattening Ability to check min grid during zroute verify

• • •

Others 1. KhanKap Mounarath – Sr. Scientist, Maxim

2. DSM group/ Library , Maxim EDA 3. Bill Sicaras - Synopsys AC

27

Maxim Integrated Products

Appendix A PT-SI and Spice correlation



Spice level simulation performed on the worst path Startpoint: clk_div_0/div_by2_by4_0/sig_i4_reg (rising edge-triggered flip-flop clocked by dac_clk1) Endpoint: clk_div_0/div_by2_by4_0/sig_i4_reg (rising edge-triggered flip-flop clocked by dac_clk1) Path Group: dac_clk1 Path Type: max

∑ ( launch clock delay + CK-Q delay +

combinational delay to the Endpoint register ) is within 5% for Block B

28

Maxim Integrated Products

Appendix B (scripts) Script used for placement ## Source the common settings for placement and optimization source common_placement_settings_icc.tcl set placer_max_cell_density_threshold 0.68 ## 15% of the clock period which is 200ps is 30ps ## 30ps plus 10ps uncertainty is 40ps overconstraining set_timing_derate late 1.15 set_clock_uncertainty 0.01 [all_clocks] set_critical_range 0.090 cd18_decoder_dac ## INITIAL PLACEMENT create_placement effort high congestion congestion_effort high legalize_placement ## FIRST ROUND OF optimizations set_dont_touch [get_cells * ] set_dont_touch [get_nets * ] psynopt ## tighten the output paths set_clock_uncertainty 0.015 [all_clocks] set_clock_latency 0.200 [get_clocks dac_clk] set_clock_latency 0.100 [get_clocks dac_clko] psynopt ## SECOND ROUND OF optimization ## Remove the dont touches and let the tool optimize the ## timing more . ( upsize cells etc. ) remove_attribute [get_cells hier *] dont_touch quiet remove_attribute [get_nets hier *] dont_touch quiet ## do not optimize some sensitive logic set_dont_touch [get_cells U*] psynopt ## save cell and report timing ## 29

Maxim Integrated Products

Appendix B Script used for CTS # DON’T MOVE CAREFULLY PLACED CELLS set_dont_touch_placement [get_cells hier *_reg* ] set_attribute [get_cells hier spr*] is_fixed true remove_clock_tree clock_trees { dac_clk dac_clko} honor_dont_touch reset_clock_tree_references define_routing_rule decoder_clk_shield_rule default_reference_rule taper_level 0 multi lier_width 2 multiplier_spacing 1 shield ## CONTROL TRANSITION FOR CLOCKS ## RELAX BUFFER LEVLES TO l help fix fanout set_clock_tree_options layer_list $runOption(input,clkRoutelayerList) routing_rule ecoder_clk_shield_rule use_default_routing_for_sinks 1 target_skew 0.010 max_buffer_levels 9 max_transition .024 set_clock_tree_options clock_trees dac_clk routing_rule decoder_clk_shield_rule \ use_default_routing_for_sinks 1 target_skew 0.010 max_buffer_levels 9 set_max_fanout 2 [get_ports dac_clk] set_max_fanout 2 [get_ports dac_clko] ## Tighter transition on output clk. timing is ok. set_clock_tree_options clock_trees dac_clko max_buffer_levels 3 max_transition 0.022 check_clock_tree clocks dac_clk report_clock_tree summary clock_trees dac_clk level_info report_clock_tree show_all_sinks report_clock_tree settings > clktree/settings.rpt update_clock_latency ## Turn on removal and recovery check ## set enable_recovery_removal_arcs true ## Perform clock tree synthesis only clock_opt only_cts operating_condition min_max 30

Maxim Integrated Products

View more...

Comments

Copyright ©2017 KUPDF Inc.
SUPPORT KUPDF