I Introduction
Mixedsignal integrated circuits are ubiquitous. While digital designs can be assisted by the mature VLSI CAD tools [elfadel2018machine], analog designs still rely on experienced human experts. It is demanding to have learningbased design automation tools.
Nonetheless, manual design is not an easy task even for seasoned designers due to the long and complicated design pipeline. Designers first need to analyze the topology and derive equations for the performance metrics. Since analog circuits have highly nonlinear properties, a large number of simplifications and approximations are necessary during the topology analysis. Based on all the equations, the initial sizes are calculated. Then, a large number of simulations for parameters finetuning are performed to meet the performance specifications. The whole process can be highly labor intensive and time consuming because of the large design space, slow simulation tools, and sophisticated tradeoffs between different performance metrics. Therefore, automatic transistor sizing is attracting more and more research interest in the recent years [lyu2018batch, liao2017parasitic, liu2010enhanced, learncircuits].
With transistors rapidly scaling down, porting existing designs from one technology node to another become a common practice. However, although much research efforts focus on transistor sizing for a single circuit, hardly any research has explored transferring the knowledge from one topology to another, or from one technology node to another. In this work, we present GCNRL Circuit Designer (Figure 1) to conduct the knowledge transfer. Inspired by the transfer learning ability of Reinforcement Learning (RL), we first train a RL agent on one circuit and then apply the same agent to size new circuits or the same circuit in new technology nodes. In this way, we can reduce the simulation cost without designing from scratch.
Moreover, prior works such as Bayesian Optimization (BO) and Evolutionary Strategy (ES) treated transistor sizing as a black box optimization problem. Inspired by the simple fact that: circuit is a graph, we propose to open the black box and leverage the topology graph in the optimization loop. In order to make full use of the graph information, we propose to equip the RL agent with Graph Convolutional Neural Network (GCN) to process the connection relationship between components in circuits. With the proposed GCNRL agent, we consistently achieved better performance than conventional methods such as BO and ES. Remarkably, the GCNRL not only enables transfer knowledge between different technology nodes but also makes knowledge transfer between different topologies possible. Experiments demonstrate that GCN is necessary for knowledge transfer between topologies.
To our knowledge, we are the first to leverage GCN equipped RL to transfer the knowledge between different technology nodes and different topologies. The contributions of this work are as follows:

Leverage the Topology Graph Information in the optimization loop (openbox optimization). We build a GCN based on the circuits topology graph to effectively open the optimization black box and embed the domain knowledge of circuits to improve the performance.

Reinforcement Learning as Optimization Algorithm, which consistently achieves better performance than human expert [stanford214Bhuman, stanford214Ahuman], random search, Evolution Strategy (ES) [hansen2016cma], Bayesian Optimization (BO) [snoek2012practical] and MACE [lyu2018batch].

Knowledge Transfer with GCNRL between different technology nodes and different circuit topologies to reduce the required number of simulations, thus shortening the design cycle.
Ii Related work
Automatic Transistor Sizing
. Automatic transistor sizing can be classified into knowledgebased and optimizationbased methods. For knowledgebased such as TAGUS
[Horta2002], circuits experts use their knowledge to design predefined plans and equations with which the transistor sizes are calculated. However, deriving a general design plan is highly time consuming, and it requires continuous maintenance to keep up with the latest technology. For optimizationbased methods, they can be further categorized into modelbased and simulationbased. Modelbased methods such as [deniz2010hierarchical, castro2009multimode] model circuit performance via manual calculation or regression with simulated data, and then optimize the model. The advantage is the easytoget global optimal. Nevertheless, building a model requires numerous simulations to improve the accuracy. For simulationbased ones, performance of circuits is evaluated with simulators (e.g. SPICE). The optimization algorithms such as BO [snoek2012practical], MACE [lyu2018batch], ES [hansen2016cma] consider the circuits as a black box and conduct optimization. Compared with ours, neither MACE nor ES leverage the topology graph information. In addition, BO and MACE have difficulties in transferring knowledge between circuits because their output space is fixed. ES cannot transfer because it keeps good samples in its population without summarizing the design knowledge.Deep Reinforcement Learning. Recently, deep RL algorithms have been extensively applied to many problems such as game playing [silver2017mastering], robotics [levine2016end] and AutoML [He:2018vj]. There are also environment libraries [mao2019park] using RL for system design. For different RL task domains, deep RL proves to be transferable [taylor2009transfer]
. In this work, we propose RL based transistor sizing, which makes it automated, transferable, and achieves better performance than other methods. Comparing to supervised learning, RL can continuously learn in the environment and adjust the policy.
Graph Neural Networks. Graph neural network (GNN) [scarselli2009graph] adapts neural networks to process graph data. Several variants of GNN are proposed, including Graph Convolution Neural Networks (GCN) [kipf2016semi], Graph Attention Networks [velivckovic2017graph], etc. There are also accelerators [yan2020hygcn, sparch] focusing on GNN related workloads. In GCNRL Circuit Designer, we refer to [kipf2016semi] to build GCN, leveraging topology graph information to benefit the optimization. [circuitgnn] used GNN to replace an EM simulator for distributed circuits. By contrast, our work focuses on analog transistor sizing and exploits RL for knowledge transfer.
Iii Methodology
Iiia Problem Definition
We handle transistor sizing problem where the topology of the analog circuits is fixed. The problem can be formulated as a boundconstrained optimization:
(1) 
where
is the parameter vector,
is the number of parameters to search. is the design space. Figure of Merits (FoM) is the objective we aim to optimize. We define it as the weighted sum of the normalized performance metrics as shown in Equation 2.(2) 
where is the measured performance metrics. and are predefined normalizing factors to normalize the performance metrics to guarantee their proper ranges. is the predefined upper bound for some performance aspects which do not need to be better after satisfying some requirements. is the weight to adjust the importance of the performance metric. For some circuit baselines we use, there exists performance specification (spec) to meet, if the spec is not met, we assign a negative number as the FoM value.
IiiB Framework Overview
An overview of the proposed framework is shown in Figure 2. In each iteration, (1) Circuit environment embeds the topology into a graph whose vertices are components and edges are wires; (2) The environment generates a state vector for each transistor and passes the graph with the state vectors (refer to the graph on the top with circle nodes) to the RL agent; (3) The RL agent processes each vertex in the graph and generates an action vector for each node. Then the agent passes the graph with the node action vectors (refer to the graph with square vertices) to the environment; (4) The environment then denormalizes actions ([1, 1] range) to parameters and refines them. We refine the transistor parameters to guarantee the transistor matching. We also round and truncate parameters according to minimum precision, lower and upper bounds of the technology node; (5) Simulate the circuit; (6) Compute an FoM value and feed to RL agent to update policy. We do not need the initial parameters as in the human design flow. The detailed RL agent will be discussed in Section IIID.
IiiC Reinforcement Learning Formulation
We apply the actorcritic RL agent in GCNRL. The critic can be considered as a differentiable model for the circuit simulator. The actor looks for points with best performance according to the model.
State Space. The RL agent processes the circuit graph component by component. For a circuit with components in topology graph , the state for the component is defined as , where is the onehot representation of the transistor index, is the onehot representation of component type and is the selected model feature vector for the component which further distinguishes different component types. For the NMOS and PMOS, the model parameters we use are and . For the capacitor and resistor, we set the model parameters to zeros. For instance, for a circuit with ten components of four different kinds (NMOS, PMOS, R, C) and a fivedimensional model feature vector, the state vector for the third component (an NMOS transistor) is,
(3) 
For each dimension in the observation vector
, we normalize them by the mean and standard deviation across different components.
Action Space. The action vector varies for different types of components because the parameters needed to search are not the same. For the component, if it is NMOS or PMOS transistors, the action vector is formulated as , where and are the width and length of the transistor gate, is the multiplexer; for resistors, the action vector is formulated as: , where is the resistance value; for capacitors, the action vector is formulated as: , where is the capacitance value.
We use a continuous action space to determine the transistor sizes even though we will round them to discrete values. The reason why we do not use a discrete action space is because that will lose the relative order information, also because the discrete action space is too large. For instance, for a typical operational amplifier with 20 transistors, each with three parameters, and each size with 1000 value options, the size of the discrete space is about .
Reward. The reward is the FoM defined in Equation 2. It is a weighted sum of the normalized performance metrics. In our default setup, all the metrics are equally weighted. We also studied the effect of assigning different weights to different metrics in the experiments. Our method is flexible to accommodate different reward setups.
IiiD Enhancing RL Agent with Graph Convolutional Neural Network
To embed the graph adjacency information into the optimization loop, we leverage GCN [kipf2016semi] to process the topology graph in the RL agent. As shown in Figure 3
, one GCN layer calculates each transistor’s hidden representation by aggregating feature vectors from its neighbors. By stacking multiple layers, one node can receive information from farther and farther nodes. In our framework, we apply seven GCN layers to make sure the last layer have a
global receptive field over the entire topology graph. The parameters for one component are primarily influenced by nearby components, but are also influenced by farther components. The GCN layer can be formulated as:(4) 
Here, is the adjacency matrix () of the topology graph
plus identity matrix (
). Adding the identity matrix is common in GCN networks[kipf2016semi]. and is a layerspecific trainable weight matrix, echoing with the shared weights in Figure 3.is an activation function such as ReLU(
)[kipf2016semi]. is the hidden features in the layer (: number of nodes, : feature dimension). , which are the input state vectors for actor.The actor and critic models have slightly different architectures (Figure 3). The actor’s first layer is a FC layer shared among all components. The critic’s first layer is a shared FC layer with a componentspecific encoder to encode different actions. The actor’s last layer has a componentspecific decoder to decode hidden activations to different actions, while the critic has a shared FC layer to compute the predicted reward value. We design those specific encoder/decoder layers because different components have different kinds of actions (parameters). The output of the last layer of the actor is a prerefined parameter vector for each component ranging [1, 1]. We denormalize and refine them to get the final parameters.
For RL agent training, we leverage DDPG [Lillicrap:2016ww], which is an offpolicy actorcritic algorithm for continuous control problem. The details are illustrated in Algorithm 1. denotes the number of sampled data batch in one episode, denotes states, denotes actions. The baseline is defined as an exponential moving average of all previous rewards in order to
reduce the variance
of the gradient estimation.
is the max search episodes and is the warmup episodes. is a truncated norm noise with exponential decay.We implemented two types of RL agent to show the effectiveness of GCN: one is the proposed GCNRL and the other is nonGCN RL (NGRL), which does not operate the aggregation step in Figure 3, thus topology information is not used.
IiiE Knowledge Transfer
Transfer between Technology Nodes. The rapid transistor scaling down makes porting existing designs from one technology node to another a common practice. As in Figure 4 top, human designers first inherit the topology from one node and compute initial parameters, then iteratively tune the parameters, simulate and analyze performance. In contrast, our method can automate this process by training an RL agent on one technology node and then directly applying the trained agent to search the same circuit under different technology nodes by virtue of similar design principles among different technology. For instance, the agent can learn to change the gain by tuning the input pair transistors of an amplifier. This feature does not vary among technology nodes.
Transfer between Topologies. We can also transfer knowledge between different topologies if they share similar design principles, such as between a twostage transimpedance amplifier and a threestage transimpedance amplifier. This is enabled by GCN which can extract features from the circuit topologies. Concretely, we slightly modify the state vector mentioned in Section IIIC. is modified to a onedimension index value instead of a onehot index vector. In this way, the dimension of the state vector of each component remains the same among different topologies. We will show that without GCN, knowledge transfer between different topologies cannot be achieved.
Iv Experiments
Iva Comparison between GCNRL and others
To demonstrate the effectiveness of the proposed GCNRL method, we applied GCNRL to 4 realworld circuits (Figure 6): a twostage transimpedance amplifier (TwoTIA), a twostage voltage amplifier (TwoVolt), a threestage transimpedance amplifier (ThreeTIA) and a lowdropout regulator (LDO). The TwoVolt and LDO were designed in commercial 180nm TSMC technology, and simulated with Cadence Spectre. TwoTIA and ThreeTIA were also designed in 180nm technology, simulated with Synopsys Hspice.
We compared FoMs of GCNRL with human expert design, random search, nonGCN RL (NGRL), Evolutionary Strategy (ES) [hansen2016cma], Bayesian Optimization (BO) [snoek2012practical], and MACE [lyu2018batch]. The human expert designs for TwoTIA and ThreeTIA are strong baselines from [stanford214Bhuman, stanford214Ahuman], which achieved the ”Most Innovative Design” award [stanford214Bspec]
. The human expert designs for TwoVolt and LDO come from a seasoned designer with 5year experience designing for 6 hours. MACE is a parallel BO method with multiobjective acquisition ensemble. For ES, BO and MACE, we used opensourced frameworks to guarantee the
unbiased implementations. For GCNRL, NGRL, ES and random search, we ran 10000 steps. In these methods, the circuit simulation time accounts for over 95% of the total runtime. For BO and MACE, it is impossible to run 10000 steps because the computation complexity is , thus we ran them for the same runtime (around 5 hours) with GCNRL for fair comparisons.We ran each experiment three times to show significance. We computed the FoM values based on Equation 2. The and were obtained by random sampling 5000 designs, and choosing the max and min of each metric. We assigned to one performance metric if larger is better, such as gain; and assigned if smaller is better, such as power. The experimental results are shown in Table I. We plot the max FoM values among three runs for each method in Figure 5. GCNRL consistently achieves the highest FoM. The convergence speed of GCNRL is also faster than NGRL, which benefits from the fact that GCN is better than pure FC on extracting the topology features, similar to CNN being better than pure FC on extracting image features.
TwoStage Transimpedance Amplifier. Diodeconnected input transistors are used to convert its drain current to a voltage at the drain of its output stage. Bandwidth, gain, power, noise and peaking are selected as metrics. The FoM results are shown in the ‘TwoTIA’ column of Table I. Performance metrics are show in Table II top part. All the models satisfy the spec. Since the FoM encourages balanced performance among metrics, the GCNRL design forms a good balance among five metrics, resulting in the highest FoM. Notably, GBW of GCNRL is also the highest.
We often need to tradeoff different performance metrics. In order to demonstrate the flexibility of the GCNRL with different design focuses, we assigned 10 larger weight for one metric then other metrics: for GCNRL1 to GCNRL5, 10 larger weight on each of BW, gain, power, noise and peaking. Since one spec is only suitable for one design requirement, in those five experiments, we did not limit the results with the spec to show its general effectiveness. Except for GCNRL4, all designs achieve highest performance on the single aspect we care about. For GCNRL4, it achieves the second best.
TwoTIA  TwoVolt  ThreeTIA  LDO  
Human [stanford214Ahuman, stanford214Bhuman]  2.32  2.02  1.15  0.61 
Random  2.46 0.02  1.74 0.06  0.74 0.03  0.27 0.03 
ES [hansen2016cma]  2.66 0.03  1.91 0.02  1.30 0.03  0.40 0.07 
BO [snoek2012practical]  2.48 0.03  1.85 0.19  1.24 0.14  0.45 0.05 
MACE [lyu2018batch]  2.54 0.01  1.70 0.08  1.27 0.04  0.58 0.04 
NGRL  2.59 0.06  1.98 0.12  1.39 0.01  0.71 0.05 
GCNRL  2.69 0.03  2.23 0.11  1.40 0.01  0.79 0.02 
BW  Gain  Power  Noise  Peaking  GBW  FoM  
(GHz)  ()  (mW)  (pA/)  (dB)  (THz)  
Spec [stanford214Bspec]  max  7.58  18.0  19.3  1.00  max   
Human [stanford214Bhuman]  5.95  7.68  8.11  18.6  0.93  4.57  2.32 
Random  1.41  35.9  2.30  9.28  0.16  5.05  2.48 
ES [hansen2016cma]  1.18  104  4.25  3.77  0  12.3  2.69 
BO [snoek2012practical]  0.16  123  2.76  1.68  0  1.99  2.51 
MACE [lyu2018batch]  0.97  83.1  2.74  7.36  0  8.07  2.55 
NGRL  0.75  156  2.31  3.85  0.068  11.7  2.65 
GCNRL  1.03  167  3.44  3.72  0.0003  17.2  2.72 
GCNRL1  13.6  0.09  41.2  295  0.036  0.13   
GCNRL2  0.20  266  2.58  5.73  0  5.18   
GCNRL3  0.42  249  0.58  4.78  0  10.3   
GCNRL4  0.86  124  3.67  3.64  1.0  10.7   
GCNRL5  0.57  89.0  0.94  11.7  0  5.10   
TwoStage Voltage Amplifier. The amplifier is connected in a closedloop configuration for PVTstable voltage gain, which is set by the capacitor ratio. Miller compensation is used to stabilize the closedloop amplifier. Bandwidth, common mode phase margin (CPM), differential mode phase margin (DPM), power, noise and open loop gain are selected as metrics. The FoM results are shown in the ‘TwoVolt’ column of Table I and details in Table III. The GCNRL achieves highest CPM, DPM; second highest Gain and GBW.
BW  CPM  DPM  Power  Noise  Gain  GBW  FoM  

(MHz)  ()  ()  (W)  (nA/)  ()  (THz)  
Human  242  180  83.9  2.94  47.1  3.94  0.95  2.02 
Random  187  180  2.51  7.85  23.8  8.77  1.64  1.80 
ES [hansen2016cma]  27.3  180  4.43  1.46  74.2  50.0  1.37  1.93 
BO [snoek2012practical]  151  166  2.77  1.45  46.3  25.0  3.77  2.05 
MACE [lyu2018batch]  99.4  180  4.44  8.45  16.1  8.93  0.89  1.76 
NGRL  96.2  180  23.7  4.02  19.2  17.2  1.66  2.09 
GCNRL  84.7  180  96.3  2.56  58.7  29.4  2.57  2.33 
ThreeStage Transimpedance Amplifier. A commonmode input pairs is used to convert its differential source current to a voltage at the drain of its output stage. Threestage configuration is used to boost the IV conversion gain. Bandwidth, gain and power are selected as metrics. The FoM results are shown in the ‘ThreeTIA’ column of Table I. GCNRL achieves lowest power and second highest GBW.
LowDropout Regulator. LDO regulates the output voltage and minimizes the effect of either supply change or load change. Settling time for load increase (), settling time for load decrease (), Load Regulation (LR), settling time for supply increase (), settling time for supply decrease (), Power Supply Rejection Ratio (PSRR) and power are the metrics. The FoM results are shown in the ‘LDO’ column of Table I. The proposed GCNRL method achieves the shortest and , second largest LR and PSRR.
IvB Knowledge Transfer Between Technology Nodes
We also conducted experiments on knowledge transfer between technology nodes, which is a great feature enabled by reinforcement learning. The agents are trained on 180nm and transferred to both larger node – 250nm and smaller nodes – 130, 65 and 45nm to verify its extensive effectiveness.
Transfer on TwoStage Transimpedance Amplifier. We directly applied the RL agent trained in Section IVA on searching parameters for TwoTIA on other technology nodes. We compared the transfer learning results with no transfer after limited number of training steps (300 in total: 100 warmup, 200 exploration). As in Table IV top part, transfer learning results are much better than without transfer.
250nm  130nm  65nm  45nm  

TwoTIA  2.36 0.05  2.43 0.03  2.36 0.09  2.36 0.06 
No Transfer  
TwoTIA  2.55 0.01  2.56 0.02  2.52 0.04  2.51 0.04 
Transfer from 180nm  
ThreeTIA  0.69 0.25  0.65 0.14  0.55 0.03  0.53 0.05 
No Transfer  
ThreeTIA  1.27 0.02  1.29 0.05  1.20 0.09  1.06 0.07 
Transfer from 180nm 
Transfer on ThreeStage Transimpedance Amplifier. We also applied the RL agent in Section IVA to search the parameters on other nodes. We show the results in Table IV bottom part. We also plot the max FoM value learning curves in Figure 7. We use the same random seeds for two methods so they have the same FoMs in the warmup stage. After exploration, transfer learning results are consistently better than no transfer after the same steps.
IvC Knowledge Transfer Between Topologies
We can also transfer the knowledge learned from one topology to another. We chose TwoTIA and ThreeTIA as they are both transimpedance amplifier, thus sharing some common knowledge. We first trained both GCNRL and NGRL agents on TwoTIA for 10000 steps. Then, we directly applied the agents to the ThreeTIA and trained for only 300 steps. We also conducted the reverse experiments, learning from ThreeTIA and transferring to TwoTIA. We compared (1) transfer learning with GCNRL, (2) transfer learning without GCN (NGRL), (3) no transfer with GCNRL, as in Table V, and learning curves in Figure 8. GCNRL consistently achieves higher FoMs than NGRL. Without GCN, the FoM of NGRL is barely at the same level as without transfer, which shows that it is critical to use GCN to extract the knowledge from the graph, and the graph information extracted by GCN can help improve knowledge transfer performance.
TwoTIA ThreeTIA  ThreeTIA TwoTIA  

No Transfer  0.63 0.07  2.37 0.01 
NGRL Transfer  0.62 0.09  2.40 0.07 
GCNRL Transfer  0.78 0.12  2.45 0.02 
V Conclusion
We present GCNRL Circuit Designer, a transferable automatic transistor sizing method with GCN and RL. Benefiting from transferability of RL, we can transfer the knowledge between different technology nodes and even different topologies, which is difficult with other methods. We also use GCN to involve topology information into the RL agent. Extensive experiments demonstrate that our method can achieve better FoM than others, with knowledge transfer ability. Therefore, GCNRL Circuit Designer enables more effective and efficient transistor sizing and design porting.
Acknowledgment
We thank NSF Career Award #1943349, MIT Center for Integrated Circuits and Systems, Samsung, MediaTek for supporting this research.
Comments
There are no comments yet.