Office Action Analysis: 17679681 — Reinforcement driven standard cell placement

Office Action

§101 §103 §112
DETAILED ACTION
This action is responsive to the claims filed on June 5, 2025. Claims 1-35 are pending. 
Claims 1-8 and 22 are withdrawn/non-elected.
Claims 9-21 and 23-35 are under examination.
Claim 19 is objected to for informalities.
Claims 15, 20, 29, and 34 are rejected under 35 U.S.C. 112(b) as indefinite.
Claims 9-21 and 23-35 are rejected under 35 USC 101 as ineligible.
Claims 9-17, 19-21, 23-31, and 33-35 are rejected under 35 USC 103 as obvious over Liao in view of Goldie.
Claims 18 and 32 are rejected under 35 USC 103 as obvious over Liao in view of Goldie and DEEPLIZARD.


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Election/Restrictions
Claims 1-8 and 21 are withdrawn from further consideration pursuant to 37 CFR 1.142(b) as being drawn to a nonelected invention, there being no allowable generic or linking claim. Election was made without traverse in the reply filed on June 5, 2025.


Information Disclosure Statement
The information disclosure statements (IDSs) submitted on March 1, 2022, January 11, 2024, and April 25, 2025, were filed before this action. The submissions are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statements are being considered by the examiner.


Claim Objections
Claim 19 is objected to because of the following informalities: 
Claim 19 recites “receive stick depiction images the candidate routed circuit layouts.” This appears to be a typo. The claims will be examined as if the claim states “stick depiction images of the candidate routed circuit layouts,” as presented in analogous claim 33.
Appropriate correction is required.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 15, 20, 29, and 34 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 15 and 29 recites, “wherein a number of unrouted terminal pairs is also applied to a fitness function of the genetic routing algorithm.” It is unclear what application of terminal pairs to a fitness function means if the terminal pairs are assessed by the genetic routing algorithm itself. In the interest of compact prosecution, for purposes of examination, this will be interpreted to mean that an objective/fitness/cost function is applied to any pair that is addressed by the genetic routing algorithm.
Claims 20 and 34 recite, “wherein the transformation into the action probabilities is invariant in relation in relation to a width of the stick depiction images.” The term invariant is unclear in this context. For example, it may be scale invariant, but the width can apply to a relative dimension that affects the circuit layout, which would affect the probability. This claim appears to be inherently unclear.
Regardless of how the elements are being interpreted for purposes of examination, the terms specified must be clarified to overcome the rejections.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 9-21 and 23-35 are rejected as ineligible under 35 USC 101 for being directed to a judicial exception without significantly more. As an initial matter, the use of genetic algorithms with reinforcement learning in the manner claimed in the independent and dependent claims is a longstanding practice as evidenced by the following references of record: Filipic et al. (1999); Shapiro et al. (1999); Moriarty et al. (1999); Qi et al. (2003); Steiner et al. (2017); Alipour et al. (2018); DEEPLIZARD for pooling (2018); Sharma for policy and value networks (2018); Bengio et al. (2020); Feldman et al (2020); Liao et al. (2020); Matei et al. (2020); Rouhani et al. (2021). This is a non-exhaustive list, as the use of reinforcement models with genetic algorithms has been popular since the turn of the century. This evidence is also presented to demonstrate that the independent claims are well-understood, routine, and conventional (WURC) activity.

Independent Claims
Claim 9 (Statutory Category – Machine)
Step 2A – Prong 1: Judicial Exception Recited?
Yes, the claims recite a mental process and a mathematical concept, which are abstract ideas.
	
	Specifically, claim 9 recites:
	logic that […]:
operates […] to generate device placements on a circuit layout;
operates […] to generate a plurality of candidate routed circuit layouts; and
operates […] to correct design rule constraint errors in the candidate routed circuit layouts. 
	(Evaluations, Mental Processes, Abstract Ideas: Deploying logic to generate device placements on a circuit layout, generate a plurality of candidate routed circuit layouts, and correct design rule constraint errors in the candidate routed circuit layouts are all practically performable in the mind or with the aid of pen and paper. The Applicant’s specification states, [0003] “Generating standard logic cell layouts in advanced technology nodes is challenging due in part to the exploding number and complexity of design rule constraints (DRCs), especially when the design goal is to minimize cell area. Different technology nodes often utilize different circuit generations and architectures in cell libraries. Previous approaches to generating standard cell layouts in advanced technology nodes leverage mathematical optimization methods such as Satisfiability Problem (SAT) and Mixed Integer Programming (MILP) to identify solutions under imposed constraints. These mathematical optimization methods rely on manual expression of design rules within an optimization framework and computational solvers.” [0004] “Automating standard cell layout may not only speed up the design process, but also enable Design and Technology Co-Optimization (DTCO), which simultaneously optimizes standard cells and chip designs to achieve better performance. Standard cell layout design automation includes two primary operations: placement and routing. Placement locates devices and assigns pin locations in the layout. Routing connects device terminals and pins based on net connectivity.” [0008] “Routing tends to be the more challenging of the two operations because routing needs to satisfy a (usually very large) set of configured DRCs. In advanced technology nodes, not only do the number of DRCs greatly expand, but the DRCs are tend to more complex. Much of the new complexity comes from DRCs that involve multiple layout shapes that were previously independent of each other. Mathematical optimization approaches based on SAT and MILP depend on the assumption that all design rule constraints can be expressed in the forms such as conjunctive normal form for SAT, or linear inequality for MILP. It is challenging or impossible to express all the DRCs efficiently in these forms. A large number of constraints are needed to handle all DRCs, which makes it difficult to scale to larger designs. Furthermore, it is often necessary to reformulate these constraints by hand for every new technology node or standard cell layout template.” – The specification demonstrates that any operations that can be done using machine learning models can be practically performed manually, as they were in the past.)

	Accordingly, claim 9 recites a mental process, an abstract idea. Claim 23 recites the method steps of claim 9 and recites an abstract idea for at least the same reasons.

Step 2A – Prong 2: Integrated into a Practical Application?
No.

	The additional limitations:
A system comprising:
one or more processors; and
[…] when applied to the one or more processors:
[…] a first reinforcement learning model […]
[…] a genetic routing algorithm […]
[…] a second reinforcement learning model […]
These elements recite generic computing components at a high level without any specific improvements thereto, and, under MPEP 2106.05(f), fail to integrate the abstract idea into a practical application at Step 2A, Prong 2.

	Also, any specific details about the parameters the data recited represent, the parameters merely limit the abstract idea to a particular technological environment and, under MPEP 2106.05(h), fail to integrate the abstract idea into a practical application at Step 2A, Prong 2.

	Claim 9 fails to recite any additional limitations that integrate the abstract idea into a practical application. 
Accordingly, claim 9 is directed to the abstract idea. Claim 23 recites the method steps of claim 9 and is directed to the abstract idea for at least the same reasons.

Step 2B: Claim provides an Inventive Concept?
No.

The Additional limitations:
A system comprising:
one or more processors; and
[…] when applied to the one or more processors:
[…] a first reinforcement learning model […]
[…] a genetic routing algorithm […]
[…] a second reinforcement learning model […]
	These elements recite generic computing components/code at a high level and, under MPEP 2106.05(f), fail to combine with other elements of the claim to provide significantly more than the abstract idea that would confer an inventive concept at Step 2B.

Also, any specific details about the parameters the data recited represent, the parameters merely limit the abstract idea to a particular technological environment and, under MPEP 2106.05(h), fail to combine with other elements of the claim to provide significantly more than the abstract idea that would confer an inventive concept at Step 2B.

Claim 9 fails to recite any additional limitations that combine with other elements of the claim to provide significantly more than the abstract idea that would confer an inventive concept at Step 2B.
Accordingly, claim 9 is ineligible. Claim 23 recites the method steps of claim 9 and is ineligible for at least the same reasons as claim 9.

Dependent Claims
The dependent claims 10-21 and 24-35 are also ineligible for at least the following reasons.
Claims 10 and 24
the first reinforcement learning model comprising a graph neural network.  
This is a generic computing component recited at a high level, so it fails to confer eligibility under MPEP 2106.05(f).
Claims 10 and 24 fail to recite any additional limitations that confer eligibility.
Claims 10 and 24 are invalid.
Claims 11 and 25
[…] generate routing estimations [for use in the logic used to determine the placement]. (Evaluation, Mental Process, Abstract Idea: The estimation of routing to use to determine placement is an evaluation practically performable in the mind or with the aid of pen and paper.
A machine learning model […] first reinforcement learning model 
These are generic computing elements recited at a high level and, under MPEP 2106.05(f), fail to confer eligibility.
Claims 11 and 25 fail to recite any additional limitations that confer eligibility.
Claims 11 and 25 are invalid.
Claims 12 and 26
the machine learning model comprising a convolutional neural network.  
These are generic computing elements recited at a high level and, under MPEP 2106.05(f), fail to confer eligibility.
Also, machine learning models including convolutional neural networks are elements of longstanding practices for which extreme conventionality prevents the limitation from conferring eligibility.
Claims 12 and 26 fail to recite any additional limitations that confer eligibility.
Claims 12 and 26 are invalid.
Claims 13 and 27
feedback of a number of […] errors to evolve the genetic routing algorithm. 
Providing error feedback is the generic manner in which all machine learning algorithms, including the recited genetic routing algorithm, 
the design rule constraint
This merely limits the abstract idea to a particular field of technology and, under MPEP 2106.05(h), fails to confer eligibility.
Claims 13 and 27 fail to recite any additional limitations that confer eligibility.
Claims 13 and 27 are invalid.
Claims 14 and 28
wherein the design rule constraint errors are applied to a fitness function of the genetic routing algorithm.  
Applying a fitness function is merely applying an objective function to propagate error through the machine learning algorithm to train it. This is both a generic computing operation and is a longstanding practice.
The design rule constraint element merely limits the abstract idea to a particular field of technology and, under MPEP 2106.05(h), fails to confer eligibility.
Claims 14 and 28 fail to recite any additional limitations that confer eligibility.
Claims 14 and 28 are invalid.
Claims 15 and 29
wherein a number of unrouted terminal pairs is also applied to the fitness function of the genetic routing algorithm. 
This is merely a selection of a type of data for training that merely limits the abstract idea to a particular field, which, under MPEP 2106.05(h), fails to confer eligibility.
Claims 15 and 29 fail to recite any additional limitations that confer eligibility.
Claims 15 and 29 are invalid.
Claims 16 and 30
wherein the second reinforcement learning model comprises a convolutional neural network generating embeddings for a plurality of policy neural networks and a state value neural network.  
The convolutional neural networks that generate embeddings for policy neural networks and a state value neural network are standard elements of a reinforcement learning model, so they are longstanding practices and are also generic computing elements, which under MPEP 2106.05(f), fail to confer eligibility.
Claims 16 and 30 fail to recite any additional limitations that confer eligibility.
Claims 16 and 30 are invalid.
Claims 17 and 31
wherein the policy neural network comprises a plurality of fully connected layers and an operation mask.  
Fully connected layers and masking to populate empty or null values are longstanding techniques in machine learning and are also generic computing elements, which, under MPEP 2106.05(f), fail to confer eligibility.
Claims 17 and 31 fail to recite any additional limitations that confer eligibility.
Claims 17 and 31 are invalid.
Claims 18 and 32
a pooling layer; and the state value neural network comprising a plurality of fully connected layers coupled to receive an output of the pooling layer. 
Using fully connected layers in series before and after pooling layers is a longstanding practice in machine learning and also represents generic computing operations, which, under MPEP 2106.05(f), fail to confer eligibility.
Claims 18 and 32 fail to recite any additional limitations that confer eligibility.
Claims 18 and 32 are invalid.
Claims 19 and 33
the reinforcement learning model configured to: receive stick depiction images the candidate routed circuit layouts; and transform the stick depiction images into action probabilities for correcting the design rule constraint errors. 
The use of input data and output therefrom to determine errors/action probabilities to propagate to train a machine learning model is a longstanding practice and represents a generic computing operation, which, under MPEP 2106.05(f), fails to confer eligibility.
The nature of the data being stick depiction images merely limits the abstract idea it a particular field, which, under MPEP 2106.05(f), fails to confer eligibility.
Claims 19 and 33 fail to recite any additional limitations that confer eligibility.
Claims 19 and 33 are invalid.
Claims 20 and 34
wherein the transformation into the action probabilities is invariant in relation to a width of the stick depiction images.
When applying masking, all images are input as the same size, the output probability is invariant relative to the size of the content of the images. A probability will be yielded regardless. This amounts to merely using image data as an input, which is a longstanding practice and is a generic computing operation, which, under MPEP 2106.05(f), fails to confer eligibility.
Claims 20 and 34 fail to recite any additional limitations that confer eligibility.
Claims 20 and 34 are invalid.
Claims 21 and 35
the genetic routing algorithm further comprising: a fitness function comprising a reciprocal of a weighted sum of a number of unrouted terminal pairs in the candidate routed circuit layouts and a number of the design rule constraint errors in the candidate routed circuit layouts. 
This is a mathematical calculation (reciprocal of weighted sum), which is a mathematical concept, an abstract idea. This fails to provide any additional limitations to confer eligibility.
The nature of the data merely limits the abstract idea to a particular field of endeavor, which, under MPEP 2106.05(h), fails to confer eligibility.
Claims 21 and 35 fail to recite any additional limitations that confer eligibility.
Claims 21 and 35 are invalid.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The factual inquiries for establishing a background for determining obviousness under pre-AIA  35 U.S.C. 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

	Claims 9-17, 19-21, 23-31, and 33-35: Liao and Goldie
Claims 9-17, 19-21, 23-31, and 33-35 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over NPL: “Track-Assignment Detailed Routing Using Attention-Based Reinforcement Learning” by Liao et al. (Liao) in view of NPL: “Placement Optimization with Deep Reinforcement Learning” by Goldie et al. (Goldie)

Claims 9 and 23
Regarding claim 9, Liao teaches:
A system comprising: one or more processors; and logic that when applied to the one or more processors: (Liao Top of Page 1 “Proceedings of the ASME 202 International Design Engineering Technical Conferences and Computers and information in Engineering Conference” Abstract “Existing actual routers as well as routability predictors either have to resort to expensive approaches that lead to high computational times, or use heuristics that do not generalize well.” Page 9, Right Column, Last Paragraph “when applying a genetic router to solve large scale problems, it tends to exhibit high computational cost.” Page 9, Left Column, Second Paragraph “All experiments are run on a workstation with an Intel Core i7-6850 CPU.” Also See Algorithm 1 on Page 6 and Algorithms 2 and 3 on page 8. – Liao teaches using a computer, which includes processors that execute logic.)
operates  to generate device placements on a circuit layout; (Liao Pages 8-9, Experiments “In order to assess the performance of the attention router and its comparison to the baseline genetic router, both algorithms are applied to detailed routing problems from two problem sets: Small and Large, which are both analog design problems based on commercial advanced node technologies (sub-16 nm technology). To be specific, Small problem set consists of different placement solutions for Comparators and OpAmp, while Large problem set consists of different placement solutions of Analog-to-Digital Converter (ADC). For Small problem set, the number of instTerms for each problem range from 10 to 100, and in Large problem sets, the number of instTerms for each problem range from 100 to 1000.” – Placements are generated.)
operates a genetic routing algorithm on the circuit layout to generate a plurality of candidate routed circuit layouts; and (Liao Page 5, Left Column, Last Paragraph – Right Column, First Paragraph “ In routing instTerm pairs, GA (for genetic router) and attention-based REINFORCE model (for attention router) are applied to determine the sequence of the instTerm pairs in a problem to be routed. GA Sequencing is executed only once, while Attention Sequencing is firstly trained by solving problems from the training set, and then applied to problems in both training and test sets. During execution of GA Sequencing and Attention Sequencing, the same Pattern Router function is utilized to compute the exact routes connecting each instTerm pairs and to calculate the cost of the solution. The problem solution is a concatenation of the actual routes for all the instTerm pairs and the cost of a solution is defined as a weighted sum of wirelength (WL) and number of openings (#Open), which is the number of instTerms pairs that remain unconnected due to a lack of feasible route for the given problem.” See FIGURE 4 and FIGURE 5 on Page 5. A genetic routing algorithm is used to generate routes for the attention sequencing reinforcement model.)
operates a second reinforcement learning model to correct design rule constraint errors in the candidate routed circuit layouts. (Liao Abstract “Complex design rule constraints are encoded into the routing algorithm and an attention model- based REINFORCE algorithm is applied to solve the most critical step in routing: sequencing device pairs to be routed. The attention router and its baseline genetic router are applied to solve different commercial advanced technologies analog circuits problem sets.” – The attention model uses the genetic algorithm as its base and provides routes that are corrected using the reinforcement learning model that applies the constraints to the pattern router to determine a cost for training and to eventually select an appropriate route for the input IC device placement arrangement.)

Liao teaches that device placements are generated as elements of the problem to be solved but appears to fail to specify how those placements are generated. Liao appears to fail to explicitly teach, but Liao in view of Goldie teaches:
operates a first reinforcement learning model to generate device placements on a circuit layout; (Goldie Abstract “Placement Optimization is an important problem in systems and chip design, which consists of mapping the nodes of a graph onto a limited set of resources to optimize for an objective, subject to constraints. In this paper, we start by motivating reinforcement learning as a solution to the placement problem. We then give an overview of what deep reinforcement learning is. We next formulate the placement problem as a reinforcement learning problem, and show how this problem can be solved with policy gradient optimization. Finally, we describe lessons we have learned from training deep reinforcement learning policies across a variety of placement optimization problems.)

It would have been obvious to a person of ordinary skill in the art before the effective date of the claims to modify the generic generation of placements of Liao by the specific placement generation methods of Goldie because the person of ordinary skill in the art would be motivated by the aim of Liao to have a good placement solution to look to Goldie, which provides an improved, promising deep reinforcement learning, graph-based approach to device placement on ICs. (Liao Page 1, Introduction “The placement and routing, while applied sequentially, are interdependent: a good placement makes routing simpler, and quantitative routing measures can in turn be used to assess the quality of a placement solution.; Goldie Page 7, Right Column, Last Paragraph – Page 8, Left Column, Third Paragraph “In this paper we discuss placement optimization with deep reinforcement learning. Deep RL is a promising approach for solving combinatorial problems, and enables domain adaptation and direct optimization of non-differentiable objective functions. […] In this work, we provide an overview of deep RL, formulate the placement problem as a RL problem, and discuss strategies for training successful RL agents. We predict a trend towards more effective RL-based domain adaptation techniques, in which graph neural networks will play a key role in enabling both higher sample efficiency and more optimal placements. We also foresee a future in which easy-to-use RL-based placement tools will enable non-ML experts to harness and improve upon these powerful techniques.)

Regarding claim 23, claim 23 recites the method executed by claim 9 and is rejected for at least the same reasons as claim 9.

Claims 10 and 24
	Regarding claim 10, Liao in view of Goldie teaches the features of claim 9 and further teaches:
	the first reinforcement learning model comprising a graph neural network. (Goldie Page 5, 3.2 “As discussed, many placement problems take input in the form of
graph. The way in which these input graphs are represented has great impact on the ability of machine learning models to generate high-quality placements. More meaningful representations also help models to learn patterns that generalize to new unseen graphs, as opposed to merely memorizing the graphs that they encounter.
We therefore describe some of the recent advances in neural graph representations, such as Graph Convolutional Networks.” Page 6, Representations “Finally, the way in which state is represented has significant impact on the performance of the policy and its ability to generalize to unseen instances of the placement problem. For example, in the earlier TensorFlow device placement papers [16,17], we represented the computational graph as a list of adjacencies, indices of the node operations, and sizes of the operations. This approach was effective when the policy was trained from scratch
for each new TensorFlow graph, but was unable to generalize or transfer what it learned to new graphs. On the other hand, [28] used graph convolutional neural networks to learn better representations of the computational graph structure, and were able to transfer knowledge across graphs.” – The reinforcement model of Goldie used to determine placements is a graph neural network.)

Claims 11 and 25
	Regarding claim 11, Liao in view of Goldie teaches the features of claim 9 and further teaches:
	further comprising a machine learning model to generate routing estimations input to the first reinforcement learning model. (Goldie Page 5, Left Column, Second Paragraph “The reward function varies for different problems. For example, for TensorFlow graph placement, we use negative runtime of a
training step of the placed deep network model. For ASIC and FPGA netlists, the reward is more complex and should include various metrics related to power and timing (e.g. total wirelength, routability congestion, and cell density). – The reward function of machine learning model is used to account for routability congestion as input to train the reinforcement learning model.)

	Regarding claim 25, claim 25 recites the method steps conducted by claim 11 and is rejected for at least the same reasons as claim 11.

Claims 12 and 26
	Regarding claim 12, Liao in view of Goldie teaches the features of claim 11 and further teaches:
the machine learning model comprising a convolutional neural network. (Goldie Page 5, Right Column, First Paragraph “Most graph neural network methods used in systems today are ConvGNNs, which generalize the concept of convolution. After all, images are merely a special case of graphs, in which pixels are nodes connected to the pixels (nodes) immediately surrounding them. ConvGNNs use deep convolutional networks to capture even distant relationships within the graph.” – The first machine learning model includes a convolutional neural network.)

	Regarding claim 26, claim 26 recites the method steps conducted by claim 12 and is rejected for at least the same reasons as claim 12.

Claims 13 and 27
Regarding claim 13, Liao in view of Goldie teaches the features of claim 9 and further teaches:
	further comprising feedback of a number of the design rule constraint errors to evolve the genetic routing algorithm. (Liao Page 5, Right Column, First Paragraph “During execution of GA Sequencing and Attention Sequencing, the same Pattern Router function is utilized to compute the exact routes connecting each instTerm pairs and to calculate the cost of the solution. The problem solution is a concatenation of the actual routes for all the instTerm pairs and the cost of a solution is defined as a weighted sum of wirelength (WL) and number of openings (#Open), which is the number of instTerms pairs that remain unconnected due to a lack of feasible route for the given problem. The cost is given in Eqn. 7.” See Also FIGURE 5 on Page 5 (Below) – Feedback of constraints is used evolve routing algorithm.)

    PNG
    media_image1.png
    333
    665
    media_image1.png
    Greyscale


Regarding claim 27, claim 27 recites the method steps conducted by claim 13 and is rejected for at least the same reasons as claim 13.

Claims 14 and 28
Regarding claim 14, Liao in view of Goldie teaches the features of claim 13 and further teaches:
wherein the design rule constraint errors are applied to a fitness function of the genetic routing algorithm. (Liao Page 11, Conclusions “We present a new approach to the track-assignment detailed routing using RL. A detailed routing pipeline we call the attention router that takes into account complex design rule constraints is developed and the attention-model based REINFORCE algorithm is applied for the ordering of instTerm pairs. The attention router and a baseline genetic router is tested on different commercial advanced technologies analog circuits problem sets.” Page 6, Left Column, First Paragraph “For the standard bipartite matching problem, [35] provides a polynomial algorithm, but the problem now becomes NP-complete [18] after introducing the assignment conflict constraint. As instTerm splitting is not allowed, we
modified the algorithm presented in [18] to solve the instTerm track assignment problem, and the algorithm is described in Algorithm 2.” Page 8, Left Column, Third Paragraph “The loss of each problem instance returned by the pattern router is
defined as a weighted sum of the total wire length and number of openings using the routing sequence p generated by the attention model, as shown in Eqn. 7.” Also See FIGURE 5 (Shown Below) – Design rule constraint errors are applied to a fitness function of the genetic algorithm).

    PNG
    media_image2.png
    463
    718
    media_image2.png
    Greyscale

Regarding claim 28, claim 28 recites the method steps conducted by claim 14 and is rejected for at least the same reasons as claim 14.

Claims 15 and 29
Regarding claim 15, Liao in view of Goldie teaches the features of claim 14 and further teaches:
	wherein a number of unrouted terminal pairs is also applied to the fitness function of the genetic routing algorithm. (Liao Page 5, Right Column, Fourth-Fifth Paragraph “Not all tracks can be used to route an instTerm. For illustration, in Fig 6 (a) with three tracks, Track 1 and 2 can only contain gate (G) and source/drain (S/D) terminals, while Track 3 can have both G and S/D terminals. Specifically, in this work,
we have seven tracks per row, where the G terminals should be on tracks 1, 2, 6, and 7, S/D terminals on tracks 2, 3, 4, 5, and 6, hence for instTerms composed of both G and S/D, only track 2 and 6 shall be used. Two graphs are used in the track assigner, namely the overlap graph and the assignment graph (Fig 6). The overlap graph models the conflicts between instTerms, where each node represents an instTerm, and an edge exists between two nodes if they belong to different nets and their x-ranges overlap (implying that they cannot be assigned to the same track). This constitutes a horizontal constraint graph [2]. The assignment graph is a weighted bipartite graph, the nodes on one side are the instTerms and the other are the available tracks, if an instTerm is assignable to a track, these two nodes are connected by an edge with a weight as the assignment cost. Because instTerm is constrained to route on the tracks of the containing row, the vertical connection and via costs can be omitted, and we model the cost considering only the horizontal track utilization.” Page 5, Right Column, First Paragraph “GA Sequencing is executed only once, while Attention Sequencing is firstly trained by solving problems from the training set, and then applied to problems in both training and test sets. During execution of GA Sequencing and Attention Sequencing, the same Pattern Router function is utilized to compute the exact routes connecting each instTerm pairs and to calculate the cost of the solution. The problem solution is a concatenation of the actual routes for all the instTerm pairs and the cost of a solution is defined as a weighted sum of wirelength (WL) and number of openings (#Open), which is the number of instTerms pairs that remain unconnected due to a lack of feasible route for the given problem. The cost is given in Eqn. 7.” – A number of unrouted terminal pairs is applied to the fitness function of the genetic routing algorithm.)

Regarding claim 29, claim 29 recites the method steps conducted by claim 15 and is rejected for at least the same reasons as claim 15.

Claims 16 and 30
Regarding claim 16, Liao in view of Goldie teaches the features of claim 9 and further teaches:
wherein the second reinforcement learning model comprises a convolutional neural network generating embeddings for a plurality of policy neural networks and a state value neural network. (NOTE: This is a standard feature of reinforcement learning systems. ALSO Liao Page 3, Right Column, Second-Third Paragraphs “In solving such combinatorial problems, the attention model-based REINFORCE use an existing policy gradient RL model: REINFORCE. In a policy gradient RL algorithm, a model is used to learn a policy model p(a|s;ϴ), matching state s of a problem at each time step to a corresponding probability distribution of all actions a by iteratively optimizing the policy model parameters over training samples. The cost that training process aims to minimize is the expectation of reward r collected after certain policy p has been rolled out for an episode […] In REINFORCE, the above gradient of cost function is used to optimize or train the policy model. However, the training process of REINFORCE tends to be unstable due to the delayed reward mechanism of REINFORCE. Thus, REINFORCE with baseline is applied to stabilize REINFORCE.” Page 4, Left Column, Second Paragraph “The policy model is realized with attention-based encoderdecoder model which can be considered as a Graph Attention Network [25], as shown in Fig. 3. The encoder is similar to the Transformer architecture [26] encoder. After a first learned linear projection layer, the major part of encoder consists of N attention layers, with each layer consisting of two sublayers: a multi-head attention (MHA) layer and a fully connected feed-forward (FF) layer, skip-connection [27] and batch normalization (BN) [28] are applied at each sublayer,” - The second reinforcement learning model generates embeddings for a plurality of policy neural networks and a state value neural network. See Also FIGURE 3 (shown below) illustrating the network structures)

    PNG
    media_image3.png
    622
    769
    media_image3.png
    Greyscale


Regarding claim 30, claim 30 recites the method steps conducted by claim 16 and is rejected for at least the same reasons as claim 16.

Claims 17 and 31
Regarding claim 17, Liao in view of Goldie teaches the features of claim 16 and further teaches:
	wherein the policy neural network comprises a plurality of fully connected layers and an operation mask. (Liao Page 4, Left Column, Second Paragraph “The policy model is realized with attention-based encoder decoder model which can be considered as a Graph Attention Network [25], as shown in Fig. 3. The encoder is similar to the Transformer architecture [26] encoder. After a first learned linear projection layer, the major part of encoder consists of N attention layers, with each layer consisting of two sublayers: a multi-head attention (MHA) layer and a fully connected feed-forward (FF) layer, skip-connection [27] and batch normalization (BN) [28] are applied at each sublayer.” – The policy neural network comprises fully connected layers. Page 3, Right Column, Fourth Paragraph “The input of the policy model is based on the graph structure (layout of cities) of the TSP, and the policy model output a probability distribution […] over all the nodes that are likely to be visited at the next time step n, during which nodes already visited are masked to ensure zero probability to be visited. - The policy neural network comprises an operation mask.)

Regarding claim 31, claim 31 recites the method steps conducted by claim 17 and is rejected for at least the same reasons as claim 17.

Claims 19 and 33
Regarding claim 19, Liao in view of Goldie teaches the features of claim 9 and further teaches:
the reinforcement learning model configured to: receive stick depiction images the candidate routed circuit layouts; and (Liao Page 6, 3.2 Pin Decomposer “Each net is composed of multiple instTerms. Each instTerm has the coordinate […] In order to simplify the problem, instead of directly working on the sequence of nets, we first decompose each net into multiple two-instTerm pairs, so that after the decomposition, our model will produce the best sequence of these instTerm pairs. Kruskal’s algorithm [36] is utilized to construct a Minimum Spanning Tree (MST) first, as the MST naturally reveals the pin pairs that should be connected as shown in Fig. 7a.” See Also FIG. 6 (shown below) – The reinforcement learning model receives stick depiction images of the candidate routed circuit layouts.)

    PNG
    media_image4.png
    532
    1592
    media_image4.png
    Greyscale

transform the stick depiction images into action probabilities for correcting the design rule constraint errors. (Liao Page 3, Right Column, Second Paragraph “In solving such combinatorial problems, the attention model-based REINFORCE use an existing policy gradient RL model: REINFORCE. In a policy gradient RL algorithm, a model is used to learn a policy model p(a|s;ϴ), matching state s of a problem at each time step to a corresponding probability distribution of all actions a by iteratively optimizing the policy model parameters over training samples. The cost that training process aims to minimize is the expectation of reward r collected after certain policy p has been rolled out for an episode” – The model generates action probabilities for correcting design rule constraint errors.)

Regarding claim 33, claim 33 recites the method steps conducted by claim 19 and is rejected for at least the same reasons as claim 19.

Claims 20 and 34
	Regarding claim 20, Liao in view of Goldie teaches the features of claim 19 and further teaches:
wherein the transformation into the action probabilities is invariant in relation to a width of the stick depiction images. (Liao Page 3, Right Column, Second Paragraph “In solving such combinatorial problems, the attention model-based REINFORCE use an existing policy gradient RL model: REINFORCE. In a policy gradient RL algorithm, a model is used to learn a policy model p(a|s;ϴ), matching state s of a problem at each time step to a corresponding probability distribution of all actions a by iteratively optimizing the policy model parameters over training samples. The cost that training process aims to minimize is the expectation of reward r collected after certain policy p has been rolled out for an episode” Page 7, Right Column, Second Paragaph “Since the number of instTerms varies in each routing problem, to ensure that each problem instance s has the same number of nodes n (inst pairs), we perform two possible padding strategies on the problem instance: Pad Random and Pad Empty. In the Pad Random strategy, we uniformly sample xy-coordinates in the domain of all instTerms; in the Pad Empty strategy, we are not concerned with the actual coordinates and instead pad with all-zero nodes in the form of (0;0;0;0;0;0;0). In each strategy, we set the graph size n to be the maximum graph size of all the problem instances, and pad nodes to problem instances of smaller size.” – The model expresses no dependency between the width of the stick depiction images and the output of probabilities based on stick image inputs because padding is used for consistent dimensionality.)

Regarding claim 34, claim 34 recites the method steps conducted by claim 20 and is rejected for at least the same reasons as claim 20.

Claims 21 and 35
Regarding claim 21, Liao in view of Goldie teaches the features of claim 9 and further teaches:
the genetic routing algorithm further comprising: a fitness function comprising a reciprocal of a weighted sum of a number of unrouted terminal pairs in the candidate routed circuit layouts and a number of the design rule constraint errors in the candidate routed circuit layouts. (Liao Page 8, Left Column, Third Paragraph “The loss of each problem instance returned by the pattern router is defined as a weighted sum of the total wire length and number of openings using the routing sequence p generated by the attention model, as shown in Eqn. 7. We optimize the loss L(qjs) by gradient descent, using the REINFORCE gradient estimator with rollout baseline b(s) [13]:” – The genetic routing algorithm uses a fitness function with reciprocal weighted sums of unrouted terminal pairs in the candidate routed circuit layouts and a number of the design rule constraint errors in the layouts.)

	Claims 18 and 32: Liao, Goldie, and DEEPLIZARD
Claims 18 and 32 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over NPL: “Track-Assignment Detailed Routing Using Attention-Based Reinforcement Learning” by Liao et al. (Liao) in view of NPL: “Placement Optimization with Deep Reinforcement Learning” by Goldie et al. (Goldie) and NPL: “Deep Learning Fundamentals – Classic Edition, Max Pooling In convolutional Neural Networks Explained” by DEEPLIZARD.

Claims 18 and 32
Regarding claim 18, Liao in view of Goldie teaches the features of claim 16 and further teaches:

the state value neural network comprising a plurality of fully connected layers coupled to receive an output . (NOTE: Using fully connected and pooling layers have been standard practice for nearly two decades for dimensionality reduction that expedites calculation and for image feature smoothing. This is nothing new. ALSO, Liao Page 4 “After a first learned linear projection layer, the major part of encoder consists of N attention layers, with each layer consisting of two sublayers: a multi-head attention (MHA) layer and a fully connected feed-forward (FF) layer, skip-connection [27] and batch normalization (BN) [28] are applied at each sublayer.” See Also FIGURE 3 (shown below) – The models in Liao include inputting and outputting data to standard fully connected layers.)


    PNG
    media_image5.png
    609
    750
    media_image5.png
    Greyscale


	Liao in view of Goldie does not appear to explicitly teach, but Liao in view of Goldie and DEEPLIZARD Teaches:
a pooling layer; and
the state value neural network comprising a plurality of fully connected layers coupled to receive an output of the pooling layer. (DEEPLIZARD Page 2, Introducing Max Pooling “Max pooling is a type of operation that is typically added to CNNs following individual convolutional layers. When added to a model, max pooling reduces the dimensionality of images by reducing the number of pixels in the output from the previous convolutional layer.” Also See the unlabeled code example on Page 4 demonstrating a pooling layer with output to a fully connected layer – This teaches using a pooling layer to output to a fully connected layer.)

	It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claims to modify the fully connected neural network layers of Liao by the use of a pooling layer in DEEPLIZARD because the person of ordinary skill in the art would be motivated based on the aims of Liao to simplify and expedite computations and improve accuracy to look to DEEPLIZARD, which uses the standard max pooling layer to reduce computational complexity and increase model accuracy. (Liao Abstract “Existing actual routers as well as routability predictors either have to resort to expensive approaches that lead to high computational times, or use heuristics
that do not generalize well. […] The attention router demonstrates generalization
ability to unseen problems and is also able to achieve more than 100x acceleration over the genetic router without severely compromising the routing solution quality. Increasing the number of training problems greatly improves the performance of attention router.; DEEPLIZARD Page 4 Why use max pooling? “[…] Since max pooling is reducing the resolution of the given output of a convolutional layer, the network will be looking at larger areas of the image at a time going forward, which reduces the amount of parameters in the network and consequently reduces computational load. […] Additionally, max pooling may also help to reduce overfitting. The intuition for why max pooling works is that, for a particular image, our network will be looking to extract some particular features. Maybe, it’s trying to identify numbers from the MNIST dataset, and so it’s looking for edges, and curves, and circles, and such. From the output of the convolutional layer, we can think of the higher valued pixels as being the ones that are the most activated. With max pooling, as we’re going over each region from the convolutional output, we’re able to pick out the most activated pixels and preserve these high values going forward while discarding the lower valued pixels that are not as activated.”)


Conclusion
Prior art made of record and not relied upon is considered pertinent to applicant's disclosure and was presented above as WURC/Longstanding Practice Evidence in the 35 USC 101 rejection of the independent claims. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAY MICHAEL WHITE whose telephone number is (571) 272-7073. The examiner can normally be reached Mon-Fri 11:00-7:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ryan Pitaro can be reached at (571) 272-4071. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/J.M.W./Examiner, Art Unit 2188    
                                                                                                                                                                                                    /RYAN F PITARO/Supervisory Patent Examiner, Art Unit 2188
Read full office action
Reinforcement driven standard cell placement

This examiner grants 12% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Reinforcement driven standard cell placement

This examiner grants 12% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email