Office Action Analysis: 18124992 — LAYOUT DESIGN SYSTEM USING DEEP REINFORCEMENT LEARNING AND LEARNING METHOD THEREOF

Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Specification
The disclosure is objected to because of the following informalities:
(Text within parentheses is either a missing or a corrected information to character(s) in bold.)
[0032] The input/output interface 1300 may be controlled by the CPU 1100 (and/or GPU 115(0)) to receive and process user inputs, as well as output information to a user, using user interface devices.
[0047] The action value selector 1224 may select an action value Q(xt, at) corresponding to the action 'aj' input by the action inputted(r) 1223.
Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sooyong LEE et. al., hereinafter LEE, (20210334444) in view of NAOAKI TSUTSUI et. al. hereinafter and TSUTSUI (US 20200184137 A1).
Regarding claim 1
LEE discloses a, b, c, d, and g part of claim 1
A layout optimization system for correcting a target layout of a semiconductor process, the system comprising:
(LEE, p. 1, [0006] “Example embodiments of inventive concepts provide a method and/or a computing device for generating a layout for manufacturing/fabrication of a semiconductor device, such as semiconductor chip, having improved reliability and/or reduced amount of computation.”)
a deep reinforcement learning (DRL) module
(LEE, p. 1, [0016] “FIG. 6 illustrates an example in which a semiconductor process machine learning module according to some example embodiments of inventive concept executes process proximity correction.”)
Machine learning module is a general term covering all different kinds including deep reinforcement learning module.
a memory storing instructions
(LEE, p. 2, [0026] “FIG. 1 is a block diagram illustrating a computing device 100 according to some example embodiments of inventive concepts. Referring to FIG. 1, the computing device 100 may include one or more processors 110, a random access memory 120, one or more device drivers 130, one or more storage devices 140, one or more modem 150, and one or more user interfaces 160.”)
(LEE, p. 2, [0027]” ... In this case, the at least one processor may load the instructions (or codes) of the semiconductor process machine learning module 200 onto the random access memory 120.”)
and a processor configured to execute the instructions to:
(LEE, p. 2, [0026] “FIG. 1 is a block diagram illustrating a computing device 100 according to some example embodiments of inventive concepts. Referring to FIG. 1, the computing device 100 may include one or more processors 110, a random access memory 120, one or more device drivers 130, one or more storage devices 140, one or more modem 150, and one or more user interfaces 160.”)
receive a target layout.
(LEE, p. 2, [0035] “FIG. 2 illustrates an example in which the semiconductor process machine learning module 200 of FIG. 1 performs generation of a layout. Referring to FIGS. 1 and 2, in operation S110, the semiconductor process machine learning module 200 may receive a first layout.”)
and apply a size correction to at least one pattern of the prediction layout based on the optimal layout
(LEE, p. 4, [0057] “When the inferred ACI image is not acceptable, in operation S260 the semiconductor process machine learning module 200 may adjust the features. For example, the semiconductor process machine learning module 200 may adjust patterns' own features such as sizes and/or shapes of the patterns.”)
LEE does not teach
generate, by the DRL module, a prediction layout by applying a simulation to the target layout
generate, by the DRL module, an optimal layout based on the prediction layout
However, TSUTSUI discloses
generate, by the DRL module, a prediction layout by applying a simulation to the target layout
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning.”)
Wherein reinforcement learning performs simulation. Further, the teacher data is considered as the target layout.
generate, by the DRL module, an optimal layout based on the prediction layout
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning. Thus, an action value function Q(s.sub.t+1,a.sub.t+1) at the next time t+1 is estimated, and r.sub.t+1+maxQ(s.sub.t+1,a.sub.t+1) is regarded as teacher data. The first neural network 520 performs learning by using the teacher data for a loss function.”)
(TSUTSUI, p. 11 [0189] “Results of the inference of the action value function Q(s.sub.t,a.sub.t) by the first neural network 520 and the teacher data generated by the second neural network 530 are used to calculate a loss function. With the use of a stochastic gradient descent (SGD), the weight coefficient of the first neural network 520 is updated such that the value of the loss function becomes small.”)
Further, the teacher data is considered as the first and target layout.
Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI, to train the deep reinforcement learning module using simulations iteratively, as it is very effective in refining trained concepts.

Regarding claim 2
LEE teaches all the features of claim 1 as disclosed above.
LEE does not teach
The system of claim 1, wherein the DRL module comprises a deep neural network configured to generate value functions corresponding to a plurality of action inputs.
However, TSUTSUI teaches
The system of claim 1, wherein the DRL module comprises a deep neural network configured to generate value functions corresponding to a plurality of action inputs.
(TSUTSUI, p. 3, [0066]“Reinforcement learning can be used for layout design. As reinforcement learning, TD learning (Temporal Difference Learning), Q learning, or the like can be employed, for example.”
[0067]” It is particularly preferable that a learning algorithm using deep learning be employed in Q learning.”
[0068] “Q learning is a method in which the value of selection of an action at by an agent at time t when the environment is in a state st is learned. The agent means an agent that takes the action, and the environment means an object subject to the action. By the action at by the agent, the environment makes a transition from a state S.sub.t to a state S.sub.t+1 and the agent receives reward r.sub.t+1. In Q learning, the action at is learned such that the total amount of the obtained reward is peaked in the end. The value of taking the action at in the state st can be expressed as an action value function Q(s.sub.t,a.sub.t).”)
(TSUTSUI, p. 3, [0070] “ Deep learning can be used for estimation of the above-described action value function Q(s.sub.t,a.sub.t).”)
(TSUTSUI, p. 3, [0071] “It is particularly preferable that the layout design system that is one embodiment of the present invention use Deep Q-Learning for layout design and the Deep Q-Network have a configuration of a convolutional neural network (CNN).”)
	Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI, because the combination would allow the configuration of the neural network to generate value functions for all required actions, yielding the predictable results of a well-trained neural network to produce an optimal layout.
Regarding claim 3
LEE and TSUTSUI teach all the features of claim 2 as disclosed above.
LEE further discloses
The system of claim 2, wherein the plurality of action inputs comprises an action corresponding to an adjustment of a size of each patterns of the target layout.
(LEE, p. 5, [0082] “In operation S460, when the inferred ACI image is not acceptable, the correction module 240 may modify the first layout L1. For example, the correction module 240 may adjust the features of the patterns corresponding to pattern dimensions, such as sizes, shapes, etc.”)
(LEE, p. 5, [0083] “In some example embodiments, the adjustment of the features of the patterns may be performed by the inference based on machine learning. The correction module 240 may perform an inference with respect to the difference between the inferred ACI image and the target ACI image to adjust the first layout L1. For example, the correction module 240 may perform an inference with respect to each of the patterns, with respect to a group of patterns, or an image of patterns.”)
Regarding claim 4
LEE and TSUTSUI teach all the features of claim 3 as disclosed above.
LEE further discloses
The system of claim 3, wherein each of the plurality of action inputs corresponds to a size adjustment applied at different times.
(LEE, p.9, [0149] “The correction module 240 may compare the size x1/y1 of the pattern in the ACI image 410 and the size x2/y2 of the pattern in the target image 420 to generate an error value dx/dy as shown in an image 430. For example, during the zero-th iteration (e.g., a first iteration), the size x/y of the pattern in the layout image 400 may be 100/100, the size x1/y1 of the pattern in the ACI image 410 may be 120/122, and the size x2/y2 of the pattern in the target image 420 may be 110/110.”)
(LEE, p. 9, [0150] “For example, during the first iteration, the size x/y of the pattern in the adjusted layout image 400 may be 90/98, and the size x1/y1 of the pattern in the adjusted ACI image 410 may be 108/109.”)
(LEE, p. 9, [0151]  “For example, during the second iteration, the size x/y of the pattern in the adjusted layout image 400 may be 92/89, and the size x1/y1 of the pattern in the adjusted ACI image 410 may be 110.2/110.3.”)
(LEE, p. 9, [0152] “For example, during the N-th iteration, the size x/y of the pattern in the adjusted layout image 400 may be 92.2/89.4, and the size x1/y1 of the pattern in the adjusted ACI image 410 may be 110/110.”)
Regarding claim 5
LEE and TSUTSUI teach all the features of claim 2 as disclosed above.
LEE discloses b part of claim 5
indicating a mutual influence of patterns of the prediction layout.
(LEE, p. 3, [0051] “In operation S220, the semiconductor process machine learning module 200 may extract features of patterns from the image of the first layout. For example, the semiconductor process machine learning module 200 may extract one or more features from each of the patterns. The semiconductor process machine learning module 200 may extract features of the same kind and/or features of different kinds, from the patterns.”)
(LEE, p. 3, [0052] “The features may include a characteristic (e.g., a size and/or a shape) of each of the patterns, along with an influence that each of the patterns experiences in etching from neighboring patterns placed around each pattern.”)
LEE does not teach
The system of claim 2, wherein the deep neural network comprises a convolutional neural network (CNN) trained with weights
However, TSUTSUI discloses further
The system of claim 2, wherein the deep neural network comprises a convolutional neural network (CNN) trained with weights
(TSUTSUI, p. 3, [0071] “It is particularly preferable that the layout design system that is one embodiment of the present invention use Deep Q-Learning for layout design and the Deep Q-Network have a configuration of a convolutional neural network (CNN). Note that in this specification and the like, a Deep Q-Network is sometimes simply referred to as a neural network.”)
(TSUTSUI, p. 11, [0181]“ In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value.”)
(TSUTSUI, p. 11, [0182] “The processing portion 103 includes a second neural network 530 and estimates the action value function Q(s.sub.t+1,a.sub.t+1) with the second neural network 530. In the second neural network 530, the input data is the image data LIMG.sub.t+1 of the layout generated in Step S31 and the output data is the action value function Q(s.sub.t+1,a.sub.t+1). For the second neural network 530, a convolutional neural network (CNN) configuration is preferably employed.”)
Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI and utilize the convolutional neural network (CNN) trained with weights indicating a mutual influence of patterns to achieve the optimal layout
Regarding claim 6
LEE and TSUTSUI teach all the features of claim 2 as disclosed above.
LEE does not teach
The system of claim 2, wherein the DRL module, comprises: an action value selector configured to select one of the value functions corresponding to one of the plurality of action inputs
and a loss function generator configured to generate a loss function by comparing a value function selected by the action value selector with a true value function based on the target layout.
However, TSUTSUI further discloses
The system of claim 2, wherein the DRL module, comprises: an action value selector configured to select one of the value functions corresponding to one of the plurality of action inputs
(TSUTSUI, p. 11, [0182] “The processing portion 103 includes a second neural network 530 and estimates the action value function Q(s.sub.t+1,a.sub.t+1) with the second neural network 530. In the second neural network 530, the input data is the image data LIMG.sub.t+1 of the layout generated in Step S31 and the output data is the action value function Q(s.sub.t+1,a.sub.t+1).”)
(TSUTSUI, p. 11, [0183] “… The second neural network 530 illustrated in FIG. 17(B) includes an input layer 531, an intermediate layer 532, and an output layer 533. The image data LIMG.sub.t+1 of the layout, which is the input data, is shown in the input layer 531; an action value function Q(s.sub.t+1,a.sup.1) to an action value function Q(s.sub.t+1,a.sup.m) are shown as output data.”)
and a loss function generator configured to generate a loss function by comparing a value function selected by the action value selector with a true value function based on the target layout.
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning. Thus, an action value function Q(s.sub.t+1,a.sub.t+1) at the next time t+1 is estimated, and r.sub.t+1+maxQ(s.sub.t+1,a.sub.t+1) is regarded as teacher data. The first neural network 520 performs learning by using the teacher data for a loss function.”)
(TSUTSUI, p. 11 [0189] “Results of the inference of the action value function Q(s.sub.t,a.sub.t) by the first neural network 520 and the teacher data generated by the second neural network 530 are used to calculate a loss function. With the use of a stochastic gradient descent (SGD), the weight coefficient of the first neural network 520 is updated such that the value of the loss function becomes small.”)
Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI to generate action values followed by determining loss function to eventually updating the weighting factors of the neural network so that the value of the loss function becomes smaller. A well-trained neural network is essential for producing optimal layout.
It must be noted that “teacher data” is considered the target layout.
Regarding claim 7
LEE and TSUTSUI teach all features of claim 1 as disclosed above.
LEE discloses b. and c. part of claim 7
that reduces a difference between the prediction layout and the target layout based on
(LEE, p. 4, [0056] “In operation S250, the semiconductor process machine learning module 200 may determine whether the inferred ACI image is acceptable. For example, when a difference between the inferred ACI image and a target ACI image (e.g., the image of the first layout) is greater than a threshold, the semiconductor process machine learning module 200 may determine the inferred ACI image as being not acceptable.”)
(LEE, p. 4, [0058] “In some example embodiments, the adjustment of the features may also be performed by the machine learning-based inference. The semiconductor process machine learning module 200 may perform the machine learning-based inference on the difference between the inferred ACI image and the target ACI image and may determine adjustment values of the features.”)
a correction of patterns used as an action input; and the target layout used as a state input
(LEE, p. 4, [0057] “When the inferred ACI image is not acceptable, in operation S260 the semiconductor process machine learning module 200 may adjust the features. For example, the semiconductor process machine learning module 200 may adjust patterns' own features such as sizes and/or shapes of the patterns. As the patterns' own features are adjusted, features of an influence that the patterns exert on neighboring patterns may also be updated.”)
(LEE, p. 4, [0058] “In some example embodiments, the adjustment of the features may also be performed by the machine learning-based inference. The semiconductor process machine learning module 200 may perform the machine learning-based inference on the difference between the inferred ACI image and the target ACI image and may determine adjustment values of the features. For example, the semiconductor process machine learning module 200 may perform an inference for each of the patterns, for each image, and/or for each group of patterns.”)
LEE does not teach
The system of claim 1, wherein the DRL module is configured to perform a reinforcement learning operation
However, TSUTSUI discloses
The system of claim 1, wherein the DRL module is configured to perform a reinforcement learning operation
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning.”)
	Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI and utilize deep reinforcing learning neural network and benefit from its capabilities
Regarding claim 8
LEE teaches all the features of claim 7 as disclosed above.
LEE does not teach explicitly
The system of claim 7, wherein the optimal layout corresponds to a maximum action value in the reinforcement learning operation or is derived from a learning result having a maximum reward.
However, TSUTSUI discloses
The system of claim 7, wherein the optimal layout corresponds to a maximum action value in the reinforcement learning operation or is derived from a learning result having a maximum reward.
(TSUTSUI, p. 9, [0162] “An ϵ-greedy method may be used for the selection of the movement a. In an ϵ-greedy method, the movement a with the highest Q value is selected with a probability of (1−ϵ) and a movement a is selected randomly with a probability of ϵ (ϵ is larger than 0 and less than or equal to 1). For example, the movement a with the highest Q value can be selected with a probability of 0.95 (95%), and a movement a can be selected randomly with a probability of 0.05 (5%). When an δ-greedy method is employed, selected actions a can be prevented from being uneven, and a more optimal action value function Q can be learned.”)
Note that the “movement a” can easily be replaced by “size adjustment.”
(TSUTSUI, p. 11, [0181] “… In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function;”)
(TSUTSUI, “[0068] “Q learning is a method in which the value of selection of an action at by an agent at time t when the environment is in a state st is learned. The agent means an agent that takes the action, and the environment means an object subject to the action. By the action at by the agent, the environment makes a transition from a state S.sub.t to a state S.sub.t+1 and the agent receives reward r.sub.t+1.”)
(TSUTSUI, p. 11, [0184] “The sum of the highest Q value among the action value function Q(s.sub.t+1,a.sup.1) to the action value function Q(s.sub.t+1,a.sup.m) and the reward r.sub.t+1 can be used as the teacher data for the inference by the first neural network.”)
(TSUTSUI, p. 19, [0322]“As shown in FIG. 32, as the number of episodes increased, the cumulative reward increased. The cumulative reward saturated and the learning converged with approximately 20000 episodes.”)
	Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI to produce the optimal layout by using the highest Q value among the action value function corresponding to saturated reward.

Regarding claim 9
LEE teaches a, b, c, and e part of claim 9.
A learning method of a layout optimization system, the learning method comprising
(LEE, p. 1, “[0007] According to some example embodiments, a method for fabricating of a semiconductor device includes.”)
receiving a target layout
(LEE, p. 1 “[0007] According to some example embodiments, a method for fabricating of a semiconductor device includes receiving a first layout including patterns for the fabrication of the semiconductor device.”)
generating a predicted layout based on the target layout,
(LEE, p. 1, “performing machine learning-based process proximity correction (PPC) based on features of the patterns of the first layout to generate a second layout”)
receiving a change to at least one pattern of the predicted layout as an action input;
(LEE, p.4, [0057] “When the inferred ACI image is not acceptable, in operation S260 the semiconductor process machine learning module 200 may adjust the features. For example, the semiconductor process machine learning module 200 may adjust patterns' own features such as sizes and/or shapes of the patterns.”)
LEE does not teach
generating a plurality of action values by performing a simulation on the predicted layout
selecting a first action value of the plurality of action values corresponding to the action input;
and determining a loss function by comparing the selected first action value with a second action value corresponding to the target layout.
However, TSUTSUI discloses
generating a plurality of action values by performing a simulation on the predicted layout
(TSUTSUI, p. 11, [0180] “Step S32 is a step of estimating the action value function Q(s.sub.t+1,a.sub.t+1) by the processing portion 103.” ).
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning.”)
Wherein reinforcement learning performs simulation.
selecting a first action value of the plurality of action values corresponding to the action input
(TSUTSUI, p. 11, [0182] “The processing portion 103 includes a second neural network 530 and estimates the action value function Q(s.sub.t+1,a.sub.t+1) with the second neural network 530. In the second neural network 530, the input data is the image data LIMG.sub.t+1 of the layout generated in Step S31 and the output data is the action value function Q(s.sub.t+1,a.sub.t+1).”)
and determining a loss function by comparing the selected first action value with a second action value corresponding to the target layout.
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning. Thus, an action value function Q(s.sub.t+1,a.sub.t+1) at the next time t+1 is estimated, and r.sub.t+1+maxQ(s.sub.t+1,a.sub.t+1) is regarded as teacher data. The first neural network 520 performs learning by using the teacher data for a loss function.”)
(TSUTSUI, p. 11 [0189] “Results of the inference of the action value function Q(s.sub.t,a.sub.t) by the first neural network 520 and the teacher data generated by the second neural network 530 are used to calculate a loss function.”)
	Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI to train the deep reinforcement learning module using simulations iteratively, as it is very effective in refining trained concepts, and further to generate action values followed by determining loss function to eventually updating the weighting factors of the neural network so that the value of the loss function becomes smaller. A well-trained neural network is essential for producing optimal layout.
Regarding claim 10
LEE and TSUTSUI teach all the features of claim 9 as disclosed above.
LEE does not teach
The learning method of claim 9, wherein the generating the plurality of action values is performed using a convolutional neural network (CNN).
However, TSUTSUI further discloses
The learning method of claim 9, wherein the generating the plurality of action values is performed using a convolutional neural network (CNN).
(TSUTSUI, p. 8, [0153] “Step S27 is a step of estimating the action value function Q(s.sub.t,a.sub.t) by the processing portion 103”).
(TSUTSUI,  p. 11, [0182] “The processing portion 103 includes a second neural network 530 and estimates the action value function Q(s.sub.t+1,a.sub.t+1) with the second neural network 530. In the second neural network 530, the input data is the image data LIMG.sub.t+1 of the layout generated in Step S31 and the output data is the action value function Q(s.sub.t+1,a.sub.t+1). For the second neural network 530, a convolutional neural network (CNN) configuration is preferably employed.”)
	Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI and utilize a deep learning convolutional neural network to benefit from all its capabilities.
Regarding claim 11
LEE and TSUTSUI teach all the features of claim 10 as disclosed above.
LEE does not teach
The learning method of claim 10, further comprising: receiving, by the CNN, the target layout as an input layer; and outputting, by the CNN, the plurality of action values from an output layer
However, TSUTSUI further teaches
The learning method of claim 10, further comprising: receiving, by the CNN, the target layout as an input layer; and outputting, by the CNN, the plurality of action values from an output layer.
(TSUTSUI, p. 11, [0181] “… In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning. Thus, an action value function Q(s.sub.t+1,a.sub.t+1) at the next time t+1 is estimated, and r.sub.t+1+maxQ(s.sub.t+1,a.sub.t+1) is regarded as teacher data.”)
(TSUTSUI, p. 11, [0182] “The processing portion 103 includes a second neural network 530 and estimates the action value function Q(s.sub.t+1,a.sub.t+1) with the second neural network 530. In the second neural network 530, the input data is the image data LIMG.sub.t+1 of the layout generated in Step S31 and the output data is the action value function Q(s.sub.t+1,a.sub.t+1). For the second neural network 530, a convolutional neural network (CNN) configuration is preferably employed.”)
Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI and utilize a deep learning convolutional neural network to benefit from all its capabilities.
Note that teacher data is considered as the target layout.
Regarding claim 12
LEE and TSUTSUI teach all the features of claim 11 as disclosed above.
LEE does not teach
The learning method of claim 11, wherein the CNN comprises a weight indicating an effect of a change between patterns of the target layout.
However, TSUTSUI further teaches
The learning method of claim 11, wherein the CNN comprises a weight indicating an effect of a change between patterns of the target layout.
(TSUTSUI, p. 3, [0071] “It is particularly preferable that the layout design system that is one embodiment of the present invention use Deep Q-Learning for layout design and the Deep Q-Network have a configuration of a convolutional neural network (CNN). Note that in this specification and the like, a Deep Q-Network is sometimes simply referred to as a neural network.”) 
(TSUTSUI , p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small.”)
(TSUTSUI, p. 11, [0182] “The processing portion 103 includes a second neural network 530 and estimates the action value function Q(s.sub.t+1,a.sub.t+1) with the second neural network 530. In the second neural network 530, the input data is the image data LIMG.sub.t+1 of the layout generated in Step S31 and the output data is the action value function Q(s.sub.t+1,a.sub.t+1). For the second neural network 530, a convolutional neural network (CNN) configuration is preferably employed.”)
Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI and utilize a deep learning convolutional neural network to benefit from all its capabilities.
Regarding claim 13
LEE and TSUTSUI teach all the features of claim 9 as disclosed above.
LEE further discloses
The learning method of claim 9, wherein the action input corresponds to a size adjustment of at least one pattern of the target layout.
(LEE, p. 5, [0082] “In operation S460, when the inferred ACI image is not acceptable, the correction module 240 may modify the first layout L1. For example, the correction module 240 may adjust the features of the patterns corresponding to pattern dimensions, such as sizes, shapes, etc.”)
(LEE, p. 5, [0083] “In some example embodiments, the adjustment of the features of the patterns may be performed by the inference based on machine learning. The correction module 240 may perform an inference with respect to the difference between the inferred ACI image and the target ACI image to adjust the first layout L1. For example, the correction module 240 may perform an inference with respect to each of the patterns, with respect to a group of patterns, or an image of patterns.”)
Regarding claim 14
LEE and TSUTSUI teach all the features of claim 13 as disclosed above.
LEE further discloses
The learning method of claim 13, wherein the action input comprises a size adjustment applied a plurality of times at different time points for the at least one pattern of the target layout.
(LEE, p.9, [0149] “The correction module 240 may compare the size x1/y1 of the pattern in the ACI image 410 and the size x2/y2 of the pattern in the target image 420 to generate an error value dx/dy as shown in an image 430. For example, during the zero-th iteration (e.g., a first iteration), the size x/y of the pattern in the layout image 400 may be 100/100, the size x1/y1 of the pattern in the ACI image 410 may be 120/122, and the size x2/y2 of the pattern in the target image 420 may be 110/110.”)
(LEE, p. 9, [0150] “For example, during the first iteration, the size x/y of the pattern in the adjusted layout image 400 may be 90/98, and the size x1/y1 of the pattern in the adjusted ACI image 410 may be 108/109.”)
(LEE, p. 9, [0151]  “For example, during the second iteration, the size x/y of the pattern in the adjusted layout image 400 may be 92/89, and the size x1/y1 of the pattern in the adjusted ACI image 410 may be 110.2/110.3.”)
(LEE, p. 9, [0152] “For example, during the N-th iteration, the size x/y of the pattern in the adjusted layout image 400 may be 92.2/89.4, and the size x1/y1 of the pattern in the adjusted ACI image 410 may be 110/110.”)
(LEE, p. 5, [0084] “In operation S440, the machine learning module 220 may perform an inference with respect to the adjusted first layout L1 and generate an ACI image. The machine learning module 220 and the correction module 240 may repeat operations S440 through S460 until the inferred ACI image becomes acceptable.”)
Regarding claim 15
LEE and TSUTSUI teach all the features of claim 9 as disclosed above.
LEE further teaches
The learning method of claim 9, further comprising: receiving a size adjustment as an action input,
(LEE, p. 5, [0082] “In operation S460, when the inferred ACI image is not acceptable, the correction module 240 may modify the first layout L1. For example, the correction module 240 may adjust the features of the patterns corresponding to pattern dimensions, such as sizes, shapes, etc.”)
 LEE does not teach
selecting one of the plurality of action values
and determining the loss function in an operation loop
However, TSUTSUI discloses
selecting one of the plurality of action values
(TSUTSUI, p. 11, [0180] “Step S32 is a step of estimating the action value function Q(s.sub.t+1,a.sub.t+1) by the processing portion 103.”)
and determining the loss function in an operation loop.
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small.”)
(TSUTSUI, p. 8, [0153] “Step S27 is a step of estimating the action value function Q(s.sub.t,a.sub.t) by the processing portion 103..”)
(TSUTSUI, p. 8, [0154] “The processing portion 103 includes a first neural network 520 and estimates the action value function Q(s.sub.t,a.sub.t) with the first neural network 520. In the first neural network 520, the input data is the image data LIMG.sub.t of the layout generated in Step S26 and the output data is the action value function Q(s.sub.t,a.sub.t).”)
(TSUTSUI, p. 11, [0181] “… The first neural network 520 performs learning by using the teacher data for a loss function.”)
(TSUTSUI, p. 11 [0189] “Results of the inference of the action value function Q(s.sub.t,a.sub.t) by the first neural network 520 and the teacher data generated by the second neural network 530 are used to calculate a loss function. With the use of a stochastic gradient descent (SGD), the weight coefficient of the first neural network 520 is updated such that the value of the loss function becomes small.”)
	Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI to determine the loss function for action values corresponding to action input of size adjustment for the purpose of updating the weighting factors.
Regarding claim 16
LEE and TSUTSUI teach all the features of claim 15 as disclosed above.
LEE does not teach
The learning method of claim 15, further comprising selecting, in the operation loop, a layout pattern corresponding to an action value that minimizes the loss function as an optimal layout
TSUTSUI further discloses
The learning method of claim 15, further comprising selecting, in the operation loop, a layout pattern corresponding to an action value that minimizes the loss function as an optimal layout.
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function;”)
(TSUTSUI, p. 11, [0181] “… The first neural network 520 performs learning by using the teacher data for a loss function.”)
(TSUTSUI, p. 11 [0189] “Results of the inference of the action value function Q(s.sub.t,a.sub.t) by the first neural network 520 and the teacher data generated by the second neural network 530 are used to calculate a loss function. With the use of a stochastic gradient descent (SGD), the weight coefficient of the first neural network 520 is updated such that the value of the loss function becomes small.”)
	Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI for the purpose of reaching the goal of getting the optimal layout. 

Regarding claim 17
LEE teaches a and d part of the claim 17.
A method, comprising: receiving a target layout
(LEE, p. 1 “[0007] According to some example embodiments, a method for fabricating of a semiconductor device includes receiving a first layout including patterns for the fabrication of the semiconductor device.”)
generating a prediction layout
(LEE, p. 1, [0007] “According to some example embodiments, a method for fabricating of a semiconductor device includes receiving a first layout including patterns for the fabrication of the semiconductor device, performing machine learning-based process proximity correction (PPC) based on features of the patterns of the first layout to generate a second layout.”)
and applying a size correction to at least one pattern of the prediction layout based on the optimal layout.
(LEE, p. 4, [0061] “In operation S260 where the features are adjusted, patterns' own features such as sizes and/or shapes of the patterns are adjusted. Accordingly, the second image may be generated from the target ACI image by revising patterns' own portions such as sizes and shapes of the patterns, based on the adjusted features.”)
LEE does not teach
 by applying a simulation to the target layout
generating an optimal layout based on the prediction layout
However, TSUTSUI discloses
by applying a simulation to the target layout
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning.”)
Wherein reinforcement learning performs simulation.
generating an optimal layout based on the prediction layout
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning. Thus, an action value function Q(s.sub.t+1,a.sub.t+1) at the next time t+1 is estimated, and r.sub.t+1+maxQ(s.sub.t+1,a.sub.t+1) is regarded as teacher data. The first neural network 520 performs learning by using the teacher data for a loss function.”)
(TSUTSUI, p. 11 [0189] “Results of the inference of the action value function Q(s.sub.t,a.sub.t) by the first neural network 520 and the teacher data generated by the second neural network 530 are used to calculate a loss function. With the use of a stochastic gradient descent (SGD), the weight coefficient of the first neural network 520 is updated such that the value of the loss function becomes small.”)
	Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI to train the deep reinforcement learning module using simulations iteratively, as it is very effective in refining trained concepts, and utilizing Q learning system for the purpose of producing the optimal layout by minimizing the loss function.
Further, the teacher data is considered as the first and target layout.
Regarding claim 18
LEE and TSUTSUI teach all the features of claim 17 as disclosed above.
LEE discloses b part of claim 18.
indicate a mutual influence of the at least one pattern of the prediction layout.
(LEE, p. 3, [0051] “In operation S220, the semiconductor process machine learning module 200 may extract features of patterns from the image of the first layout. For example, the semiconductor process machine learning module 200 may extract one or more features from each of the patterns. The semiconductor process machine learning module 200 may extract features of the same kind and/or features of different kinds, from the patterns.”
[0052] “The features may include a characteristic (e.g., a size and/or a shape) of each of the patterns, along with an influence that each of the patterns experiences in etching from neighboring patterns placed around each pattern.”)
LEE does not teach
The method of claim 17, wherein the simulation is applied based on a convolutional neural network (CNN) trained with weights that numerically
However, TSUTSUI discloses
The method of claim 17, wherein the simulation is applied based on a convolutional neural network (CNN) trained with weights that numerically
(TSUTSUI, p. 3, [0071] “It is particularly preferable that the layout design system that is one embodiment of the present invention use Deep Q-Learning for layout design and the Deep Q-Network have a configuration of a convolutional neural network (CNN). Note that in this specification and the like, a Deep Q-Network is sometimes simply referred to as a neural network.”)
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value.”)
(TSUTSUI, p. 11, [0182] “The processing portion 103 includes a second neural network 530 and estimates the action value function Q(s.sub.t+1,a.sub.t+1) with the second neural network 530. In the second neural network 530, the input data is the image data LIMG.sub.t+1 of the layout generated in Step S31 and the output data is the action value function Q(s.sub.t+1,a.sub.t+1). For the second neural network 530, a convolutional neural network (CNN) configuration is preferably employed.”)
Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI and utilize the convolutional neural network (CNN) trained with weights indicating a mutual influence of patterns to achieve the optimal layout
Regarding claim 19
LEE and TSUTSUI teach all the features of claim 17 as disclosed above.
LEE discloses b part of claim 19.
receiving a size adjustment for the at least one pattern of the prediction layout as an action input;
(LEE, p. 4, [0057] “When the inferred ACI image is not acceptable, in operation S260 the semiconductor process machine learning module 200 may adjust the features. For example, the semiconductor process machine learning module 200 may adjust patterns' own features such as sizes and/or shapes of the patterns.”)
LEE does not teach
The method of claim 17, further comprising performing deep reinforcement learning (DRL)
selecting a first action value of a plurality of action values corresponding to the action input
and determining a loss function by comparing the selected first action value with a second action value corresponding to the target layout
However, TSUTSUI discloses
The method of claim 17, further comprising performing deep reinforcement learning (DRL)
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning.”)
selecting a first action value of a plurality of action values corresponding to the action input
(TSUTSUI, p. 11, [0182] “The processing portion 103 includes a second neural network 530 and estimates the action value function Q(s.sub.t+1,a.sub.t+1) with the second neural network 530. In the second neural network 530, the input data is the image data LIMG.sub.t+1 of the layout generated in Step S31 and the output data is the action value function Q(s.sub.t+1,a.sub.t+1).”)
and determining a loss function by comparing the selected first action value with a second action value corresponding to the target layout.
(TSUTSUI, p. 11 [0181] “In general, in learning of deep learning, the weight coefficient of a neural network is updated such that an error between output data and teacher data becomes small. The update of a weight coefficient is repeated until the error between the output data and the teacher data becomes a certain value. In Q learning, which is a kind of reinforcement learning, the purpose of the learning is to search for the optimal Q function; however, the optimal Q function is not found during the learning. Thus, an action value function Q(s.sub.t+1,a.sub.t+1) at the next time t+1 is estimated, and r.sub.t+1+maxQ(s.sub.t+1,a.sub.t+1) is regarded as teacher data. The first neural network 520 performs learning by using the teacher data for a loss function.”)
(TSUTSUI, p. 11 [0189] “Results of the inference of the action value function Q(s.sub.t,a.sub.t) by the first neural network 520 and the teacher data generated by the second neural network 530 are used to calculate a loss function. With the use of a stochastic gradient descent (SGD), the weight coefficient of the first neural network 520 is updated such that the value of the loss function becomes small.”)
Therefore, it would have been obvious before the effective priority date of the claim to a person having ordinary skill in the art to combine the teachings of LEE and TSUTSUI to utilize deep reinforcement learning module such as Q learning system to generate action values followed by determining loss function to eventually updating the weighting factors of the neural network so that the value of the loss function becomes smaller. A well-trained neural network is essential for producing optimal layout.
It must be noted that “teacher data” is considered the target layout.
Regarding claim 20
LEE teaches all the features of 17 as disclosed above.
LEE further discloses
The method of claim 17, wherein the target layout corresponds to an after cleaning inspection (ACI) critical dimension (CD).
(LEE, p. 2, [0035] “FIG. 2 illustrates an example in which the semiconductor process machine learning module 200 of FIG. 1 performs generation of a layout. Referring to FIGS. 1 and 2, in operation S110, the semiconductor process machine learning module 200 may receive a first layout. For example, the first layout may be a target layout that an operator/technician/engineer wants to or intends to obtain in ACI (after cleaning inspection).”)
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAYAPPU SOUNDRANAYAGAM whose telephone number is (571)272-0629. The examiner can normally be reached Mon-Fri: 8:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jack Chiang can be reached at (571) 272-7483. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.









  /R.S./  Examiner, Art Unit 2851                                                                                                                                                                                                                                                                                                                                                                                                           


/JACK CHIANG/Supervisory Patent Examiner, Art Unit 2851
Read full office action
LAYOUT DESIGN SYSTEM USING DEEP REINFORCEMENT LEARNING AND LEARNING METHOD THEREOF

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

LAYOUT DESIGN SYSTEM USING DEEP REINFORCEMENT LEARNING AND LEARNING METHOD THEREOF

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email