Last updated: April 19, 2026
Application No. 18/192,954
QUANTUM CIRCUIT MAPPING USING REINFORCEMENT LEARNING TECHNIQUES

Non-Final OA §103§112
Filed
Mar 30, 2023
Examiner
RUTTEN, JAMES D
Art Unit
2121
Tech Center
2100 — Computer Architecture & Software
Assignee
Amazon Technologies, Inc.
OA Round
1 (Non-Final)
This examiner grants 63% of cases after interview

— +38.4% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 580 resolved cases, 2023–2026
Examiner Intelligence

RUTTEN, JAMES D View full profile →
Grants 63% of resolved cases
Career Allow Rate
365 granted / 580 resolved
+7.9% vs TC avg
Strong +38% interview lift
Without
With
+38.4%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
23 currently pending
Career history
603
Total Applications
across all art units
Statute-Specific Performance

§101
10.0%
-30.0% vs TC avg
§103
50.6%
+10.6% vs TC avg
§102
11.2%
-28.8% vs TC avg
§112
16.7%
-23.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 580 resolved cases
Office Action

§103 §112
CTNF 18/192,954 CTNF 79877 DETAILED ACTION Notice of Pre-AIA or AIA Status 07-03-aia AIA 15-10-aia The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Claims 1-20 have been examined. Claim Rejections - 35 USC § 112 07-30-02 AIA The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. 07-34-01 Claim 14 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 14 includes the following clause: “… such that an updated predicted number of SWAP gates that are to be scheduled during said determining the routing, via the reinforcement learning model, of the quantum gates, wherein: … ” This clause appears to be incomplete as it does not have an object associated with the SWAP gates. A simplified version of this clause could be “such that an updated number of gates, wherein.” What happens to the updated number of gates? The clause is unclear. For the purpose of further examination, the limitation will be interpreted as “… such that an updated the predicted number of SWAP gates is updated that are to be scheduled during said determining the routing, via the reinforcement learning model, of the quantum gates , wherein: … ” Claim Rejections - 35 USC § 103 07-20-aia AIA The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. 07-21-aia AIA Claim s 1-2 and 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over “Quantum Compiling by Deep Reinforcement Learning” by Moro et al. (“Moro”) in view of U.S. Patent Application Publication 20210157662 by Heckey et al. ("Heckey") and U.S. Patent Application Publication 20200394544 by Low et al. ("Low") . In regard to claim 1, Moro discloses: 1. A system, comprising: one or more computing devices of a … network configured to implement a quantum computing service, wherein the quantum computing service is configured to enable execution of quantum circuits using a plurality of quantum hardware devices; and one or more computing devices of the … network configured to implement a quantum compilation service configured to compile instructions comprising a quantum circuit mapping for executing a logical quantum circuit using a given one of the quantum hardware devices, See Moro bottom of p. 7, “As a concluding remark, we observe that this approach can be specialized, taking into account any hardware constraints that limit operations, integrating them directly into the environments.” Also middle of p. 8 under “Software and Hardware,” e.g. “Intel Xeon W-2195 and a Nvidia GV100.” Moro does not expressly disclose: … service provider…. This is taught by Heckey. See Heckey, Fig. 1 element 100 “Service Provider Network.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Heckey’s service provider network with Moro’s devices in order to enable potential users of quantum computers to access quantum computers based on multiple different quantum computing technologies and/or paradigms, without the cost and resources required to build or manage such quantum computers as suggested by Heckey (see ¶ 0071). Moro also discloses: wherein to implement the quantum compilation service, the one or more computing devices are further configured to: implement a reinforcement-learning-based (RL-based) quantum circuit router, Moro, pp. 1-2, e.g.: By exploiting a deep reinforcement learning algorithm, it would be in principle possible to train an agent to generalize how to map any unitary operator into a sequence of elementary gates within an arbitrary precision. … In the framework of quantum compiling, the environment consists of a quantum circuit that starts as the identity at the beginning of each episode. It is built incrementally at each time-step by the agent, choosing a gate from B according to the policy π encoded in the deep neural network, as shown in Figure 1. Therefore, the available actions that the agent can perform correspond to the gates in the base B . wherein the RL-based quantum circuit router is configured to: receive a request to generate the quantum circuit mapping; generate, via a reinforcement learning model, one or more results of the quantum circuit mapping, Moro, 3 rd and 8 th paragraphs on p. 2: In this work, we propose a novel approach to quantum compiling, exploiting deep reinforcement learning to approximate, with competitive tolerance, single-qubit unitary operators as circuits made by an arbitrary initial set of elementary quantum gates. … In this work, we ask the agent to approximate any single-qubit unitary matrix U , within a fixed tolerance ε . wherein the one or more results comprise an ordering of quantum gates and … to be performed to execute the logical quantum circuit using the given one of the quantum hardware devices; and Moro, pp. 1-2, e.g.: In the framework of quantum compiling, the environment consists of a quantum circuit that starts as the identity at the beginning of each episode. It is built incrementally at each time-step by the agent, choosing a gate from B according to the policy π encoded in the deep neural network, as shown in Figure 1. Therefore, the available actions that the agent can perform correspond to the gates in the base B . Moro does not expressly disclose: SWAP gates . This is taught by Low. See Low, ¶ 0004, “Via the network of swap gates, the computation of the numerous many-body terms may be parallelized within the quantum computer, resulting in a significantly decreased circuit depth of the computer.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Low’s network of swap gates with Moro’s circuit in order to enable parallelization resulting in a significantly decreased circuit depth of the computer as suggested by Low. Moro and Heckey also teach: provide the one or more generated results to the quantum computing service, See Moro 2 nd paragraph on p. 4, “ At the end of the learning, the agent discovered an approximating circuit made by 76 gates only, within the target tolerance. ” Also see Heckey Fig. 1 as cited above. Moro does not expressly disclose: wherein the one or more computing devices that implement the quantum computing service are further configured to submit the compiled instructions comprising the one or more generated results for use in execution of the logical quantum circuit using the given quantum hardware device. This is taught by Heckey. See Heckey ¶ 0091, “Quantum circuits that have been translated by translation module 112 may be provided to back-end API transport module 110 in order for the translated quantum circuits to be transported to a quantum computer at a respective quantum hardware provider location.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Heckey’s translated circuit submission with Moro’s reinforcement learning results in order to enable potential users of quantum computers to access quantum computers based on multiple different quantum computing technologies and/or paradigms, without the cost and resources required to build or manage such quantum computers as suggested by Heckey (see ¶ 0071). In regard to claim 2, Moro also discloses: 2. The system of claim 1, wherein to generate, via the reinforcement learning model, the one or more results of the quantum circuit mapping, the RL-based quantum circuit router is further configured to: select an action of a plurality of actions that change a current state of the quantum circuit mapping being generated, … update the current state of the quantum circuit mapping being generated to an updated state of the quantum circuit mapping being generated based, at least in part, on the selected action. Moro, bottom of p. 2, e.g.: In the framework of quantum compiling, the environment consists of a quantum circuit that starts as the identity at the beginning of each episode. It is built incrementally at each time-step by the agent, choosing a gate from B according to the policy π encoded in the deep neural network, as shown in Figure 1. Therefore, the available actions that the agent can perform correspond to the gates in the base B . Moro does not expressly disclose: wherein the selected action causes a SWAP gate to be scheduled such that one or more respective ones of the quantum gates of the logical quantum circuit are additionally scheduled; and This is taught by Low. See Low, ¶ 0005, “The qubit swap operations may be implemented by the network of quantum gates.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Low’s network of gates with Moro’s circuit in order to enable parallelization resulting in a significantly decreased circuit depth of the computer as suggested by Low. In regard to claim 6, Moro and Heckey also teach: 6. The system of claim 1, wherein: the given one of the quantum hardware devices is a quantum hardware device of a quantum hardware provider; the quantum hardware provider is accessible to the quantum compilation service via the service provider network; and physical qubit connectivity information corresponding to the given one of the quantum hardware devices is provided via the service provider network. See Heckey, Fig. 1, depicting physical connectivity of hardware provided by a service provider network. In regard to claim 7, Moro also discloses: 7. The system of claim 1, wherein the RL-based quantum circuit router is further configured to: update one or more rewards of the reinforcement learning model of the RL-based quantum circuit router based, at least in part, on the one or more generated results of the quantum circuit mapping. Moro, p. 2, under “Deep reinforcement learning”: According to a policy function that fully determines its behavior, the former interacts with the latter at discrete time-steps, performing an action based on an observation related to the current state of the environment. Therefore, the environment evolves changing its state and returning a reward signal, that can be interpreted as a measure of the adequateness of the action the agent has performed . 07-21-aia AIA Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Moro in view of Heckey and Low as applied above, and further in view of U.S. Patent Application Publication 20210278825 by Wen et al. ("Wen") . In regard to claim 3, Moro also discloses: 3. The system of claim 1, wherein the one or more computing devices of the service provider network configured to implement the RL-based quantum circuit router or one or more additional computing devices of the service provider network are configured to: train the RL-based quantum circuit router, Moro, p. 3, e.g. “In this work we exploit Deep Q-Learning (DQL) 43 and Proximal Policy Optimization (PPO) 44 algorithms to train the agents, depending on the reward function.” … determine loss values corresponding to respective ones of the projected quantum gate scheduling paths, wherein the determined loss values are provided to a value network to update rewards of the reinforcement learning model used by a policy network of the RL-based quantum circuit router. Moro, in the description of Fig. 1, e.g. “At each time-step n the agent receives the current observation O n and based on that information it chooses the next gate to apply on the quantum circuit. Therefore, the environment returns the real-valued reward r n to the agent.” Moro does not expressly disclose: wherein to train the RL-based quantum circuit router, said one or more computing devices or the one or more additional computing devices are further configured to: cause a Monte Carlo Tree Search (MCTS) algorithm to be performed, wherein the MCTS algorithm forecasts projected quantum gate scheduling paths based on respective ones of the plurality of actions; and This is taught by Wen. See ¶ 0004, “MCTS.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Moro’s RL-based action scheduling with Wen’s MCTS in order to utilize an efficient and quick real time search to identify the optimal policy from the one or more possible policies as suggested by Wen (see ¶ 0004) . 07-21-aia AIA Claim s 4-5 are rejected under 35 U.S.C. 103 as being unpatentable over Moro in view of Heckey and Low as applied above, and further in view of U.S. Patent Application Publication 20190018721 by Wallman et al. ("Wallman") . In regard to claim 4, Moro does not expressly disclose the claimed limitations. The following limitations are taught by Wallman. 4. The system of claim 1, wherein to compile instructions comprising the quantum circuit mapping, the one or more computing devices of the service provider network configured to implement the quantum compilation service are further configured to: generate initial qubit allocation information, wherein logical qubits of the logical quantum circuit are respectively assigned to one or more physical qubits located on the given one of the quantum hardware devices; and provide the request to generate the quantum circuit mapping to the RL-based quantum circuit router, wherein the request comprises the generated initial qubit allocation information. See Wallman Figure 3C, depicting an initial allocation. Also see ¶ 0072, “Thus, the modified quantum-logic gate sequence 300C shown in FIG. 3C represents the initial quantum-logic gate sequence 300A shown in FIG. 3A transformed by application of the virtual random gates 308 shown in FIG. 3B.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Wallman’s initial allocation with Moro’s mapping request in order to reduce the effect of noise in a quantum information processor as suggested by Wallman (see ¶ 0061). In regard to claim 5, Moro and Wallman also teach: 5. The system of claim 4, wherein: the request to generate the quantum circuit mapping further comprises a noise model corresponding to the given one of the quantum hardware devices; and the generated initial qubit allocation is additionally based, at least in part, on the noise model. Wallman, ¶ 0067, “FIGS. 3A, 3B, and 3C are schematic diagrams showing an example of noise tailoring applied to a quantum-logic gate sequence.” 07-21-aia AIA Claim s 8, 11, 13-17 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Moro in view of Low and Wen . In regard to claim 8, Moro discloses: 8. A method, comprising: See Moro, at least p. 7, under “Methods.” receiving a request to generate compiled instructions comprising a quantum circuit mapping for executing a logical quantum circuit using a quantum hardware device; Moro, 3 rd and 8 th paragraphs on p. 2: In this work, we propose a novel approach to quantum compiling, exploiting deep reinforcement learning to approximate, with competitive tolerance, single-qubit unitary operators as circuits made by an arbitrary initial set of elementary quantum gates. … In this work, we ask the agent to approximate any single-qubit unitary matrix U , within a fixed tolerance ε . determining a routing, via a reinforcement learning model, of quantum gates of the logical quantum circuit to physical qubits of the quantum hardware device, Moro, pp. 1-2, e.g.: In the framework of quantum compiling, the environment consists of a quantum circuit that starts as the identity at the beginning of each episode. It is built incrementally at each time-step by the agent, choosing a gate from B according to the policy π encoded in the deep neural network, as shown in Figure 1. Therefore, the available actions that the agent can perform correspond to the gates in the base B . wherein said determining the routing comprises: determining a plurality of actions that change a current state of the quantum circuit mapping being generated, … selecting an action from the plurality of actions; updating the current state of the quantum circuit mapping being generated to an updated state of the quantum circuit mapping being generated based, at least in part, on the selected action; Moro, bottom of p. 2, e.g.: In the framework of quantum compiling, the environment consists of a quantum circuit that starts as the identity at the beginning of each episode. It is built incrementally at each time-step by the agent, choosing a gate from B according to the policy π encoded in the deep neural network, as shown in Figure 1. Therefore, the available actions that the agent can perform correspond to the gates in the base B . Also see Fig. 1 on p. 3, depicting the iterative process of reinforcement learning which includes selection of an action associated with an observation of an updated environment. Moro does not expressly disclose: … wherein respective ones of the actions cause respective SWAP gates to be scheduled such that one or more respective ones of the quantum gates of the logical quantum circuit are additionally scheduled; This is taught by Low. See ¶ 0004, “Via the network of swap gates, the computation of the numerous many-body terms may be parallelized within the quantum computer, resulting in a significantly decreased circuit depth of the computer.” Also ¶ 0005, “The qubit swap operations may be implemented by the network of quantum gates.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Low’s swap gates with Moro’s circuit in order to enable parallelization resulting in a significantly decreased circuit depth of the computer as suggested by Low. performing a … [search] , wherein the … [search] forecasts projected quantum gate scheduling paths corresponding to the selected action and to respective ones of the unselected plurality of actions; See Moro, at least Fig. 1 at the top of p. 3, depicting a reinforcement learning process. Moro does not expressly disclose: Monte Carlo Tree Search (MCTS), wherein the MCTS forecasts projected quantum gate scheduling paths …; This is taught by Wen. See ¶ 0004, “MCTS.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Moro’s RL-based action scheduling with Wen’s MCTS in order to utilize an efficient and quick real time search to identify the optimal policy from the one or more possible policies as suggested by Wen (see ¶ 0004). Moro also discloses: determining loss values corresponding to respective ones of the projected quantum gate scheduling paths; and Moro, in the description of Fig. 1, e.g. “At each time-step n the agent receives the current observation O n and based on that information it chooses the next gate to apply on the quantum circuit. Therefore, the environment returns the real-valued reward r n to the agent.” repeating, for an additional plurality of actions, said determining the additional plurality of actions, said selecting an additional action from the plurality of additional actions, said updating the updated state such that respective ones of the quantum gates of the logical quantum circuit are routed, said performing the MCTS, and said determining updated loss values; See Moro, Fig. 1 on p. 3, depicting the iterative nature of a reinforcement learning architecture determining a mapping recommendation based, at least in part, on the determined routing; and providing the mapping recommendation. See Moro 2 nd paragraph on p. 4, “ At the end of the learning, the agent discovered an approximating circuit made by 76 gates only, within the target tolerance. ” In regard to claim 11 Moro does not expressly disclose: 11. The method of claim 8, wherein said selecting the action from the plurality of actions is based, at least in part, on a forecasting recommendation of the MCTS, wherein the MCTS additionally forecasts the projected quantum gate scheduling paths prior to said selecting the action from the plurality of actions. This is taught by Wen. See Wen, Fig. 4 and ¶ 0048, “FIG. 4 depicts the workflow for a MCTS. The MCTS includes iteratively building a search tree until a predefined computational budget, for example, a time, memory or iteration constraint is reached, at which point the search is halted and the best performing root action is returned.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Wen’s MCTS with Moro’s RL-based action scheduling in order to utilize an efficient and quick real time search to identify the optimal policy from the one or more possible policies as suggested by Wen (see ¶ 0004). In regard to claim 13, Moro also discloses: 13. The method of claim 8, further comprising: respectively allocating logical qubits of the logical quantum circuit to one or more of the physical qubits of the quantum hardware device; and See Moro, p. 3, under “Training neural networks for approximating a single-qubit gate,” e.g. “ decomposing a single-qubit gate U , into a circuit of unitary transformations that can be implemented directly on quantum hardware. “ determining, via the reinforcement learning model, a predicted number of … gates that are to be scheduled during said determining the routing, via the reinforcement learning model, of the quantum gates, wherein said predicted number of … gates is based, at least in part, on said allocating. Moro, top of p. 4, “At the end of the learning, the agent discovered an approximating circuit made by 76 gates only, within the target tolerance.” Moro does not expressly disclose: SWAP gates. This is taught by Low. See Low, ¶ 0005, “The qubit swap operations may be implemented by the network of quantum gates.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Low’s network of gates with Moro’s circuit in order to enable parallelization resulting in a significantly decreased circuit depth of the computer as suggested by Low. In regard to claim 14, Moro and Low also teach: 14. The method of claim 13, further comprising: re-allocating, responsive to said determining the predicted number of SWAP gates, at least one of the logical qubits to one or more other physical qubits of the quantum hardware device such that [ an updated the predicted number of SWAP gates is updated that are to be scheduled during said determining the routing, via the reinforcement learning model, of the quantum gates , ] wherein: the updated predicted number of SWAP gates is smaller than the predicted number of SWAP gates; and said determining the routing, via the reinforcement learning model, of the quantum gates is based, at least in part, on said re-allocating. See Moro, Fig. 3 on p. 6, e.g. “a, The length distributions of the gates sequences discovered by the agents at the end of the learning. The HRC base generates shorter circuits as expected. b, Performance of the agent during training on the tasks.” In regard to claim 15, Moro also discloses: 15. The method of claim 8, further comprising: updating one or more rewards of the reinforcement learning model based, at least in part, on one or more of the selected action and additional selected actions. Moro, p. 2 under “Deep reinforcement learning:” “Therefore, the environment evolves changing its state and returning a reward signal, that can be interpreted as a measure of the adequateness of the action the agent has performed. The only purpose of the agent is to learn a policy to maximize the reward over time.” In regard to claim 16, Moro also discloses: 16. The method of claim 8, further comprising: generating compiled instructions based, at least in part, on the determined routing, wherein: the generated compiled instructions comprise an ordering of the quantum gates and the respective SWAP gates to be performed to execute the logical quantum circuit using the quantum hardware device; and Moro, pp. 1-2, e.g.: In the framework of quantum compiling, the environment consists of a quantum circuit that starts as the identity at the beginning of each episode. It is built incrementally at each time-step by the agent, choosing a gate from B according to the policy π encoded in the deep neural network, as shown in Figure 1. Therefore, the available actions that the agent can perform correspond to the gates in the base B . said determining the mapping recommendation is additionally based, at least in part, on the generated compiled instructions. See Moro 2 nd paragraph on p. 4, “ At the end of the learning, the agent discovered an approximating circuit made by 76 gates only, within the target tolerance. ” In regard to claim 17, Moro discloses: 17. A non-transitory, computer-readable, medium storing program instructions that, when executed on or across one or more processors, cause the one or more processors to: See Moro p. 8 under “Software and Hardware,” which discusses implementation using Python, Stable Baseline, GNU parallel, Intel Xeon W-2195 and Nvidia GV100. This software and hardware combination inherently requires the use of a computer-readable medium storing instructions for execution on a processor. All further limitations of claim 17 have been addressed in the above rejection of claim 8. In regard to claim 19, parent claim 17 is addressed above. All further limitations of claim 19 have been addressed in the above rejection of claim 15. In regard to claim 20, Moro also discloses: 20. The non-transitory, computer-readable medium of claim 17, wherein the program instructions further cause the one or more processors to: determine that a first projected quantum gate scheduling path of the projected quantum gate scheduling paths, corresponding to the selected action, comprises a number of SWAP gates that is smaller than another number of SWAP gates, corresponding to another action of the plurality of actions; and See Moro, Fig. 3 on p. 6, “a, The length distributions of the gates sequences discovered by the agents at the end of the learning. The HRC base generates shorter circuits as expected. b, Performance of the agent during training on the tasks.” determine rewards of the reinforcement learning model, wherein a larger reward is assigned to the number of SWAP gates that is smaller than the other number of SWAP gates. See Moro, p. 3, 1 st paragraph, e.g. “Both reward functions are negative at each time-step, so that the agent will prefer shorter episodes.” 07-21-aia AIA Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Moro in view of Low and Wen as applied above, and further in view of U.S. Patent Application Publication 20210073674 by Zadorojniy et al. ("Zadorojniy") . In regard to claim 9, Moro also teaches: 9. The method of claim 8, wherein said selecting the action from the plurality of actions is based, at least in part, on an action selection recommendation provided via a policy network of the reinforcement learning model,… See Moro, Fig. 1 on p. 3, depicting a policy network π ( a--- n │ O n ). Also Moro, p. 2, under “Deep reinforcement learning”: According to a policy function that fully determines its behavior, the former interacts with the latter at discrete time-steps, performing an action based on an observation related to the current state of the environment. Therefore, the environment evolves changing its state and returning a reward signal, that can be interpreted as a measure of the adequateness of the action the agent has performed. Moro does not expressly disclose: wherein the action selection recommendation comprises an indication of probabilities associated to respective ones of the determined plurality of actions . This is taught by Zadorojniy. See ¶ 0035 The policy is a map which provides a probability P, commonly between 0 and 1, that action u will be taken when the environment in which the model operates indicates a state s: … It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Zadorojniy’s probabilities with Moro’s policy recommendation since probabilities are an inherent part of a policy as taught by Zadorojniy . 07-21-aia AIA Claim s 10, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Moro in view of Low and Wen as applied above, and further in view of Wallman . In regard to claim 10, Moro does not expressly disclose the limitations. Wallman teaches: 10. The method of claim 8, wherein said selecting the action from the plurality of actions is based, at least in part, on a noise model corresponding to the quantum hardware device. See Wallman Figure 3C, depicting an initial allocation. Also see ¶ 0072, “Thus, the modified quantum-logic gate sequence 300C shown in FIG. 3C represents the initial quantum-logic gate sequence 300A shown in FIG. 3A transformed by application of the virtual random gates 308 shown in FIG. 3B.” Also ¶ 0067, “FIGS. 3A, 3B, and 3C are schematic diagrams showing an example of noise tailoring applied to a quantum-logic gate sequence.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Wallman’s initial allocation with Moro’s mapping request in order to reduce the effect of noise in a quantum information processor as suggested by Wallman (see ¶ 0061). In regard to claim 12, Moro does not expressly disclose: 12. The method of claim 8, further comprising: respectively allocating logical qubits of the logical quantum circuit to one or more of the physical qubits of the quantum hardware device, wherein said determining the routing, via the reinforcement learning model, of the quantum gates is based, at least in part, on said allocating. This is taught by Wallman. See Wallman Figure 3C, depicting an initial allocation. Also see ¶ 0072, “Thus, the modified quantum-logic gate sequence 300C shown in FIG. 3C represents the initial quantum-logic gate sequence 300A shown in FIG. 3A transformed by application of the virtual random gates 308 shown in FIG. 3B.” It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use Wallman’s initial allocation with Moro’s mapping request in order to reduce the effect of noise in a quantum information processor as suggested by Wallman (see ¶ 0061). In regard to claim 18, parent claim 17 is addressed above. All further limitations of claim 18 have been addressed in the above rejection of claim 12 . Conclusion 07-96 AIA The prior art made of record and not relied upon is considered pertinent to applicant&apos;s disclosure. U.S. Patent Application Publication 20210342516 by Ren et al. See Abstract “A ML method that uses reinforcement learning (RL), such as deep RL, to determine and optimize routing of circuit connections using a game process is provided.” “Compiler Optimization for Quantum Computing Using Reinforcement Learning” by Quetschlich et al. See Abstract, “In this work, we take advantage of decades of classical compiler optimization and propose a reinforcement learning framework for developing optimized quantum circuit compilation flows.” “Using Reinforcement Learning to Perform Qubit Routing in Quantum Compilers” by Pozzi et al. See Abstract, “In this article, we propose a qubit routing procedure that uses a modified version of the deep Q-learning paradigm.” “Topological Quantum Compiling with Reinforcement Learning” by Zhang et al. See Abstract, “We introduce an efficient algorithm based on deep reinforcement learning that compiles an arbitrary single-qubit gate into a sequence of elementary gates from a finite universal set.” Any inquiry concerning this communication or earlier communications from the examiner should be directed to James D Rutten whose telephone number is (571)272-3703. The examiner can normally be reached M-F 9:00-5:30 ET. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached at (571)272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /James D. Rutten/Primary Examiner, Art Unit 2121 /Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121 Application/Control Number: 18/192,954 Page 2 Art Unit: 2121 Application/Control Number: 18/192,954 Page 3 Art Unit: 2121 Application/Control Number: 18/192,954 Page 4 Art Unit: 2121 Application/Control Number: 18/192,954 Page 5 Art Unit: 2121 Application/Control Number: 18/192,954 Page 6 Art Unit: 2121 Application/Control Number: 18/192,954 Page 7 Art Unit: 2121 Application/Control Number: 18/192,954 Page 8 Art Unit: 2121 Application/Control Number: 18/192,954 Page 9 Art Unit: 2121 Application/Control Number: 18/192,954 Page 10 Art Unit: 2121 Application/Control Number: 18/192,954 Page 11 Art Unit: 2121 Application/Control Number: 18/192,954 Page 12 Art Unit: 2121 Application/Control Number: 18/192,954 Page 13 Art Unit: 2121 Application/Control Number: 18/192,954 Page 14 Art Unit: 2121 Application/Control Number: 18/192,954 Page 15 Art Unit: 2121 Application/Control Number: 18/192,954 Page 16 Art Unit: 2121 Application/Control Number: 18/192,954 Page 17 Art Unit: 2121 Application/Control Number: 18/192,954 Page 18 Art Unit: 2121 Application/Control Number: 18/192,954 Page 19 Art Unit: 2121 Application/Control Number: 18/192,954 Page 20 Art Unit: 2121
Read full office action
Prosecution Timeline

Mar 30, 2023
Application Filed
Mar 20, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/075,501
Patent 12579423
SYSTEMS AND METHODS FOR PREDICTING BIOLOGICAL RESPONSES
2y 5m to grant Granted Mar 17, 2026
17/122,738
Patent 12555004
PATH-SUFFICIENT EXPLANATIONS FOR MODEL UNDERSTANDING
2y 5m to grant Granted Feb 17, 2026
17/212,464
Patent 12541707
METHOD AND SYSTEM FOR DEVELOPING A MACHINE LEARNING MODEL
2y 5m to grant Granted Feb 03, 2026
17/676,629
Patent 12510888
Model Reduction and Training Efficiency in Computer-Based Reasoning and Artificial Intelligence Systems
2y 5m to grant Granted Dec 30, 2025
17/865,492
Patent 12511577
DETERMINING AVAILABILITY OF NETWORK SERVICE
2y 5m to grant Granted Dec 30, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
63%
Grant Probability
99%
With Interview (+38.4%)
4y 1m
Median Time to Grant
Low
PTA Risk
Based on 580 resolved cases by this examiner. Grant probability derived from career allow rate.