Last updated: April 19, 2026

Application No. 17/356,235

MACHINE LEARNING APPROACH FOR SOLVING BEAM ANGLE OPTIMIZATION

Non-Final OA §102§103

Filed

Jun 23, 2021

Examiner

ALGHAZZY, SHAMCY

Art Unit

2128

Tech Center

2100 — Computer Architecture & Software

Assignee

Siemens Healthineers International AG

OA Round

5 (Non-Final)

Interview Optional

— +0.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 62 resolved cases, 2023–2026

Examiner Intelligence

ALGHAZZY, SHAMCY View full profile →

Grants 48% of resolved cases

Career Allow Rate

30 granted / 62 resolved

-6.6% vs TC avg

Minimal +1% lift

Without

With

+0.7%

Interview Lift

resolved cases with interview

Typical timeline

3y 11m

Avg Prosecution

25 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

34.9%

-5.1% vs TC avg

§103

39.3%

-0.7% vs TC avg

§102

11.1%

-28.9% vs TC avg

§112

10.0%

-30.0% vs TC avg

Black line = Tech Center average estimate • Based on career data from 62 resolved cases

Office Action

§102 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined
under the first inventor to file provisions of the AIA .

Examiner's Note
The Examiner respectfully requests of the Applicant in preparing responses, to fully consider the entirety of the reference(s) as potentially teaching all or part of the claimed invention.  It is noted, REFERENCES ARE RELEVANT AS PRIOR ART FOR ALL THEY CONTAIN.  “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned.  They are part of the literature of the art, relevant for all they contain.”  In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)).  A reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art, including non-preferred embodiments (see MPEP 2123).  The Examiner has cited particular locations in the reference(s) as applied to the claim(s) above for the convenience of the Applicant.  Although the specified citations are representative of the teachings of the art and are applied to the specific limitations within the individual claim(s), typically other passages and figures will apply as well.

Response to Applicant’s Arguments
Applicant’s arguments, pages 7-9 filed 11/20th/2025, regarding the rejection of claims 1-20 under 35U.S.C. 103 have been fully considered and they are moot in light of the new rejection below. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ogunmolu et al (Non-Patent Literature “Deep BOO! Automating Beam Orientation Optimization in Intensity-Modulated Radiation Therapy”), in view of Fong (US20190197244A1), in view of Fox (US20210038320A1).

Regarding Claim 1, Ogunmolu et al teaches a computer-implemented method of beam angle optimization comprising: executing, by at least one processor, a machine learning model that receives an input of data associated with a treatment plan to achieve one or more clinical goals comprising a first beam angle for a patient and outputs a revised beam angle for the patient indicating a direction of radiation into the patient, (Ogunmolu et al, Section 1, pg. 339, “The process of choosing what beam angle is best for delivering beamlet intensities is termed beam orientation optimization (BOO), while the process of determining what intensity meets a prescribed fluence profile by a doctor is termed fluence map optimization (FMO)”. )
… and using a training dataset comprising a training treatment plan and a corresponding beam angle, (Ogunmolu et al, Section 3, pg.349, “We start the training process by randomly adding five beam blocks to the state queue as described in Sect. 2. The input planes are then passed through the tower residual network from which probability distributions and value are predicted”. The examiner interprets Ogunmolu’s five beam blocks to be the claimed treatment plan.)
wherein the machine learning model iteratively calculates a reward representing a performance of a possible beam angle with respect to the one or more clinical goals, using a policy, for the possible beam angle for the training treatment plan in the training dataset, (Ogunmolu et al, Section 2.1, pg. 341, “The learning problem is posed within a discrete finite-time horizon, T , while a beam angle combination search task can be defined by a reward function, r(st, at), which can be found by recovering a policy, p(at | st; ψ), that specifies a distribution over actions conditioned on the state, and parameterized by the weights of a neural network, a tensor ψ. Without loss of generality, we denote the action conditional p(at | st, ψ) as πψ(at | st).”) 
wherein the machine learning model iteratively increases a summation of rewards  (Ogunmolu et al, Section 2.6, pg. 349, “                        
                            Q
                             
                            (
                            s
                            ,
                             
                            a
                            )
                        
                     can be recovered from the cumulative function,                         
                            R
                            
                                    s
                                    ,
                                     
                                    a
                                
                     and probability transition function,                         
                            P
                            (
                            s
                            ,
                            a
                            )
                            "
                        
                    , 
 until an advantage value associated with the policy satisfies an accuracy threshold; (Ogunmolu et al, Section 2.6, pg. 349,  “the frequency with which different actions for                         
                            
                                    p
                                
                                    1
                                
                     and                         
                            
                                    p
                                
                                    2
                                
                     are chosen will converge to the probability distribution that characterized their random strategies”. The examiner interprets the frequency as taught by Ogunmolu to be the claimed advantage value and the probability distribution as taught by Ogunmolu to be the claimed accuracy threshold)
However, Ogunmolu is not relied upon to explicitly teach wherein the machine learning model is trained via a reinforcement learning paradigm that uses asynchronous agents. Ogunmolu is also not relied upon to explicitly teach transmitting, by the at least one processor, the revised beam angle to a second processor executing a plan optimizer computer model that ingests the revised beam angle and generates a revised treatment plan for the patient.
On the other hand, Fong teaches wherein the machine learning model is trained via a reinforcement learning paradigm that uses asynchronous agents ([0042] Embodiments described herein generally relate to using an Asynchronous Advantage Actor-Critic (A3C) reinforcement learning algorithm. The examiner notes that Ogunmolu and Fong are both directed towards the field of data processing and are seen as reasonably pertinent analogous art. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ogunmolu’s training process to incorporate wherein the machine learning model is trained using asynchronous advantage actor critic reinforcement learning as taught by Fong [0042] to help build a payload based on how a system reacts to other synthetically created payloads [0042]).
Furthermore, Fox teaches transmitting, by the at least one processor, the revised beam angle to a second processor executing a plan optimizer computer model that ingests the revised beam angle and generates a revised treatment plan for the patient ([0037] An ultrasound image-guided needle insertion procedure in accordance with the present invention is outlined in FIG. 10. A clinician begins the procedure by positioning a curved array transducer probe on the body of a subject, manipulating the probe until the target anatomy for the procedure is in the field of view. The target anatomy may be a cyst which is to be biopsied using a needle, for instance. With the target anatomy in view in the ultrasound image, the clinician starts inserting the needle in-line with the plane of the image, as stated in step 102. As the insertion proceeds, the curved array transducer transmits un-steered beams over the field of view to image the field and capture the insertion of the needle as stated in step 104. In step 106 the needle location processor of the ultrasound system identifies the line of specular needle reflections in the image, where the radially directed beams from the curved array are intersecting the needle around the most favorable angle. In step 108 the needle location processor identifies the brightest point along the needle reflection line, which identifies the angle of the transmit beam which produced that bright point as stated in step 110. The needle location processor then causes the transmit controller to control the curved array transducer to transmit parallel steered beams toward the needle at the identified beam angle as stated in step 112. Scanning with the parallel steered beams produces the strongest image of the needle, and a needle guide graphic is displayed with the image in step 114, preferably on either side of the location of the needle in the ultrasound image. Clutter reduction may then be performed using one of the dual apodization processing techniques explained above. The examiner notes the Fox teaches a needle location processor that identifies the most favorable needle beam angle which is transmitted to a transmit controller to control the curved array transducer to transmit parallel steered beams toward the needle at the identified beam angle. The examiner further notes that Ogunmolu and Fox are both directed towards the field of data processing and beam angle guidance and are seen as reasonably pertinent analogous art. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ogunmolu’s beam guidance to incorporate transmitting, by the at least one processor, the revised beam angle to a second processor executing a plan optimizer computer model that ingests the revised beam angle and generates a revised treatment plan for the patient as taught by Fox [0037] to allow the clinician to manipulate the needle to avoid piercing blood vessels or work around hard substances in the body [0027]).

Regarding Claim 2, Ogunmolu et al teaches the computer-implemented method of Claim 1. Ogunmolu et al further teaches wherein at least one of the treatment plan or the training treatment plan comprise at least one of a medical image, a planning target volume, an organ at risk, a radiation type, a radiation dose, and initial beam angle, or field geometry. (Ogunmolu et al, Section 2, pg. 340, “Concatenation of the target volume mask and beam angle before feeding the input planes to the residual tower neural network”, “wherein the target volume mask is a treatment plan that is a medical image and a planning target volume).

Regarding Claim 3, Ogunmolu et al teaches the computer-implemented method of Claim 2. Ogunmolu et al further teaches further comprising executing the machine learning model that receives the input of data associated with the treatment plan for the patient and outputs a dose distribution, (Ogunmolu et al, Section 2.2, pg. 342, “CT scans and their dose matrices”) wherein the machine learning model is trained using a training dataset comprising the training treatment plan and a corresponding dose distribution, (Ogunmolu et al, Section 3, pg.350, “We start the training process by randomly adding five beam blocks to the state queue as described in Sect. 2. The input planes are then passed through the tower residual network from which probability distributions and value are predicted”).

Regarding claim 4, Ogunmolu et al teaches the computer-implemented method according to claim 1. Ogunmolu et al teaches wherein, executing, by at least one processor, a machine learning model that receives a second input of data associated with a second treatment plan and outputs a second revised beam angle for the patient indicating a direction of radiation into the patient, wherein the machine learning model outputs the second revised beam angle independent of the second treatment plan comprising a beam angle. (Ogunmolu et al, Section 3, pg.350, “We start the training process by randomly adding five beam blocks to the state queue as described in Sect. 2. The input planes are then passed through the tower residual network from which probability distributions and value are predicted. We add a random walk sequence to this pure strategy, generating a mixed strategy, and subsequently construct the tree. This mixed strategy guides search for approximately optimal beam angles.”. The examiner notes that Ogunmolu teaches using a machine learning process that takes as input multiple radiation therapy beam angles and outputs an optimal beam angle. The examiner further interprets the 5 random beam angles as taught by Ogunmolu to be associated with five initial treatment plans.)

Regarding Claim 5, Ogunmolu et al teaches the computer- implemented method of Claim 1 (and thus the rejection of Claim 1 is incorporated). Ogunmolu et al further teaches further comprising presenting, by the at least one processor, for display, the revised beam angle. (Ogunmolu et al, Section 2, pg. 340, Lines 6-12, “The search for an approximately optimal beam angle set is performed by optimizing the parameters of a function approximator                                  
                                    ψ
                                
                            , (here, a deep neural network, with multiple residual blocks as in) that approximates a policy                                 
                                    π
                                
                            . The policy guides simulations of ‘best-first’ beam angle combinations for a sufficiently large number of iterations – essentially a sparse lookout simulation that selectively adjusts beams that contribute the least to an optimal fluence. Successor nodes beneath a terminal node are approximated with a value,                                 
                                    v
                                    (
                                    s
                                    )
                                
                            , to assure efficient selectivity. We maintain a probability distribution over possible states, based on a set of observation probabilities for the underlying Markov Decision Process (MDP)”. Table 2, pg. 351“Doses increases in intensity from blue to green to yellow to red. The intersection of beams delivers heavy doses to tumors (center of the slices) while largely spares surrounding tissues”.)

    PNG
    media_image1.png
    399
    523
    media_image1.png
    Greyscale

Regarding claim 6, Ogunmolu teaches The computer-implemented method according to claim 1. However, Ogunmolu is not relied upon to explicitly teach wherein the machine learning model is trained using asynchronous advantage actor critic reinforcement learning.
On the other hand, Fong teaches wherein the machine learning model is trained using asynchronous advantage actor critic reinforcement learning ([0042] Embodiments described herein generally relate to using an Asynchronous Advantage Actor-Critic (A3C) reinforcement learning algorithm. The examiner notes that Ogunmolu and Fong are both directed towards the field of data processing and are seen as reasonably pertinent analogous art. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Ogunmolu’s training process to incorporate wherein the machine learning model is trained using asynchronous advantage actor critic reinforcement learning as taught by Fong [0042] to help build a payload based on how a system reacts to other synthetically created payloads [0042]). 

Regarding Claim 7, Ogunmolu et al teaches the computer- implemented method (and thus the rejection of Claim 1 is incorporated). Ogunmolu et al further teaches wherein the machine learning model in implemented using hybrid graphics processing unit and central processing unit. (Ogunmolu et al, Section 2.3, pg. 343, “(i) a 3D convolution of 64 x                                 
                                    l
                                
                             filters, followed by a square kernel of size 1, and a single stride convolution; (ii) a 3D batch normalization layer; (iii) nonlinear rectifiers; (iv) a 3D convolution of 64 x                                 
                                    l
                                
                             filters (v) a 3D batch normalization layer; (vi) a skip connection from the input to the block (vii) nonlinear rectifiers; (viii) a fully connected layer that maps the resulting output to the total number of discretized beam angle grids; and (ix) a SoftMax layer then maps the neuron units to logit probabilities                                 
                                    
                                            p
                                        
                                            i
                                        
                            (                                
                                    s
                                    |
                                    a
                                    )
                                
                             for all beam angles”, where any unit (software, hardware, module) anything that performs processing on images and performs the mathematical functions (e.g., SoftMax) is a graphic processing unit and a central processing unit).

Regarding Claim 8, Ogunmolu et al teaches the computer- implemented method (and thus the rejection of Claim 1 is incorporated). Ogunmolu et al further teaches wherein the machine learning model is optimized with respect to one or more clinical goals received  in the treatment plan, the clinical goals including at least one of a dosimetric quality, a robustness measure, metrics based on linear energy transfer, or relative biological effects. (Ogunmolu et al, Section 1, pg. 339, “this aids the network in finding an approximately optimal beam angle candidate set the meet the doctor’s dosimetric requirement”, wherein the dosimetric quality is the clinical goal of the doctor to measure the dosage of radiation administered).

Regarding Claim 9, Ogunmolu et al teaches the computer- implemented method (and thus the rejection of Claim 1 is incorporated). Ogunmolu et al further teaches receiving, by the at least one processor from the second processor, the revised treatment plan, wherein the revised treatment plan, the treatment plan is based on beam angle; (Ogunmolu et al, Section 1, pg. 339, “When new information is presented to the decisionmaker, the subjective probability distribution gets revised. Decisions about the optimal beam angle combination at the current  time step are made under uncertainty; so we use a probability model to choose among lotteries (i.e., probability distribution over all discretized beam angles in the setup”) executing, by the at least one processor, the machine learning model using the revised treatment plan for the patient and outputting the revised beam angle; and transmitting by the at least one processor, the revised beam angle to the second processor. (Ogunmolu et al, Section 1, pg. 339, “The process of choosing what beam angle is best for delivering beamlet intensities is termed beam orientation optimization (BOO), while the process of determining what intensity meets a prescribed fluence profile by a doctor is termed fluence map optimization (FMO)” which is transmitted to the robot manipulator in order to deliver the radiation).

Regarding Claim 10, Ogunmolu et al teaches the computer- implemented method (and thus the rejection of Claim 1 is incorporated). Ogunmolu et al further teaches wherein iteratively calculating the reward, using the policy, for the possible beam angle from the training treatment plan in the training dataset includes iteratively comparing the reward to a baseline. (Ogunmolu et al, Section 2.1, pg. 341, “The learning problem is posed within a discrete finite-time horizon,                                 
                                    T
                                
                            , while a beam angle combination search task can be defined by a reward function,                                 
                                    r
                                    (
                                    
                                            s
                                        
                                            t
                                            ,
                                        
                                            a
                                        
                                            t
                                        
                                    )
                                
                            , which can be found by recovering a policy,                                 
                                    p
                                    
                                                    a
                                                
                                                    t
                                                
                                                    s
                                                
                                                    t
                                                
                                            ,
                                            ψ
                                        
                                    ,
                                
                             that specifies a distribution over actions conditioned on the state, and parameterized by the weights of neural network, a tensor                                 
                                    ψ
                                
                            ” where finding the max of the total reward necessarily compares the total reward to the other possible total sums, e.g. a baseline).

Claims 11-20 recite a system comprising a server comprising a processor  to perform the methods of Claims 1-10. Therefore, claims 11-20 are rejected for reasons set forth in the rejections of Claims 1-10, respectively.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
BARKOUSARAIE - A fast deep learning approach for beam orientation optimization
“BARKOUSARAIE teaches a fast beam orientation selection method, based on deep learning
neural networks (DNN), capable of developing a plan comparable to those developed by the state-of-the-art column generation (CG) method”
Nakagawa (US20210031057A1)
“Nakagawa teaches an improved x-ray cone-beam CT image reconstruction by end-to-end training of a multi-layered neural network”
Nguyen (US 2021/0339048 Al)
“Nguyen teaches a method for determining a radiotherapy treatment plan”
Hakala (US 2022/0184421 Al)
“Hakala teaches a method for identifying radiation therapy treatment data for patients”
Hibbard (US20220088410A1)
“Hibbard teaches a method for generating fluence maps for a radiotherapy treatment plan that uses machine learning prediction”
Lachaine (US20110009742A1)
“Hibbard teaches a method for delivering radiation treatment to a patient by positioning the patient such that a radiation beam is delivered to a lesion within the patient along a beam-delivery path”

Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHAMCY ALGHAZZY whose telephone number is (571)272-8824.  The examiner can normally be reached on M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, OMAR FERNANDEZ RIVAS can be reached on (571) 272-2589.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SHAMCY ALGHAZZY/Examiner, Art Unit 2128                                                                                                                                                                                                        
/OMAR F FERNANDEZ RIVAS/Supervisory Patent Examiner, Art Unit 2128

Read full office action

Prosecution Timeline

Jun 23, 2021

Application Filed

Jun 28, 2024

Non-Final Rejection — §102, §103

Oct 03, 2024

Response Filed

Nov 25, 2024

Final Rejection — §102, §103

Jan 23, 2025

Interview Requested

Feb 13, 2025

Response after Non-Final Action

Mar 17, 2025

Request for Continued Examination

Mar 19, 2025

Response after Non-Final Action

May 17, 2025

Non-Final Rejection — §102, §103

Jul 28, 2025

Applicant Interview (Telephonic)

Jul 28, 2025

Examiner Interview Summary

Jul 30, 2025

Response Filed

Sep 11, 2025

Final Rejection — §102, §103

Nov 20, 2025

Response after Non-Final Action

Jan 16, 2026

Non-Final Rejection — §102, §103

Apr 03, 2026

Applicant Interview (Telephonic)

Apr 03, 2026

Examiner Interview Summary

Precedent Cases

Applications granted by this same examiner with similar technology

17/613,773

Patent 12596925

SINGLE-STAGE MODEL TRAINING FOR NEURAL ARCHITECTURE SEARCH

2y 5m to grant Granted Apr 07, 2026

18/612,881

Patent 12596922

ACCELERATING NEURAL NETWORKS IN HARDWARE USING INTERCONNECTED CROSSBARS

2y 5m to grant Granted Apr 07, 2026

19/236,733

Patent 12579408

ADAPTIVELY TRAINING OF NEURAL NETWORKS VIA AN INTELLIGENT LEARNING MANAGEMENT SYSTEM

2y 5m to grant Granted Mar 17, 2026

17/704,176

Patent 12572847

SYSTEMS AND METHODS FOR RESOURCE-AWARE MODEL RECALIBRATION

2y 5m to grant Granted Mar 10, 2026

16/678,038

Patent 12566966

TRAINING ADAPTABLE NEURAL NETWORKS BASED ON EVOLVABILITY SEARCH

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

5-6

Expected OA Rounds

48%

Grant Probability

49%

With Interview (+0.7%)

3y 11m

Median Time to Grant

High

PTA Risk

Based on 62 resolved cases by this examiner. Grant probability derived from career allow rate.

MACHINE LEARNING APPROACH FOR SOLVING BEAM ANGLE OPTIMIZATION

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email