Last updated: April 19, 2026

Application No. 18/274,320

METHOD AND NETWORK NODE FOR APPLYING MACHINE LEARNING IN A WIRELESS COMMUNICATIONS NETWORK

Non-Final OA §101§102

Filed

Jul 26, 2023

Examiner

RAHMAN, M MOSTAZIR

Art Unit

2411

Tech Center

2400 — Computer Networks

Assignee

Telefonaktiebolaget Lm Ericsson (Publ)

OA Round

1 (Non-Final)

Interview Optional

— +41.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 312 resolved cases, 2023–2026

Examiner Intelligence

RAHMAN, M MOSTAZIR View full profile →

Grants 68% — above average

Career Allow Rate

213 granted / 312 resolved

+10.3% vs TC avg

Strong +42% interview lift

Without

With

+41.6%

Interview Lift

resolved cases with interview

Typical timeline

3y 8m

Avg Prosecution

54 currently pending

Career history

366

Total Applications

across all art units

Statute-Specific Performance

§101

4.0%

-36.0% vs TC avg

§103

66.7%

+26.7% vs TC avg

§102

9.9%

-30.1% vs TC avg

§112

12.8%

-27.2% vs TC avg

Black line = Tech Center average estimate • Based on career data from 312 resolved cases

Office Action

§101 §102

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Preliminary Amendment
 	Preliminary Amendment that was filed on 07/26/2023 is entered.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 07/26/2023 IDS Considered have been placed in record and considered by the examiner.


Claim Objections
Claims 2-20   are objected to because of the following informalities: Specifically, 
Claim 2  line 1: replace “The method according to claim 1” to " -- “The method according to claim 1, ”.
Claim 3  line 1: replace “The method according to claim 1” to " -- “The method according to claim 1, ”.
Claim 4  line 1: replace “The method according to claim 1” to " -- “The method according to claim 1, ”.
Claim 5  line 1: replace “The method according to claim 1” to " -- “The method according to claim 1, ”.
Claim 6  line 1: replace “The method according to claim 1” to " -- “The method according to claim 1, ”.
Claim 7  line 1: replace “The method according to claim 1” to " -- “The method according to claim 1, ”.
Claim 8  line 1: replace “The method according to claim 1” to " -- “The method according to claim 1, ”.
Claim 9  line 1: replace “The method according to claim 1” to " -- “The method according to claim 1, ”.
Claim 13  line 1: replace “The network node  according to claim 12” to " -- “The network node according to claim 12, ”.
Claim 14  line 1: replace “The network node  according to claim 12” to " -- “The network node  according to claim 12, ”.
Claim 15  line 1: replace “The network node  according to claim 12” to " -- “The network node  according to claim 12, ”.
Claim 16  line 1: replace “The network node  according to claim 12” to " -- “The network node  according to claim 12, ”.
Claim 17  line 1: replace “The network node  according to claim 12” to " -- “The network node  according to claim 12, ”.
Claim 18  line 1: replace “The network node  according to claim 12” to " -- “The network node  according to claim 12, ”.
Claim 19  line 1: replace “The network node  according to claim 12” to " -- “The network node  according to claim 12, ”.
Claim 20  line 1: replace “The network node  according to claim 12” to " -- “The network node  according to claim 12, ”.

Appropriate action required.


Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 10-11 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  

Claim 10 discloses “A computer program comprising instructions, which when executed by a processor, causes the processor to perform actions according to claim 1.”.
The claim is could be interpreted as directed to signal, and that does not contain at least one structural limitation, has no physical or tangible form, and thus does not fall within any statutory category. See MPEP 2106.03.
 
The claim can be amended to recite “A non-transitory computer program …” to overcome the rejection.
 

Claim 11 discloses “A carrier comprising the computer program of claim 10, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium..”.
The claim is could be interpreted as directed to signal, and that does not contain at least one structural limitation, has no physical or tangible form, and thus does not fall within any statutory category. See MPEP 2106.03.
 
The claim can be amended to recite “A non-transitory computer program ….. or A non-transitory computer-readable storage medium..”.” to overcome the rejection.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
 

Claims 1-20  are rejected under 35 U.S.C. 102  (a)(1) or  102  (a)(2))  as being anticipated by  G´eza Szab´o et al. (Information Gain Regulation In Reinforcement Learning With The Digital Twins’ Level of Realism; 2020 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE  RADIO COMMUNICATIONS, IEEE, 31 August 2020 (2020-08-31), pages 1-7,  XP033837540,  DOI: 10.1109/PIMRC48278.2020.9217201;   hereinafter as “GEZASZABO”; provided in IDS).

Examiner’s note: in what follows, references are drawn to GEZASZABO unless otherwise mentioned.

With respect to independence claim: 
 Regarding claim 12, GEZASZABO teaches A network node  (==5G Radio in Fig. 1==network Node ) comprising a processor and a memory (5G radio device must have processor and memory) wherein said memory comprises instructions executable by said processor whereby said network node  (==5G Radio in Fig. 1)  is configured to apply machine learning in a wireless communication network, for training a communication policy controlling radio resources for communication of messages between the network node and a control node  (==ARIAC Solution)  operating a remotely controlled device (=robot arm as remotely controlled device) (aforesaid 5G radio will have “an architecture in which the radio
access control happens automatically to minimize the utilized  radio resources while still maximizing the production KPIs of the  robot cell.” , “ To achieve this, we apply Reinforcement Learning (RL) in a simulated environment to explore the environment fast, while  the Digital Twin (DT) ensures that the learned policy can be applied on the real  world environment as well. We show that the application of Ultra  Reliable Low Latency Communication (URLLC) connection can be reduced to approx. 30% of the total radio time while achieving  real-world accurate robot control” : [abstract] ‘’ The robot arm is connected robot controller
with an evaluated network setup. The edge cloud consists of robotic controller, a physical simulator that realizes the robotic  scenario, a factory cell scheduler that processes the order
and the robot control that is connected to the real robot arm  with the access controller. The access controller can switch  between the QoC phases. The performance of the robot cell
deployment is evaluated with productivity KPIs. The robot arm”:: [section II, Page 2) , 


    PNG
    media_image1.png
    226
    779
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    334
    463
    media_image2.png
    Greyscale



 the network node  (==5G Radio in Fig. 1==network Node )  further being configured to: 

obtain said messages during one or more communication phases communicated when an initial first communication policy is applied for controlling a Quality of Service, QoS, mode in said communication, wherein the QoS mode is adapted to set to one of at least two predefined QoS modes having different levels of QoS for each of said one or more communication phases (aforesaid 5G Radio with automatic QoS process, “The manual QoC-tagging is replaced with an automatic process performed in the 5G radio.”  First, the uplink packets are processed by a  Deep Packet Inspection (DPI) module, which ensures that the automatic QoC setup module is aware of the current status  of the robot. The status messages are used to feed the ML  algorithms whose output sets up the packet scheduler in the  5G radio.” Section : III; “All the 15 ARIAC competition scenarios are considered  during the training. At every policy evaluation episode one  scenario is randomly selected. The maximum score for this scenario is known apriori and provided to the evaluation  environment (max score). At every second the policy is  queried based on the current observations, and an action is  selected from the discrete action space whether to switch to  high or low QoC mode ”: Section III),
 train a machine learning model based on said messages and the first communication policy (“All the 15 ARIAC competition scenarios are considered during the training”: Section III;  “ we suggest an implementation example in  one of the state-of-the-art Reinforcement Learning (RL) trainers. We choose Proximal  Policy Optimization (PPO) [21] and its implementation in  Ray [22]. PPO perform comparably or better than state-ofthe- art approaches while being much simpler to implement  and tune. Its simple implementation enables us to focus on the  improvements. Also PPO uses simple synchronous sampling.  It means that the trajectories are not buffered for later replay anywhere in the system, the PPO learner use them once”  : Section IV, C), 
 produce a second communication policy based on the machine learning model, wherein the second communication policy comprises at least one adjusted QoS mode for at least one of the one or more communication phases ( “ The QoC switching action is realized as in [4] by selecting low Modulation and Coding Scheme (MCS) (QPSK and ½  rate coding) for the high QoC phase and a high MCS (64- QAM and 2/3 rate coding) for the low QoC phase. ….” “ It is a network delay in the end that the robot and the robot controller experiences. There can be various options to implement the low and high QoC phases on radio. This one is given as an example.  here can be setups which  the robustness of packet delivery is not affected. The overall  goal of the two phases is to relax the radio requirements and  utilize the network in a use case optimized way.”:  Section IV, B ),  
determine a performance score for the second communication policy in the one or more communication phases based on the radio resources used when communicating using the second communication policy and further based on a reduced operation precision when said one or more communication phases are communicated using the adjusted QoS mode (Modifying PPO policy in  Listing 1: with “default_policy”   PPOTrainer_dt =  POTrainer.with_updates(name="  PPOTrainer_dt", default_policy=PPOTFPolicy_dt, make_workers=make_workers_with_dt)”:  Section V; provide Specific Score : Section V ), 
when the determined performance score indicates a performance exceeding a predetermined performance, apply the second communication policy to said communication between the network node and the control node ( “We checked the ARIAC scores and
ARIAC Total Processing Times (TPT) [14] and the low QoC  ratios for the three cases. Figure 6 shows the results. First it  is important to note that we do not teach robot control but  influence an existing well-performing robot control over the  radio quality switching policy. After successful learning, the  ARIAC scores remain maxima in all cases. This is due to  the fact that policy could be learnt and the reward function  worked well. Based on the experienced difference between the  low QoC ratios in various setups we can deduce the expected
situation that the real robot with real network is a noisier  environment in total compared to the fully simulated setup.  The fully simulated setup (Setup A) can achieve a 90% low  QoC ratio, while the fully DT case (Setup C) can achieve about  71% without compromising on the perfect ARIAC scores. The  policy with 90% low QoC ratio trained fully in simulation can  achieve 39% low QoC ratio in the DT setup (Setup B). The  difference on the noise of the observation space is most visible  in this case. Also the cost of time spent on low QoC phase  can be observed in the increased TPT. The robot controller  had to compensate on certain points more and have to wait  for a high QoC phase to do the accurate positioning. Note  that this is the default built-in behavior of the robot controller.  This is also an expected behavior that the accuracy loss of low  quality network can be compensated by reducing the speed of  the robot [26].   ”: SECTION V).  

Regarding claim 1, GEZASZABO teaches,  a method in a network node for applying machine learning in a wireless communication network for training a communication policy controlling radio resources for communication of messages between the network node and a control node 
obtaining said messages during one or more communication phases communicated when an initial first communication policy is applied for controlling a Quality of Service, QoS, mode in said communication, wherein the QoS mode is set to one of at least two predefined QoS modes having different levels of QoS for each of said one or more communication phases,  
training a machine learning model based on said messages and the first communication policy,  
producing a second communication policy based on the machine learning model, wherein the second communication policy comprises at least one adjusted QoS mode for at least one of the one or more communication phases, 
determining  a performance score for the second communication policy in the one or more communication phases based on the radio resources used when communicating using the second communication policy and further based on a reduced operation precision when said one or more communication phases are communicating using the adjusted QoS mode,
 when the determined performance score indicates a performance exceeding a predetermined performance, applying the second communication policy to said communication between the network node and the control node (Regarding claim 1, the claim is interpreted and rejected for the same reason as set forth in claim 12).

Regarding claim 10, GEZASZABO teaches, A computer program comprising instructions, which when executed by a processor, causes the processor to perform actions according to claim 1 (Regarding claim 10, the claim is interpreted and rejected for the same reason as set forth in claim 1).

Regarding claim 11, GEZASZABO teaches,  A carrier comprising the computer program of claim 10, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium (Regarding claim 11, the claim is interpreted and rejected for the same reason as set forth in claim 1).  


Regarding claims 2, GEZASZABO teaches, The method according to claim 1 wherein said messages comprises a status indication received from the control node and control operations sent to the control node for controlling the remotely controlled device  and wherein applying the second communication policy to said communication between the network node and the control node comprises sending the control operations to the control node and receiving the status indication from the control node using the second communication policy (“ the uplink packets are processed by a Deep Packet Inspection (DPI) module, which ensures that the automatic QoC setup module is aware of the current status  of the robot. The status messages are used to feed the ML  algorithms whose output sets up the packet scheduler in the 5G radio”: Section III ).  

Regarding claim 3, GEZASZABO teaches, The method according to claim 1 wherein determining a performance score for the second communication policy further comprises computing the performance score for the second communication policy based on an intermediate reward for selecting a high level or low level QoS mode for the at least one adjusted QoS mode and further based on an end reward for a change in operation precision caused by said selection (“ the uplink packets are processed by a Deep Packet Inspection (DPI) module, which ensures that the automatic QoC setup module is aware of the current status  of the robot. The status messages are used to feed the ML  algorithms whose output sets up the packet scheduler in the  5G radio.”: Section III ; Modifying PPO policy in  Listing 1: with “default_policy”   PPOTrainer_dt =  POTrainer.with_updates(name="  PPOTrainer_dt", default_policy=PPOTFPolicy_dt, make_workers=make_workers_with_dt)”:  Section V; provide Specific Score : Section V ).  

Regarding claim 4, GEZASZABO teaches, The method according to claim 1 wherein determining a performance score for the second communication policy comprises any of simulating or measuring the communication performed between the network node  and the control node using the second communication policy (Modifying PPO policy in  Listing 1: with “default_policy”   PPOTrainer_dt =  POTrainer.with_updates(name="  PPOTrainer_dt", default_policy=PPOTFPolicy_dt, make_workers=make_workers_with_dt)”:  Section V; provide Specific Score : Section V ).  

Regarding claim 5, GEZASZABO teaches, The method according to claim to claim 1 wherein training the machine learning model is further based on a first performance score of the first communication policy (“All the 15 ARIAC competition scenarios are considered during the training”: Section III;  “ we suggest an implementation example in  one of the state-of-the-art Reinforcement Learning (RL) trainers. We choose Proximal  Policy Optimization (PPO) [21] and its implementation in  Ray [22]. PPO perform comparably or better than state-ofthe- art approaches while being much simpler to implement  and tune. Its simple implementation enables us to focus on the  improvements. Also PPO uses simple synchronous sampling.  It means that the trajectories are not buffered for later replay anywhere in the system, the PPO learner use them once”  : Section IV, C).  

Regarding claim 6, GEZASZABO teaches, The method according to claim 5 wherein the machine learning model is further trained based on a third communication policy, second messages communicated between the network node and the control node using the third communication policy, and a third performance score associated with the third communication policy ( “ The QoC switching action is realized as in [4] by selecting low Modulation and Coding Scheme (MCS) (QPSK and ½  rate coding) for the high QoC phase and a high MCS (64- QAM and 2/3 rate coding) for the low QoC phase. ….” “ It is a network delay in the end that the robot and the robot controller experiences. There can be various options to implement the low and high QoC phases on radio. This one is given as an example.  here can be setups which  the robustness of packet delivery is not affected. The overall  goal of the two phases is to relax the radio requirements and  utilize the network in a use case optimized way.”:  Section IV, B ).  

Regarding claim 7, GEZASZABO teaches, The method according to claim 1 wherein the at least one adjusted QoS mode is changed from a high level QoS to a low level QoS (“”The manual  QoC-tagging is replaced with an automatic process performed  in the 5G radio. First, the uplink packets are processed by a  Deep Packet Inspection (DPI) module, which ensures that the  automatic QoC setup module is aware of the current status  of the robot. The status messages are used to feed the ML  algorithms whose output sets up the packet scheduler in the  5G radio.”: Section: III; “ The scoring of the ARIAC environment is used together  with a reward for the agent after every second to encourage  the usage of the low QoC channel. Using low QoC channel provides 10 points to the agent, while using high  QoC channel is penalized with 1 point after every second.”: Section III). 
  
Regarding claim 8, GEZASZABO teaches, The method according to claim 1 wherein a high level QoS mode comprises the network node demanding Ultra-Reliable Low-Latency Communication, URLLC, for communicating with the control node ( “ We showed that the application of  URLLC connection can be reduced to approx. 30% of the total  radio time while achieving real world accurate robot control.”: Section VIII).  

Regarding claim 9, GEZASZABO teaches, The method according to claim 1 wherein applying the second communication policy requires the determined performance score to indicate a  performance exceeding a predefined performance by a predefined threshold ( “ The fully simulated setup (Setup A) can achieve a 90% low  QoC ratio, while the fully DT case (Setup C) can achieve about  71% without compromising on the perfect ARIAC scores. The  policy with 90% low QoC ratio trained fully in simulation can  achieve 39% low QoC ratio in the DT setup (Setup B). The  difference on the noise of the observation space is most visible  in this case.” Section V B) .  
Regarding claims 12-20, the claim is interpreted and rejected for the same reason as set forth in claims 1-9.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to M MOSTAZIR RAHMAN whose telephone number is (571)272-4785. The examiner can normally be reached 8:30am-5:00pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Derrick Ferris can be reached at 571-272-3123. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/M Mostazir Rahman/Examiner, Art Unit 2411                                                                                                                                                                                                        
/DERRICK W FERRIS/Supervisory Patent Examiner, Art Unit 2411

Read full office action

Prosecution Timeline

Jul 26, 2023

Application Filed

Feb 21, 2026

Non-Final Rejection — §101, §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/366,780

Patent 12556325

DMRS BUNDLING SCHEMES FOR UPLINK NON-CODEBOOK TRANSMISSIONS

2y 5m to grant Granted Feb 17, 2026

17/607,567

Patent 12538351

METHOD AND DEVICE FOR TRANSMITTING FRAME THROUGH DETERMINATION OF CHANNEL EXPANSION IN BROADBAND WIRELESS COMMUNICATION NETWORK

2y 5m to grant Granted Jan 27, 2026

17/998,112

Patent 12526346

TECHNIQUES FOR SELECTIVE RE-COMPRESSION OF ROBUST HEADER COMPRESSION PACKETS

2y 5m to grant Granted Jan 13, 2026

17/476,282

Patent 12513729

UPLINK PERFORMANCE FOR BEARERS

2y 5m to grant Granted Dec 30, 2025

17/684,739

Patent 12513046

CENTRAL COMMUNICATION UNIT OF PURPOSE-BUILT VEHICLE AND METHOD OF CONFIGURING DYNAMIC NETWORK THEREOF

2y 5m to grant Granted Dec 30, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

68%

Grant Probability

99%

With Interview (+41.6%)

3y 8m

Median Time to Grant

Low

PTA Risk

Based on 312 resolved cases by this examiner. Grant probability derived from career allow rate.