Last updated: April 19, 2026

Application No. 18/702,052

REINFORCEMENT LEARNING OF INTERFERENCE-AWARE BEAM PATTERN DESIGN

Non-Final OA §103

Filed

Apr 17, 2024

Examiner

BROCKMAN, ANGEL T

Art Unit

2412

Tech Center

2400 — Computer Networks

Assignee

Arizona Board of Regents

OA Round

1 (Non-Final)

Interview Optional

— +6.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 726 resolved cases, 2023–2026

Examiner Intelligence

BROCKMAN, ANGEL T View full profile →

Grants 82% — above average

Career Allow Rate

593 granted / 726 resolved

+23.7% vs TC avg

Moderate +6% lift

Without

With

+6.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

18 currently pending

Career history

744

Total Applications

across all art units

Statute-Specific Performance

§101

7.3%

-32.7% vs TC avg

§103

53.5%

+13.5% vs TC avg

§102

22.9%

-17.1% vs TC avg

§112

4.7%

-35.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 726 resolved cases

Office Action

§103

DETAILED ACTION
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-12 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (Online Beam Learning with Interference Nulling for Milimeter Wave Mimo Systems,2022, hereinafter Zhang) in view of Fredj et al. (Distributed Beamforming Techniques for Cell-Free Wireless Networks Using Deep Reinforcement Learning , IEEE, 2021 hereinafter Fredj).
Regarding claim 1,  Zhang discloses a method for designing an interference-aware beam pattern, the method comprising (abstract):
measuring one or more channels for one or more interfering signals from one or more interference directions (section 2, system model wherein hk represents channels from interfering transmitters, the received signal includes the desired signal, interference from other transmitters and noise);
; and
communicating over the one or more channels using the one or more interference-aware beams (section 2, wherein the beamforming vector is applied at the antenna array to communicate with the desired receiver ) Zhang does not disclose using reinforcement learning to shape one or more interference-aware beams to reduce interference in one or more directions based on the one or more interfering signals. Fredj discloses using reinforcement learning to shape one or more interference-aware beams to reduce interference in one or more directions based on the one or more interfering signals (abstract, section 3, wherein the deep reinforcement learning is used s) . Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the reinforcement learning in order to improve adaptive beamforming under interference conditions. 

 	Regarding claim 2, Zhang discloses  wherein the measuring further comprises measuring, by a base station, a power level of a received signal from a target user equipment of a target user and measuring an interference power level of one or more undesired transmitters (section 2, system model, wherein the received signal includes the desired signal power and interference power from other transmitters).

 	Regarding claim 3,  Zhang discloses wherein measuring, by the base station, the power level of the received signal from the target user equipment of the target user further comprises measuring a power of an interference plus a noise level signal when the target user equipment is not transmitting and measuring a power of a signal plus the interference plus the noise level signal of the target user equipment using a same beam produced by the target user equipment (section 2, received signal model includes noise and signal interference).

 	Regarding claim 4,  Zhang discloses wherein the power of the interference plus the noise level signal when the target user equipment is not transmitting is obtained from a zero power reference signal transmitted by the target user equipment (section 2, baseline/noise/interference modeling and reference signal).

 	Regarding claim 5, Zhang discloses all subject  matter of the claimed invention with the exception of wherein the reinforcement learning comprises an actor-critic-based deep reinforcement learning architecture.Fredj discloses wherein the reinforcement learning comprises an actor-critic-based deep reinforcement learning architecture (section 3, wherein the DRL algorithms including DDPG and  and D4PG are  actor-critic DRL methods). Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the actor-critic DRL framework  in order to improve convergence and policy optimization efficiency in beamforming design.
 	Regarding claim 6, Zhang discloses all subject matter of the claimed invention with the exception of  the actor-critic-based deep reinforcement learning architecture comprises a fully connected (FC) feed-forward neural network. Fredj discloses the actor-critic-based deep reinforcement learning architecture comprises a fully connected (FC) feed-forward neural network (section 3, wherein the DDPG and D4PG are neural network implementations). Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the actor-critic DRL framework  in order to improve convergence and policy optimization efficiency in beamforming design.
 	Regarding claim 7,  Zhang discloses a beam pattern design system, comprising:
a measurement module configured to measure interference on a channel (section 2, channel/interference measurement model) ;and
a beamforming control module configured to apply the beam pattern to communicate with a user device (section 2, beamforming vector applied at antenna array). Zhang does not disclose a learning module configured to use reinforcement learning to learn a beam pattern which reduces interference on the channel. Fredj discloses a learning module configured to use reinforcement learning to learn a beam pattern which reduces interference on the channel (section 3, DRL based beamforming optimization). Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the reinforcement learning in order to improve adaptive beamforming under interference conditions. 

 	Regarding claim 8,  Zhang discloses wherein the measurement module is configured to measure, by a base station, a power level of a received signal from a target user equipment of a target user and measuring an interference power level of one or more undesired transmitters (section 2, system model, wherein the received signal includes the desired signal power and interference power from other transmitters).

 	Regarding claim 9,  Zhang discloses  wherein the base station measures the power level of the received signal from the target user equipment of the target user by measuring a power of an interference plus a noise level signal when the target user equipment is not transmitting and measuring a power of a signal plus the interference plus the noise level signal of the target user equipment using a same beam produced by the target user equipment (section 2, received signal model includes noise and signal interference).
 	Regarding claim 10, Zhang discloses , wherein the power of the interference plus the noise level signal when the target user equipment is not transmitting is obtained from a zero power reference signal transmitted by the target user equipment section 2, baseline/noise/interference modeling and reference signal).
 	Regarding claim 11 Zhang discloses all subject  matter of the claimed invention with the exception of wherein the reinforcement learning comprises an actor-critic-based deep reinforcement learning architecture. Fredj discloses wherein the reinforcement learning comprises an actor-critic-based deep reinforcement learning architecture (section 3, actor-critic DRL framework for beamforming optimization). Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the actor-critic DRL framework  in order to improve convergence and policy optimization efficiency in beamforming design.

 	Regarding claim 12, Zhang discloses all subject matter of the claimed invention with the exception of  the actor-critic-based deep reinforcement learning architecture comprises a fully connected (FC) feed-forward neural network. Fredj discloses the actor-critic-based deep reinforcement learning architecture comprises a fully connected (FC) feed-forward neural network (section 3, neural network architecture). Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the actor-critic DRL framework  in order to improve convergence and policy optimization efficiency in beamforming design.

Claims 13-19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang in view of Chiang (US 2018/0375206 A1, hereinafter Chiang).

 	Regarding claim 13, Zhang discloses a radio frequency (RF) device comprising  (abstract , section 2) . and use reinforcement learning to design a beam pattern or beam codebook that reduces the self-interference and optimizes a performance parameter of the RF device (section 2). Zhang does not disclose an RF transmitter; an RF receiver co-located with the RF transmitter; and control circuitry configured to: measure self-interference between the RF transmitter and the RF receiver. Chiang discloses an RF transmitter; an RF receiver co-located with the RF transmitter; and control circuitry configured to: measure self-interference between the RF transmitter and the RF receiver (¶[0031]-¶[0038], figure 7). Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the RF device architecture into the reinforcement-learning based beam pattern and codebook design of Zhang to provide an RF device capable of sensing interference conditions and adaptively controlling RF transmission performance. 
 	Regarding claim 14, Zhang discloses  wherein the performance parameter comprises a power for a desired user (section 2, system model, wherein the received signal includes the desired signal power and interference power from other transmitters).

 	Regarding claim 15,  Zhang disclose wherein the measure further comprises measuring, by a base station, a power level of a received signal from a target user equipment of a target user and measuring an interference power level of one or more undesired transmitters section 2, system model, wherein the received signal includes the desired signal power and interference power from other transmitters).
.

 	Regarding claim 16,  Zhang discloses wherein measuring, by the base station, the power level of the received signal from the target user equipment of the target user further comprises measuring a power of an interference plus a noise level signal when the target user equipment is not transmitting and measuring a power of a signal plus the interference plus the noise level signal of the target user equipment using a same beam produced by the target user equipment (section 2, system model, wherein the received signal includes the desired signal power and interference power from other transmitters).
.
 	Regarding claim 17, Zhang discloses wherein the power of the interference plus the noise level signal when the target user equipment is not transmitting is obtained from a zero power reference signal transmitted by the target user equipment (section 2, baseline/noise/interference modeling and reference signal).

-
 	Regarding claim 18, Zhang discloses all subject  matter of the claimed invention with the exception of wherein the reinforcement learning comprises an actor-critic-based deep reinforcement learning architecture. Luong discloses wherein the reinforcement learning comprises an actor-critic-based deep reinforcement learning architecture (section 3, actor-critic DRL framework for beamforming optimization). Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the actor-critic DRL framework  in order to improve convergence and policy optimization efficiency in beamforming design.

 	Regarding claim 19, Zhang discloses all subject matter of the claimed invention with the exception of  the actor-critic-based deep reinforcement learning architecture comprises a fully connected (FC) feed-forward neural network. Luong discloses the actor-critic-based deep reinforcement learning architecture comprises a fully connected (FC) feed-forward neural network (section 3, neural network architecture). Thus, it would have been obvious to one of ordinary skill in the art at the time of invention to make the proposed modification of the actor-critic DRL framework  in order to improve convergence and policy optimization efficiency in beamforming design.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANGEL T BROCKMAN whose telephone number is (571)270-5664. The examiner can normally be reached Monday-Thursday 6:00 AM-4:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Jiang can be reached at 571-270-7191. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ANGEL T BROCKMAN/Examiner, Art Unit 2412

Read full office action

Prosecution Timeline

Apr 17, 2024

Application Filed

Apr 01, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/254,789

Patent 12593349

COMMUNICATION APPARATUS AND COMMUNICATION METHOD FOR PRIORITIZED TRAFFIC

2y 5m to grant Granted Mar 31, 2026

17/903,118

Patent 12574175

Data Transmission Method, Vehicle-Side Device, and Network Side Device

2y 5m to grant Granted Mar 10, 2026

18/491,621

Patent 12574918

FRAME EXCHANGE SEQUENCE AND NETWORK ALLOCATION VECTOR (NAV) PROTECTION

2y 5m to grant Granted Mar 10, 2026

18/959,525

Patent 12574949

SIDELINK SIGNAL POSITIONING COORDINATION BASED ON USER DEVICE CAPABILITY

2y 5m to grant Granted Mar 10, 2026

18/076,380

Patent 12556436

INSTANTANEOUS AMPLITUDE GAIN SIDE INFORMATION FOR A MULTIPLEXED SIGNAL

2y 5m to grant Granted Feb 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

82%

Grant Probability

88%

With Interview (+6.5%)

2y 9m

Median Time to Grant

Low

PTA Risk

Based on 726 resolved cases by this examiner. Grant probability derived from career allow rate.

REINFORCEMENT LEARNING OF INTERFERENCE-AWARE BEAM PATTERN DESIGN

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email