Last updated: April 19, 2026
Application No. 17/317,633
TWO-STAGE DEEP LEARNING BASED SECURE PRECODER FOR INFORMATION AND ARTIFICIAL NOISE SIGNAL IN NON-ORTHOGONAL MULTIPLE ACCESS SYSTEM

Non-Final OA §103§112
Filed
May 11, 2021
Examiner
BOSTWICK, SIDNEY VINCENT
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
Korea Advanced Institute Of Science And Technology
OA Round
5 (Non-Final)
Interview Optional

— +38.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 136 resolved cases, 2023–2026
Examiner Intelligence

BOSTWICK, SIDNEY VINCENT View full profile →
Grants 52% of resolved cases
Career Allow Rate
71 granted / 136 resolved
-2.8% vs TC avg
Strong +38% interview lift
Without
With
+38.2%
Interview Lift
resolved cases with interview
Typical timeline
4y 7m
Avg Prosecution
68 currently pending
Career history
204
Total Applications
across all art units
Statute-Specific Performance

§101
24.4%
-15.6% vs TC avg
§103
40.9%
+0.9% vs TC avg
§102
12.0%
-28.0% vs TC avg
§112
21.9%
-18.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 136 resolved cases
Office Action

§103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .  A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 9/18/2025 has been entered.

Remarks
This Office Action is responsive to Applicants' Amendment filed on September 18, 2025, in which claims 1, 6, 16, and 18 are currently amended. Claims 1-10 and 16-18 are currently pending.

Response to Arguments
The previous rejections to claims 1-10 and 16-18 under 35 U.S.C. § 112(a)/(b) in view of the claim limitation “without applying weighting factors to individual loss components” are hereby withdrawn, as necessitated by applicant's amendments and remarks made to the rejections.

The previous rejections to claims 1-10 and 16-18 under 35 U.S.C. § 112(b) in view of the claim limitation “obtains a faster convergence result” are hereby withdrawn, as necessitated by applicant's amendments and remarks made to the rejections.
-
Applicant’s arguments with respect to rejection of claims 1-10 and 16-18 under 35 U.S.C. 103 based on amendment have been considered and are persuasive. The argument is moot in view of a new ground of rejection set forth below.

Claim Objections
Claims 1, 6, and 18 are objected to because of the following informalities: "Converges to a faster result than" should read "converges to a result faster than".  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 1-10 and 16-18 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Regarding claims 1, 6, and 16, "to maximize a sum secrecy rate that each of one or more respective legitimate users achieves a secrecy rate not less than a respective minimum secrecy rate" is grammatically unclear.  It's unclear if the claim limitation should be read, "to maximize a sum secrecy rate that each of one or more respective legitimate users achieves. A secrecy rate not less than a respective minimum secrecy rate" (wherein the secrecy rate is interpreted as the sum secrecy rate) or "to maximize a sum secrecy rate: each of one or more respective legitimate users achieves a secrecy rate not less than a respective minimum secrecy rate" or something else altogether.  As each of these interpretations are contradictory the scope of the claim cannot be reasonably determined.  In the interest of further examination the claim is interpreted as "to maximize a sum secrecy rate that each of one or more respective legitimate users achieves. A secrecy rate not less than a respective minimum secrecy rate" (wherein the secrecy rate is interpreted as the sum secrecy rate).

Regarding claims 1, 6, and 16, "to utilize channel characteristics, such as" is indefinite.  "Such as" represents a non-exhaustive list such that the scope of the claims cannot be reasonably determined (see MPEP 2173.05(d)(B)).

The remaining claims are rejected with respect to their dependence on the rejected claims.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.


	Claims 1, 2, 6, 7, 16,  and 18 are rejected under U.S.C. §103 as being unpatentable over the combination of Yang (“Deep Reinforcement Learning-Based Intelligent Reflecting Surface for Secure Wireless Communications”, 2020), Huang (“Unsupervised Learning-Based Fast Beamforming Design for Downlink MIMO”, 2018), and Goecks (“Integrating Behavior Cloning and Reinforcement Learning for Improved Performance in Dense and Sparse Reward Environments”, 2020).

	 Regarding claim 1, Yang teaches A learning method for a secure precoder, the learning method comprising:([p. 376] "A novel DRL-based secure beamforming approach is firstly proposed to jointly optimize the beamforming matrix at the BS and the reflecting beamforming matrix (reflection phases) at the IRS in dynamic environments")
	to maximize a sum secrecy rate that each of one or more respective legitimate users achieves a secrecy rate not less than a respective minimum secrecy rate when one or more eavesdropper eavesdrops([p. 378] "Since each eavesdropper can eavesdrop any of the K MUs’ signal, according to [14]–[25], the achievable individual secrecy rate from the BS to the k-th MU can be expressed by [Eqn. 10]" See line 1 of Eqn. 11 which maximizes the sum of legitimate user secrecy rates and constraint 11a where R_k^sec is constrained to be greater than minimum secrecy rate R_k^sec,min)
	each having a single antenna; and([p. 376] "Initial studies on IRS-aided secure communication systems have reported in [14]–[17], where a simple system model with only a single-antenna legitimate user and a single-antenna eavesdropper was considered in these works" [p. 377] "We consider an IRS-aided secure communication system, as shown in Fig. 1, where the BS is equipped with N antennas to serve K single-antenna legitimate mobile users (MUs) in the presence of M single-antenna eavesdroppers")
	within a channel between a base station and the one or more legitimate users, ([p. 377] "Let K = {1, 2,...,K}, M = {1, 2,...,M} and L = {1, 2,...,L} denote the MU set, the eavesdropper set and the IRS reflecting element set, respectively. Let Hbr ∈ CL×N , hH bu,k ∈ C1×N , hH ru,k ∈ C1×L, hH be,m ∈ C1×N , and hH re,m ∈ C1×L denote the channel coefficients from the BS to the IRS, from the BS to the k-th MU, from the IRS to the k-th MU, from the BS to the m-th eavesdropper, and from the IRS to the m-th eavesdropper, respectively. All the above mentioned channel coefficients in the system are assumed to be small-scale fading with path loss")
	when a maximum transit power is allocated to the [NOMA] system([p. 378] "The constraint in (11c) is set to satisfy the BS’s maximum power constraint")
	without reliance on location of the one or more legitimate users or the one or more eavesdroppers([p. 380] "In detail, the agent utilizes the observed state (i.e, CSI, previous secrecy rate, QoS satisfaction level), the feedback reward from environment as well as the historical experience from the replay buffer to train its learning model. After that, the agent employs the trained model to make decision (beamforming matrices V and Ψ) based on its learned policy. The procedures of the proposed learning based secure beamforming are provided in the following subsections" Yang is not reliant on user or eavesdropper location but rather channel state information (CSI).)
	wherein the secure precoder is trained to utilize channel characteristics, such as path loss and fading, to dynamically adjust transmission strategies between the one or more legitimate users and the one or more eavesdroppers to improve the security of the one or more legitimate users,([p. 377] "Let K = {1, 2,...,K}, M = {1, 2,...,M} and L = {1, 2,...,L} denote the MU set, the eavesdropper set and the IRS reflecting element set, respectively. Let Hbr ∈ CL×N , hH bu,k ∈ C1×N , hH ru,k ∈ C1×L, hH be,m ∈ C1×N , and hH re,m ∈ C1×L denote the channel coefficients from the BS to the IRS, from the BS to the k-th MU, from the IRS to the k-th MU, from the BS to the m-th eavesdropper, and from the IRS to the m-th eavesdropper, respectively. All the above mentioned channel coefficients in the system are assumed to be small-scale fading with path loss" [p. 376] "A novel DRL-based secure beamforming approach is firstly proposed to jointly optimize the beamforming matrix at the BS and the reflecting beamforming matrix (reflection phases) at the IRS in dynamic environments [...] we formulate a joint BS’s transmit beamforming and IRS’s reflect beamforming optimization problem with the goal of maximizing the system secrecy rate while considering the QoS requirements of legitimate users" Yang explicitly optimizes the base station transmit beamforming vector through reinforcement learning to improve the security of one or more legitimate users.)
	wherein one or more sum loss functions according to performing [supervised] pre-training and performing post-training are defined as a sum of secrecy-rate penalties for each legitimate user([p. 379] "In this paper, the reward function represents the optimization objective, and our objective is to maximize the system secrecy rate of all MUs while guaranteeing their QoS requirements. Thus, the presented QoS-aware reward function is expressed" See Eqn. 14.  Yang explicitly teaches a two stage training process.).
	However, Yang does not explicitly teach  performing supervised pre-training for downlink non-orthogonal multiple access (NOMA) system on a physical layer security scheme of the NOMA system to maximize a sum secrecy rate 
	performing post-training by fine tuning a neural network learned by the pre-training using unsupervised learning.
	and wherein performing supervised pre-training and performing post-training as two stages converges to a faster result than a one-stage training scheme converges to a result.

	Huang, in the same field of endeavor, teaches  performing supervised pre-training for downlink non-orthogonal multiple access (NOMA) system on a physical layer security scheme of the NOMA system to maximize a sum secrecy rate ([Abstract] "This paper considers a multiple input–multiple output broadcast channel to maximize the weighted sum-rate under the total power constraint" [p. 2 §I] "A beamforming design architecture based on DNN is proposed by redesigning the loss function. Based on the idea of unsupervised learning, the sum-rate can be maximized under the constraint of the total transmit power with slight performance loss compared to the WMMSE algorithm" [p. 2 §I] "The self-encoder in deep learning was applied in [14] to the non-orthogonal multiple access (NOMA) communication system, and the new mechanism of end-to-end communication was realized while optimizing communication system performance" [p. 3 §IIB] "Our main objective is to maximize the weighted sum-rate of all users" [p. 6 §V] "we use the pruning method to reload the pre-trained network model in the training process" All users are interpreted as legitimate users.  Beamforming interpreted as synonymous with precoding.  Legitimate user information transmission is interpreted as synonymous with a test set (See [p. 7603 §IVA] "The number of training samples and test samples are 50000 and 5000").).

	Yang as well as Huang are directed towards machine learning for a secure precoder.  Therefore, Yang as well as Huang are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Yang with the teachings of Huang by using the supervised learning in Huang.  Huang provides as additional motivation for combination ([p. 7604 §V] "we use a DNN model to design beamforming matrix, which greatly reduces the computational complexity compared to the traditional WMMSE algorithm while ensuring performance").
	However, the combination of Yang and Huang does not explicitly teach performing post-training by fine tuning a neural network learned by the pre-training using unsupervised learning.
	and wherein performing supervised pre-training and performing post-training as two stages converges to a faster result than a one-stage training scheme converges to a result.

	Goecks, in the same field of endeavor, teaches performing post-training by fine tuning a neural network learned by the pre-training using unsupervised learning.([p. 2 §1] "we propose the Cycle-of-Learning (CoL) framework, which uses an actor-critic architecture with a loss function that combines behavior cloning and 1-step Q-learning losses with an off-policy algorithm, and a pre-training step to learn from human demonstrations. Unlike previous approaches to combine BC with RL, such as Rajeswaran et al. [24], our approach uses an actor-critic architecture to learn both a policy and value function from the human demonstration data, which we show, speeds up learning. Additionally, we perform a detailed component analysis of our method to investigate the individual contributions of pre-training")
	and wherein performing supervised pre-training and performing post-training as two stages converges to a faster result than a one-stage training scheme converges to a result(See FIG. 2 which shows the two-stage CoL method converging much more quickly than the one-stage training schemes).

	The combination of Yang and Huang as well as Goecks are directed towards machine learning.  Therefore, the combination of Yang and Huang as well as Goecks are reasonably pertinent analogous art.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Yang and Huang with the teachings of Goecks by performing supervised pre-training (Huang) followed by reinforcement learning fine-tuning (Yang).  Goecks provides FIG. 3, which shows improved convergence over previous training methods, as additional motivation for combination.

	 Regarding claim 2, the combination of Yang, Huang, and Goecks teaches The learning method of claim 1, wherein the performing of the pre-training includes: performing the pre-training using a loss function, and(Huang [p. 4 §IIIB] " The chosen loss function is the mean squared error between the label {w (i) } and the network output {bw (i)}. We select Adam as the optimizer, which has exemplary performance in non-convex problems")
	wherein the loss function is defined with regard to a probability that a secrecy rate obtained by a secure precoder according to a channel of each of the legitimate users will be less than a secrecy rate each of the legitimate users should ensure and the secrecy rate obtained by the secure precoder.(Yang [p. 377] "Let K = {1, 2,...,K}, M = {1, 2,...,M} and L = {1, 2,...,L} denote the MU set, the eavesdropper set and the IRS reflecting element set, respectively. Let Hbr ∈ CL×N , hH bu,k ∈ C1×N , hH ru,k ∈ C1×L, hH be,m ∈ C1×N , and hH re,m ∈ C1×L denote the channel coefficients from the BS to the IRS, from the BS to the k-th MU, from the IRS to the k-th MU, from the BS to the m-th eavesdropper, and from the IRS to the m-th eavesdropper, respectively. All the above mentioned channel coefficients in the system are assumed to be small-scale fading with path loss").
	
	 Regarding claims 6 and 7, claims 6 and 7 are directed towards a device for performing the method of claims 1 and 2, respectively.  Therefore, the rejection applied to claims 1 and 2 also apply to claims 6 and 7.
	
	 Regarding claim 16, Yang teaches A learning method for a secure precoder, the learning method comprising:([p. 376] "A novel DRL-based secure beamforming approach is firstly proposed to jointly optimize the beamforming matrix at the BS and the reflecting beamforming matrix (reflection phases) at the IRS in dynamic environments")
	to maximize a sum secrecy rate while ensuring secrecy rates of one or more respective legitimate users when one or more eavesdropper eavesdrops, ([p. 376] "A novel DRL-based secure beamforming approach is firstly proposed to jointly optimize the beamforming matrix at the BS and the reflecting beamforming matrix (reflection phases) at the IRS in dynamic environments")
	without reliance on location of the one or more legitimate users or the one or more eavesdroppers([p. 380] "In detail, the agent utilizes the observed state (i.e, CSI, previous secrecy rate, QoS satisfaction level), the feedback reward from environment as well as the historical experience from the replay buffer to train its learning model. After that, the agent employs the trained model to make decision (beamforming matrices V and Ψ) based on its learned policy. The procedures of the proposed learning based secure beamforming are provided in the following subsections" Yang is not reliant on user or eavesdropper location but rather channel state information (CSI).)
	wherein the secure precoder is trained to utilize channel characteristics, such as path loss and fading, to dynamically adjust transmission strategies between the one or more legitimate users and the one or more eavesdroppers to improve the security of the one or more legitimate users([p. 377] "Let K = {1, 2,...,K}, M = {1, 2,...,M} and L = {1, 2,...,L} denote the MU set, the eavesdropper set and the IRS reflecting element set, respectively. Let Hbr ∈ CL×N , hH bu,k ∈ C1×N , hH ru,k ∈ C1×L, hH be,m ∈ C1×N , and hH re,m ∈ C1×L denote the channel coefficients from the BS to the IRS, from the BS to the k-th MU, from the IRS to the k-th MU, from the BS to the m-th eavesdropper, and from the IRS to the m-th eavesdropper, respectively. All the above mentioned channel coefficients in the system are assumed to be small-scale fading with path loss" [p. 376] "A novel DRL-based secure beamforming approach is firstly proposed to jointly optimize the beamforming matrix at the BS and the reflecting beamforming matrix (reflection phases) at the IRS in dynamic environments [...] we formulate a joint BS’s transmit beamforming and IRS’s reflect beamforming optimization problem with the goal of maximizing the system secrecy rate while considering the QoS requirements of legitimate users" Yang explicitly optimizes the base station transmit beamforming vector through reinforcement learning to improve the security of one or more legitimate users.)
	wherein the performing of the post-training includes: performing training using a margin of a secrecy rate of each legitimate user to minimize a probability that a secrecy rate obtained by a secure precoder according to a channel of each of the legitimate users will be less than a secrecy rate each of the legitimate users should ensure,([p. 377] "Let K = {1, 2,...,K}, M = {1, 2,...,M} and L = {1, 2,...,L} denote the MU set, the eavesdropper set and the IRS reflecting element set, respectively. Let Hbr ∈ CL×N , hH bu,k ∈ C1×N , hH ru,k ∈ C1×L, hH be,m ∈ C1×N , and hH re,m ∈ C1×L denote the channel coefficients from the BS to the IRS, from the BS to the k-th MU, from the IRS to the k-th MU, from the BS to the m-th eavesdropper, and from the IRS to the m-th eavesdropper, respectively. All the above mentioned channel coefficients in the system are assumed to be small-scale fading with path loss" [p. 376] "A novel DRL-based secure beamforming approach is firstly proposed to jointly optimize the beamforming matrix at the BS and the reflecting beamforming matrix (reflection phases) at the IRS in dynamic environments [...] we formulate a joint BS’s transmit beamforming and IRS’s reflect beamforming optimization problem with the goal of maximizing the system secrecy rate while considering the QoS requirements of legitimate users" Yang explicitly optimizes the base station transmit beamforming vector through reinforcement learning to improve the security of one or more legitimate users.)
	wherein one or more loss functions according to performing [supervised] pre-training and performing post-training are defined as a sum of secrecy-rate penalties for each legitimate user([p. 379] "In this paper, the reward function represents the optimization objective, and our objective is to maximize the system secrecy rate of all MUs while guaranteeing their QoS requirements. Thus, the presented QoS-aware reward function is expressed" See Eqn. 14.  Yang explicitly teaches a two stage training process.).
	However, Yang does not explicitly teach  performing supervised pre-training for a downlink non-orthogonal multiple access (NOMA) system on a physical layer security scheme of the NOMA system to maximize a sum secrecy rate while ensuring secrecy rates of one or more respective legitimate users 
	each having a single antenna, 
	within a channel between a base station and the one or more legitimate users, when a maximum transit power is allocated to the NOMA system; 
	and performing post-training by fine tuning a neural network learned by the pre-training using unsupervised learning.

	Huang, in the same field of endeavor, teaches  performing supervised pre-training for a downlink non-orthogonal multiple access (NOMA) system on a physical layer security scheme of the NOMA system to maximize a sum secrecy rate while ensuring secrecy rates of one or more respective legitimate users ([Abstract] "This paper considers a multiple input–multiple output broadcast channel to maximize the weighted sum-rate under the total power constraint" [p. 2 §I] "A beamforming design architecture based on DNN is proposed by redesigning the loss function. Based on the idea of unsupervised learning, the sum-rate can be maximized under the constraint of the total transmit power with slight performance loss compared to the WMMSE algorithm" [p. 2 §I] "The self-encoder in deep learning was applied in [14] to the non-orthogonal multiple access (NOMA) communication system, and the new mechanism of end-to-end communication was realized while optimizing communication system performance" [p. 3 §IIB] "Our main objective is to maximize the weighted sum-rate of all users" [p. 6 §V] "we use the pruning method to reload the pre-trained network model in the training process" All users are interpreted as legitimate users.  Beamforming interpreted as synonymous with precoding.  Legitimate user information transmission is interpreted as synonymous with a test set (See [p. 7603 §IVA] "The number of training samples and test samples are 50000 and 5000").)
	each having a single antenna, ([p. 2 §1] "In the literature, mainly power control and single-antenna transceiver communication scenarios have been considered" [p. 2 §II] "We consider a downlink transmission scenario in a typical MIMO system. As shown in Figure 1, a transmitter equipped with P antennas serves K users, each with Q receive antennas")
	within a channel between a base station and the one or more legitimate users, when a maximum transit power is allocated to the NOMA system; ([p. 7600 §IIA] "The channel between user k and the BS is denoted as a matrixHk 2 C[QxP]").

	Yang as well as Huang are directed towards machine learning for a secure precoder.  Therefore, Yang as well as Huang are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of Yang with the teachings of Huang by using the supervised learning in Huang.  Huang provides as additional motivation for combination ([p. 7604 §V] "we use a DNN model to design beamforming matrix, which greatly reduces the computational complexity compared to the traditional WMMSE algorithm while ensuring performance").
	However, the combination of Yang and Huang does not explicitly teach and performing post-training by fine tuning a neural network learned by the pre-training using unsupervised learning,.

	Goecks, in the same field of endeavor, teaches and performing post-training by fine tuning a neural network learned by the pre-training using unsupervised learning,([p. 2 §1] "we propose the Cycle-of-Learning (CoL) framework, which uses an actor-critic architecture with a loss function that combines behavior cloning and 1-step Q-learning losses with an off-policy algorithm, and a pre-training step to learn from human demonstrations. Unlike previous approaches to combine BC with RL, such as Rajeswaran et al. [24], our approach uses an actor-critic architecture to learn both a policy and value function from the human demonstration data, which we show, speeds up learning. Additionally, we perform a detailed component analysis of our method to investigate the individual contributions of pre-training").

	The combination of Yang and Huang as well as Goecks are directed towards machine learning.  Therefore, the combination of Yang and Huang as well as Goecks are reasonably pertinent analogous art.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the combination of Yang and Huang with the teachings of Goecks by performing supervised pre-training (Huang) followed by reinforcement learning fine-tuning (Yang).  Goecks provides FIG. 3, which shows improved convergence over previous training methods, as additional motivation for combination.

	 Regarding claim 18, the combination of Yang, Huang, and Goecks teaches The method of claim 16, wherein performing supervised pre-training and performing post-training as two stages obtains a faster convergence result than a one-stage training scheme(Goecks See FIG. 2 which shows the two-stage CoL method converging much more quickly than the one-stage training schemes).
	
	Claims 3, 4, 8,  and 9 are rejected under U.S.C. §103 as being unpatentable over the combination of Yang, Huang, Goecks, and Kang (“Deep Learning-Based MIMO-NOMA With Imperfect SIC Decoding”, 2019).

	 Regarding claim 3, the combination of Yang, Huang, and Goecks teaches The learning method of claim 1.
	However, the combination of Yang, Huang, and Goecks doesn't explicitly teach, wherein a loss function according to the post-training is defined as the following formula,  Lpost =−R s1 −R s2 +c 1(max[G 1+ϵ1 −R s1,0])2 +c 2(max[G 2+ϵ2 −R s2,0])2 where Rsk denotes an achievable secrecy rate for the secure precoder, ck denotes the penalty coefficient, and ϵk denotes a margin of the secrecy rate of each of the legitimate users and where k= a first legitimate user 1, the second legitimate user 2, and the artificial noise N..

	Kang, in the same field of endeavor, teaches a loss function according to the post-training is defined as the following formula,  Lpost =−R s1 −R s2 +c 1(max[G 1+ϵ1 −R s1,0])2 +c 2(max[G 2+ϵ2 −R s2,0])2 where Rsk denotes an achievable secrecy rate for the secure precoder, ck denotes the penalty coefficient, and ϵk denotes a margin of the secrecy rate of each of the legitimate users and where k= a first legitimate user 1, the second legitimate user 2, and the artificial noise N. (See Eqn. 9).

	The combination of Yang, Huang, and Goecks as well as Kang are directed towards using neural networks for NOMA beamforming/precoding.  Therefore, The combination of Yang, Huang, and Goecks as well as Kang are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of The combination of Yang, Huang, and Goecks with the teachings of Kang by substituting the loss function in The combination of Yang, Huang, and Goecks with the loss function in Kang.  Kang provides as additional motivation for combination ([p. 3417 §V] "In this article, considering the practical issue of imperfectness of SIC decoding, we developed the deep leaning based jointly optimal precoding and SIC decoding scheme for the MIMO-NOMA system in the sense of minimizing the total MSE between the users’ desired signals and their decoded signals. The superior performance and effectiveness of the proposed scheme are demonstrated through the numerical results. An important and interesting work to be further investigated is to analyze the performance of the proposed MIMO-NOMA system with deep learning").  This motivation for combination also applies to the remaining claims which depend on this combination.

	 Regarding claim 4, the combination of Yang, Huang, Goecks, and Kang teaches The learning method of claim 3, wherein the performing of the post-training includes: performing training using the margin of the secrecy rate of each legitimate user to minimize a probability that a secrecy rate obtained by a secure precoder according to a channel of each of the legitimate users will be less than a secrecy rate each of the legitimate users should ensure.(Yang [p. 378] "Since each eavesdropper can eavesdrop any of the K MUs’ signal, according to [14]–[25], the achievable individual secrecy rate from the BS to the k-th MU can be expressed by [Eqn. 10]" See line 1 of Eqn. 11 which maximizes the sum of legitimate user secrecy rates and constraint 11a where R_k^sec is constrained to be greater than minimum secrecy rate R_k^sec,min guarantees the secrecy rate is not less than the minimum secrecy rate.).
	
	Regarding claim 8, claim 8 is directed towards a device for performing the method of claim 3.  Therefore, the rejection applied to claim 3 also applies to claim 8.

Regarding claim 9, claim 9 is directed towards a device for performing the method of claim 4.  Therefore, the rejection applied to claim 4 also applies to claim 9.

	Claims 5, 10,  and 17 are rejected under U.S.C. §103 as being unpatentable over the combination of Yang and Huang and Goecks and Gui ("Deep Learning for an Effective Nonorthogonal Multiple Access Scheme", 2018).

	 Regarding claim 5, the combination of Yang, Huang, and Goecks teaches The learning method of claim 1, further comprising:(Huang claim 1).
	However, the combination of Yang, Huang, and Goecks doesn't explicitly teach updating a weight matrix and a bias vector using a stochastic gradient descent (SGD) scheme, when updating the weight matrix and the bias vector in a backpropagation scheme using a loss function according to the pre-training and a loss function according to the post-training..

	Gui, in the same field of endeavor, teaches updating a weight matrix and a bias vector using a stochastic gradient descent (SGD) scheme, when updating the weight matrix and the bias vector in a backpropagation scheme using a loss function according to the pre-training and a loss function according to the post-training.([p. 3 §IIA] "For the sake of enhancing the generalization ability of the network and reducing the dimension of the input data, the RBM is implemented to train the original input of the LSTM. In other words, the RBM is used as the pre-training structure for promoting the performance of the LSTM" [p. 5 §IIIB] "the form of the output oj of the jth layer can be expressed as oj = fsigmoid(Wj oj−1 + bj), (9) where Wj and bj denote the weight parameter and bias parameter, respectively. With the aid of the activation function fsigmoid(·), a nonlinearity transform can be realized" [p. 7 §IIID] "Unfortunately, the typical gradient descent method is not the best choice for optimizing these “min-sum” decoders, as the function computed by the check nodes has non-differentiable kinks at certain points. However, the good characteristic of the SGD is that it can select the subgradient during backpropagation. Thus, the SGD is introduced to train the DL decoder" See also Loss Eqn. (10) on p. 5.).

	The combination of Yang, Huang, and Goecks as well as Gui are directed towards neural network systems for NOMA beamforming/precoding.  Therefore, The combination of Yang, Huang, and Goecks as well as Gui are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of The combination of Yang, Huang, and Goecks with the teachings of Gui by using the RBM for pretraining and adding the LSTM in Gui as a hidden layer for the network in The combination of Yang, Huang, and Goecks.  This could be accomplished trivially (See similarities between FIG. 2 of The combination of Yang, Huang, and Goecks and FIG. 1 of Gui).  Gui provides as additional motivation for combination ([p. 9 §IVA] "Paper [32] came to an initial conclusion that introducing radio transformer networks (RTNs) to integrate expert knowledge into the DL model can further improve the end-to-end training performance. Here, we deploy RTNs in our proposed NOMA system to elevate its performance. In Fig. 10, the BLER of the proposed LSTMscheme, the proposed RTN-aided LSTM scheme, and the typical hard decision coding scheme are investigated under non-line-of-sight (NLOS) propagation").  This motivation for combination also applies to the remaining claims which depend on this combination.

	 Regarding claim 10, claim 10 is directed towards a device for performing the method of claim 5.  Therefore, the rejection applied to claim 5 also applies to claim 10.
	
	 Regarding claim 17, the combination of Yang, Huang, and Goecks teaches The method of claim 16.
	However, the combination of Yang, Huang, and Goecks doesn't explicitly teach wherein the performing of the post-training further comprises: updating a weight matrix and a bias vector using a stochastic gradient descent method, when updating the weight matrix and the bias vector in a backpropagation method using a loss function according to pre-training and a loss function according to post-training.

	Gui, in the same field of endeavor, teaches the performing of the post-training further comprises: updating a weight matrix and a bias vector using a stochastic gradient descent method, when updating the weight matrix and the bias vector in a backpropagation method using a loss function according to pre-training and a loss function according to post-training. ([p. 3 §IIA] "For the sake of enhancing the generalization ability of the network and reducing the dimension of the input data, the RBM is implemented to train the original input of the LSTM. In other words, the RBM is used as the pre-training structure for promoting the performance of the LSTM" [p. 5 §IIIB] "the form of the output oj of the jth layer can be expressed as oj = fsigmoid(Wj oj−1 + bj), (9) where Wj and bj denote the weight parameter and bias parameter, respectively. With the aid of the activation function fsigmoid(·), a nonlinearity transform can be realized" [p. 7 §IIID] "Unfortunately, the typical gradient descent method is not the best choice for optimizing these “min-sum” decoders, as the function computed by the check nodes has non-differentiable kinks at certain points. However, the good characteristic of the SGD is that it can select the subgradient during backpropagation. Thus, the SGD is introduced to train the DL decoder" See also Loss Eqn. (10) on p. 5.).

	The combination of Yang, Huang, and Goecks as well as Gui are directed towards neural network systems for NOMA beamforming/precoding.  Therefore, The combination of Yang, Huang, and Goecks as well as Gui are analogous art in the same field of endeavor.  It would have been obvious before the effective filing date of the claimed invention to combine the teachings of The combination of Yang, Huang, and Goecks with the teachings of Gui by using the RBM for pretraining and adding the LSTM in Gui as a hidden layer for the network in The combination of Yang, Huang, and Goecks.  This could be accomplished trivially (See similarities between FIG. 2 of The combination of Yang, Huang, and Goecks and FIG. 1 of Gui).  Gui provides as additional motivation for combination ([p. 9 §IVA] "Paper [32] came to an initial conclusion that introducing radio transformer networks (RTNs) to integrate expert knowledge into the DL model can further improve the end-to-end training performance. Here, we deploy RTNs in our proposed NOMA system to elevate its performance. In Fig. 10, the BLER of the proposed LSTMscheme, the proposed RTN-aided LSTM scheme, and the typical hard decision coding scheme are investigated under non-line-of-sight (NLOS) propagation").  This motivation for combination also applies to the remaining claims which depend on this combination.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Deng (“Deep Learning-Based Secure MIMO Communications with Imperfect CSI for Heterogeneous Networks”, 2020) is directed towards using deep learning for a secure precoder.
Fritschek (“Deep Learning for the Gaussian Wiretap Channel”, 2019) is also directed towards the use of deep learning for a secure precoder.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SIDNEY VINCENT BOSTWICK whose telephone number is (571)272-4720. The examiner can normally be reached M-F 7:30am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/SIDNEY VINCENT BOSTWICK/Examiner, Art Unit 2124
Read full office action
Prosecution Timeline

May 11, 2021
Application Filed
Apr 08, 2024
Non-Final Rejection — §103, §112
Jun 07, 2024
Response Filed
Jul 05, 2024
Final Rejection — §103, §112
Oct 11, 2024
Request for Continued Examination
Oct 22, 2024
Response after Non-Final Action
Jan 30, 2025
Non-Final Rejection — §103, §112
May 02, 2025
Response Filed
May 19, 2025
Final Rejection — §103, §112
Sep 03, 2025
Interview Requested
Sep 09, 2025
Applicant Interview (Telephonic)
Sep 09, 2025
Examiner Interview Summary
Sep 18, 2025
Request for Continued Examination
Oct 05, 2025
Response after Non-Final Action
Nov 03, 2025
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/373,021
Patent 12561604
SYSTEM AND METHOD FOR ITERATIVE DATA CLUSTERING USING MACHINE LEARNING
2y 5m to grant Granted Feb 24, 2026
18/486,534
Patent 12547878
Highly Efficient Convolutional Neural Networks
2y 5m to grant Granted Feb 10, 2026
16/902,547
Patent 12536426
Smooth Continuous Piecewise Constructed Activation Functions
2y 5m to grant Granted Jan 27, 2026
18/607,777
Patent 12518143
FEEDFORWARD GENERATIVE NEURAL NETWORKS
2y 5m to grant Granted Jan 06, 2026
16/940,293
Patent 12505340
STASH BALANCING IN MODEL PARALLELISM
2y 5m to grant Granted Dec 23, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
52%
Grant Probability
90%
With Interview (+38.2%)
4y 7m
Median Time to Grant
High
PTA Risk
Based on 136 resolved cases by this examiner. Grant probability derived from career allow rate.