Last updated: April 19, 2026
Application No. 17/193,974
ARCHITECTURES FOR TEMPORAL PROCESSING ASSOCIATED WITH WIRELESS TRANSMISSION OF ENCODED DATA

Final Rejection §103
Filed
Mar 05, 2021
Examiner
REYES, CHRISTOPHER ANTHONY
Art Unit
2475
Tech Center
2400 — Computer Networks
Assignee
Qualcomm Incorporated
OA Round
4 (Final)
Interview Optional

— -6.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 8 resolved cases, 2023–2026
Examiner Intelligence

REYES, CHRISTOPHER ANTHONY View full profile →
Grants 88% — above average
Career Allow Rate
7 granted / 8 resolved
+29.5% vs TC avg
Minimal -6% lift
Without
With
+-6.3%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
52 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
3.3%
-36.7% vs TC avg
§103
82.8%
+42.8% vs TC avg
§102
11.1%
-28.9% vs TC avg
§112
2.9%
-37.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 8 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 20 and 29-30 objected to because of the following informalities: The claims read in part, "...and a state vector that represents an output pf a prior temporal processing operation..." with a misspelling of the word "of".  Appropriate correction is required.

Response to Arguments
Applicant’s arguments with respect to claim(s) 1, 20, and 29-30 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Since the independent claims 1, 20, and 29-30 remain rejected, the rejection of the dependent
claims persist.

Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Claim(s) 1, 20, and 29-30 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, et al. (US 20200076841 A1, hereinafter, "HAJIMIRSADEGHI") in view of MURUGAN (Pushparaja Murugan, "Learning The Sequential Temporal Information with Recurrent Neural Networks", manuscript, XRVision Research and Development center, Singapore, arXiv: 1807.02857v1, July 8, 2018, hereinafter, "MURUGAN"), SHAZEER, et al. (US 20180341860 A1, hereinafter, "SHAZEER"), and WANG, et al. (US 20240030980 A1, hereinafter, "WANG").

Regarding claim 1, HAJIMIRSADEGHI teaches a transmitting wireless communication device for
wireless communication (paragraph 0353; figure 24, 2418: communication interface), comprising:
one or more memories (paragraph 0346; figure 24, 2406: main memory);
and one or more processors, operatively coupled to the one or more memories, configured to
(paragraph 0346; figure 24, 2404: processor):
encode a data set using a single shot encoding operation to output a first encoded data set;
HAJIMIRSADEGHI writes, “One-hot encoding is exemplified by bitmap 960 that encodes category 928
into bytes 14-15 of raw feature vector 940. For example if category 928 is a month and has a value of
April (i.e. fourth month of year), then bit 4 of bitmap 960 is set and bits 1-3 and 5-12 are clear. A
category that is naturally ordered (i.e. sortable), such as month on a calendar, should be encoded as an
integer to preserve the ordering. For example, one of twelve months may be encoded as a nibble (i.e. 4-
bit integer with 16 possible values). A category that is naturally unordered, such as colors of a palette,
should be one-hot encoded” (paragraph 0177). HAJIMIRSADEGHI states the use of one-hot encoding to
encode the data set. One-hot encoding may be construed as single-shot encoding.
and transmit the second encoded data set to a receiving wireless communication device
(paragraph 0353; figure 24, 2418: communication interface).
HAJIMIRSADEGHI fails to explicitly disclose information regarding, “perform, using a set of inputs that includes the first encoded data set and a state vector that represents an output of a prior temporal processing operation, a temporal processing operation associated with at least one neural network”, “to produce a second encoded data set having a dimensionality that is less than a dimensionality of the first encoded data set,”, and “wherein a dimensionality of the state vector is greater than the dimensionality of the first encoded data set;”
	However, MURUGAN teaches, in analogous art, perform, using a set of inputs that includes the first encoded data set and a state vector that represents an output of a prior temporal processing operation, a temporal processing operation associated with at least one neural network
MURUGAN writes, “Simple Recurrent Network has three layers, input units added with context unit, hidden units and output units. The feedback loop from the hidden units to the context units in the input units allow the network to process the previous stage of any sequential events. Hence, the network will have two sources of inputs such as input information and the state of previous events. Perceiving the stage of previous events is commonly known as the memory of Recurrent Neural Networks” (page 3, section 2, paragraph 2). MURUGAN adds, “The recurrent network unit applies the recurrence formula to the input vector it at time step t and previous stage st−1 of the input vector at the time step (t − 1) and outputs the vector xt+1 with new proposed stage st” (pages 3-4, section 2.1, paragraph 1). MURUGAN continues, “A typical recurrent neural networks is consisted of many recurrent unit in the same network and passes the time dependent information to the successive neural units. This implies that the recurrent neural network are highly related to sequences and lists. In over all, the hidden state as the memory of the networks captures the temporal information of the previous history that lead to computed outputs dependent on the network memory. Unlike the traditional feed-forward networks, the RNN shares same parameters across all the time steps that reduce larger number learning parameters. On the other hand, We are performing same task at each time step but with the different inputs” (page 4, section 2.2, paragraph 2).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI the use of RNN in processing sequential data as described by MURUGAN to achieve state of the art performance.
HAJIMIRSADEGHI and MURUGAN fail to explicitly disclose information regarding, “to produce a second encoded data set having a dimensionality that is less than a dimensionality of the first encoded data set,”, and “wherein a dimensionality of the state vector is greater than the dimensionality of the first encoded data set;”
	However, SHAZEER teaches, in analogous art, to produce a second encoded data set having a dimensionality that is less than a dimensionality of the first encoded data set,
SHAZEER writes, “In some cases, the learned transformations applied by the attention sub-layer reduce the dimensionality of the original keys and values and, optionally, the queries. For example, when the dimensionality of the original keys, values, and queries is d and there are h attention layers in the sub-layer, the sub-layer may reduce the dimensionality of the original keys, values, and queries to d/h” (paragraph 0072).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention and method of HAJIMIRSADEGH and MURUGAN to include aspects of the method and apparatus described by SHAZEER that "relates to transducing sequences using neural networks." SHAZEER provides motivation for modification of the invention stating, “The use of attention mechanisms allows the sequence transduction neural network to effectively learn dependencies between distant positions during training, improving the accuracy of the sequence transduction neural network on various transduction tasks, e.g., machine translation" (paragraph 0008). SHAZEER adds, "The sequence transduction neural network can also exhibit improved performance over conventional machine translation neural networks without task-specific tuning through the use of the attention mechanism" (paragraph 0008).
HAJIMIRSADEGHI, MURUGAN, and SHAZEER fail to explicitly disclose information regarding, “wherein a dimensionality of the state vector is greater than the dimensionality of the first encoded data set;”
	However, WANG teaches, in analogous art, wherein a dimensionality of the state vector is greater than the dimensionality of the first decoded data set;
WANG writes, “Performing dimension reduction processing on the first information to generate the
second information is performing dimension reduction processing on the channel state information to
generate dimension-reduced channel state information. The dimension-reduced channel state
information is encoded based on the first neural network, to generate encoded channel state
information to be sent” (paragraph 0074). WANG adds, “In an embodiment, the access network device
performs the first encoding on the CSI-RS to generate the first information, the terminal device
determines the second information based on the first information and the CSI -RS, the terminal device
performs the second encoding on the second information to generate the third information... ”
(paragraph 0233). WANG informs the reader of performing a dimension reduction processing on the
first information to generate the second information, in which the dimension reduction processing on
the channel state information. WANG notes the dimension-reduced channel state information is
encoded based on the first neural network. WANG continues with the access network device performs
the first encoding on the CSI-RS to generate the first information… the terminal device performs the
second encoding on the second information.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention and method of HAJIMIRSADEGH, MURUGAN, and SHAZEER to include aspects of the method and apparatus described by WANG that "relates to the field of wireless communication, and more specifically, to a data transmission method and apparatus." WANG provides motivation for modification of the invention stating, “Therefore, in this application, the access network device can decode, by using a fused structure, information that is encoded twice, improving accuracy of decoding." (paragraph 0014). WANG adds, " In addition, compared with a manner of using only compression encoding, the composite encoding manner combining compression encoding and neural network encoding can help significantly improve quality of the channel information" (paragraph 0068).

Regarding claim 20, HAJIMIRSADEGHI teaches a receiving wireless communication device for
wireless communication (paragraph 0355; figure 24, 2430: server), comprising:
one or more memories (paragraph 0346; figure 24, 2406: main memory);
and one or more processors, operatively coupled to the one or more memories, configured to
(paragraph 0346; figure 24, 2404: processor):
receive an encoded data set from a transmitting wireless communication device (paragraph
0353; figure 24, 2418: communication interface);
and decode the first decoded data set using a single shot decoding operation to produce a
second decoded data set.
HAJIMIRSADEGHI writes, “One-hot encoding is exemplified by bitmap 960 that encodes category 928
into bytes 14-15 of raw feature vector 940. For example if category 928 is a month and has a value of
April (i.e. fourth month of year), then bit 4 of bitmap 960 is set and bits 1-3 and 5-12 are clear. A
category that is naturally ordered (i.e. sortable), such as month on a calendar, should be encoded as
an integer to preserve the ordering. For example, one of twelve months may be encoded as a nibble (i.e.
4-bit integer with 16 possible values). A category that is naturally unordered, such as colors of a palette,
should be one-hot encoded” (paragraph 0177). HAJIMIRSADEGHI states the use of one-hot encoding to
encode the data set. One-hot encoding may be construed as single-shot encoding. If the decoder works
in a reverse method to the encoder with the same components, the decoder will also contain a single-
shot decoding operation.
HAJIMIRSADEGHI fails to explicitly disclose information regarding, “perform, using a set of
inputs that includes the encoded data set and a state vector, a temporal processing operation associated with at least one neural network to produce a first decoded data set having a dimensionality that is greater than the dimensionality of the encoded data set, wherein a
dimensionality of the state vector is greater than the dimensionality of the first decoded data set;”, “to produce a first decoded data set having a dimensionality that is greater than the dimensionality of the encoded data set,”, and “wherein a dimensionality of the state vector is greater than the dimensionality of the first decoded data set;”.
However, MURUGAN teaches, in analogous art, perform, using a set of inputs that includes the encoded data set and a state vector, a temporal processing operation associated with at least one neural network to produce a first decoded data set having a dimensionality that is greater than the dimensionality of the encoded data set, wherein a dimensionality of the state vector is greater than the dimensionality of the first decoded data set;
MURUGAN writes, “Simple Recurrent Network has three layers, input units added with context unit, hidden units and output units. The feedback loop from the hidden units to the context units in the input units allow the network to process the previous stage of any sequential events. Hence, the network will have two sources of inputs such as input information and the state of previous events. Perceiving the stage of previous events is commonly known as the memory of Recurrent Neural Networks” (page 3, section 2, paragraph 2). MURUGAN adds, “The recurrent network unit applies the recurrence formula to the input vector it at time step t and previous stage st−1 of the input vector at the time step (t − 1) and outputs the vector xt+1 with new proposed stage st” (pages 3-4, section 2.1, paragraph 1). MURUGAN continues, “A typical recurrent neural networks is consisted of many recurrent unit in the same network and passes the time dependent information to the successive neural units. This implies that the recurrent neural network are highly related to sequences and lists. In over all, the hidden state as the memory of the networks captures the temporal information of the previous history that lead to computed outputs dependent on the network memory. Unlike the traditional feed-forward networks, the RNN shares same parameters across all the time steps that reduce larger number learning parameters. On the other hand, We are performing same task at each time step but with the different inputs” (page 4, section 2.2, paragraph 2).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI the use of RNN in processing sequential data as described by MURUGAN to achieve state of the art performance.
HAJIMIRSADEGHI and MURUGAN fail to explicitly disclose information regarding, “to produce a first decoded data set having a dimensionality that is greater than the dimensionality of the encoded data set,”, and “wherein a dimensionality of the state vector is greater than the dimensionality of the first decoded data set;”
	However, SHAZEER teaches, in analogous art, to produce a first decoded data set having a dimensionality that is greater than the dimensionality of the encoded data set,
SHAZEER writes, “In some cases, the learned transformations applied by the attention sub-layer reduce the dimensionality of the original keys and values and, optionally, the queries. For example, when the dimensionality of the original keys, values, and queries is d and there are h attention layers in the sub-layer, the sub-layer may reduce the dimensionality of the original keys, values, and queries to d/h” (paragraph 0072).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention and method of HAJIMIRSADEGH and MURUGAN to include aspects of the method and apparatus described by SHAZEER that "relates to transducing sequences using neural networks." SHAZEER provides motivation for modification of the invention stating, “The use of attention mechanisms allows the sequence transduction neural network to effectively learn dependencies between distant positions during training, improving the accuracy of the sequence transduction neural network on various transduction tasks, e.g., machine translation" (paragraph 0008). SHAZEER adds, "The sequence transduction neural network can also exhibit improved performance over conventional machine translation neural networks without task-specific tuning through the use of the attention mechanism" (paragraph 0008).
HAJIMIRSADEGHI, MURUGAN, and SHAZEER fail to explicitly disclose information regarding, “wherein a dimensionality of the state vector is greater than the dimensionality of the first encoded data set;”
	However, WANG teaches, in analogous art, wherein a dimensionality of the state vector is greater than the dimensionality of the first decoded data set;
WANG writes, “Performing dimension reduction processing on the first information to generate the
second information is performing dimension reduction processing on the channel state information to
generate dimension-reduced channel state information. The dimension-reduced channel state
information is encoded based on the first neural network, to generate encoded channel state
information to be sent” (paragraph 0074). WANG adds, “In an embodiment, the access network device
performs the first encoding on the CSI-RS to generate the first information, the terminal device
determines the second information based on the first information and the CSI -RS, the terminal device
performs the second encoding on the second information to generate the third information...”
(paragraph 0233). WANG informs the reader of performing a dimension reduction processing on the
first information to generate the second information, in which the dimension reduction processing on
the channel state information. WANG notes the dimension-reduced channel state information is
encoded based on the first neural network. WANG continues with the access network device performs
the first encoding on the CSI-RS to generate the first information… the terminal device performs the
second encoding on the second information.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention and method of HAJIMIRSADEGH, MURUGAN, and SHAZEER to include aspects of the method and apparatus described by WANG that "relates to the field of wireless communication, and more specifically, to a data transmission method and apparatus." WANG provides motivation for modification of the invention stating, “Therefore, in this application, the access network device can decode, by using a fused structure, information that is encoded twice, improving accuracy of decoding." (paragraph 0014). WANG adds, " In addition, compared with a manner of using only compression encoding, the composite encoding manner combining compression encoding and neural network encoding can help significantly improve quality of the channel information" (paragraph 0068).

Claims 29 – 30 are method claims corresponding to the apparatus claims 1 and 20 that have
already been rejected above. The applicant’s attention is directed to the rejection of claims 1 and 20.
Claims 29 – 30 are rejected under the same rational as claims 1 and 20.

Claim(s) 2-3, 21, and 31-33 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG as applied to claims 1 and 29 above, and further in view of WEN (C.-K. Wen, W.-T. Shih, and S. Jin, “Deep learning for massive MIMO CSI feedback,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751, Oct. 2018., hereinafter, "WEN").

Regarding claim 2, HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG teach the transmitting wireless communication device of claim 1,
HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG fail to explicitly disclose information regarding, “wherein the data set is based at least in part on sampling of one or more reference signals.”
However, WEN teaches, in analogous art, wherein the data set is based at least in part on
sampling of one or more reference signals.
WEN writes, “...deep learning (DL). DL attempts to mimic the human brain to accomplish a specific task
by training large multilayered neural networks with vast numbers of training samples. Our developed CSI
sensing (or encoder) and recovery (or decoder) network is hereafter called CsiNet. CsiNet has the
following features.
Encoder: Rather than using random projection, CsiNet learns a transformation from original channel matrices to compress representations (codewords) through training data. The algorithm is agnostic to human knowledge on channel distribution and instead directly learns to effectively use the channel structure from training data” (page748, column 2, paragraph 1).
WEN indicates that training samples are used to train the neural networks.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG the use of training samples to aid the neural network in learning the specific task of encoding and decoding to improve performance.

Regarding claim 3, HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG teach the transmitting wireless communication device of claim 1,
Additionally, HAJIMIRSADEGHI teaches wherein the one or more processors, to transmit the
second encoded data set to the receiving wireless communication device (paragraph 0353; figure 24,
2404: processor), are configured to:
HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG fail to explicitly disclose information regarding, “transmit channel state information feedback to the receiving wireless communication device.”
However, WEN teaches in analogous art, transmit channel state information feedback to the
receiving wireless communication device.
WEN writes, “Let H̃… be the [channel state information (CSI)] stacked in the spatial frequency domain. In
the FDD system, the UE should return H̃ to the BS through feedback links” (page 749, column 1,
paragraph 1).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG to transmit channel state information to the BS through feedback links. WEN provides the motivation for this action explaining in the abstract of the prior art, “In frequency division duplex mode, the downlink channel state information (CSI) should be sent to the base station through feedback links so that the potential gains of a massive multiple-input multiple-output can be exhibited” (page 748, abstract).

Claim 21 is an apparatus claim and claims 31-32 are method claims corresponding to the
apparatus claim 3 that has already been rejected above. The applicant’s attention is directed to the
rejection of claim 3. Claims 21 and 31-32 are rejected under the same rational as claim 3.

Claim 33 is a method claim corresponding to the apparatus claim 2 that has already been rejected above. The applicant’s attention is directed to the rejection of claim 2. Claim 33 is rejected under the same rational as claim 2.

Claim(s) 6-11 and 34 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG as applied to claims 1 and 29 above, and further in view of LU, et al. (C. Lu, W. Xu, H. Shen, J. Zhu and K. Wang, "MIMO Channel Information Feedback Using Deep Recurrent Network," in IEEE Communications Letters, vol. 23, no. 1, pp. 188-191, Jan. 2019, doi: 10.1109/LCOMM.2018.2882829, hereinafter, "LU").

Regarding claim 6, HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG teach the transmitting wireless communication device of claim 1,
HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG fail to explicitly disclose information regarding, “wherein the prior temporal processing operation is associated with an encoder of the transmitting wireless communication device.”
However, LU teaches wherein the prior temporal processing operation is associated with an
encoder of the transmitting wireless communication device.
LU writes, “The compression and uncopression modules are proposed using the long short -term
memory (LSTM) network [8], which has the memory function and thus can capture and extract inherent
correlations, e.g. temporal correlations within input sequences” (page 189, column 2, paragraph 3;
figure 2, Encoder). LU specifies that both the compression and un-compression modules both use the
LSTM network for temporal correlations where the compression module is associated with the encoder
and the un-compression module is associated with the decoder.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG the proposed NN architectures by LU to achieve better performance in terms of both CSI compression and recovery accuracy.

Claim 7 is an apparatus claim and claim 34 is a method claim corresponding to the apparatus claim 6 that has already been rejected above. The applicant’s attention is directed to the rejection of claim 6. Claims 7 and 34 are rejected under the same rational as claim 6.

Regarding claim 8, HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG teach the transmitting wireless communication device of claim 1,
Additionally, HAJIMIRSADEGHI teaches wherein the one or more processors, to encode the
data set using the temporal processing operation (paragraph 0346; figure 24, 2404: processor),
HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG fail to explicitly disclose information regarding, “are configured to perform the temporal processing operation using a temporal processing block.”
However, LU teaches are configured to perform the temporal processing operation using a
temporal processing block.
LU states, “The compression and uncopression modules are proposed using the long short-term memory
(LSTM) network [8], which has the memory function and thus can capture and extract inherent
correlations, e.g. temporal correlations within input sequences” (page 189, column 2, paragraph 3;
figure 2). LU explains that the LSTM network is used to performed temporal correlations.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG the proposed NN architectures by LU to achieve better performance in terms of both CSI compression and recovery accuracy.

Regarding claim 9, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU teach the transmitting wireless communication device of claim 8,
Additionally, LU teaches wherein the temporal processing block comprises a recurrent neural
network (RNN) bank that includes one or more RNNs.
LU writes, “In this letter, we propose a new [Neural Network (NN)] by incorporating recurrent neural
network (RNN) to catch the temporal channel correlation” (page 188, column 2, paragraph 1). LU
specifies that a recurrent neural network (RNN) is incorporated to catch the temporal channel
correlation. As LU explained earlier, the temporal correlation is captured and extracted in the LSTM
network.

Regarding claim 10, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU teach the transmitting wireless communication device of claim 9,
Additionally, LU teaches wherein the one or more RNNs include at least one of: a long-short
term memory, a gated recurrent unit, or a basic RNN.
LU writes, “The compression and uncopression modules are proposed using the long short -term
memory (LSTM) network [8], which has the memory function and thus can capture and extract inherent
correlations, e.g. temporal correlations within input sequences” (page 189, column 2, paragraph 3;
figure 2). “Note that, in deep learning networks, gated recurrent unit (GRU) is an alternative of LSTM for
modeling sequences with memory. In our proposed NN architecture, the LSTM can be replaced by GRU
without much additional changes to our current design” (page 190, column 1, paragraph 1). LU discusses
the use of the LSTM network in the prior art, and notes that the GRU being a suitable alternative.

Regarding claim 11, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU teach the transmitting wireless communication device of claim 8,
Additionally, LU teaches wherein the temporal processing block comprises an output generator
that includes at least one of: a fully connected layer, a convolutional layer, or a fully connected
convolutional layer.
LU states, “In our design, see Fig. 1, the input of the compression module is split into two parallel flows:
an LSTM network and a linear [fully-connected network (FCN)]” (page 189, column 2, paragraph 3; figure
1, FCN). “Comparing the proposed architectures in Fig. 1 and Fig. 3, RecCsiNet connects LSTM and FCN in
parallel while PR-RecCsiNet connects LSTM and FCN in serial” (page 190, column 1, paragraph 2; figure
3, FCN). LU introduces a FCN in parallel with the LSTM in figure 1, however, LU suggests the LSTM and
FCN in serial, pictured in figure 3, corresponding to the PR-RecCsiNet.

Claim(s) 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU as applied to claim 11 above, and further in view of WU, et al. (US 20220237917 A1, hereinafter, "WU").

Regarding claim 12, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU teach the transmitting wireless communication device of claim 11,
HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU fail to explicitly disclose information regarding, “wherein the output generator takes, as input, an output of a recurrent neural network (RNN) bank and produces the encoded data set.”
However, WU teaches, in analogous art, wherein the output generator takes, as input, an
output of a recurrent neural network (RNN) bank and produces the encoded data set.
WU writes, “Referring to FIG. 3B, the first feature extraction layer may be implemented based on the
CNN, and the second feature extraction layer may be implemented based on a recurrent neural
network, such as a long short-term memory (LSTM) network” (paragraph 0096; figure 3B). WU
continues, “After the vector difference is obtained, the vector difference may be classified by using the
fully connected layer in the definition difference analysis mechanism” (paragraph 0113). WU indicates
and demonstrates in figure 3B that the output of the RNN is the input to the fully connected layer that
will encode the data set for transmission.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU a first and second fully connected layers that may include an activation layer between the first and second fully connected layers to initiate non-linearity in order for the neural network to learn.

Claim(s) 24 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG as applied to claim 20 above, and further in view of LU and LU, W. et al. (US 20210373161 A1, hereinafter, "LU, W.").

Regarding claim 24, HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG teach the receiving wireless communication device of claim 20,
Additionally, HAJIMIRSADEGHI teaches wherein the one or more processors (paragraph 0346;
figure 24, 2404: processor), to decode the encoded data set using the temporal processing operation,
HAJIMIRSADEGHI, MURUGAN, SHAZEER, and WANG fail to explicitly disclose information regarding, “are configured to perform the temporal processing operation using a temporal processing block, wherein the temporal processing block comprises:”, “a recurrent neural network (RNN) bank that includes one or more RNNs,”, “wherein an input of the RNN bank comprises the state vector associated with a first time, and wherein an output of the RNN bank comprises a state vector associated with a second time;”, and “and an output generator that takes, as input, an output of the RNN bank and produces the decoded data set.”
However, LU teaches, in analogous art, are configured to perform the temporal processing
operation using a temporal processing block, wherein the temporal processing block comprises:
LU writes, “The compression and uncopression modules are proposed using the long short -term
memory (LSTM) network [8], which has the memory function and thus can capture and extract inherent
correlations, e.g. temporal correlations within input sequences” (page 189, column 2, paragraph 3;
figure 2). LU explains that the LSTM network is used to performed temporal correlations.
a recurrent neural network (RNN) bank that includes one or more RNNs,
LU states, “In this letter, we propose a new [Neural Network (NN)] by incorporating recurrent neural
network (RNN) to catch the temporal channel correlation” (page 188, column 2, paragraph 1). LU clearly
points out that a new NN is proposed that incorporates RNN.
and an output generator that takes, as input, an output of the RNN bank and produces the
decoded data set.
LU writes, “The parameter-reduced recurrent CsiNet (PR-RecCsiNet) utilizes new compression and
uncompression modules as illustrated in Fig. 3. We use a linear FCN to project M-dimensional input to
N-dimensional output and the output size of LSTM is reduced to M in the uncompression module” (page
190, column 1, paragraph 2; figure 3). As can be seen in figure 3 and described by LU, the output of the
LSTM with size M, is the input to the FCN to produce the decoded data set with size N.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI, MURUGAN, and SHAZEER the proposed NN architectures by LU to achieve better performance in terms of both CSI compression and recovery accuracy.
	HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU fail to explicitly disclose information regarding, “wherein an input of the RNN bank comprises a state vector associated with a first time, and wherein an output of the RNN bank comprises a state vector associated with a second time;”
However, LU, W. teaches, in analogous art, wherein an input of the RNN bank comprises a
state vector associated with a first time, and wherein an output of the RNN bank comprises a state
vector associated with a second time;
LU, W. writes, “...the probability vectors 831 can be provided as inputs to a number of RNNs 1109, 1111,
and 1113 for temporal smoothness. Each RNN includes multiple long short term memory (LSTM) units.
Each of the probability vectors 1103, 1105 and 1108 can be provided as an input to one of the RNNs,
which can generate a corresponding probability vector 1121, 1123 or 1125. A weighted sum 1127 of the
corresponding probability vectors 121, 1123 and 1125 can be computed, and used in conjunction with
the original probability vectors 1103, 1105 and 1107 to obtain an estimated offset 1117” (paragraph
0116). LU, W. indicates that probability vectors can be provided as inputs to a number of RNNs and the
RNNs can generate a corresponding probability vector. LU, W. continues, “The exemplary implementation uses recurrent neural networks (RNNs) to achieve similar temporal smoothness. To be
more specific, LSTM units are used. Each of the probability vectors 1209 for the dimensions (x, y, ?) from
a probability offset volume described above can be treated as the input of each parameter independent
RNNs unit. Through learning of historical information by RNNs, the trajectory of localization results
would be smoother and more accurate” (paragraph 0120). LU, W. indicates that implementation uses
RNNs, specifically LSTM units, and that the probability vectors can be treated as input from a first time
and that the historical information produced by the RNN as a second time.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI, MURUGAN, and SHAZEER the proposed NN architectures by LU to achieve better performance in terms of both CSI compression and recovery accuracy.

Claim(s) 18-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU as applied to claim 9 above, and further in view of LU W.

Regarding claim 18, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU teach the transmitting wireless communication device of claim 9,
HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU fail to explicitly disclose information regarding, “wherein the RNN bank is configured to select one or more dimensions of a set of dimensions for an input to have based at least in part on a correlation between the one or more dimensions and at least one additional dimension of the set of dimensions.”
However, LU W. teaches, in analogous art, wherein the RNN bank is configured to select one
or more dimensions of a set of dimensions for an input to have based at least in part on a correlation
between the one or more dimensions and at least one additional dimension of the set of dimensions.
LU, W. writes, “The method also includes compressing the probability offset volume into multiple
probability vectors across a x dimension, a y dimension and a yaw dimension; providing each probability
vector of the probability offset volume to a number of recurrent neural networks (RNNs)” (paragraph
0043). LU, W. specifies that multiple probability vectors across variable dimensions will be provided to a
number of RNNs, therefore the selection from the RNNs will be determined based on the dimensions of
the vectors.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU to include a number of RNNs with each RNN corresponding to a set of dimensions to provide adaptability for the encoder and decoder.

Regarding claim 19, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU teach the transmitting wireless communication device of claim 9,
HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU fail to explicitly disclose information regarding, “wherein the RNN bank comprises a plurality of RNNs, each RNN of the plurality of RNNs corresponding to a different dimension of a plurality of dimensions.”
However, LU W. teaches, in analogous art, wherein the RNN bank comprises a plurality of
RNNs, each RNN of the plurality of RNNs corresponding to a different dimension of a plurality of
dimensions.
LU, W. writes, “The method also includes compressing the probability offset volume into multiple
probability vectors across a x dimension, a y dimension and a yaw dimension; providing each probability
vector of the probability offset volume to a number of recurrent neural networks (RNNs)” (paragraph
0043). LU, W. denotes a number of RNNs and that multiple probability vectors across variable
dimensions will be provided to a number of RNNs, therefore the selection from the RNNs will be
determined based on the dimensions of the vectors.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, and LU to include a number of RNNs with each RNN corresponding to a set of dimensions to provide adaptability for the encoder and decoder.

Claim(s) 13-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and WU as applied to claim 12 above, and further in view of LU W.

Regarding claim 13, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and WU teach the transmitting wireless communication device of claim 12,
HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and WU fail to explicitly disclose information regarding, “wherein the output of the RNN bank comprises a state vector associated with a first time,”, “and wherein the output generator takes, as additional input, an output of a single-shot encoder associated with a second time,”, and “wherein the second time occurs after the first time.”
However, LU, W. teaches, in analogous art, wherein the output of the RNN bank comprises a
state vector associated with a first time,
LU, W. writes, “...the probability vectors 831 can be provided as inputs to a number of RNNs 1109, 1111,
and 1113 for temporal smoothness. Each RNN includes multiple long short term memory (LSTM) units.
Each of the probability vectors 1103, 1105 and 1108 can be provided as an input to one of the RNNs,
which can generate a corresponding probability vector 1121, 1123 or 1125. A weighted sum 1127 of the
corresponding probability vectors 121, 1123 and 1125 can be computed, and used in conjunction with
the original probability vectors 1103, 1105 and 1107 to obtain an estimated offset 1117” (paragraph
0116). LU, W. indicates that the output of the RNNs are corresponding probability vectors.
and wherein the output generator takes, as additional input, an output of a single-shot
encoder associated with a second time,
LU, W. writes, “...the probability vectors 831 can be provided as inputs to a number of RNNs 1109, 1111,
and 1113 for temporal smoothness. Each RNN includes multiple long short term memory (LSTM) units.
Each of the probability vectors 1103, 1105 and 1108 can be provided as an input to one of the RNNs,
which can generate a corresponding probability vector 1121, 1123 or 1125. A weighted sum 1127 of the
corresponding probability vectors 121, 1123 and 1125 can be computed, and used in conjunction with
the original probability vectors 1103, 1105 and 1107 to obtain an estimated offset 1117” (paragraph
0116). LU, W. suggests the output of the RNNs can further be used as inputs.
wherein the second time occurs after the first time.
LU, W. states, “The exemplary implementation uses recurrent neural networks (RNNs) to achieve similar
temporal smoothness. To be more specific, LSTM units are used. Each of the probability vectors 1209 for
the dimensions (x, y, ?) from a probability offset volume described above can be treated as the input of
each parameter independent RNNs unit. Through learning of historical information by RNNs, the
trajectory of localization results would be smoother and more accurate” (paragraph 0120). As LU, W.
indicates LSTM units are used. As LU stated previously, “The LSTM network is usually used for sequence
modeling, due to its ability to capture correlation [8]. This can be verified through Eq. (2): ct is
determined by its previous state ct−1 and the present input it” (page 190, column 1, paragraph 1; figure
2; equation 2). LU illustrates in figure 2 and the computations involved in the LSTM, and further explains
the output relies on its previous state and current input.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and WU to use historical information by the RNNs, to aid in the neural networks learning to be smoother and more accurate.

Regarding claim 14, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, WU, and LU, W. teach the transmitting wireless communication device of claim 13,
Additionally, WU teaches wherein the output generator (paragraph 0113; figure 3B: Fully
connected layer) comprises:
a first fully connected layer that produces a first output having a first number of dimensions;
WU writes, “The fully connected layer includes a first fully connected layer and a second fully connected
layer, and the first fully connected layer and the first definition feature vector have the same dimension,
for example, 512. The dimension of the second fully connected layer is 1” (paragraph 0113).
a rectified linear unit (ReLU) activation layer that receives the first output and produces a
second output having the first number of dimensions;
WU states, “In this embodiment, the first fully connected layer and the second fully connected layer are
connected through an activation layer. An activation function of the activation layer may be a non-linear
activation function, for example, a rectified linear unit (ReLU) function” (paragraph 0114).
and a second fully connected layer that receives the second output and produces a third
output having a second number of dimensions that is less than the first number of dimensions.
WU writes, “The fully connected layer includes a first fully connected layer and a second fully connected
layer, and the first fully connected layer and the first definition feature vector have the same dimension,
for example, 512. The dimension of the second fully connected layer is 1” (paragraph 0113).

Regarding claim 15, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and WU teach the transmitting wireless communication device of claim 12,
HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and WU fail to explicitly disclose information regarding, “wherein an input of the RNN bank comprises a state vector associated with a first time,”, “wherein the output of the RNN bank comprises a state vector associated with a second time,”, “and wherein the output generator takes, as additional input, an output of a single-shot encoder associated with the second time,”, and “wherein the second time occurs after the first time.”
However, LU, W. teaches, in analogous art, wherein an input of the RNN bank comprises a
state vector associated with a first time,
LU, W. writes, “...the probability vectors 831 can be provided as inputs to a number of RNNs 1109, 1111,
and 1113 for temporal smoothness. Each RNN includes multiple long short term memory (LSTM) units.
Each of the probability vectors 1103, 1105 and 1108 can be provided as an input to one of the RNNs,
which can generate a corresponding probability vector 1121, 1123 or 1125. A weighted sum 1127 of the
corresponding probability vectors 121, 1123 and 1125 can be computed, and used in conjunction with
the original probability vectors 1103, 1105 and 1107 to obtain an estimated offset 1117” (paragraph
0116). LU, W. indicates that probability vectors can be provided as inputs to a number of RNNS.
wherein the output of the RNN bank comprises a state vector associated with a second time,
LU, W. states, “...the probability vectors 831 can be provided as inputs to a number of RNNs 1109, 1111,
and 1113 for temporal smoothness. Each RNN includes multiple long short term memory (LSTM) units.
Each of the probability vectors 1103, 1105 and 1108 can be provided as an input to one of the RNNs,
which can generate a corresponding probability vector 1121, 1123 or 1125. A weighted sum 1127 of the
corresponding probability vectors 121, 1123 and 1125 can be computed, and used in conjunction with
the original probability vectors 1103, 1105 and 1107 to obtain an estimated offset 1117” (paragraph
0116). LU, W. indicates the RNNs can generate a corresponding probability vector.
and wherein the output generator takes, as additional input, an output of a single-shot
encoder associated with the second time,
LU, W. writes, “...the probability vectors 831 can be provided as inputs to a number of RNNs 1109, 1111,
and 1113 for temporal smoothness. Each RNN includes multiple long short term memory (LSTM) units.
Each of the probability vectors 1103, 1105 and 1108 can be provided as an input to one of the RNNs,
which can generate a corresponding probability vector 1121, 1123 or 1125. A weighted sum 1127 of the
corresponding probability vectors 121, 1123 and 1125 can be computed, and used in conjunction with
the original probability vectors 1103, 1105 and 1107 to obtain an estimated offset 1117” (paragraph
0116). LU, W. suggests that the corresponding probability vectors can be computed and used in
conjunction with the original probability to obtain an estimated offset. The outputs obtained can further
be used as inputs to the output generator.
wherein the second time occurs after the first time.
LU, W. states, “The exemplary implementation uses recurrent neural networks (RNNs) to achieve similar
temporal smoothness. To be more specific, LSTM units are used. Each of the probability vectors 1209 for
the dimensions (x, y, ?) from a probability offset volume described above can be treated as the input of
each parameter independent RNNs unit. Through learning of historical information by RNNs, the
trajectory of localization results would be smoother and more accurate” (paragraph 0120). LU, W.
indicates that implementation uses RNNs, specifically LSTM units, and that the probability vectors can
be treated as input from a first time and that the historical information produced by the RNN as a
second time.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and WU to use historical information by the RNNs, to aid in the neural networks learning to be smoother and more accurate.

Claim(s) 25-26 and 28 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, MURUGAN, SHAZEER,  WANG, LU, and LU W. as applied to claims 24 above, and further in view of WU.

Regarding claim 25, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and LU, W. teach the receiving wireless communication device of claim 24,
Additionally, LU teaches wherein the RNN bank produces a first output having a first number
of dimensions,
LU writes, “In PR-RecCsiNet, the input and output of LSTM already have the same size, which allows us
to link them together directly” (page 190, column 1, paragraph 3). The size of the input to the LSTM will
have the same size as the output.
and a second fully connected layer that receives the third output and produces a fourth
output having a second number of dimensions that is greater than the first number of dimensions.
LU states, “We use a linear FCN to project M-dimensional input to N-dimensional output…” (page 190,
column 1, paragraph 2). LU adds, “The input size of the compression module is N while the output size is
M, typically N > M” (page 189, column 2, paragraph 3; figure 1). LU indicates the output of the FCN is N-
dimensional and the input to the FCN is M-dimensional with N greater than M.
HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, and LU, W. fail to explicitly disclose information regarding, “and wherein the output generator comprises:”, “a first fully connected layer that receives the first output and produces a second output having the first number of dimensions;”, “a first middle layer that receives the second output and produces a third output having the first number of dimensions,”, and “wherein the first middle layer comprises at least one of a batch normalization (BN) layer or a rectified linear unit (ReLU) layer;”
However, WU teaches, in analogous art, and wherein the output generator (paragraph 0113;
figure 3B: Fully connected layer) comprises:
a first fully connected layer that receives the first output and produces a second output having
the first number of dimensions;
WU writes, “The fully connected layer includes a first fully connected layer and a second fully connected
layer, and the first fully connected layer and the first definition feature vector have the same
dimension...” (paragraph 0113).
a first middle layer that receives the second output and produces a third output having the
first number of dimensions,
WU states, “In this embodiment, the first fully connected layer and the second fully connected layer are
connected through an activation layer” (paragraph 0114).
wherein the first middle layer comprises at least one of a batch normalization (BN) layer or a
rectified linear unit (ReLU) layer;
WU writes, “An activation function of the activation layer may be a non-linear activation function, for
example, a rectified linear unit (ReLU) function” (paragraph 0114).
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to incorporate into the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, LU, and LU, W. a first and second fully connected layers that may include an activation layer between the first and second fully connected layers to initiate non-linearity in order for the neural network to learn.

Regarding claim 26, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, LU, W., and WU teach the receiving wireless communication device of claim 25,
Additionally, LU teaches wherein the temporal processing block comprises:
LU writes, “The compression and uncopression modules are proposed using the long short-term
memory (LSTM) network [8], which has the memory function and thus can capture and extract inherent
correlations, e.g. temporal correlations within input sequences" (page 189, column 2, paragraph 3;
figure 2). LU specifies that the LSTM network can catch the temporal channel correlation, and illustrates
the LSTM network in figure 2.
and a fourth fully connected layer that receives the sixth output and produces a seventh
output having the first number of dimensions.
LU states, “We use a linear FCN to project M-dimensional input to N-dimensional output…” (page 190,
column 1, paragraph 2). LU includes, “The input size of the compression module is N while the output
size is M, typically N > M” (page 189, column 2, paragraph 3; figure 1). LU indicates the fully connected
layer produces an N-dimensional output corresponding to the dimensions of the data before being
encoded.
Additionally, WU teaches a third fully connected layer that receives the encoded data set and
produces a fifth output having the first number of dimensions;
WU writes, “The fully connected layer includes a first fully connected layer and a second fully connected
layer...” (paragraph 0113). WU adds, “The foregoing embodiments only show several implementations
of the present disclosure, and descriptions thereof are in detail, but are not to be understood as a
limitation to the patent scope of the present disclosure” (paragraph 0193). WU includes, “In this
embodiment, the first fully connected layer and the second fully connected layer are connected through
an activation layer. An activation function of the activation layer may be a non-linear activation function,
for example, a rectified linear unit (ReLU) function” (paragraph 0114). WU specifies that layers are not
limited by first and second, and that two fully connected layers may be joined together by a middle
layer. LU indicated earlier that the fully connected layer produces N-dimensional output corresponding
to the dimensions of the data before being encoded. Therefore, the first fully connected layer can
produce N-dimensional output and the second fully connected layer can receive N-dimensional input.
a second middle layer that receives the fifth output and produces a sixth output having the
first number of dimensions, wherein the second middle layer comprises at least one of a BN layer or a
ReLU layer;
WU writes, “In this embodiment, the first fully connected layer and the second fully connected layer are
connected through an activation layer. An activation function of the activation layer may be a non-linear
activation function, for example, a rectified linear unit (ReLU) function” (paragraph 0114). WU adds,
“The foregoing embodiments only show several implementations of the present disclosure, and
descriptions thereof are in detail, but are not to be understood as a limitation to the patent scope of the
present disclosure” (paragraph 0193).

Regarding claim 28, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, LU, W., and WU teach the transmitting wireless communication device of claim 25,
Additionally, LU, W. teaches wherein the RNN bank is configured to select one or more
dimensions of a set of dimensions for an input to have based at least in part on a correlation between
the one or more dimensions and at least one additional dimension of the set of dimensions.
LU, W. writes, “The method also includes compressing the probability offset volume into multiple
probability vectors across a x dimension, a y dimension and a yaw dimension; providing each probability
vector of the probability offset volume to a number of recurrent neural networks (RNNs)” (paragraph
0043). LU, W. specifies that multiple probability vectors across variable dimensions will be provided to a
number of RNNs, therefore the selection from the RNNs will be determined based on the dimensions of
the vectors.

Claim(s) 16-17 and 27 is/are rejected under 35 U.S.C. 103 as being unpatentable over HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, WU, and LU W. as applied to claim15 and 26 above, and further in view of WEN.

Regarding claim 16, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, WU, and LU, W. teach the transmitting wireless communication device of claim 15,
Additionally, WU teaches wherein the output generator (paragraph 0113; figure 3B: Fully
connected layer) comprises:
a first fully connected layer that produces a first output having a first number of dimensions;
WU writes, “The fully connected layer includes a first fully connected layer and a second fully connected
layer, and the first fully connected layer and the first definition feature vector have the same dimension,
for example, 512. The dimension of the second fully connected layer is 1” (paragraph 0113).
and rectified linear unit (ReLU) activation layer that receives the first output and produces a
second output having the first number of dimensions;
WU states, “In this embodiment, the first fully connected layer and the second fully connected layer are
connected through an activation layer. An activation function of the activation layer may be a non-linear
activation function, for example, a rectified linear unit (ReLU) function” (paragraph 0114). WU indicates
the use of a ReLU activation layer between the first and second fully connected layers.
and a second fully connected layer that receives the second output and produces a third
output having a second number of dimensions that is less than the first number of dimensions.
WU writes, “The fully connected layer includes a first fully connected layer and a second fully connected
layer, and the first fully connected layer and the first definition feature vector have the same dimension,
for example, 512. The dimension of the second fully connected layer is 1” (paragraph 0113).
HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, WU, and LU, W. fail to explicitly disclose information regarding, “a first batch normalization (BN)”
However, WEN teaches, in analogous art, a first batch normalization (BN)
WEN writes, “The rectified linear unit (ReLU), ReLU(x) = max(x, 0), is used as the activation function, and
we introduce batch normalization to each layer” (page 750, column 1, paragraph 1). WEN indicates that
BN is introduced to each layer including a first BN.
	It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the invention of HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, WU, and LU, W. to incorporate the batch normalization technique into a neural network to improve speed and stability.

Regarding claim 17, HAJIMIRSADEGHI, MURUGAN, SHAZEER, WANG, LU, WU, LU W., and WEN teach the transmitting wireless communication device of claim 16,
Additionally, WEN teaches wherein the output generator further comprises a second BN layer
that receives the third output and produces a fourth output having the second number of dimensions.
WEN writes, “The rectified linear unit (ReLU), ReLU(x) = max(x, 0), is used as the activation function, and
we introduce batch normalization to each layer” (page 750, column 1, paragraph 1). WEN indicates that
BN is introduced to each layer including a second BN layer. Since BN scales each dimension of the input
to a succeeding output from a previous layer and the third output having the second number of
dimensions, the second BN layer would have an output of the second number of dimensions.

Claim 27 is an apparatus claim corresponding to the apparatus claim 17 that has already been
rejected above. The applicant’s attention is directed to the rejection of claim 17. Claim 27 is rejected
under the same rational as claim 17.

	Claims 4-5 and 22-23 have been canceled by the applicant, respectfully.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHRISTOPHER A REYES whose telephone number is (703)756-4558. The examiner can normally be reached Monday - Friday 8:30 - 5:00 EDT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KHALED KASSIM can be reached at (571) 270-3770. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Christopher A. Reyes/Examiner, Art Unit 2475                                                                                                                                                                                                        1/25/2026

/KHALED M KASSIM/supervisory patent examiner, Art Unit 2475
Read full office action
Prosecution Timeline

Mar 05, 2021
Application Filed
Dec 03, 2024
Non-Final Rejection — §103
Jan 21, 2025
Interview Requested
Jan 29, 2025
Examiner Interview Summary
Jan 29, 2025
Applicant Interview (Telephonic)
Mar 04, 2025
Response Filed
Apr 24, 2025
Final Rejection — §103
May 21, 2025
Interview Requested
Jun 04, 2025
Examiner Interview Summary
Jun 04, 2025
Applicant Interview (Telephonic)
Jun 18, 2025
Response after Non-Final Action
Jul 31, 2025
Request for Continued Examination
Aug 05, 2025
Response after Non-Final Action
Aug 20, 2025
Non-Final Rejection — §103
Oct 01, 2025
Interview Requested
Oct 23, 2025
Examiner Interview Summary
Oct 23, 2025
Applicant Interview (Telephonic)
Oct 31, 2025
Response Filed
Jan 25, 2026
Final Rejection — §103
Mar 03, 2026
Interview Requested
Mar 12, 2026
Applicant Interview (Telephonic)
Mar 12, 2026
Examiner Interview Summary
Precedent Cases

Applications granted by this same examiner with similar technology

18/123,331
Patent 12598621
Device and Method for Handling a Multi-cell Scheduling
2y 5m to grant Granted Apr 07, 2026
18/348,168
Patent 12593337
RESOURCE DETERMINATION METHOD AND APPARATUS, DEVICES, AND STORAGE MEDIUM
2y 5m to grant Granted Mar 31, 2026
17/915,521
Patent 12457249
STORAGE MEDIUM TO STORE TRANSMISSION DATA SETTING SUPPORT PROGRAM, GATEWAY DEVICE, AND TRANSMISSION DATA SETTING SUPPORTING METHOD
2y 5m to grant Granted Oct 28, 2025
17/662,570
Patent 12294868
Method Of Building Ad-Hoc Network Of Wireless Relay Node And Ad-Hoc Network System
2y 5m to grant Granted May 06, 2025
Study what changed to get past this examiner. Based on 4 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
88%
Grant Probability
81%
With Interview (-6.3%)
2y 11m
Median Time to Grant
High
PTA Risk
Based on 8 resolved cases by this examiner. Grant probability derived from career allow rate.
ARCHITECTURES FOR TEMPORAL PROCESSING ASSOCIATED WITH WIRELESS TRANSMISSION OF ENCODED DATA

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email