Last updated: April 19, 2026
Application No. 18/737,906
LATENT TRANSFORMER CORE FOR A LARGE CODEWORD MODEL

Non-Final OA §103§112
Filed
Jun 07, 2024
Examiner
GODO, MORIAM MOSUNMOLA
Art Unit
2148
Tech Center
2100 — Computer Architecture & Software
Assignee
AtomBeam Technologies Inc.
OA Round
5 (Non-Final)
This examiner grants 44% of cases after interview

— +33.4% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 68 resolved cases, 2023–2026
Examiner Intelligence

GODO, MORIAM MOSUNMOLA View full profile →
Grants 44% of resolved cases
Career Allow Rate
30 granted / 68 resolved
-10.9% vs TC avg
Strong +33% interview lift
Without
With
+33.4%
Interview Lift
resolved cases with interview
Typical timeline
4y 8m
Avg Prosecution
47 currently pending
Career history
115
Total Applications
across all art units
Statute-Specific Performance

§101
16.1%
-23.9% vs TC avg
§103
56.7%
+16.7% vs TC avg
§102
12.7%
-27.3% vs TC avg
§112
12.9%
-27.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 68 resolved cases
Office Action

§103 §112
DETAILED ACTION
1.	This office action is in response to the Application No. 18737906 filed on 07/24/2025. Claims 1-18 are presented for examination and are currently pending.

Notice of Pre-AIA  or AIA  Status
2.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114

3. 	A request for continued examination under 37 CFR 1.114, including the fee set
forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this
application is eligible for continued examination under 37 CFR 1.114, and the fee set
forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action
has been withdrawn pursuant to 37 CFR 1.114. Applicant’s submission filed on
07/24/2025 has been entered. 


Response to Arguments

4.	Upon review of the claim amendment filed 07/24/2025, the 112 rejection has been withdrawn. However, a new 112 rejection has been added.
	On page 6 of the remarks, the Applicant stated that “The amended limitation referencing the dimensionality of latent space vectors and transformer inputs/outputs is expressly supported throughout the disclosure, including at least [0116],  [0119], [0131], and [0139]”.
	It is noted that the cited paragraphs [0116],  [0119], [0131], and [0139] the Applicant has referenced has no support for the limitations “wherein the transformer comprises a plurality of multihead attention blocks each having a second dimensionality equal to the first dimensionality”. As a result, a new 112 rejection has been added to the rejection.
On page 7 of the remarks, the Applicant argued that “Claim 1 specifically requires a latent transformer core for large codeword models comprising a variational autoencoder’s encoder and decoder that process latent space vectors. Ye teaches a video prediction system using CNN encoder/decoder architectures that process visual feature vectors, while Im teaches BERT-based masked token prediction. Neither reference teaches the claimed Large Codeword Model architecture, and the combination fails to cure these fundamental deficiencies”.
On page 7 of the remarks, the Applicant argued that “Ye discloses video prediction where separate future frames are generated from past frames, but fails to teach output vectors that extend corresponding input vectors. Ye’s future frames are entirely distinct outputs based on past frames, not extensions of input frames containing the original input data plus new elements”.
The argument above is not persuasive because Im teaches a deep learning system with a latent transformer core (Among deep learning language models, the BERT model currently shows good performance in the word prediction field. The BERT model is a model designed using an encoder part of a transformer structure [0063]) for large codeword models (When an adjusted facial image 1 is input, encoding is performed on the basis of a codebook (S143) such that the image is changed to consecutive codebook indices (words) [0068]. The Examiner notes a codebook consist of codewords)
However, Ye as a secondary reference is still very relevant as prior art because Ye teaches a deep learning system that comprises an encoder which provides output received by a transformer and the transformer provides an output to the decoder which reconstructs the data.


	On page 7 of the remarks, the Applicant argued that “Im, meanwhile, addresses a different problem (masked token prediction in BERT for facial image generation), and does not remedy the deficiencies of Ye. Im fills in masked portions of existing vectors but does not generate extended vectors with new elements beyond the original vector boundaries”.
	It is noted that the argument has been considered but is moot because the limitation has been remapped in light of the new primary reference.
	
	On page 7 of the remarks, the Applicant argued that “Claim 1 requires a variational autoencoder’s encoder and decoder, while Ye teaches CNN encoder/decoder architectures. These are fundamentally different—VAEs generate probabilistic latent space representations while CNNs generate deterministic visual feature maps. This architectural distinction is not merely semantic but represents different computational approaches with different capabilities and applications”.
	It is noted that the arguments has been considered but is moot in light of the newly added reference Song in view of Ye in view of Im.

	On page 7-8 of the remarks, the Applicant argued that “The Examiner appears to interpret “at least one new element extending the corresponding input vector” in a manner that is unreasonably broad. For Ye to meet this limitation under the Examiner’s interpretation, it would mean that any separate output prediction qualifies as “extending” an input vector. This is analogous to claiming that predicting tomorrow’s weather “extends” today’s weather data, when in fact tomorrow’s forecast is a separate prediction based on today’s data, not an extension of today’s data itself. Such an interpretation reads out the claim’s structural specificity requiring output vectors to actually contain and extend their corresponding input vectors with additional elements, and would not be understood by a person of ordinary skill in the art”.
It is noted that the arguments has been considered but is moot in light of the newly added primary reference Song, in view of Ye in view of Im now teaches claim 1.

	On page 8 of the remarks, the Applicant argued that “Furthermore, the combination lacks a reasoned motivation supported by evidence. Ye teaches video prediction using CNN encoders/decoders with transformers for temporal modeling, while Im teaches BERT-based masked token prediction for facial image generation using codebooks. There is no clear rationale for why a skilled artisan would have been motivated to combine these disparate approaches—one focused on video temporal prediction and the other on static image token masking. The technologies address fundamentally different problems in different domains with different architectural requirements”.
It is noted that Song, the newly added primary reference in view of Ye in view of Im now teaches claim 1.
Additionally, the argument above is not persuasive because newly applied Song in view of Ye in view of Im are all analogous art. Song teaches a transformer for identifying the dependency between data (pg. 98116, right col., first para.). Similarly, Ye teaches using transformer block for spatio-temporal feature learning (pg. 1, right col., third para.) and Im discloses a transformer architechure [0062], then, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Song to incorporate the teachings of Im for the benefit of generating an image generation model using a bidirectional encoder representations from transformers (BERT) model that covers some tokens with a mask among the codebook indices in the facial image training data 10′ represented with the codebook indices and predicts what are the tokens covered with the mask (Im [0062])

On page 8 of the remarks, the Applicant argued that “In summary, the cited references fail to teach the claimed invention on multiple grounds: (1) neither reference discloses the specific Large Codeword Model architecture with latent transformer core; (2) the references teach fundamentally different encoder/decoder architectures (CNN vs. VAE); (3) the key limitation requiring output vectors that extend corresponding input vectors is not taught by either reference; and (4) there is no reasonable motivation to combine video prediction and masked token prediction technologies. The rejection therefore fails to establish a prima facie case of obviousness, and the §103 rejection should be withdrawn”.
The above argument is not persuasive for the same reasons argued above.

On page 8-9 of the remarks, the Applicant argued that “As Claims 1 and 7 have been shown to be patentable over the cited prior art, the dependent claims are likewise patentable at least by virtue of their dependency. Accordingly, Applicant respectfully requests that the §103 rejections of the dependent claims be withdrawn”.
The Examiner notes dependent claims which depend directly or indirectly from claims 1 and 7 are not patentable because the Applicant’s arguments above are not persuasive for similar reasons argued above regarding claim 1.

On page 9-10 of the remarks, the Applicant argued that “Applicant notes that Claim 13 includes all of the structural and functional limitations discussed above in connection with Claim 1, including the use of a variational autoencoder’s encoder and decoder to process latent space vectors and a transformer decoder that generates output vectors that extend corresponding input vectors. All substantive arguments presented above regarding the failure of Ye and Im to teach the claimed invention apply equally to Claim 13, and Kano does not cure these deficiencies. The substitution of computer-readable media for method or system implementation does not render the claimed invention obvious. Kanso is cited solely for disclosing generic language concerning non-transitory computer-readable storage media and deployment frameworks such as Kubernetes. However, Kanso is directed to infrastructure-level orchestration of containerized workloads and operator success prediction based on system configurations. It does not disclose or suggest the latent transformer architecture, variational autoencoder structure, codeword-based data representation, or the vector extension logic recited in the present claims. The Examiner’s rationale for combining Kanso with Ye and Im lacks a reasoned motivation. No explanation is provided as to why a skilled artisan would apply container orchestration or configuration management systems to the specific latent transformer architecture claimed here, nor is there any indication that doing so would yield the claimed structure or achieve the claimed functionality. The proposed combination would require merging video prediction algorithms, facial image processing, and container deployment systems—technologies from entirely different domains with no apparent synergy. The rejection thus fails to provide the reasoned explanation”.
On page 10 of the remarks, the Applicant argued that “In summary, the addition of Kanso to the Ye and Im combination does not cure the deficiencies identified with respect to the core limitations of Claim 13, which remain unmet. Kanso contributes only generic disclosure of computer-readable storage media and deployment infrastructure, without addressing the claimed latent transformer architecture, variational autoencoder pipeline, or the required functionality of generating output vectors that extend corresponding input vectors. Absent a teaching, suggestion, or motivation to combine these disparate systems in a manner that yields the claimed invention, the rejection fails to establish a prima facie case of obviousness. Accordingly, Applicant respectfully requests that the §103 rejection of Claim 13 be withdrawn”.
It is noted that Song in view of Ye in view of Im in view of Isik now teaches the limitations of claim 13. Kanso is no more applied to claim 13, as a result, the arguments directed to Kanso are moot.

On page 10 of the remarks, the Applicant argued that “As Claim 13 has been shown to be patentable over the cited prior art, the dependent claims are likewise patentable at least by virtue of their dependency. Accordingly, Applicant respectfully requests that the §103 rejections of the dependent claims be withdrawn”.
The Examiner notes dependent claims which depend directly or indirectly from claim 13 is not patentable because the Applicant’s arguments above are not persuasive for similar reasons argued above regarding claim 13.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

5.	Claims 1-18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claims 1, 7 and 13 recites “wherein the variational autoencoder's decoder 
comprises a plurality of network layers of successively larger sizes, the first of which is of the first dimensionality”. This limitations does not appear to have support in the instant specification.
Claims 1, 7 and 13 recites “wherein the transformer comprises a plurality of multihead attention blocks each having a second dimensionality equal to the first dimensionality”. This limitations does not appear to have support in the instant specification.
	The limitations above do not appear to have any support in the Applicant’s specification because the specification only recites “The output of the Feed Forward layer has the same dimensionality as the input embeddings” (instant specification, [0131]) and “When the Decoder's final hidden states are passed through a linear transformation, they are projected into a vector space with the same dimensionality as the size of the vocabulary. Each dimension in this space corresponds to a specific codeword in the vocabulary” (instant specification, [0139]). The specification also recites “Normalization helps in improving the convergence and stability of the learning process, as it ensures that all features or dimensions of the data contribute equally to the learning algorithm” (instant specification, [0052]). 

The above citation from the specification, clearly states that the vector 
space has a dimension which is different from variational autoencoder's decoder comprises a plurality of network layers of successively larger sizes, the first of which is of the first dimensionality. This claim recites that the network layer of the decoder has successively larger sizes and the first of the size has a first dimension which is not the same has the vector space having a dimension.
	Furthermore, the vector space has the same dimensionality as the size of the vocabulary. This is not the same as the plurality of multihead attention blocks each having  a second dimensionality equal to the first dimensionality.
	As a result, the specification does not have support for “wherein the variational autoencoder's decoder comprises a plurality of network layers of successively larger sizes, the first of which is of the first dimensionality” and “wherein the transformer comprises a plurality of multihead attention blocks each having a second dimensionality equal to the first dimensionality”.

Claims 2-6, 8-12 and 14-18 that are not specifically mentioned are rejected due to dependency.


The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


6.	Claims 1-18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
	Claims 1, 7 and 13 recites “the first” which lacks antecedent basis. It is not clear which first “the first” is referring to.
	Claims 1, 7 and 13 recites “the first of which is of the first width dimensionality”. It is unclear what the limitation “the first” is being referred to.
	Claims 2-6, 8-12 and 14-18 that are not specifically mentioned are rejected due to dependency.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


7.	Claims 1 and 7 are rejected under 35 U.S.C 103 as being unpatentable over Song et al. (“Anomaly VAE-Transformer: A Deep Learning Approach for Anomaly Detection in Decentralized Finance”, VOLUME 11, 2023, date of publication 8 September 2023) in view of Ye et al. (“VPTR: Efficient Transformers for Video Prediction”, 2022 26th International Conference on Pattern Recognition (ICPR) August 21-25, 2022) and further in view of Im et al. (US20230169709)

Regarding claim 1, Song teaches a deep learning system with a latent transformer core (FIGURE 3. Overall architecture of anomaly VAE-Transformer model, pg. 98122; In this paper, we propose an anomaly variational autoencoder (VAE)-Transformer, which is a new deep learning model for anomaly detection, pg. 98116) …, 
comprising one or more computers with executable instructions that, when executed, cause the deep learning system to (The experiments were conducted on a machine equipped with Intel Core i7-12700F CPU and NVIDIA GeForce RTX 3060 GPU and 32GB DDR4 RAM, pg. 98125, left col., first para.): 
receive a plurality of input vectors (For training the VAE, the VAE encoder receives dt =  
    PNG
    media_image1.png
    26
    294
    media_image1.png
    Greyscale
, pg. 98122, left col., first para, Fig. 3); 
generate a plurality of latent space vectors (and it encodes this data into low-dimensional embedding et, 
    PNG
    media_image2.png
    38
    338
    media_image2.png
    Greyscale
 where et ∈ R k and k represents the latent space dimension, pg. 98122, left col., first para, Fig. 3) each having a first dimensionality (dimension of latent space = 64, pg. 98124, right col., Table 2, Hyperparameters of proposed Anomaly VAE-Transformer, FIGURE 6. Training loss and validation loss of VAE (dimension of latent space = 64) by processing the plurality of input vectors through a variational autoencoder's encoder (The VAE encoder encodes daily data (sequence of hourly data of 24 hours) among time series data, pg. 98121, right col., section B. ANOMALY VAE-TRANSFORMER), 
wherein the variational autoencoder's encoder comprises a plurality of network layers of successively smaller sizes (VAE Encoder in Fig. 3, pg. 98122. Examiner notes the VAE Encoder in Fig. 3 comprises a plurality of network layers of successively smaller sizes which is similar to VAE Encoder Subsystem 150 of instant specification ‘s Fig. 1C), 
the last of which outputs the plurality of latent space vectors (The VAE encoder encodes q nonoverlapping daily data separately, and delivers Et, … 

    PNG
    media_image3.png
    80
    454
    media_image3.png
    Greyscale
pg. 98122, right col., last para. to pg. 98123, left col., first para.); 
learn relationships between the plurality of latent space vectors by processing the plurality of latent space vectors through a transformer (The transformer receives Et as an input and generates q contextualized embeddings as a set of output Zt, pg. 98123, left col., first para. Fig. 3), 
wherein the transformer comprises a plurality of … attention blocks (The transformer … consists of three encoders which use stacked attention and feed-forward layers, pg. 98121, right col., section B. ANOMALY VAE-TRANSFORMER) each having a second dimensionality (dimension of latent space = 64, pg. 98124, right col., Table 2, Hyperparameters of proposed Anomaly VAE-Transformer, FIGURE 6. Training loss and validation loss of VAE (dimension of latent space = 64) equal to the first dimensionality (dimension of latent space = 64, pg. 98124, right col., Table 2, Hyperparameters of proposed Anomaly VAE-Transformer, FIGURE 6. Training loss and validation loss of VAE (dimension of latent space = 64) and 
receives successive ones of the plurality of latent space vectors as successive inputs (The VAE encoder encodes time series daily data into low-dimensional embedding and sends it to the transformer. The transformer receives the embedding sequence, pg. 98116, left col., first para.), 
wherein outputs of an encoder of the transformer (outputs for the Encoders of the Transformer in Fig. 3) 
and generate output vectors by passing the plurality of output latent space vectors through the variational autoencoder's decoder (The VAE decoder receives the transformer output Zt as an input, and decodes q number of zti separately, pg. 98123, left col., third para.), 
wherein the variational autoencoder's decoder comprises a plurality of network layers of successively larger sizes (VAE Decoder in Fig. 3, pg. 98122. The Examiner notes that the VAE Decoder in Fig. 3 comprises a plurality of network layers of successively larger sizes (which is similar to the VAE Decoder Subsystem 180 of instant specification’s Fig. 1C)), 
the first of which is of the first dimensionality (dimension of latent space = 64, pg. 98124, right col., Table 2, Hyperparameters of proposed Anomaly VAE-Transformer, FIGURE 6. Training loss and validation loss of VAE (dimension of latent space = 64); 
wherein each of the plurality of output vectors (the output is sent to a VAE decoder, which inputs the output of the transformer and reconstructs the daily data (pg. 98116, right col., first para.); plurality of output vectors O1, O2, O3 …, pg. 98123, Fig. 5. The Applicant discloses: “The machine learning training subsystem 600 is responsible for training the VAE decoder to accurately reconstruct or generate data from the latent space”(instant specification [0069])) comprises at least one new element extending the corresponding input vector (we proposed a new deep learning model, the anomaly VAE-Transformer model, which consists of a transformer that captures dependency between data in the long term, and VAE, which extracts local information in the short term, pg. 98129, right col., first para.).
Song does not explicitly teach wherein the transformer comprises a plurality of multihead attention blocks, outputs of an encoder of the transformer are provided as inputs to a decoder of the transformer; use the learned relationships between the plurality of latent space vectors to generate, using the decoder of the transformer, a plurality of output latent space vectors based on the plurality of input vectors; a latent transformer core for large codeword models,
Ye teaches wherein the transformer comprises a plurality of multihead attention blocks (plurality of multi-head self-attention (MHSA) blocks, VPTR-NAR Figure 2b, pg. 3495), 
outputs of an encoder of the transformer are provided as inputs to a decoder (e1 e2… eL-1 eL is output by TE is provided as inputs to TD Transformer decoder, Fig. 2b, pg. 3495) of the transformer ((b)VPTR-NAR. The left part is the Transformer encoder and right part is the non-autoregressive Transformer decoder, pg. 3495, Fig. 2); 
use the learned relationships between the plurality of latent space vectors to generate, using the decoder of the transformer (The decoder TD of VPTR-NAR, right part of Fig. 2(b), pg. 3495, left col., second para.)), 
a plurality of output latent space vectors based on the plurality of input vectors (zL+1 zL+2 zL+N-1 zL+N are generated by decoder TD of VPTR-NAR, Fig. 2, pg. 3495); 
Since Song as primary reference desires inferring the generation factors of training data (pg. 98124, left col., second para.) using an encoder that comprises a plurality of network layers of successively smaller sizes (see Song, Fig. 3) and a decoder that comprises a plurality of network layers of successively larger sizes (see Song, Fig. 3), while Ye similarly teaches increasing the inference speed (abstract) using an encoder in Fig. 1 which comprises a plurality of network layers of successively smaller sizes and decoder in Fig. 1 which comprises a plurality of network layers of successively larger sizes, then,
 It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Song to incorporate the teachings of Ye for the benefit of increasing the inference speed (Ye, abstract)
Modified Song does not explicitly teach a latent transformer core for large codeword models,
Im teaches a deep learning system with a latent transformer core (Among deep learning language models, the BERT model currently shows good performance in the word prediction field. The BERT model is a model designed using an encoder part of a transformer structure [0063])
for large codeword models (When an adjusted facial image 1 is input, encoding is performed on the basis of a codebook (S143) such that the image is changed to consecutive codebook indices (words) [0068]. The Examiner notes a codebook consist of codewords),
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Song to incorporate the teachings of Im for the benefit of generating an image generation model using a bidirectional encoder representations from transformers (BERT) model that covers some tokens with a mask among the codebook indices in the facial image training data 10′ represented with the codebook indices and predicts what are the tokens covered with the mask (Im [0062])

Regarding claim 7, claim 7 is similar to claim 1. It is rejected in the same manner and reasoning applying.
	
8.	Claims 2-4 and 8-10 are rejected under 35 U.S.C. 103 as being unpatentable over Song et al. (“Anomaly VAE-Transformer: A Deep Learning Approach for Anomaly Detection in Decentralized Finance”, VOLUME 11, 2023, date of publication 8 September 2023) in view of Ye et al. (“VPTR: Efficient Transformers for Video Prediction”, 2022 26th International Conference on Pattern Recognition (ICPR) August 21-25, 2022) and further in view of Im et al. (US20230169709) and further in view of Saber et al. (US20230131694) 

Regarding claim 2, Modified Song teaches the system of claim 1, Saber teaches wherein the input vectors may contain a plurality of appended zeros (The matrices input to the encoder may be reshaped to have a fixed size by appending zeros in a configuration that may be commonly understood between a UE and a gNB [0178]) and 
a plurality of truncated data points may be used to train (In any of the embodiments of frameworks disclosed herein, a quantizer function may be differentiable with a derivative value of essentially zero (e.g., with a probability of 1) throughout some or all of a quantizer range (e.g., essentially throughout the entire range) [0102]. The Examiner notes truncation is a type of quantization) and 
operationalize a transformer that predicts the next sequential vector following an input vector (Thus, a reconstructed input may be the input applied to the generation model, or an approximation, estimate, prediction, etc., of the input applied to the generation model [0028]).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Song to incorporate the teachings of Saber for the benefit of a generation model 303 which may be trained to generate a feature vector that may identify or separate one or more features (e.g., latent features) of the training data that may reduce the overhead associated with storing and/or transmitting the representation 307 (Saber [0064])

Regarding claim 3, Modified Song teaches the system of claim 1, Saber teaches wherein the input vectors may contain a plurality of appended metadata (Thus, a training set may originally include matrices of different sizes which may be modified by appending zeros as described above to convert the matrices to one fixed matrix size [0178]; CSI matrices from time from ti-1 to time ti. The Examiner notes CSI matrices from time from ti-1 to time ti relates to temporal information).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Song to incorporate the teachings of Saber for the benefit of a generation model 303 which may be trained to generate a feature vector that may identify or separate one or more features (e.g., latent features) of the training data that may reduce the overhead associated with storing and/or transmitting the representation 307 (Saber [0064])

Regarding claim 4, Modified Song teaches the system of claim 3, Saber teaches wherein metadata comprises data type, temporal information, data source, data characteristics, and domain-specific metadata (Thus, a training set may originally include matrices of different sizes which may be modified by appending zeros as described above to convert the matrices to one fixed matrix size [0178]; CSI matrices from time from ti-1 to time ti. The Examiner notes CSI matrices from time from ti-1 to time ti relates to temporal information).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Song to incorporate the teachings of Saber for the benefit of a generation model 303 which may be trained to generate a feature vector that may identify or separate one or more features (e.g., latent features) of the training data that may reduce the overhead associated with storing and/or transmitting the representation 307 (Saber [0064])

Regarding claim 8, claim 8 is similar to claim 2. It is rejected in the same manner and reasoning applying.

Regarding claim 9, claim 9 is similar to claim 3. It is rejected in the same manner and reasoning applying.

Regarding claim 10, claim 10 is similar to claim 4. It is rejected in the same manner and reasoning applying.


9.	Claims 5, 6, 11-13, 17 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Song et al. (“Anomaly VAE-Transformer: A Deep Learning Approach for Anomaly Detection in Decentralized Finance”, VOLUME 11, 2023, date of publication 8 September 2023) in view of Ye et al. (“VPTR: Efficient Transformers for Video Prediction”, 2022 26th International Conference on Pattern Recognition (ICPR) August 21-25, 2022) in view of Im et al. (US20230169709) and further in view of Isik et al. (US20240195438)

Regarding claim 5, Modified Song teaches the system of claim 1, Im teaches a plurality of codewords (The facial area generation operation may include a codebook training operation of training and generating a codebook to represent the plurality of pieces of facial image training data with block codebook indices [0010]. The Examiner notes a codebook consist of plurality of codewords).
The same motivation to combine dependent claim 1 applies here.
Modified Song does not explicitly teach wherein the plurality of input vectors comprises a plurality of codewords.
Isik teaches wherein the plurality of input vectors comprises a plurality of codewords (input vector 105 to each codeword 115 a-n [0024]. The Examiner notes codeword 115 a-n is an input in Fig. 1).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Song to incorporate the teachings of Isik for the benefit of using attention layers in Transformer networks [0024] to generate compressed data, where the compressed data includes an indication of the predetermined number of codewords of the plurality of codewords (Isik [0012])

Regarding claim 6, Modified Song teaches the system of claim 5, Ye teaches converting data into the plurality of latent space vectors (where Enc denotes the CNN frame encoder. In short, we decode each future feature to be frame xˆt, and then encode the frame back into a latent feature, pg. 3494, right col., second to the last para. 
Isik teaches wherein the plurality of codewords are converted into the plurality of latent space vectors (The distance vector obtained may then be converted to a probability distribution over the codebook for each input vector [0025]).
The same motivation to combine dependent claim 5 applies here.

Regarding claim 11, claim 11 is similar to claim 5. It is rejected in the same manner and reasoning applying.

Regarding claim 12, claim 12 is similar to claim 6. It is rejected in the same manner and reasoning applying.

Regarding claim 13, claim 13 is similar to claim 1. It is rejected in the same manner and reasoning applying. Song teaches computer-readable storage media having computer-executable instructions embodied thereon that, when executed by one or more processors of a computing system (The experiments were conducted on a machine equipped with Intel Core i7-12700F CPU and NVIDIA GeForce RTX 3060 GPU and 32GB DDR4 RAM, pg. 98125, left col., first para.)
Modified Song does not explicitly teach a non-transitory, computer-readable storage media having computer-executable instructions embodied thereon that, when executed by one or more processors of a computing system employing an asset registry platform,
Isik teaches a non-transitory, computer-readable storage media having computer-executable instructions embodied thereon that, when executed by one or more processors of a computing system (processor(s) 802 is/are at least one hardware processor configured to execute various tasks, operations and/or functions for computing device 800 as described herein according to software and/or instructions configured for computing device 800 [0064]) employing an asset registry platform (Data/information being tracked and/or sent to one or more entities as discussed herein could be provided in any database, table, register, …, storage, and/or storage structure [0071])
for a latent transformer core for a Large Codeword Model (Instead of selecting a single codeword from the codebook to approximately represent the input vector, a weighted combination of multiple codewords is used [0022];  attention layers in Transformer networks [0024]; According to other examples, the disclosed techniques may be implemented in systems in which the same encoder and decoder are used [0030]),
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified the method of Modified Song to incorporate the teachings of Isik for the benefit of using attention layers in Transformer networks [0024] to generate compressed data, where the compressed data includes an indication of the predetermined number of codewords of the plurality of codewords (Isik [0012])

Regarding claim 17, claim 17 is similar to claim 5. It is rejected in the same manner and reasoning applying.

Regarding claim 18, claim 18 is similar to claim 6. It is rejected in the same manner and reasoning applying.

10.	Claims 14-16 is rejected under 35 U.S.C. 103 as being unpatentable over Song et al. (“Anomaly VAE-Transformer: A Deep Learning Approach for Anomaly Detection in Decentralized Finance”, VOLUME 11, 2023, date of publication 8 September 2023) in view of Ye et al. (“VPTR: Efficient Transformers for Video Prediction”, 2022 26th International Conference on Pattern Recognition (ICPR) August 21-25, 2022) in view of Im et al. (US20230169709) in view of Isik et al. (US20240195438) and further in view of Saber et al. (US20230131694) 

Regarding claim 14, claim 14 is similar to claim 2. It is rejected in the same manner and reasoning applying.

Regarding claim 15, claim 15 is similar to claim 3. It is rejected in the same manner and reasoning applying.

Regarding claim 16, claim 16 is similar to claim 4. It is rejected in the same manner and reasoning applying.

Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MORIAM MOSUNMOLA GODO whose telephone number is (571)272-8670. The examiner can normally be reached Monday-Friday 8:00am-5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michelle T. Bechtold can be reached on (571) 431-0762. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/M.G./Examiner, Art Unit 2148
/MICHELLE T BECHTOLD/Supervisory Patent Examiner, Art Unit 2148
Read full office action
Prosecution Timeline

Jun 07, 2024
Application Filed
Aug 09, 2024
Non-Final Rejection — §103, §112
Sep 24, 2024
Response Filed
Oct 04, 2024
Final Rejection — §103, §112
Oct 24, 2024
Request for Continued Examination
Oct 28, 2024
Response after Non-Final Action
Dec 03, 2024
Non-Final Rejection — §103, §112
Jan 07, 2025
Response Filed
Apr 15, 2025
Final Rejection — §103, §112
Jul 24, 2025
Request for Continued Examination
Jul 29, 2025
Response after Non-Final Action
Dec 22, 2025
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/919,417
Patent 12602586
SUPERVISORY NEURON FOR CONTINUOUSLY ADAPTIVE NEURAL NETWORK
2y 5m to grant Granted Apr 14, 2026
17/096,425
Patent 12530583
VOLUME PRESERVING ARTIFICIAL NEURAL NETWORK AND SYSTEM AND METHOD FOR BUILDING A VOLUME PRESERVING TRAINABLE ARTIFICIAL NEURAL NETWORK
2y 5m to grant Granted Jan 20, 2026
16/249,279
Patent 12511528
NEURAL NETWORK METHOD AND APPARATUS
2y 5m to grant Granted Dec 30, 2025
16/942,263
Patent 12367381
CHAINED NEURAL ENGINE WRITE-BACK ARCHITECTURE
2y 5m to grant Granted Jul 22, 2025
16/513,208
Patent 12314847
TRAINING OF MACHINE READING AND COMPREHENSION SYSTEMS
2y 5m to grant Granted May 27, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
44%
Grant Probability
78%
With Interview (+33.4%)
4y 8m
Median Time to Grant
High
PTA Risk
Based on 68 resolved cases by this examiner. Grant probability derived from career allow rate.