Office Action Analysis: 17588066 — SYSTEMS AND METHODS FOR LEARNING RICH NEAREST NEIGHBOR REPRESENTATIONS FROM SELF-SUPERVISED ENSEMBLES

Examiner Intelligence

SIPPEL, MOLLY CLARKE View full profile →
Grants 50% of resolved cases
Career Allow Rate
7 granted / 14 resolved
-5.0% vs TC avg
Strong +58% interview lift
Without
With
+58.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 7m
Avg Prosecution
25 currently pending
Career history
39
Total Applications
across all art units
Statute-Specific Performance

§101
33.8%
-6.2% vs TC avg
§103
32.0%
-8.0% vs TC avg
§102
9.8%
-30.2% vs TC avg
§112
23.6%
-16.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 14 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
	This action is responsive to the amendment filed 02/12/2026. Claims 1-3, 5-9, 11-15, and 17-22 are pending in the case. Claims 1, 7-8, 13, and 20 are currently amended. Claims 1, 8, and 13 are independent claims. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
	Acknowledgement is made of applicant’s claim for domestic priority based on a provisional application filed on 10/05/2021. 

Specification
The disclosure is objected to because it contains an embedded hyperlink and/or other form of browser-executable code. Applicant is required to delete the embedded hyperlink and/or other form of browser-executable code; references to websites should be limited to the top-level domain name without any prefix such as http:// or other browser-executable code. See MPEP § 608.01.

Claim Objections
Claim 13 is objected to because of the following informalities:  
In claim 13, line 4, “the plurality MLPs” should read “the plurality of MLPs” as it appears to be a typographical error. 
In claim 13, line 4, “for image feature” should read “for image feature extraction” as it appears to be a typographical error. 
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 7 and 20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 7 recites the limitation “the plurality of trained MLPs” in line 3. There is insufficient antecedent basis for this limitation in the claim. The parent claim recites “a plurality of multi-layer perceptrons (MLPs)” and “trained MLPs”. It is unclear which limitation application is attempting to refer to. For examination purposes, this limitation has been interpreted to mean “the trained MLPs”, referring to the second previously recited claim element. 

Claim 20 recites the limitation “the plurality of trained MLPs” on line 3. There is insufficient antecedent basis for this limitation in the claim. The parent claim recites “a plurality of multi-layer perceptrons (MLPs)” and “trained MLP”. It is unclear which limitation application is attempting to refer to. For examination purposes, this limitation has been interpreted to mean “the plurality of MLPs”, referring to the previously recited claim element.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-3, 5-9, 11-15, and 19-22 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1: 
Step 1 Statutory Category: Claim 1 is directed to a method, which falls within one of the four statutory categories. 
Step 2A Prong 1 Judicial exception: Claim 1 recites, in part, “determining, in response to an image sample from the dataset, a set of feature vectors via the plurality of pre-trained feature extractors, respectively” and “computing a loss objective between the set of feature vectors and the plurality of mapped representations”. These limitations, under the broadest reasonable interpretation, cover the recitation of mathematical calculations which falls within the “mathematical concepts” grouping of abstract ideas. Additionally, the limitation “mapping…the memory bank vector into a plurality of mapped representations, each corresponding to a respective one of the set of feature vectors”, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgement, opinion), in this case evaluation. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “determining … an output feature representation in response to an input image sample”. This limitation, under the broadest reasonable interpretation, covers the recitation of mathematical calculations which falls within the “mathematical concepts” grouping of abstract ideas. See MPEP §2106.04(a)(2)(I)(C).
Step 2A Prong 2 Integration into a practical application: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements: “a plurality of multi-layer perceptrons (MLPs)”, “a plurality of pre-trained feature extractors”, “via the plurality of MLPs”, and “via trained MLPs” which amount to additional elements that generally link the use of the judicial exception to a particular technological environment or field of use, see MPEP §2106.05(h). Further, “via a communication interface” is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §2106.05(f). Further, “receiving, …, a dataset of a plurality of data samples that include one or more image samples”, “retrieving a memory bank vector that is initialized to represent a feature vector of the image sample from the dataset”, and “receiving, …, the memory bank vector as an input to each of the plurality of MLPs” are additional elements that amount to mere data gathering, and as such, are considered insignificant extra-solution activity to the judicial exception, see MPEP §2106.05(g). Further, “training the plurality of MLPs and the memory bank vector based on the computed loss objective” is a post-solution step that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process, see MPEP §2106.05(f). Alternatively, “training the plurality of MLPs and the memory bank vector based on the computed loss objective” is a post-solution step that amounts to adding insignificant extra-solution activity to the judicial exception. See MPEP §2106.05(g).
Step 2B Significantly more: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements: “a plurality of multi-layer perceptrons (MLPs)”, “a plurality of pre-trained feature extractors”, “via the plurality of MLPs”, and “via trained MLPs” which amount to additional elements that generally link the use of the judicial exception to a particular technological environment or field of use. Merely generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. Further, the additional element: “via a communication interface” amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, the elements, “receiving, …, a dataset of a plurality of data samples that include one or more image samples”, “retrieving a memory bank vector that is initialized to represent a feature vector of the image sample from the dataset”, and “receiving, …, the memory bank vector as an input to each of the plurality of MLPs” are insignificant extra-solution activity that are directed to receiving or transmitting data over a network which the courts have recognized as well-understood, routine, and conventional when they are claimed in a generic manner, see MPEP §2106.05(d)(II). Finally, “training the plurality of MLPs and the memory bank vector based on the computed loss objective” is a post-solution step that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Mere instructions to apply an exception cannot provide an inventive concept. Alternatively, the additional element “training the plurality of MLPs and the memory bank vector based on the computed loss objective” is a post-solution step that amounts to adding insignificant extra-solution activity to the judicial exception. Furthermore, the additional element is well-understood, routine, and conventional as taught by activity supported under Berkheimer Option 2 Tukiainen et al., U.S. Patent Application Publication No. 20200302322, Paragraph 0107, Lines 2-7, “Training DNN 1201 in this example includes inputting the feature vectors             
                
                        q
                    
                        i
                    
        (            
                
                        S
                    
                        k
                    
        ) for i=1, . . . ,             
                
                        N
                    
                        A
                    
         into DNN 1201 and applying the well-known supervised learning technique of forward propagation, backpropagation, and gradient descent, to update the connection weights of DNN 1201”. The claim is not patent eligible. 

Regarding claim 2, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that include different head architectures”. This is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use.  See MPEP § 2106.05(h). Merely generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible. 

Regarding claim 3, the rejection of claim 1 is further incorporated, and further, the claim recites: “wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that are trained on different objectives”. This is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use.  See MPEP § 2106.05(h). Merely generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible.

Regarding claim 5, the rejection of claim 1 is further incorporated, and further, the claim recites: “wherein the dataset further includes a plurality of text documents or a plurality of audio files”. This is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use.  See MPEP § 2106.05(h). Merely generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible.

Regarding claim 6, the rejection of claim 1 is further incorporated, and further, the claim recites: “wherein the dataset further includes a plurality of point clouds or polygon meshes”. This is more specifics of the abstract idea identified in clam 1. The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. 

Regarding claim 7, the rejection of claim 1 is further incorporated, and further, the claim recites: “wherein the method further comprises: freezing parameters of the plurality of trained MLPs”. This limitation recites mental processes in addition to those identified in the rejection of claim 1, thus recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 8: 
Step 1 Statutory Category: Claim 8 is directed to a method, which falls within one of the four statutory categories. 
Step 2A Prong 1 Judicial exception: Claim 8 recites, in part, “determining, in response to an image sample from the interpretation data sample, a set of feature vectors via the plurality of pre-trained feature extractors, respectively”, “determining an average of the set of feature vectors; initializing a memory bank vector with the average set of feature vectors”, and “computing a loss objective between the set of feature vectors and the plurality of mapped representations”. These limitations, under the broadest reasonable interpretation, cover the recitation of mathematical calculations which falls within the “mathematical concepts” grouping of abstract ideas. Additionally, the limitation “mapping…the memory bank vector into a plurality of mapped representations in response to the memory bank vector being an input to each of the plurality of MLPs, each of the plurality of mapped representations corresponding to a respective one or the set of feature vectors”, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgement, opinion), in this case evaluation. See MPEP § 2106.04(a)(2)(III). Finally, “while freezing the plurality of MLPs” is a limitation that, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgement, opinion), in this case observation. See MPEP §2106.04(a)(2)(III). Further, the claim recites: “generating a trained memory bank vector as the feature vector of the image sample”. This limitation, under the broadest reasonable interpretation, covers the recitation of mathematical calculations which falls within the “mathematical concepts” grouping of abstract ideas. See MPEP §2106.04(a)(2)(I)(C).
Step 2A Prong 2 Integration into a practical application: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements: “a plurality of trained multi-layer perceptrons (MLPs)”, “a plurality of pre-trained feature extractors” and “via the plurality of MLPs” which amount to additional elements that generally link the use of the judicial exception to a particular technological environment or field of use, see MPEP §2106.05(h). Further, “via a communication interface” is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §2106.05(f). Further, “receiving, …, an interpretation data sample that includes one or more image samples” is an additional element that amounts to mere data gathering, and as such, is considered insignificant extra-solution activity to the judicial exception, see MPEP §2106.05(g). Further, “training the initialized memory bank vector based on the computed loss objective” is a post-solution step that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process, see MPEP §2106.05(f). 
Step 2B Significantly more: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements: “a plurality of trained multi-layer perceptrons (MLPs)”, “a plurality of pre-trained feature extractors”, and “via the plurality of MLPs” amount to additional elements that generally link the use of the judicial exception to a particular technological environment or field of use. Merely generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. Further, the additional element “via a communication interface” amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, the element, “receiving, …, an interpretation data sample that includes one or more image samples” is insignificant extra-solution activity that is directed to receiving or transmitting data over a network which the courts have recognized as well-understood, routine, and conventional when they are claimed in a generic manner, see MPEP §2106.05(d)(II). Finally, “training the initialized memory bank vector based on the computed loss objective” is a post-solution step that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Mere instructions to apply an exception cannot provide an inventive concept. The claim is not patent eligible. 

Regarding claim 9, the rejection of claim 8 is further incorporated, and further, claim 9 is substantially similar to claim 3 respectively, and is rejected in the same manner and reasoning applying. 

Regarding claim 11, the rejection of claim 8 is further incorporated, and further, claim 11 recites features similar to claim 5 and is rejected for at least the same reasons therein. While claim 11 recites “interpretation data samples” rather than “dataset”, the rejection remains the same.

Regarding claim 12, the rejection of claim 8 is further incorporated, and further, claim 12 recites features similar to claim 6 and is rejected for at least the same reasons therein. While claim 12 recites “interpretation data samples” rather than “dataset”, the rejection remains the same.

Regarding claim 13: 
Step 1 Statutory Category: Claim 13 is directed to a system, which falls within one of the four statutory categories. 
Step 2A Prong 1 Judicial exception: Claim 13 recites, in part, “determine, in response to an image sample from the dataset, a set of feature vectors via the plurality of pre-trained feature extractors, respectively” and “compute a loss objective between the set of feature vectors and the combination of the mapped set of representations and a network of layers in the plurality of MLPs”. These limitations, under the broadest reasonable interpretation, cover the recitation of mathematical calculations which falls within the “mathematical concepts” grouping of abstract ideas. Further, “generate, …, a mapped set of representations in response to an input of the memory bank vector, each of the representations corresponding to a respective one of the set of feature vectors”, under the broadest reasonable interpretation, covers the recitation of the abstract idea of a mental process that can practically be performed in the human mind, with or without the use of a physical aid such as pen and paper (including an observation, evaluation, judgement, opinion), in this case evaluation. See MPEP § 2106.04(a)(2)(III). Further, the claim recites: “determine, …, an output feature representation in response to an input image sample”. This limitation, under the broadest reasonable interpretation, covers the recitation of mathematical calculations which falls within the “mathematical concepts” grouping of abstract ideas. See MPEP §2106.04(a)(2)(I)(C).
Step 2A Prong 2 Integration into a practical application: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional element: “a plurality of pre-trained feature extractors”, “a plurality of multi-layer perceptrons (MLPs)”, “via the plurality of MLPs”, “by the plurality of MLPs”, and “via trained MLP” which amounts to additional elements that generally links the use of the judicial exception to a particular technological environment or field of use, see MPEP §2106.05(h). Further, “a system”, “a communication interface for receiving a query for information”, “a memory storing a plurality of machine-readable instructions” and “a processor reading and executing the instructions from the memory to perform operations” are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §2106.05(f). Further, “via a communication interface” is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process.  See MPEP §2106.05(f). Further, “receive, …, a dataset of a plurality of data samples that include one or more image samples”, “retrieve a memory bank vector that corresponds to the set of feature vectors of the image sample”, and “receiving, …, the memory bank vector as an input to each of the plurality of MLPs” are additional elements that amount to mere data gathering, and as such, are considered insignificant extra-solution activity to the judicial exception, see MPEP §2106.05(g). Further, “train the plurality of MLPs and the memory bank vector by minimizing the computed loss objective” is a post-solution step that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely use a computer in its ordinary capacity as a tool to perform an existing process, see MPEP §2106.05(f). Alternatively, “train the plurality of MLPs and the memory bank vector by minimizing computed loss objective” is a post-solution step that amounts to adding insignificant extra-solution activity to the judicial exception. See MPEP §2106.05(g).
Step 2B Significantly more: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements: “a plurality of pre-trained feature extractors”, “a plurality of multi-layer perceptrons (MLPs)”, “via the plurality of MLPs”, “by the plurality of MLPs”, and “via trained MLP” which amount to additional elements that generally link the use of the judicial exception to a particular technological environment or field of use. Elements that merely generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. Further, “a system”, “a communication interface for receiving a query for information”, “a memory storing a plurality of machine-readable instructions” and “a processor reading and executing the instructions from the memory to perform operations” are additional elements that amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Elements that merely use a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, “via a communication interface” is an additional element that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Elements that merely amount to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process cannot provide an inventive concept. Further, “receive, …, a dataset of a plurality of data samples that include one or more image samples”, “retrieve a memory bank vector that corresponds to the set of feature vectors of the image sample”, and “receiving, …, the memory bank vector as an input to each of the plurality of MLPs” are insignificant extra-solution activity that are directed to receiving or transmitting data over a network, which the courts have recognized as well-understood, routine, and conventional when they are claimed in a generic manner, see MPEP §2106.05(d)(II). Finally, “train the plurality of MLPs and the memory bank vectors by minimizing the computed loss objective” is a post-solution step that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely use a computer in its ordinary capacity as a tool to perform an existing process. Mere instructions to apply an exception cannot provide an invention concept. Alternatively, the additional element “train the plurality of MLPs and the memory bank vector by minimizing the computed loss objective” is a post-solution step that amounts to adding insignificant extra-solution activity to the judicial exception. Furthermore, the additional element is well-understood, routine, and conventional as taught by activity supported under Berkheimer Option 2 Tukiainen et al., U.S. Patent Application Publication No. 20200302322, Paragraph 0107, Lines 2-7, “Training DNN 1201 in this example includes inputting the feature vectors             
                
                        q
                    
                        i
                    
        (            
                
                        S
                    
                        k
                    
        ) for i=1, . . . ,             
                
                        N
                    
                        A
                    
         into DNN 1201 and applying the well-known supervised learning technique of forward propagation, backpropagation, and gradient descent, to update the connection weights of DNN 1201”. The claim is not patent eligible.  

Regarding claim 14, the rejection of claim 13 is further incorporated, and further, claim 14 is substantially similar to claim 2 respectively, and is rejected in the same manner and reasoning applying. 

Regarding claim 15, the rejection of claim 13 is further incorporated, and further, claim 15 recites features similar to claim 3 and claim 9 and is rejected for at least the same reasons therein.

Regarding claim 17, the rejection of claim 14 is further incorporated, and further, claim 17 recites features similar to claim 5 and claim 11 and is rejected for at least the same reasons therein.

Regarding claim 18, the rejection of claim 14 is further incorporated, and further, claim 18 recites features similar to claim 6 and claim 12 and is rejected for at least the same reasons therein.

Regarding claim 19, the rejection of claim 14 is further incorporated, and further, claim 19 recites: “wherein the plurality of pre-trained feature extractors is selected from a plurality of convolutional neural network”. This is an additional element that generally links the use of the judicial exception to a particular technological environment or field of use.  See MPEP § 2106.05(h). Merely generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible.

Regarding claim 20, as understood in accordance with the objection as being dependent upon claim 13, the rejection of claim 13 is further incorporated, and further, claim 20 recites: “freezing, parameters of the plurality of trained MLPs”. This limitation recites mental processes in addition to those identified in the rejection of claim 13. Further, “determining, in response to the interpretation data sample, the set of feature vectors via the plurality of pre-trained feature extractors, respectively” recites mathematical concepts in addition to those identified in the rejection of claim 13. Further, “updating the memory bank vector using an average of the set of feature vectors” recites mathematical concepts in addition to those identified in the rejection of claim 13. Further, “computing a loss objective between the set of feature vectors and the plurality of mapped representations” recites mathematical concepts in addition to those identified in the rejection of claim 13. Further, “mapping, … , the initialized memory bank vector into a plurality of mapped representations, respectively” and “while freezing the plurality of MLPs” recite additional mental processes in addition to those identified in the rejection of claim 13. 
Claim 20 recites the additional limitation “receiving, via the communication interface, an interpretation data sample”. This is an additional element that amounts to mere data gathering, and as such, is considered insignificant extra-solution activity to the judicial exception, see MPEP §2106.05(g). This limitation is directed to receiving or transmitting data over a network which the courts have recognized as well-understood, routine, and conventional when they are claimed in a generic manner, see MPEP §2106.05. Further, the claim recites: “via the plurality of MLPs”. This limitation is an additional element that amounts generally linking the use of the judicial exception to a particular technological environment or field of use, see MPEP §2106.05(h). Elements that merely generally link the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. Finally, “updating the initialized memory bank vector based on the computed loss objective” is a post-solution step that amounts to adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer in its ordinary capacity as a tool to perform an existing process. Mere instructions to apply an exception cannot provide an inventive concept. The claim is not patent eligible.

Regarding claim 21, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein the memory bank vector is initialized based on an average of the set of feature vectors”. This limitation recites mathematical concepts in addition to those identified in the rejection of the parent claim, thus the claim recites a judicial exception. 
The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible.

Regarding claim 22, the rejection of claim 1 is incorporated, and further, the claim recites: “wherein a number of the plurality of pre-trained feature extractors is same as a number of the plurality of MLPs”. This limitation is an additional element that amounts to generally linking the use of the judicial exception to a particular technological environment or field of use.  See MPEP §2106.05(h). Elements that merely amount to generally linking the use of the judicial exception to a particular technological environment or field of use cannot provide an inventive concept. The claim is not patent eligible. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 7-8, and 21-22 are rejected under 35 U.S.C. 103 as being unpatentable over Shi R, Ji J, Zhang C, Miao Q. Boosting sparsity-induced autoencoder: A novel sparse feature ensemble learning for image classification. International Journal of Advanced Robotic Systems. 2019;16(3). doi:10.1177/1729881419853471, hereinafter referred to as “Shi”, in view of Fidler et al., U.S. Patent Application Publication No. 20220067983, hereinafter referred to as “Fidler”. 

Regarding claim 1, Shi teaches A method for a training framework comprising a plurality of pre-trained feature extractors and a plurality of multi-layer perceptrons (MLPs) corresponding to the plurality of pre-trained feature extractors to train the plurality of MLPs for image feature extraction (Shi, Page 4, Col 2, Algorithm 1 shows a method for training a model that outputs a reconstruction representation of the input, which is considered to be the image feature; further, because the output of the pre-trained feature extractors and the output of the MLPs are both used in calculating loss, they are considered to be corresponding; Shi, Page 2, Paragraph 4, Lines 1-4, “we propose a new sparsity encourage mechanism for autoencoder and further build a sparsity-induced autoencoder (SparsityAE) that can exploit more representative and intrinsic features from input”; Shi, Page 6, Experiments, Lines 2-6, “accurately illustrate the performance and stability of the proposed ensemble sparse feature learning method on real-world image processing and computer vision tasks, BoostingAE will be carried out on three widely employed image data sets”) comprising:
receiving … a dataset of a plurality of data samples that include one or more image samples (Shi, Page 3, Sparsity-induced autoencoder, Paragraph 2, Lines 7-8, “We input the high-dimensional data set into the encoding stage”; Shi, Pages 6-8, Experiments; Shi, Page 7, BoostingAE for classification on MNIST, Lines 1-3, “MNIST is a widely used handwritten digits data set, and the size of each image is 28 x 28. It consists of 60,000 training images and 10,000 testing images”);
determining, in response to an image sample from the dataset, a set of feature vectors via the plurality of pre-trained feature extractors, respectively (Shi, Page 4, Feature Ensemble Method, Paragraph 2, Lines 6-9, “N SparsityAE s will be pretrained to set initial values of boosting network and, therefore, produce N representations of input data” The “SparsityAEs are considered to be the “pre-trained feature extractors” and the “representations of input data” are considered to be the “feature vectors” Shi, Page 5, Feature Ensemble Method, Paragraph 3, Lines 2-6, and Equation 8, “Let x be the original input,                         
                            
                                    W
                                
                                    (
                                    i
                                    )
                                
                     be the weight matrix connecting the input layer and the hidden layer and                         
                            
                                    b
                                
                                    (
                                    i
                                    )
                                
                     be the bias vector, so the output of SparsityAE_1 can be mapped by equation (4)                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                            =
                            σ
                            (
                            x
                            •
                            
                                    W
                                
                                            i
                                        
                            +
                             
                                    b
                                
                                            i
                                        
                            )
                        
                    ”                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                     is considered to be the “set of feature vectors”);
retrieving a memory bank vector that is initialized to represent a feature vector of the image sample from the dataset; receiving, by the plurality of MLPs, the memory bank vector as an input to each of the plurality of MLPs; mapping, via the plurality of MLPs, the memory bank vector into a plurality of mapped representations, each corresponding to a respective one of the set of feature vectors; (Shi, Page 5, Feature Ensemble Method, Paragraph 4, Lines 2-4, and Equation 9, “With reference to the above operation, the output of SparsityAE_2 can be obtained, which can also be used as the input of next SparsityAE                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                            =
                            σ
                            (
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                            •
                            
                                    W
                                
                                            2
                                        
                            +
                             
                                    b
                                
                                            2
                                        
                            )
                        
                    ”                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                     is considered to be the “memory bank vector”)

    PNG
    media_image1.png
    611
    996
    media_image1.png
    Greyscale

As seen with reference to Fig. 3, shown above, calculating                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                     is considered to be “retrieving a memory bank vector”, SparsityAE_2 is an autoencoder, the encoder “retrieves” the memory bank vector by calculating it, then the decoder, receives the memory bank vector as an input and attempts to reconstruct                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                    , creating the “mapped representations” by mapping the memory bank vector. 
computing a loss objective between the set of feature vectors and the plurality of mapped representations (Shi, Page 4, Col 1, Paragraph 3, Lines 1-3, “For our SparsityAE, we optimize the reconstruction loss regularized by a weight decay and a sparsity-inducing term. The cost function can be designed as follows                         
                            L
                            
                                    X
                                    ,
                                     
                                    Y
                                    ;
                                     
                                    θ
                                
                            =
                             
                                            y
                                        
                                            i
                                        
                                    -
                                    
                                            y
                                        
                                        ^
                                    
                                                    x
                                                
                                                    i
                                                
                                        2
                                    
                                        2
                                    
                            /
                            N
                             
                            +
                             
                            β
                            •
                            K
                            L
                            (
                            
                                    p
                                
                                ^
                            
                            |
                            |
                            p
                            )
                        
                    ”; The                         
                            
                                    y
                                
                                    i
                                
                     is the “original input”, and in the case of “SparsityAE_2”, the input is the “set of feature vectors” Shi, Page 5, Col 1, Paragraph 3, Lines 1-2, “Treat                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                     as the input of the next SparsityAE to further train SparsityAE_2”) ; and 
training the plurality of MLPs and the memory bank vector based on the computed loss objective (Shi, Page 4, Col 1, Final Paragraph, Lines 3-8, “After that, minimizing the loss function by using the stochastic gradient descent to update network parameters. Then, updating the data according to the hidden representation derived from the SparsityAE and setting the neurons without significant activation value to zero”; Shi, Page 4, Algorithm 1, Process (1)-(5); Shi, Page 5, Col 1, Paragraphs 2-4; The “plurality of MLPs” are trained using “stochastic gradient descent” and the loss function is calculated based on “the memory bank vector”, the memory bank vector is used during training of the “plurality of MLPs”); and 
determining, via trained MLPs, an output feature representation in response to an input image sample (Shi, Page 6, Experiments, Lines 2-6, “accurately illustrate the performance and stability of the proposed ensemble sparse feature learning method on real-world image processing and computer vision tasks, BoostingAE will be carried out on three widely employed image data sets”; Shi, Page 1, Abstract, Lines 5-9, “we firstly add a sparsity-induced layer into the autoencoder to exploit and extract more representative and essential features exist in the input and then combining the ensemble learning mechanism, we propose a novel sparse feature ensemble learning method, named Boosting sparsity-induced autoencoder, which could make full use of hierarchical and diverse features, increase the accuracy and the stability of a single model”; Shi, Page 4, Algorithm 1, Output, “reconstruction representation of the input” This is considered to be “an output feature representation” and it is output after the steps of training the SparsityAE, thus it is performed “via trained MLPs”)
Shi does not explicitly teach receiving the dataset of data samples via a communication interface. 
Fidler teaches receiving data via a communication interface (Fidler, Paragraph 0190, Lines 1-6, “In at least one embodiment, processor(s) 1010 may further include a set of embedded processors that may serve as an audio processing engine which may be an audio subsystem that enables full hardware support for multi-channel audio over multiple interfaces, and a broad and flexible range of audio I/O interfaces”). 
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi to include receiving the input dataset via a communication interface as taught by Fidler. The motivation for doing so would have been that because the input data is necessary for the method of Shi, it would be necessary to have a method of receiving that data (Shi, Page 3, Sparsity-induced autoencoder, Paragraph 2, Lines 7-8; Fidler Paragraph 0190). 

Regarding claim 2, the rejection of claim 1 is incorporated, and further, Shi teaches wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that include different head architectures (Shi, Page 2, Related Work, Paragraph 4, Lines 6-9, “Various sparse regularization methods are used in deep belief networks or autoencoders, 20 and the results have proved that these methods have beneficial effects for a particular scene” The “SparsityAE” disclosed in the reference is chosen among the “various sparse regularization methods”).

Regarding claim 7, the rejection of claim 1 is incorporated. 
The proposed combination thus far does not explicitly teach freezing parameters of the plurality of trained MLPs.
Fidler teaches freezing parameters of the plurality of trained MLPs (Fidler, Paragraph 0091, “At 204, said system freezes said decoder. In at least one embodiment, said freezing comprises halting changes to various weights or parameters of said decoder, while training to other portions of one or more neural networks associated with said decoder continues”; Fidler, Paragraph 0619, “freezing the parameters of the decoder while training the encoder using the training data generated based, at least in part, on output of the decoder”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method of updating the memory bank vector as taught by Shi to not include updating the plurality of MLPs as taught by Fidler. The motivation for doing so would have been to continue updating the memory bank vector without affecting the parameters of the MLPs (Fidler, Paragraph 0091, “At 204, said system freezes said decoder. In at least one embodiment, said freezing comprises halting changes to various weights or parameters of said decoder, while training to other portions of one or more neural networks associated with said decoder continues”). 

Regarding claim 8, Shi teaches A method for feature extraction using a trained framework comprising a plurality of trained multi-layer perceptrons (MLPs) and a plurality of pre-trained feature extractors corresponding to the plurality of MLPs extraction, (Shi, Page 4, Col 2, Algorithm 1 shows a method for training and using a model that outputs a reconstruction representation of the input, which is considered to be the image feature; further, because the output of the pre-trained feature extractors and the output of the MLPs are both used in calculating loss, they are considered to be corresponding; Shi, Page 6, Experiments, Lines 2-6, “accurately illustrate the performance and stability of the proposed ensemble sparse feature learning method on real-world image processing and computer vision tasks, BoostingAE will be carried out on three widely employed image data sets”) comprising:
receiving … an interpretation data sample that includes one or more image samples (Shi, Page 3, Sparsity-induced autoencoder, Paragraph 2, Lines 7-8, “We input the high-dimensional data set into the encoding stage”; Shi, Pages 6-8, Experiments; Shi, Page 7, BoostingAE for classification on MNIST, Lines 1-3, “MNIST is a widely used handwritten digits data set, and the size of each image is 28 x 28. It consists of 60,000 training images and 10,000 testing images”);
determining, in response to an image sample from the interpretation data sample, a set of feature vectors via the plurality of pre-trained feature extractors, respectively (Shi, Page 4, Feature Ensemble Method, Paragraph 2, Lines 6-9, “N SparsityAE s will be pretrained to set initial values of boosting network and, therefore, produce N representations of input data” The “SparsityAEs are considered to be the “pre-trained feature extractors” and the “representations of input data” are considered to be the “feature vectors” Shi, Page 5, Feature Ensemble Method, Paragraph 3, Lines 2-6, and Equation 8, “Let x be the original input,                         
                            
                                    W
                                
                                    (
                                    i
                                    )
                                
                     be the weight matrix connecting the input layer and the hidden layer and                         
                            
                                    b
                                
                                    (
                                    i
                                    )
                                
                     be the bias vector, so the output of SparsityAE_1 can be mapped by equation (4)                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                            =
                            σ
                            (
                            x
                            •
                            
                                    W
                                
                                            i
                                        
                            +
                             
                                    b
                                
                                            i
                                        
                            )
                        
                    ”                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                     is considered to be the “set of feature vectors”);
determining an average of the set of feature vectors (Shi, Page 5, Combination method of voting, Paragraph 1, Lines 1-6, “When all three classifiers are set up, final prediction is given by the fusion result of all classifiers with a specific fusion rule. Here, the Naive Bayes combination rules21 are applied, which assume that individual classifiers are mutually independent. Naive Bayes combination rules include MAX rule, MIN rule, and AVG rule” See also Page 5, Col 2, AVG rule; Shi teaches taking an average based on the set of feature vectors);
initializing a memory bank vector with the average of the set of feature vectors (Shi, Page 4, Col 1, Final Paragraph, Lines 5-8, “Then, updating the data according to the hidden representation derived from the SparsityAE and setting the neurons without significant activation value to zero”);
mapping, via the plurality of MLPs, the memory bank vector into a plurality of mapped representations in response to the memory bank vector being input to each of the plurality of MLPs, each of the plurality of mapped representations corresponding to a respective one of the feature vectors (Shi, Page 5, Feature Ensemble Method, Paragraph 4, Lines 2-4, and Equation 9, “With reference to the above operation, the output of SparsityAE_2 can be obtained, which can also be used as the input of next SparsityAE                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                            =
                            σ
                            (
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                            •
                            
                                    W
                                
                                            i
                                        
                            +
                             
                                    b
                                
                                            i
                                        
                            )
                        
                    ”                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                     is considered to be the “memory bank vector”); 

    PNG
    media_image1.png
    611
    996
    media_image1.png
    Greyscale

As seen with reference to Fig. 3, shown above, calculating                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                     is considered to be “retrieving a memory bank vector”, SparsityAE_2 is an autoencoder, the encoder “retrieves” the memory bank vector by calculating it, then the decoder receives the memory bank vector as input and attempts to reconstruct                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                    , creating the “mapped representations” by mapping the memory bank vector. 
computing a loss objective between the set of feature vectors and the plurality of mapped representations (Shi, Page 4, Col 1, Paragraph 3, Lines 1-3, “For our SparsityAE, we optimize the reconstruction loss regularized by a weight decay and a sparsity-inducing term. The cost function can be designed as follows                         
                            L
                            
                                    X
                                    ,
                                     
                                    Y
                                    ;
                                     
                                    θ
                                
                            =
                             
                                            y
                                        
                                            i
                                        
                                    -
                                    
                                            y
                                        
                                        ^
                                    
                                                    x
                                                
                                                    i
                                                
                                        2
                                    
                                        2
                                    
                            /
                            N
                             
                            +
                             
                            β
                            •
                            K
                            L
                            (
                            
                                    p
                                
                                ^
                            
                            |
                            |
                            p
                            )
                        
                    ”; The                         
                            
                                    y
                                
                                    i
                                
                     is the “original input”, and in the case of “SparsityAE_2”, the input is the “set of feature vectors” Shi, Page 5, Col 1, Paragraph 3, Lines 1-2, “Treat                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                     as the input of the next SparsityAE to further train SparsityAE_2”); and
training the initialized memory bank vector based on the computed loss objective (Shi, Page 4, Col 1, Final Paragraph, Lines 3-8, “After that, minimizing the loss function by using the stochastic gradient descent to update network parameters. Then, updating the data according to the hidden representation derived from the SparsityAE and setting the neurons without significant activation value to zero”; Shi, Page 4, Algorithm 1, Process (1)-(5); Shi, Page 5, Col 1, Paragraphs 2-4; The training of the SparsityAE is performed using the memory bank vector, and the memory bank vector is updated as a result of the “stochastic gradient descent” which is considered to be the “training”); and 
generating a trained memory bank vector as the feature vector of the image sample (Shi, Page 6, Experiments, Lines 2-6, “accurately illustrate the performance and stability of the proposed ensemble sparse feature learning method on real-world image processing and computer vision tasks, BoostingAE will be carried out on three widely employed image data sets”; Shi, Page 1, Abstract, Lines 5-9, “we firstly add a sparsity-induced layer into the autoencoder to exploit and extract more representative and essential features exist in the input and then combining the ensemble learning mechanism, we propose a novel sparse feature ensemble learning method, named Boosting sparsity-induced autoencoder, which could make full use of hierarchical and diverse features, increase the accuracy and the stability of a single model”; Shi, Page 4, Algorithm 1, Output, “reconstruction representation of the input”; During the process of training the “plurality of MLPs” and using them to generate the “reconstruction representation of the input” a “trained memory bank vector” is generated)
Shi does not teach receiving the interpretation data sample via a communication device, nor updating the initialized memory bank vector while freezing the plurality of MLPs. 
Fidler teaches receiving data via a communication interface (Fidler, Paragraph 0190, Lines 1-6, “In at least one embodiment, processor(s) 1010 may further include a set of embedded processors that may serve as an audio processing engine which may be an audio subsystem that enables full hardware support for multi-channel audio over multiple interfaces, and a broad and flexible range of audio I/O interfaces”), and
updating the initialized memory bank vector based on the computed loss objective while freezing the plurality of MLPs (Fidler, Paragraph 0091, “At 204, said system freezes said decoder. In at least one embodiment, said freezing comprises halting changes to various weights or parameters of said decoder, while training to other portions of one or more neural networks associated with said decoder continues” Fidler, Paragraph 0619, “freezing the parameters of the decoder while training the encoder using the training data generated based, at least in part, on output of the decoder”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi to include receiving the input dataset via a communication interface as taught by Fidler. The motivation for doing so would have been that because the input data is necessary for the method of Shi, it would be necessary to have a method of receiving that data (Shi, Page 3, Sparsity-induced autoencoder, Paragraph 2, Lines 7-8; Fidler Paragraph 0190). Further it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method of updating the memory bank vector as taught by Shi to not include updating the plurality of MLPs as taught by Fidler. The motivation for doing so would have been to continue updating the memory bank vector without affecting the parameters of the MLPs (Fidler, Paragraph 0091, “At 204, said system freezes said decoder. In at least one embodiment, said freezing comprises halting changes to various weights or parameters of said decoder, while training to other portions of one or more neural networks associated with said decoder continues”). 

	Regarding claim 21, the rejection of claim 1 is incorporated, and further, the proposed combination teaches wherein the memory bank vector is initialized based on an average of the set of feature vectors (Shi, Page 4, Col 1, Final Paragraph, Lines 5-8, “Then, updating the data according to the hidden representation derived from the SparsityAE and setting the neurons without significant activation value to zero”).

	Regarding claim 22, the rejection of claim 1 is incorporated, and further, the proposed combination teaches wherein a number of the plurality of pre-trained feature extractors is same as a number of the plurality of MLPs (Shi, Page 5, Figure 3, The SparsityAEs are considered to be the pre-trained feature extractors, and the decoders of the autoencoders are considered to be the “plurality of MLPs”, thus the number of each must be the same).

Claims 5, 11 are rejected under 35 U.S.C. 103 as being unpatentable over Shi in view of Fidler in further view of Hoehne et al., U.S. Patent Application Publication No. 20240070440, hereinafter referred to as “Hoehne”. 

Regarding claim 5, the rejection of claim 1 is incorporated. 
Shi in view of Fidler does not explicitly teach the dataset including text or audio. 
Hoehne teaches wherein the dataset further includes a plurality of text documents or a plurality of audio files (Hoehne, Paragraph 0164, “The machine learning model according to the present disclosure is configured to receive input data of different modalities. Usually, there are different inputs for inputting input data of different modalities, e.g., an image input for inputting one or more images, a text input for inputting one or more texts (including numbers), and/or an audio input for inputting one or more audio files”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi in view of Fidler to include audio or text input as taught by Hoehne. The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Hoehne, Paragraph 0195). 

Regarding claim 11, the rejection of claim 8 is incorporated. 
Shi in view of Fidler does not explicitly teach the dataset including text or audio. 
Hoehne teaches wherein the interpretation data sample includes a plurality of text documents or a plurality of audio files (Hoehne, Paragraph 0164, “The machine learning model according to the present disclosure is configured to receive input data of different modalities. Usually, there are different inputs for inputting input data of different modalities, e.g., an image input for inputting one or more images, a text input for inputting one or more texts (including numbers), and/or an audio input for inputting one or more audio files”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi in view of Fidler to include audio or text input as taught by Hoehne. The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Hoehne, Paragraph 0195). 

Claims 3, 6, 9, and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Shi in view of Fidler in further view of Yoo et al., U.S. Patent Application Publication No. 20220383993, hereinafter referred to as “Yoo”. 

Regarding claim 3, the rejection of claim 1 is incorporated.
The proposed combination thus far does not explicitly teach wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that are trained on different objectives. 
Yoo teaches wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that are trained on different objectives (Yoo, Paragraph 0144, Lines 9-13, “Further, in a case of an encoder, models such as deep neural networks, graph convolutional networks and 3D convolutional neural networks may be used according to an input. Further, recurrent neural network models for obtaining a character string representation or other graph generative models may be utilized as a decoder” The “different objectives” are considered to be selecting features for different data types, each encoder taught above is used “according to an input” meaning they have different objectives).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi in view of Fidler to include selecting pre-trained feature extractors from pre-trained feature extractors trained on different objectives as taught by Yoo. The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Yoo, Paragraph 0144).

Regarding claim 6, the rejection of claim 1 is incorporated. 
Shi in view of Fidler does not explicitly teach the dataset including a plurality of point clouds or polygon meshes. 
Yoo teaches wherein the dataset further includes a plurality of point clouds (Yoo, Paragraph 0144, Lines 3-8, “Further, in addition to a fingerprint input, the encoder-decoder model of the present disclosure may utilize various methods for expressing a molecular structure, such as input in the form of a graph of a molecular structure or input in the form of a point cloud with coordinate values”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi in view of Fidler to include point clouds in the input dataset as taught by Yoo. The motivation for doing so would have been The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Yoo, Paragraph 0144). 

Regarding claim 9, the rejection of claim 8 is incorporated.
The proposed combination thus far does not explicitly teach wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that are trained on different objectives.
Yoo teaches wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that are trained on different objectives (Yoo, Paragraph 0144, Lines 9-13, “Further, in a case of an encoder, models such as deep neural networks, graph convolutional networks and 3D convolutional neural networks may be used according to an input. Further, recurrent neural network models for obtaining a character string representation or other graph generative models may be utilized as a decoder The “different objectives” are considered to be selecting features for different data types, each encoder taught above is used “according to an input” meaning they have different objectives).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi in view of Fidler to include selecting pre-trained feature extractors from pre-trained feature extractors trained on different objectives as taught by Yoo. The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Yoo, Paragraph 0144).

Regarding claim 12, the rejection of claim 8 is incorporated. 
Shi in view of Fidler does not explicitly teach the dataset including a plurality of point clouds or polygon meshes. 
Yoo teaches wherein the interpretation data sample includes a plurality of point clouds or polygon meshes (Yoo, Paragraph 0144, Lines 3-8, “Further, in addition to a fingerprint input, the encoder-decoder model of the present disclosure may utilize various methods for expressing a molecular structure, such as input in the form of a graph of a molecular structure or input in the form of a point cloud with coordinate values”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi in view of Fidler to include point clouds in the input dataset as taught by Yoo. The motivation for doing so would have been The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Yoo, Paragraph 0144). 

Claims 13, 14, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Shi in view of Fidler in further view of Andrew Ng, CS294A Lecture notes Sparse Autoencoder, 2021, https://web.stanford.edu/class/cs294a/sparseAutoencoder_2011new.pdf, hereinafter referred to as “Ng”. 

Regarding claim 13, Shi teaches a training framework comprising a plurality of pre-trained feature extractors and a plurality of multi-layer perceptrons (MLPs) corresponding to the plurality of pre-trained feature extractors for training the plurality MLPs for image feature (Shi, Page 4, Col 2, Algorithm 1 shows a method for training and using a model that outputs a reconstruction representation of the input, which is considered to be the ensemble vector representation; further, because the output of the pre-trained feature extractors and the output of the MLPs are both used in calculating loss, they are considered to be corresponding; Shi, Page 6, Experiments, Lines 2-6, “accurately illustrate the performance and stability of the proposed ensemble sparse feature learning method on real-world image processing and computer vision tasks, BoostingAE will be carried out on three widely employed image data sets”). 
Further, Shi teaches receive … a dataset of a plurality of data samples that include one or more image samples (Shi, Page 3, Sparsity-induced autoencoder, Paragraph 2, Lines 7-8, “We input the high-dimensional data set into the encoding stage”; Shi, Pages 6-8, Experiments; Shi, Page 7, BoostingAE for classification on MNIST, Lines 1-3, “MNIST is a widely used handwritten digits data set, and the size of each image is 28 x 28. It consists of 60,000 training images and 10,000 testing images”);
determine, in response to an image sample from the dataset, a set of feature vectors via the plurality of pre-trained feature extractors, respectively (Shi, Page 4, Feature Ensemble Method, Paragraph 2, Lines 6-9, “N SparsityAE s will be pretrained to set initial values of boosting network and, therefore, produce N representations of input data” The “SparsityAEs are considered to be the “pre-trained feature extractors” and the “representations of input data” are considered to be the “feature vectors” Shi, Page 5, Feature Ensemble Method, Paragraph 3, Lines 2-6, and Equation 8, “Let x be the original input,                         
                            
                                    W
                                
                                    (
                                    i
                                    )
                                
                     be the weight matrix connecting the input layer and the hidden layer and                         
                            
                                    b
                                
                                    (
                                    i
                                    )
                                
                     be the bias vector, so the output of SparsityAE_1 can be mapped by equation (4)                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                            =
                            σ
                            (
                            x
                            •
                            
                                    W
                                
                                            i
                                        
                            +
                             
                                    b
                                
                                            i
                                        
                            )
                        
                    ”                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                     is considered to be the “set of feature vectors”);
retrieve a memory bank vector that corresponds to the set of feature vectors of the image sample; receiving, by the plurality of MLPs, the memory bank vector as an input to each of the plurality of MLPs; generate, via the plurality of MLPs, a mapped set of representations in response to an input of the memory bank vector, each of the representations corresponding to a respective one of the set of feature vectors (Shi, Page 5, Feature Ensemble Method, Paragraph 4, Lines 2-4, and Equation 9, “With reference to the above operation, the output of SparsityAE_2 can be obtained, which can also be used as the input of next SparsityAE                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                            =
                            σ
                            (
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                            •
                            
                                    W
                                
                                            i
                                        
                            +
                             
                                    b
                                
                                            i
                                        
                            )
                        
                    ”                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                     is considered to be the “memory bank vector”)

    PNG
    media_image1.png
    611
    996
    media_image1.png
    Greyscale

As seen with reference to Fig. 3, shown above, calculating                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                     is considered to be “retrieving a memory bank vector”, SparsityAE_2 is an autoencoder, the encoder “retrieves” the memory bank vector by calculating it, then the decoder receives the memory bank vector as an input and attempts to reconstruct                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                    , creating the “mapped representations” by mapping the memory bank vector. 
train the plurality of MLPs and the memory bank vectors by minimizing the computed loss objective (Shi, Page 4, Col 1, Final Paragraph, Lines 3-8, “After that, minimizing the loss function by using the stochastic gradient descent to update network parameters. Then, updating the data according to the hidden representation derived from the SparsityAE and setting the neurons without significant activation value to zero”; Shi, Page 4, Algorithm 1, Process (1)-(5); Shi, Page 5, Col 1, Paragraphs 2-4; The “plurality of MLPs” are trained using “stochastic gradient descent” and the loss function is calculated based on “the memory bank vector”, the memory bank vector is used during training of the “plurality of MLPs”); and 
determining, via trained MLP, an output feature representation in response to an input image sample (Shi, Page 6, Experiments, Lines 2-6, “accurately illustrate the performance and stability of the proposed ensemble sparse feature learning method on real-world image processing and computer vision tasks, BoostingAE will be carried out on three widely employed image data sets”; Shi, Page 1, Abstract, Lines 5-9, “we firstly add a sparsity-induced layer into the autoencoder to exploit and extract more representative and essential features exist in the input and then combining the ensemble learning mechanism, we propose a novel sparse feature ensemble learning method, named Boosting sparsity-induced autoencoder, which could make full use of hierarchical and diverse features, increase the accuracy and the stability of a single model”; Shi, Page 4, Algorithm 1, Output, “reconstruction representation of the input” This is considered to be “an output feature representation” and it is output after the steps of training the SparsityAE, thus it is performed “via the trained MLP”)
Shi does not explicitly teach a system … a communication interface for receiving a query for information; a memory storing a plurality of machine-readable instructions; and a processor reading and executing the instructions from the memory to perform operations, nor receiving a dataset of a plurality of data samples via a communication interface, nor compute a loss objective between the set of feature vectors and the combination of the mapped set of representations and a network of layers in the plurality of MLPs. 
Fidler teaches A system (Fidler, Paragraph 0001, Lines 2-4, “at least one embodiment pertains to processors or computing system used to train neural networks to perform a task related to image processing” Fidler, Paragraph 0088, Lines 4-9, “In at least one embodiment, for example, operations described in relation to FIG. 2 are performed by a system comprising at least one processor and a memory comprising instructions executable by said at least one processor, such that execution of said instructions cause said system to at least perform said operations”) and 
a communication interface for receiving a query for information (Fidler, Paragraph 0196, Lines 6-10, “In at least one embodiment, one or more of SoC(s) 1004 may further include an input/output controller(s) that may be controlled by software and may be used for receiving I/O signals that are uncommitted to a specific role”);
a memory storing a plurality of machine-readable instructions; and a processor reading and executing the instructions from the memory to perform operations Fidler, Paragraph 0088, Lines 4-9, “In at least one embodiment, for example, operations described in relation to FIG. 2 are performed by a system comprising at least one processor and a memory comprising instructions executable by said at least one processor, such that execution of said instructions cause said system to at least perform said operations”) and
Fidler teaches receiving data via the communication interface (Fidler, Paragraph 0190, Lines 1-6, “In at least one embodiment, processor(s) 1010 may further include a set of embedded processors that may serve as an audio processing engine which may be an audio subsystem that enables full hardware support for multi-channel audio over multiple interfaces, and a broad and flexible range of audio I/O interfaces”). 
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi to include the hardware (a system, communication interface, memory and processor) as taught by Fidler. The motivation for doing so would have been that the method of Shi requires hardware to be able to operate. Further, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi to include receiving the input dataset via a communication interface as taught by Fidler. The motivation for doing so would have been that because the input data is necessary for the method of Shi, it would be necessary to have a method of receiving that data (Shi, Page 3, Sparsity-induced autoencoder, Paragraph 2, Lines 7-8; Fidler Paragraph 0190).
While Shi teaches computing a loss objective between the set of feature vectors and the mapped set of representations (Shi, Page 5, Feature Ensemble Method, Paragraph 4, Lines 2-4, and Equation 9), Shi in view of Fidler does not explicitly teach computing a loss objective between the set of feature vectors and the combination of the mapped set of representations and a network of layers in the plurality of MLPs. 
Ng teaches a loss objective between the set of feature vectors and the combination of the mapped set of representations and a network of layers in the plurality of MLPs (Ng, Page 6, Section 2.2, Paragraph 1, “Suppose we have a fixed training set {(x (1), y(1)), . . . ,(x (m) , y(m) )} of m training examples. We can train our neural network using batch gradient descent. In detail, for a single training example (x, y), we define the cost function with respect to that single example to be                         
                            J
                            
                                    W
                                    ,
                                    b
                                    ;
                                    x
                                    ,
                                    y
                                
                            =
                            
                                    1
                                
                                    2
                                
                                                    h
                                                
                                                    W
                                                    ,
                                                    b
                                                
                                                    x
                                                
                                            -
                                            y
                                        
                                    2
                                
                    This is a (one-half) squared-error cost function. Given a training set of m examples, we then define the overall cost function to be [Equation 8]. The first term in the definition of J(W, b) is an average sum-of-squares error term. The second term is a regularization term (also called a weight decay term) that tends to decrease the magnitude of the weights, and helps prevent overfitting.1 The weight decay parameter λ controls the relative importance of the two terms. Note also the slightly overloaded notation: J(W, b; x, y) is the squared error cost with respect to a single example; J(W, b) is the overall cost function, which includes the weight decay term”). 
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by Shi in view of Fidler to include computing a loss objective between the set of feature vectors and the combination of the mapped set of representations and a network of layers in the MLP as taught by Ng. The motivation for doing so would have been to prevent the model from overfitting (Ng, Page 6, Section 2.2, Paragraph 1, Lines 11-14, “The second term is a regularization term (also called a weight decay term) that tends to decrease the magnitude of the weights, and helps prevent overfitting.1 The weight decay parameter λ controls the relative importance of the two terms”). 

	Regarding claim 14, the rejection of claim 13 is incorporated, and further Shi teaches wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that include different head architectures (Shi, Page 2, Related Work, Paragraph 4, Lines 6-9, “Various sparse regularization methods are used in deep belief networks or autoencoders, 20 and the results have proved that these methods have beneficial effects for a particular scene” The “SparsityAE” disclosed in the reference is chosen among the “various sparse regularization methods”).

	Regarding claim 20, the rejection of claim 13 is incorporated, and further, Shi in view of Fidler teaches freezing, parameters of the plurality of trained MLPs (Fidler, Paragraph 0619, “freezing the parameters of the decoder while training the encoder using the training data generated based, at least in part, on output of the decoder” Fidler, Paragraph 0091, “At 204, said system freezes said decoder. In at least one embodiment, said freezing comprises halting changes to various weights or parameters of said decoder, while training to other portions of one or more neural networks associated with said decoder continues”) receiving … an interpretation data sample (Shi, Page 3, Sparsity-induced autoencoder, Paragraph 2, Lines 7-8, “We input the high-dimensional data set into the encoding stage”); via the communication device (Fidler, Paragraph 0190, Lines 1-6, “In at least one embodiment, processor(s) 1010 may further include a set of embedded processors that may serve as an audio processing engine which may be an audio subsystem that enables full hardware support for multi-channel audio over multiple interfaces, and a broad and flexible range of audio I/O interfaces”)
determining, in response to the interpretation data sample, the set of feature vectors via the plurality of pre-trained feature extractors, respectively (Shi, Page 4, Feature Ensemble Method, Paragraph 2, Lines 6-9, “N SparsityAE s will be pretrained to set initial values of boosting network and, therefore, produce N representations of input data” The “SparsityAEs are considered to be the “pre-trained feature extractors” and the “representations of input data” are considered to be the “feature vectors” Shi, Page 5, Feature Ensemble Method, Paragraph 3, Lines 2-6, and Equation 8, “Let x be the original input,                         
                            
                                    W
                                
                                    (
                                    i
                                    )
                                
                     be the weight matrix connecting the input layer and the hidden layer and                         
                            
                                    b
                                
                                    (
                                    i
                                    )
                                
                     be the bias vector, so the output of SparsityAE_1 can be mapped by equation (4)                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                            =
                            σ
                            (
                            x
                            •
                            
                                    W
                                
                                            i
                                        
                            +
                             
                                    b
                                
                                            i
                                        
                            )
                        
                    ”                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                     is considered to be the “set of feature vectors”);
updating the memory bank vector using an average of the set of feature vectors (Shi, Page 4, Col 1, Final Paragraph, Lines 5-8, “Then, updating the data according to the hidden representation derived from the SparsityAE and setting the neurons without significant activation value to zero”);
mapping, via the MLPs, the updated memory bank vector into a plurality of mapped representations, respectively; (Shi, Page 5, Feature Ensemble Method, Paragraph 4, Lines 2-4, and Equation 9, “With reference to the above operation, the output of SparsityAE_2 can be obtained, which can also be used as the input of next SparsityAE                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                            =
                            σ
                            (
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                            •
                            
                                    W
                                
                                            i
                                        
                            +
                             
                                    b
                                
                                            i
                                        
                            )
                        
                    ”                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                     is considered to be the “memory bank vector”)

    PNG
    media_image1.png
    611
    996
    media_image1.png
    Greyscale

As seen with reference to Fig. 3, shown above, calculating                         
                            
                                            y
                                        
                                        ^
                                    
                                    2
                                
                     is considered to be “retrieving a memory bank vector”, SparsityAE_2 is an autoencoder, the encoder “retrieves” the memory bank vector by calculating it, then the decoder attempts to reconstruct                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                    , creating the “mapped representations” by mapping the memory bank vector. 
computing a loss objective between the set of feature vectors and the plurality of mapped representations (Shi, Page 4, Col 1, Paragraph 3, Lines 1-3, “For our SparsityAE, we optimize the reconstruction loss regularized by a weight decay and a sparsity-inducing term. The cost function can be designed as follows                         
                            L
                            
                                    X
                                    ,
                                     
                                    Y
                                    ;
                                     
                                    θ
                                
                            =
                             
                                            y
                                        
                                            i
                                        
                                    -
                                    
                                            y
                                        
                                        ^
                                    
                                                    x
                                                
                                                    i
                                                
                                        2
                                    
                                        2
                                    
                            /
                            N
                             
                            +
                             
                            β
                            •
                            K
                            L
                            (
                            
                                    p
                                
                                ^
                            
                            |
                            |
                            p
                            )
                        
                    ”; The                         
                            
                                    y
                                
                                    i
                                
                     is the “original input”, and in the case of “SparsityAE_2”, the input is the “set of feature vectors” Shi, Page 5, Col 1, Paragraph 3, Lines 1-2, “Treat                         
                            
                                            y
                                        
                                        ^
                                    
                                    1
                                
                     as the input of the next SparsityAE to further train SparsityAE_2”); and 
updating the initialized memory bank vector based on the computed loss objective (Shi, Page 4, Col 1, Final Paragraph, Lines 3-8, “After that, minimizing the loss function by using the stochastic gradient descent to update network parameters. Then, updating the data according to the hidden representation derived from the SparsityAE and setting the neurons without significant activation value to zero”) while freezing the plurality of MLPs (Fidler, Paragraph 0619, “freezing the parameters of the decoder while training the encoder using the training data generated based, at least in part, on output of the decoder” Fidler, Paragraph 0091, “At 204, said system freezes said decoder. In at least one embodiment, said freezing comprises halting changes to various weights or parameters of said decoder, while training to other portions of one or more neural networks associated with said decoder continues”).
	
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Shi in view of Fidler in further view of Ng in further view of Hoehne. 

Regarding claim 17, the rejection of claim 14 is incorporated. 
The proposed combination thus far does not explicitly teach wherein the dataset further includes a plurality of text documents or a plurality of audio files.
Hoehne teaches wherein the data sample includes a plurality of text documents or a plurality of audio files (Hoehne, Paragraph 0164, “The machine learning model according to the present disclosure is configured to receive input data of different modalities. Usually, there are different inputs for inputting input data of different modalities, e.g., an image input for inputting one or more images, a text input for inputting one or more texts (including numbers), and/or an audio input for inputting one or more audio files”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by the proposed combination to include audio or text input as taught by Hoehne. The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Hoehne, Paragraph 0195). 

Claims 15, and 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Shi in view of Fidler in further view of Ng in further view of Yoo. 

Regarding claim 15, the rejection of claim 13 is incorporated.
The proposed combination thus far does not explicitly teach wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that are trained on different objectives.
Yoo teaches wherein the plurality of pre-trained feature extractors is selected from one or more of the pre-trained feature extractors that are trained on different objectives (Yoo, Paragraph 0144, Lines 9-13, “Further, in a case of an encoder, models such as deep neural networks, graph convolutional networks and 3D convolutional neural networks may be used according to an input. Further, recurrent neural network models for obtaining a character string representation or other graph generative models may be utilized as a decoder” The “different objectives” are considered to be selecting features for different data types, each encoder taught above is used “according to an input” meaning they have different objectives).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by the proposed combination to include selecting pre-trained feature extractors from pre-trained feature extractors trained on different objectives as taught by Yoo. The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Yoo, Paragraph 0144).

Regarding claim 18, the rejection of claim 14 is incorporated. 
The proposed combination thus far does not explicitly teach the dataset including a plurality of point clouds or polygon meshes. 
Yoo teaches wherein the data sample further includes a plurality of point clouds or polygon meshes (Yoo, Paragraph 0144, Lines 3-8, “Further, in addition to a fingerprint input, the encoder-decoder model of the present disclosure may utilize various methods for expressing a molecular structure, such as input in the form of a graph of a molecular structure or input in the form of a point cloud with coordinate values”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by the proposed combination to include point clouds in the input dataset as taught by Yoo. The motivation for doing so would have been The motivation for doing so would have been the ability to create ensemble vector representations of different input types, rather than simply for images (Yoo, Paragraph 0144). 

Regarding claim 19, the rejection of claim 14 is incorporated. 
The proposed combination thus far does not teach wherein the plurality of pre-trained feature extractors is selected from a plurality of convolutional neural network. 
Yoo teaches wherein the plurality of pre-trained feature extractors is selected from a plurality of convolutional neural network (Yoo, Paragraph 0144, Lines 8-11, “Further, in a case of an encoder, models such as deep neural networks, graph convolutional networks and 3D convolutional neural networks may be used according to an input”).
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to have modified the method for determining an ensemble vector representation as taught by the proposed combination to include selecting pre-trained feature extractors from a plurality of convolutional neural networks as taught by Yoo. The motivation for doing so would have been to the ability to select a model type based on the input type (Yoo, Paragraph 0144). 

Response to Arguments
	Applicant’s amendments to the specification with respect to objections to the specification have been fully considered, but do not overcome the objections set forth in the final office action dated 11/13/2025. The amendment to the specification filed 02/12/2026 still includes an embedded hyperlink on line 10. 
	Applicant’s amendments to the claims with respect to 35 U.S.C. 112(b) indefiniteness rejections have been fully considered, and overcome the rejections set forth in the final office action dated 11/13/2025. Consequently, the previous 35 U.S.C. 112(b) indefiniteness rejections to the claims have been withdrawn, however, a new grounds of rejection has been identified because of applicant’s amendments. 

Applicant’s arguments regarding the 35 U.S.C. 101 rejections of the claims have been fully considered but are unpersuasive.
Applicant first argues, on page 10, paragraphs 1-2 of the response, that the claims are not directed to an abstract idea. Examiner respectfully disagrees. Applicant specifically mentions the “specific architecture of the neural network model”, however, it is noted that the features upon which applicant relies (i.e., the “specific architecture”) are not recited in the rejected claims. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. Applicant points to “training framework comprising a plurality of pre-trained feature extractors and a plurality of multi-layer perceptrons (MLPs) corresponding to the plurality of pre-trained feature extractors”, however this limitation does not recite a “specific architecture” it only requires that the feature extractors correspond to the MLPs, but makes no specific limitations on how they correspond. Applicant next points to the 2025 memo and the published USPTO example 39, however the fact patterns of the instant application are not identical to those of the published USPTO example 39, and thus the same rationale cannot be applied. Further, applicant argues “computing a loss objective between the set of feature vectors and the plurality of mapped representations” cannot be practically performed in the human mind, however this limitation has been identified as a mathematical concept, for an in depth analysis see the updated 35 U.S.C. 101 rejection seen above. Further, the limitation “training the plurality of MLPs and the memory bank vector based on the computer loss objective” was not identified as an abstract idea but rather an additional element that amounted to insignificant extra-solution activity, see the updated 35 U.S.C. 101 rejection seen above. 
Applicant next argues, on page 10, final paragraph – page 11, paragraph 2 of the response, that the abstract idea is integrated into a practical application of feature extraction. Examiner respectfully disagrees. Applicant specifically points to improved “feature extraction”, it is important to note that an improvement in the abstract idea itself (e.g. a recited mathematical concept) is not an improvement in technology, see MPEP 2106.05(a). 
Applicant next points to Appeal 2024-000567, and claims the application reflect an improvement in feature extraction and are thus eligible. Examiner respectfully disagrees. The fact patterns of the instant application are not identical to those of Appeal 2024-000567, and thus the same rationale cannot be applied. Further, it is important to note that an improvement in the abstract idea itself (e.g. a recited mathematical concept) is not an improvement in technology, see MPEP 2106.05(a).
Applicant's arguments regarding the remainder of the claims rely upon the arguments asserted with respect to the independent claims, and are thus unpersuasive.

Applicant’s arguments regarding the 35 U.S.C. 103 rejections of the claims have been fully considered but are unpersuasive.
Applicant first argues, on page 13, final paragraph – page 14, paragraph 1, that Shi does not teach any “memory bank vector as an input to each of the plurality of MLPs” of claim 1 and that Shi discusses “a completely different architecture” than amended claim 1. Examiner respectfully disagrees. The specific “architecture” is not reflected in the claims, and the broadest reasonable interpretation of the limitations of claim 1 include the architecture disclosed by Shi. Further, as discussed in the updated 35 U.S.C. 103 rejection above, the decoder of the Sparsity_AEs disclosed by Shi receive the “memory bank vector” as input to generate the mapped representations. 
Applicant next argues, on page 14, paragraph 2 of the response, that Shi fails to disclose “mapping, via the plurality of MLPs, the memory bank vector into a plurality of mapped representations, each corresponding to a respective one of the set of feature vectors”. Examiner respectfully disagrees. The mapped representations are used with their corresponding feature vectors during loss function calculations, and thus each is considered to be “corresponding to a respective one of the set of feature vectors”. 
Applicant next argues, on page 14, final paragraph – page 15, paragraph 1 of the response, that Shi fails to disclose “computing a loss objective between the set of feature vectors and the plurality of mapped representations”. Examiner respectfully disagrees. With regard to the cost function of Shi, “            
                L
                
                        X
                        ,
                         
                        Y
                        ;
                         
                        θ
                    
                =
                 
                                y
                            
                                i
                            
                        -
                        
                                y
                            
                            ^
                        
                                        x
                                    
                                        i
                                    
                            2
                        
                            2
                        
                /
                N
                 
                +
                 
                β
                •
                K
                L
                (
                
                        p
                    
                    ^
                
                |
                |
                p
                )
            
        ” (Shi, Page 4, Col 1, Equation 5),             
                
                        y
                    
                        i
                    
         is the original input and             
                
                        y
                    
                    ^
                
                                x
                            
                                i
                            
         is the output. The input of the autoencoder is considered to be the “set of feature vectors” and the output is considered to be the “plurality of mapped representations”, as can be seen in the image above in the in depth analysis of the 35 U.S.C. 103 rejection. 
Applicant next argues, on page 15, paragraph 2 of the response, that Shi fails to disclose “training the plurality of MLPs and the memory bank vector based on the computed loss objective”. Examiner respectfully disagrees. Applicant argues the claim recites a “same input” passed to each of the autoencoders. However, the claim only requires “determining, in response to an image sample from the dataset, a set of feature vectors via the plurality of pre-trained feature extractors”. This limitation does not rule out an interpretation where an input is passed to an ensemble and each individual feature extractor receives a different input, thus Shi does teach the aforementioned limitation. 
Applicant's arguments regarding the remainder of the claims rely upon the arguments asserted with respect to the independent claims, and are thus unpersuasive.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
J. Song, Y. Yang, Y. -Z. Song, T. Xiang and T. M. Hospedales, "Generalizable Person Re-Identification by Domain-Invariant Mapping Network," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 719-728, doi: 10.1109/CVPR.2019.00081: This reference teaches a domain invariant mapping network that does not require model updating for target domains and using a memory bank for maintaining scalability and discrimination ability. 
Z. Wu, Y. Xiong, S. X. Yu and D. Lin, "Unsupervised Feature Learning via Non-parametric Instance Discrimination," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 3733-3742, doi: 10.1109/CVPR.2018.00393: This reference teaches an unsupervised feature learning approach using a backbone CNN to encode each image as a feature vector, and the optimal feature embedding is learned via instance-level discrimination. 
Liu et al., Unsupervised Learning using Pretrained CNN and Associative Memory Bank, 05/02/2018, https://arxiv.org/abs/1805.01033: This reference teaches a new architecture and approach for unsupervised object recognition, it uses a pretrained CNN model for automated feature extraction pipelined with a Hopfield network based associative memory bank for storing patterns for classification. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOLLY CLARKE SIPPEL whose telephone number is (571)272-3270. The examiner can normally be reached Monday - Friday, 7:30 a.m. - 4:30 p.m. ET..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571)272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/M.C.S./            Examiner, Art Unit 2122                                                                                                                                                                                            
/KAKALI CHAKI/            Supervisory Patent Examiner, Art Unit 2122
Read full office action
Prosecution Timeline

Jan 28, 2022
Application Filed
May 01, 2025
Non-Final Rejection — §101, §103, §112
Aug 25, 2025
Applicant Interview (Telephonic)
Aug 25, 2025
Examiner Interview Summary
Sep 12, 2025
Response Filed
Oct 30, 2025
Final Rejection — §101, §103, §112
Dec 22, 2025
Applicant Interview (Telephonic)
Dec 22, 2025
Examiner Interview Summary
Dec 31, 2025
Response after Non-Final Action
Feb 12, 2026
Request for Continued Examination
Feb 24, 2026
Response after Non-Final Action
Mar 10, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/929,541
Patent 12602592
NOISE COMMUNICATION FOR FEDERATED LEARNING
2y 5m to grant Granted Apr 14, 2026
17/932,941
Patent 12596916
CONSTRAINED MASKING FOR SPARSIFICATION IN MACHINE LEARNING
2y 5m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 2 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds
Prosecution Projections

3-4
Expected OA Rounds
50%
Grant Probability
99%
With Interview (+58.3%)
3y 7m
Median Time to Grant
High
PTA Risk
Based on 14 resolved cases by this examiner. Grant probability derived from career allow rate.
SYSTEMS AND METHODS FOR LEARNING RICH NEAREST NEIGHBOR REPRESENTATIONS FROM SELF-SUPERVISED ENSEMBLES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

SYSTEMS AND METHODS FOR LEARNING RICH NEAREST NEIGHBOR REPRESENTATIONS FROM SELF-SUPERVISED ENSEMBLES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email