Last updated: May 29, 2026
Application No. 17/764,015
CLUSTERING DATA USING NEURAL NETWORKS BASED ON NORMALIZED CUTS

Final Rejection §101§103§112
Filed
Mar 25, 2022
Priority
Sep 25, 2019 — provisional 62/906,081 +1 more
Examiner
BAKER, EZRA JAMES
Art Unit
2126
Tech Center
2100 — Computer Architecture & Software
Assignee
Google LLC
OA Round
2 (Final)
This examiner grants 50% of cases after interview

— +53.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 16 resolved cases, 2023–2026
Examiner Intelligence

BAKER, EZRA JAMES View full profile →
Grants 50% of resolved cases
Career Allowance Rate
8 granted / 16 resolved
-5.0% vs TC avg
Strong +53% interview lift
Without
With
+53.3%
Interview Lift
resolved cases with interview
Typical timeline
4y 0m
Avg Prosecution
23 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
5.5%
-34.5% vs TC avg
§103
90.8%
+50.8% vs TC avg
§102
3.7%
-36.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 16 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
	The present application is being examined under the claims filed 10/09/2025.
	Claims 1-11 and 13-21 are pending.

Response to Amendment
This Office Action is in response to Applicant’s communication filed 10/09/2025 in response to office action mailed 06/26/2025. The Applicant’s remarks and any amendments to the claims or specification have been considered with the results that follow.
Response to Arguments
Regarding 35 U.S.C. 101
In Remarks page 9, Argument 1
(Examiner summarizes Applicant’s arguments) Applicant argues that the claims are not directed to an abstract idea, citing the August 4 memo on subject matter eligibility which reminds examiners not to expand abstract idea groupings to limitations that could not be performed in the human mind. Applicant argues that training a clustering network involves minimizing a loss function across potentially millions of parameters and cannot practically be performed in the human mind.
Examiner’s response to Argument 1
Examiner disagrees. Though machine learning training generally is not directed to a mental process, Examiner points to Example 47 claim 2 of the July 2024 eligibility guidance
(page 6 paragraph 2) “The training algorithm is a backpropagation algorithm and a gradient descent algorithm. When given their broadest reasonable interpretation in light of the background, the backpropagation algorithm and gradient descent algorithm are mathematical calculations. The plain meaning of these terms are optimization algorithms, which compute neural network parameters using a series of mathematical calculations.”
When the details of the machine learning models can be performed using a mental process or mathematical processes, they are directed to an abstract idea. Turning to the claims at issue, the claims recite the limitation 
minimizing a normalized cuts loss function that includes a first term that measures an expected normalized cuts of clustering nodes in a graph representing the data set into the plurality of clusters according to clustering outputs generated by the clustering neural network, wherein: the trained values are determined without guidance from labeled data; nodes in an input graph represent items in the data set and edges in the input graph represent relationships between items in the data set, and the normalized cuts of clustering a given graph measures, for each cluster, a ratio of (i) a total weight of edges that are removed from the given graph to form a disjoint subgraph of the given graph that includes only the nodes in the cluster to (ii) a total weight of edges in the given graph that connect to at least one node in the cluster.
The claim recites a particular mathematical function and a process of optimizing the mathematical function. Just as in example 47 claim 2, this particular step of machine learning recites math. Thus, the claim limitations of the instant application are not merely based on or involve an abstract idea, but could be performed by an abstract idea. Thus the claim still recites an abstract idea.

In Remarks page 9-10, Argument 2
(Examiner summarizes Applicant’s arguments) Applicant argues that the claims are directed to an improvement to computer functionality, not mere instructions to “apply it”. Applicant argues that the claimed method of clustering with normalized cuts loss function in an unsupervised manner can cluster new data without retraining or human-provided labels, and provides a new tool that is more effective and efficient than conventional systems.
Examiner’s response to Argument 2
	Examiner disagrees. According to MPEP 2106.05(a), a technical improvement cannot be provided by an abstract idea alone. A technical improvement can be provided by the additional elements or the additional elements in combination with the abstract ideas. In the instant application, the additional elements are recited in a highly generic manner and do not contribute substantially to any alleged improvements. For example, the additional elements (steps 2A prong 2 and 2B) are directed to mere data inputting/outputting, using a computer as a tool to perform the abstract idea, and generic unsupervised training. Therefore, the claims are not directed to an improvement to technology but rather an abstract idea not integrated into a practical application nor significantly more. See rejections under 35 U.S.C. 101 below for a complete analysis.

Regarding 35 U.S.C. 103
In Remarks page 11, Argument 3
(Examiner summarizes Applicant’s arguments)  Applicant argues that the claims as amended recite training in a purely unsupervised manner while Tang describes weakly supervised learning which requires labeled data. Applicant argues that the supervised cut loss of Tang is not a standalone module for general unsupervised learning but an integrated component of a hybrid loss function requiring labelled data and that there is no suggestion to use only the normalized cut portion for a completely unsupervised problem.
Examiner’s response to Argument 3
Applicant’s arguments are not convincing. Supervised cut loss is a tool often used for unsupervised learning, which is explicitly mentioned by Tang
(page 5 column 1 paragraph 1) “This suggests that normalized cut is a reasonable loss encouraging balanced non-linear partitioning of unlabeled pixels. Our normalized cut loss is motivated by popularity of normalized cut as an unsupervised segmentation criteria with many attractive properties [43, 49].”
Though the particular version of the normalized cuts loss of Tang appears to be purposed for “weakly-supervised” learning, a person having ordinary skill in the art would recognize the benefits of Tang’s approach and have reason to adapt this new loss function back into the purely unsupervised setting (as taught by Shaham) where normalized cuts loss is so often used (as is recognized even by the authors). Accordingly, the rejection is maintained.

In Remarks page 11-12, Argument 4
(Examiner summarizes Applicant’s arguments) Applicant argues that there is a mismatch in system architecture between the direct end-to-end clustering as claimed vs. the indirect embed-then-cluster system taught by Shaham. Applicant argues that Shaham teaches passing a point through a network to obtain an embedding, and is only assigned a cluster via a separate k-means algorithm. Applicant argues Shaham’s network does not directly performing clustering as claimed.
Examiner’s response to Argument 4
	Examiner disagrees. Firstly, Applicant ignores that Andoni was used to teach the limitation of the clustering output, not Shaham nor Tang.
	Second, even considering the combination of Shaham, Tang, and Andoni the result would still meet the requirements of the claim. According to MPEP 2111, a claim must be interpreted with the broadest reasonable interpretation consistent with the specification. Neural networks are often composed of multiple components which perform different functions within the neural network. Particularly, neural network outputs almost always require some form of post-processing to function properly. For example, a person having ordinary skill in the art would consider a softmax applied to the output of an output neuron as a part of the neural network, and not a separate algorithm. Experts in the field expect the output of a neural network to be intelligible and therefore, reasonably regard any necessary processing as part of the neural network itself.
	Moreover, even Shaham et al. regard their entire processing pipeline as a neural network (i.e. SpectralNet)
The training of SpectralNet consists of three components: (i) unsupervised learning of an affinity given the input distance measure, via a Siamese network (see Section 3.2); (ii) unsupervised learning of the map Fθ by optimizing a spectral clustering objective while enforcing orthogonality (see Section 3.1); (iii) learning the cluster assignments, by k-means clustering in the embedded space.
Therefore, the broadest reasonable interpretation of a neural network includes neural networks with post-processing steps such as the k-means clustering of Shaham et al.

In Remarks page 12, Argument 5
Replacing Shaham's learning objective (learning an embedding for a subsequent k-means step) with Tang's loss function (designed for direct pixel-wise segmentation) is not a simple substitution. It would require a wholesale redesign of Shaham's architecture and training philosophy. The Examiner's proposed combination is only made possible by using the Applicant's disclosure as a blueprint, which constitutes improper hindsight.
Examiner’s response to Argument 5,
	Examiner disagrees. Applicant’s argument is a mere assertion that Shaham and Tang are incompatible without providing any supporting reasoning nor evidence to justify the assertion. Applicant argues that Tang’s loss function is designed for direct pixel-wise segmentation. However, Tang states that “Normalized Cut is a popular graph clustering algorithm originally proposed for image segmentation [43]. It is the sum of ratios between the cuts and the volumes.” Tang suggests that (1) pixel-wise segmentation and clustering are not necessarily mutually exclusive and (2) Normalized cut losses are intended to be used for clustering. Therefore, those familiar with the art would readily recognize that the particular normalized cut loss of Tang could be used with the clustering of Shaham, and they would recognize the benefits of doing so.
	Tang page 7 column 1 paragraph 1 recites
After training with pCE only, we fine-tune the network with extra normalized cut loss introduced in Sec. 4.2, see Fig. 6 and Tab. 1. For Tab. 1, MSC means with multi scaling branches and CRF means with post-processing. In Fig. 6, we tried different weight λ for the extra normalized cut loss. Having this extra loss significantly boosts segmentation accuracy.
A person having ordinary skill in the art would recognize that using this loss in place of or as a term in addition to another loss would likely yield performance improvements. Therefore, Examiner’s reasoning in the prior Office Action was not “impermissible hindsight” reasoning and the rejections for all independent and dependent claims are maintained.

In Remarks page 12, Argument 6
The other claims in the application are each dependent on the independent claims, and are allowable for at least the above reasons. Because each claim is deemed to define additional aspects of the disclosure, however, the individual consideration of each claim on its own merits is respectfully requested.
Examiner’s response to Argument 7
	The rejections to the independent claims are maintained and, for similar reasons, none of the claims are found to be allowable.

Regarding 35 U.S.C. 112
In Remarks page 12, Argument 7
Claims 1-11 and 13-21 stand rejected under 35 U.S.C. § 112, second paragraph, as allegedly indefinite. Without conceding the merits of the rejection, the claims have been amended. Reconsideration and withdrawal of the§ 112 rejections are respectfully requested.
Examiner’s response to Argument 7
	Examiner agrees that the rejections under 35 U.S.C. 112(b) of all claims are obviated by the amendments and the rejections are withdrawn accordingly.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.



Claims 1-11 and 13-21 are rejected under 35 U.S.C. 101 for containing an abstract idea without significantly more.

Regarding Claim 1:
	Step 1 – Is the claim to a process, machine, manufacture, or composition of matter?
	Yes, the claim is to a process.
	Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	Yes, the claim recites the abstract ideas of:
and to process the input data in accordance with the clustering parameters to generate, for each item in the data set and as a direct output of the clustering neural network, a respective clustering output that defines a probability distribution that includes a respective probability for each of a plurality of clusters — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to evaluating a sequence of inputs with parameters to determine an output.
by minimizing a normalized cuts loss function that includes a first term that measures an expected normalized cuts of clustering nodes in a graph representing the data set into the plurality of clusters according to clustering outputs generated by the clustering neural network, wherein: nodes in the input graph represent items in the data set and edges in the input graph represent relationships between items in the data set, and the normalized cuts of clustering a given graph measures, for each cluster, a ratio of (i) a total weight of edges that are removed from the given graph to form a disjoint subgraph of the given graph that includes only the nodes in the cluster to (ii) a total weight of edges in the given graph that connect to at least one node in the cluster —This limitation is directed to the abstract idea of a mathematical process, and mathematical calculations in particular (MPEP 2106.04(a)(2) I. C.). The claim describes the mathematical operation of calculating a particular loss function in words.
	Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
	No, the claim does not recite additional elements that integrate the judicial exception into a practical application. The additional elements:
A method performed by one or more computers, the method comprising: — This limitation is directed to merely applying an abstract idea using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
obtaining unlabeled training data for training a clustering neural network having a plurality of clustering parameters, wherein: the training data comprises input data representing a data set of a plurality of items, the input data comprises a respective feature embedding of each of the plurality of items — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
and the clustering neural network is configured to receive the input data — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
and training the clustering neural network on the unlabeled training data to determine trained values of the clustering parameters […] wherein the trained values are determined without guidance from labeled data — This limitation is directed to mere instructions to apply a judicial exception. Using machine learning training to apply a judicial exception (see MPEP 2106.05(f)) is insufficient to integrate the judicial exception into a practical application. Even if the machine learning training is implemented on a generic computer (see MPEP 2106.05(f)(2), 2106.04(d)), the limitation does not integrate the judicial exception into a practical application.

	Step 2B – Does the claim recite additional elements that amount to significantly more than the abstract idea itself?
	No, the claim does not recite additional elements which amount to significantly more than the abstract idea itself. The additional elements as identified in step 2A prong 2:
A method performed by one or more computers, the method comprising: — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
obtaining unlabeled training data for training a clustering neural network having a plurality of clustering parameters, wherein: the training data comprises input data representing a data set of a plurality of items, the input data comprises a respective feature embedding of each of the plurality of items — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
and the clustering neural network is configured to receive the input data — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
and training the clustering neural network on the unlabeled training data to determine trained values of the clustering parameters — Mere instructions to apply a judicial exception (see MPEP 2106.05(f)) and using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.

Regarding Claim 2
Claim 2 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The claim recites the additional limitations:
Step 2A Prong 1:
processing the feature embedding of each item in the subset using the clustering neural network and in accordance with current values of the clustering parameters to generate a respective probability distribution for each item — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to evaluating a sequence of inputs with parameters to determine an output.
determining, for each particular item in the subset, affinity weights that measure relationships between the particular item and the items in the subset — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to evaluating relationships between data to determine weights.
determining, for each particular item in the subset, a total affinity weight between the particular item and all other items in the subset — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to evaluating data to determine weights.
and determining an update to the current values of the parameters by minimizing the normalized cuts loss function for the subset based on the total affinity weights, the affinity weights, and the probability distributions for the items in the subset — This limitation is directed to the abstract idea of a mathematical process, and mathematical calculations in particular (MPEP 2106.04(a)(2) I. C.). The claim describes the mathematical operation of calculating a particular loss function in words.
Step 2A Prong 2:
wherein the training comprises repeatedly performing the following: sampling a subset of items from the data set — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the training comprises repeatedly performing the following: sampling a subset of items from the data set — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 3
Claim 3 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 2 which included an abstract idea (see rejection for claim 2). The claim recites the additional limitations:
Step 2A Prong 2:
wherein the input data is data representing an input graph of nodes and edges, and wherein the affinity weight between two items in the subset identifies whether there is an edge in the input graph between two nodes in an input graph that represent the two items in the subset — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the input data.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the input data is data representing an input graph of nodes and edges, and wherein the affinity weight between two items in the subset identifies whether there is an edge in the input graph between two nodes in an input graph that represent the two items in the subset — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 4
Claim 4 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 2 which included an abstract idea (see rejection for claim 2). The claim merely recites the additional abstract idea:
Step 2A Prong 1:
wherein the affinity weights are based on distances between the feature embeddings of the items in the subsets in an embedding space — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to performing an evaluation of how far apart two data points are by methods of distance function, k nearest neighbors, etc.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 5
Claim 5 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 2 which included an abstract idea (see rejection for claim 2). The claim merely recites the additional abstract idea:
Step 2A Prong 1:
wherein determining an update to the current values of the parameters by minimizing the normalized cuts loss function for the subset based on the total affinity weights, the affinity weights, and the probability distributions for the items in the subset comprises: determining a gradient of a loss function that satisfies: 
    PNG
    media_image1.png
    56
    293
    media_image1.png
    Greyscale
 where 
    PNG
    media_image2.png
    34
    196
    media_image2.png
    Greyscale
 is a first matrix, 
    PNG
    media_image3.png
    65
    90
    media_image3.png
    Greyscale
 denotes a sum over the elements of the first matrix, Y is a first matrix that includes the probabilities for each of the particular items in the subset,                     
                        Γ
                    
                 is a matrix that satisfies                     
                        Γ
                        =
                        
                            
                                Y
                            
                            
                                T
                            
                        
                        D
                    
                , D is a column vector that includes the total affinity weights for each of the particular items in the subset, and W is a matrix that includes the affinity weights for each particular items in the cluster,                     
                        ⊘
                    
                 denotes element-wise division, and                     
                        ⊙
                    
                 denotes element-wise multiplication — This limitation is directed to the abstract idea of a mathematical process, and mathematical formulas or equations in particular (MPEP 2106.04(a)(2) I. B.). The claim explicitly recites a mathematical formula.
 Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 6
Claim 6 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The claim recites the additional limitations:
Step 2A Prong 1:
and processing each of the features using an embedding neural network to generate the feature embeddings for the items in the data set — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to .
Step 2A Prong 2:
wherein obtaining the unlabeled training data comprises: receiving features of each of the items in the data set — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein obtaining the unlabeled training data comprises: receiving features of each of the items in the data set — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 7
Claim 7 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 6 which included an abstract idea (see rejection for claim 6). The claim recites the additional limitations:
Step 2A Prong 2:
further comprising: training the embedding neural network to generate feature embeddings that represent affinities between items in the data set — This limitation is directed to mere instructions to apply a judicial exception. Using machine learning training to apply a judicial exception (see MPEP 2106.05(f)) is insufficient to integrate the judicial exception into a practical application. Even if the machine learning training is implemented on a generic computer (see MPEP 2106.05(f)(2), 2106.04(d)), the limitation does not integrate the judicial exception into a practical application.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
further comprising: training the embedding neural network to generate feature embeddings that represent affinities between items in the data set — Mere instructions to apply a judicial exception (see MPEP 2106.05(f)) and using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.


Regarding Claim 8
Claim 8 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 8 which included an abstract idea (see rejection for claim 7). The claim recites the additional limitations:
Step 2A Prong 2:
wherein the embedding neural network is a Siamese neural network — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the embedding neural network.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the embedding neural network is a Siamese neural network — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 9
Claim 9 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The claim merely recites the additional abstract idea:
Step 2A Prong 1:
after training the clustering neural network, generating a final clustering of the data set into the plurality of clusters — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to performing an evaluation of data using given parameters.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 10
Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 1 which included an abstract idea (see rejection for claim 1). The claim recites the additional limitations:
Step 2A Prong 1:
and generating a clustering of the new data set without re-training the clustering neural network — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). The limitation is directed to a mental process because it amounts to performing an evaluation of data using given parameters.
Step 2A Prong 2:
after training the clustering neural network, receiving a new data set — This limitation is directed to mere data gathering and outputting which has been recognized by the courts (as per Ultramercial, 772 F.3d at 715, 112 USPQ2d at 1754) as insignificant extra-solution activity (see MPEP 2106.05(g)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
after training the clustering neural network, receiving a new data set — This limitation is recited at a high level of generality and amounts to mere data gathering of transmitting and receiving data over a network, which is well-understood, routine, and conventional activity (see MPEP 2106.05(d) II.), which cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 11
Claim 11 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 10 which included an abstract idea (see rejection for claim 10). The claim recites the additional limitations:
Step 2A Prong 2:
wherein the training data comprises a plurality of items of visual data — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the type of training data.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein the training data comprises a plurality of items of visual data — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 13
Independent claim 13 is a computer-readable medium claim corresponding to method claim 1, which was directed to an abstract idea, therefore the same rejection and rationale applies. The only difference is that claim 13 recites the following additional elements treated under step 2A prong 2 and step 2B:
Step 2A Prong 2:
One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising — This limitation is directed to merely applying an abstract idea using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 14
Independent claim 14 is a computer system claim corresponding to method claim 1, which was directed to an abstract idea, therefore the same rejection and rationale applies. The only difference is that claim 14 recites the following additional elements treated under step 2A prong 2 and step 2B:
Step 2A Prong 2:
A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising — This limitation is directed to merely applying an abstract idea using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2. 
Step 2B:
A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.

Regarding Claim 15
Dependent claim 15 is a computer system claim corresponding to method claim 2, which was directed to an abstract idea, therefore the same rejection and rationale applies.

Regarding Claim 16
Dependent claim 16 is a computer system claim corresponding to method claim 3, which was directed to an abstract idea, therefore the same rejection and rationale applies.

Regarding Claim 17
Dependent claim 17 is a computer system claim corresponding to method claim 4, which was directed to an abstract idea, therefore the same rejection and rationale applies.

Regarding Claim 18
Dependent claim 18 is computer system claim corresponding to method claim 5, which was directed to an abstract idea, therefore the same rejection and rationale applies.

Regarding Claim 19
Dependent claim 19 is a computer system claim corresponding to method claim 6, which was directed to an abstract idea, therefore the same rejection and rationale applies.

Regarding Claim 20
Dependent claim 20 is a computer system claim corresponding to method claim 7, which was directed to an abstract idea, therefore the same rejection and rationale applies.

Regarding Claim 21
Dependent claim 21 is a computer system claim corresponding to method claim 9, which was directed to an abstract idea, therefore the same rejection and rationale applies.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 4-11, 13-15, and 17-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over NPL reference Shaham et al. “SpectralNet: Spectral Clustering using Deep Neural Networks” in view of NPL reference Tang et al. “Normalized Cut Loss for Weakly-supervised CNN Segmentation” herein referred to as Tang and Andoni et al. (PGPUB no. US 20190228312 A1) herein referred to as Andoni.

Regarding Claim 1
Shaham teaches:
A method performed by one or more computers, the method comprising: obtaining unlabeled training data for training a clustering neural network having a plurality of clustering parameters
(page 1 abstract) “Moreover, the map learned by SpectralNet naturally generalizes the spectral embedding to unseen data points. To further improve the quality of the clustering, we replace the standard pairwise Gaussian affinities with affinities leaned from unlabeled data using a Siamese network.”; (page 4 paragraph 2) “The training of SpectralNet consists of three components: (i) unsupervised learning of an affinity given the input distance measure, via a Siamese network (see Section 3.2); (ii) unsupervised learning of the map F by optimizing a spectral clustering objective while enforcing orthogonality (see Section 3.1); (iii) learning the cluster assignments, by k-means clustering in the embedded space.”

wherein: the training data comprises input data representing a data set of a plurality of items
(page 9 section 5.2.1 paragraph 1) “MNIST is a collection of 70,000 28x28 gray-scale images of handwritten digits, divided to training (60,000)[*Examiner notes: data set of a plurality of items] and test (10,000) sets.”

the input data comprises a respective feature embedding of each of the plurality of items
(page 6 paragraph 4) “A Siamese net maps every data point xi into an embedding zi = Gθsiamese (xi) in some space.”

and training the clustering neural network on the unlabeled training data to determine trained values of the clustering parameters 
(page 4 paragraph 2) “The training of SpectralNet consists of three components: (i) unsupervised learning of an affinity given the input distance measure, via a Siamese network (see Section 3.2); (ii) unsupervised learning of the map Fθ by optimizing a spectral clustering objective while enforcing orthogonality (see Section 3.1); (iii) learning the cluster assignments, by k-means clustering in the embedded space.”

the trained values are determined without guidance from labeled data
(page 1 abstract) “Our end-to-end learning procedure is fully unsupervised.”

wherein: nodes in an input graph represent items in the data set and edges in the input graph represent relationships between items in the data set
(page 4 paragraph 3) “To this end, let w : Rd x Rd -> [0, ∞) be a symmetric affinity function, such that w(x; x’) expresses the similarity between x and x0. Given w, we would like points x; x’ which are similar to each other (i.e., with large w(x; x’)) to be embedded close to each other.”

the trained values are determined without guidance from labeled data
(page 1 abstract) “Our end-to-end learning procedure is fully unsupervised.”

Shaham does not explicitly teach:
and the clustering neural network is configured to receive the input data and to process the input data in accordance with the clustering parameters to generate, for each item in the data set and as a direct output of the clustering neural network, a respective clustering output that defines a probability distribution that includes a respective probability for each of a plurality of clusters
by minimizing a normalized cuts loss function that includes a first term that measures an expected normalized cuts of clustering nodes in a graph representing the data set into the plurality of clusters according to clustering outputs generated by the clustering neural network, 
and the normalized cuts of clustering a given graph measures, for each cluster, a ratio of (i) a total weight of edges that are removed from the given graph to form a disjoint subgraph of the given graph that includes only the nodes in the cluster to (ii) a total weight of edges in the given graph that connect to at least one node in the cluster

However, Tang teachesby minimizing a normalized cuts loss function that includes a first term that measures an expected normalized cuts of clustering nodes in a graph representing the data set into the plurality of clusters according to clustering outputs generated by the clustering neural network, 
(page 5 column 2 section 5) “To see how capable are neural networks to minimize normalized cut, we train networks for normalized cut loss only in Sec. 5.1.”; (page 2 column 1 first bullet point) “We propose and evaluate a novel loss for weakly supervised semantic segmentation. It combines partial cross entropy on labeled pixels and normalized cut for unlabeled pixels.”

and the normalized cuts of clustering a given graph measures, for each cluster, a ratio of (i) a total weight of edges that are removed from the given graph to form a disjoint subgraph of the given graph that includes only the nodes in the cluster to (ii) a total weight of edges in the given graph that connect to at least one node in the cluster
(page 4 column 1 last paragraph) “Normalized Cut is a popular graph clustering algorithm originally proposed for image segmentation [43]. It is the sum of ratios between the cuts and the volumes”

    PNG
    media_image4.png
    50
    344
    media_image4.png
    Greyscale


	Shaham, Tang, and the instant application are analogous because they are all directed to machine learning.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the clustering of Shaham with the normalized cuts loss function of Tang because (Tang page 2 column 1 bullet point 4) “Experiments show that normalized cut loss achieves the state-of-the-art for training semantic segmentation with scribbles.”

Andoni teaches:
and the clustering neural network is configured to receive the input data and to process the input data in accordance with the clustering parameters to generate, for each item in the data set and as a direct output of the clustering neural network, a respective clustering output that defines a probability distribution that includes a respective probability for each of a plurality of clusters 
(paragraph [0015]) “In a particular aspect, in response to the first input data 101 being input to the first neural network 110, the neural network 110 generates first output data 103 having k numerical values (one for each of the k output nodes), where each of the numerical values indicates a probability that the first input data 101 is part of (e.g., classified in) a corresponding one of the k clusters, and where the sum of the numerical values is one. In the example of FIG. 1B, the k cluster probabilities in the first output data 103 are denoted p1 . . . pk, and the first output data 103 indicates that the first input data 101 is classified into cluster 2 with a probability of (p2=0.91=91%).”; Figure 1B box 103

    PNG
    media_image5.png
    156
    115
    media_image5.png
    Greyscale



	Shaham, Tang, Andoni and the instant application are analogous because they are all directed to machine learning.
	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the clustering of Shaham in view of Tang with the probability outputs as taught by Andoni because (Andoni paragraph [0004]) “There are many different types of machine learning tasks. Clustering is a machine learning task in which a model is trained to accept input data and output an indicator of which of multiple possible clusters the input data belongs to.”

Regarding Claim 2
Shaham in view of Tang and Andoni teaches:
The method of claim 1
(see rejection of claim 1)

Shaham further teaches:
wherein the training comprises repeatedly performing the following: sampling a subset of items from the data set
determining, for each particular item in the subset, affinity weights that measure relationships between the particular item and the items in the subset
(page 6 Algorithm 1)

    PNG
    media_image6.png
    388
    1012
    media_image6.png
    Greyscale


Tang further teaches:
determining, for each particular item in the subset, a total affinity weight between the particular item and all other items in the subset
(page 4 column 1 second to last paragraph) “The cut or assoc for two sets A and B is
defined as 
    PNG
    media_image7.png
    18
    94
    media_image7.png
    Greyscale
”

and determining an update to the current values of the parameters by minimizing the normalized cuts loss function for the subset based on the total affinity weights, the affinity weights, and the probability distributions for the items in the subset
(page 3 column 1 below equation 3) “where Sk ∈ [0, 1]|Ω| is a (soft) support vectors for class k[*Examiner notes: corresponds to probability distributions] combining k-th components Sk p of vectors Sp ∈ [0, 1]K for all points p ∈ Ω.” (page 4 column 2 above equation 6) “Given any affinity matrix W = [Wij ] and degree vector d = W1, we define our joint loss for one image as [Equation 6]”
    PNG
    media_image8.png
    80
    289
    media_image8.png
    Greyscale


It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Shaham and Andoni with Tang for the same reasons given in claim 1 above.


Andoni further teaches:
processing the feature embedding of each item in the subset using the clustering neural network and in accordance with current values of the clustering parameters to generate a respective probability distribution for each item
(paragraph [0015]) “In a particular aspect, in response to the first input data 101 being input to the first neural network 110, the neural network 110 generates first output data 103 having k numerical values (one for each of the k output nodes), where each of the numerical values indicates a probability that the first input data 101 is part of (e.g., classified in) a corresponding one of the k clusters, and where the sum of the numerical values is one. In the example of FIG. 1B, the k cluster probabilities in the first output data 103 are denoted p1 . . . pk, and the first output data 103 indicates that the first input data 101 is classified into cluster 2 with a probability of (p2=0.91=91%).”; Figure 1B box 103; (paragraph [0035]) “During operation in the training mode (FIG. 1A), training data is provided to the neural networks 110, 120, 170 to calculate loss and adjust the parameters of the neural networks 110, 120, 170.”; Figure 1A box 103

    PNG
    media_image9.png
    199
    231
    media_image9.png
    Greyscale


It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Shaham and Tang with Andoni for the same reasons given in claim 1 above.


Regarding Claim 4
Shaham in view of Tang and Andoni teaches:
The method of claim 2
(see rejection of claim 2)

Shaham further teaches:
wherein the affinity weights are based on distances between the feature embeddings of the items in the subsets in an embedding space
(page 9 paragraph 3) “We considered two variants of Gaussian affinity functions: using Euclidean distances (6), and Siamese distances; the latter case follows Algorithm 1.”; (page 1 last sentence) “Finally, spectral clustering has a solid probabilistic interpretation, since the Euclidean distance in the embedding space is equal to a diffusion distance, which, informally, measures the time it takes probability mass to transfer between points”

Regarding Claim 5
Shaham in view of Tang and Andoni teaches:
The method of claim 2
(see rejection of claim 2)

Tang further teaches:
wherein determining an update to the current values of the parameters by minimizing the normalized cuts loss function for the subset based on the total affinity weights, the affinity weights, and the probability distributions for the items in the subset comprises: determining a gradient of a loss function that satisfies: 
    PNG
    media_image1.png
    56
    293
    media_image1.png
    Greyscale
 where 
    PNG
    media_image2.png
    34
    196
    media_image2.png
    Greyscale
 is a first matrix, 
    PNG
    media_image3.png
    65
    90
    media_image3.png
    Greyscale
 denotes a sum over the elements of the first matrix, Y is a first matrix that includes the probabilities for each of the particular items in the subset,                         
                            Γ
                        
                     is a matrix that satisfies                         
                            Γ
                            =
                            
                                
                                    Y
                                
                                
                                    T
                                
                            
                            D
                        
                    , D is a column vector that includes the total affinity weights for each of the particular items in the subset, and W is a matrix that includes the affinity weights for each particular items in the cluster,                         
                            ⊘
                        
                     denotes element-wise division, and                         
                            ⊙
                        
                     denotes element-wise multiplication
Equation 6; (page 5 section 4.3) “This section shows how to compute the gradient of the normalized cut loss[*Examiner notes: determining a gradient] regularizer with a dense Gaussian kernel in linear time.”; [*Examiner notes: see equation 6 annotated below];

    PNG
    media_image10.png
    233
    597
    media_image10.png
    Greyscale



It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Shaham and Andoni with Tang for the same reasons given in claim 1 above.

Regarding Claim 6
Shaham in view of Tang and Andoni teaches:
The method of claim 1
(see rejection of claim 1)

Shaham further teaches:
wherein obtaining the unlabeled training data comprises: receiving features of each of the items in the data set; and processing each of the features using an embedding neural network to generate the feature embeddings for the items in the data set
(page 6 paragraph 4) “A Siamese net maps every data point xi into an embedding zi = Gθsiamese (xi) in some space. The net is typically trained to minimize contrastive loss,”

Regarding Claim 7
Shaham in view of Tang and Andoni teaches:
The method of claim 6
(see rejection of claim 6)

Shaham further teaches:
training the embedding neural network to generate feature embeddings that represent affinities between items in the data set
(page 6 paragraph 4) “A Siamese net maps every data point xi into an embedding zi = Gθsiamese (xi) in some space. The net is typically trained to minimize contrastive loss”

Regarding Claim 8
Shaham in view of Tang and Andoni teaches:
The method of claim 7
(see rejection of claim 7)

Shaham further teaches:
wherein the embedding neural network is a Siamese neural network
(page 6 paragraph 4) “A Siamese net maps every data point xi into an embedding zi = Gθsiamese (xi) in some space. The net is typically trained to minimize contrastive loss”

Regarding Claim 9
Shaham in view of Tang and Andoni teaches:
The method of claim 1
(see rejection of claim 1)

Shaham further teaches:
further comprising: after training the clustering neural network, generating a final clustering of the data set into the plurality of clusters.
(page 7 paragraph 1) “Once SpectralNet is trained, computing the embeddings of new test points (i.e., out-of-sample extension) and their cluster assignments is straightforward: we simply propagate each test point xi through the network Fθ to obtain their embeddings yi, and assign the point to its nearest centroid, where the centroids were computed using k-means on the training data, at the last line of Algorithm 1.”

Regarding Claim 10
Shaham in view of Tang and Andoni teaches:
The method of claim 1
(see rejection of claim 1)

Shaham further teaches:
further comprising: after training the clustering neural network, receiving a new data set and generating a clustering of the new data set without re-training the clustering neural network.
(page 2 paragraph 3) “Moreover, once trained, it provides a function, implemented as a feed-forward network, that maps each input data point to its spectral embedding coordinates. This map can easily be applied to new test data.”

Regarding Claim 11
Shaham in view of Tang and Andoni teaches:
The method of claim 1
(see rejection of claim 1)

Shaham further teaches:
wherein the training data comprises a plurality of items of visual data.
(page 9 section 5.2.1) “MNIST is a collection of 70,000 28x28 gray-scale images of handwritten digits, divided to training (60,000) and test (10,000) sets. To construct positive pairs for the Siamese net, we paired each instance with its two nearest neighbors. An equal number of negative pairs were chosen randomly from non-neighboring points.”

Regarding Claim 13
Claim 13 is a computer-readable medium claim corresponding to method claim 1. The only difference is that claim 13 recites a non-transitory computer-readable medium:
Andoni teaches:
One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations comprising
(paragraph [0089]) “Furthermore, the system may take the form of a computer program product on a computer-readable storage medium or device having computer-readable program code […] As used herein, a “computer-readable storage medium” or “computer-readable storage device” is not a signal.”

	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the clustering of Shaham in view of Tang and Andoni with the computer-readable medium of Andoni because (Andoni paragraph [0091]) “These computer program instructions may also be stored in a computer-readable memory or device that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart block or blocks.”

	The remaining limitations of the claim are taught by the rejection of claim 1.

Regarding Claim 14
Claim 14 is a computer system claim corresponding to method claim 1. The only difference is that claim 14 recites a computer with a processor and storage:
Andoni teaches:
A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising
(paragraph [0089]) “Any suitable computer-readable storage medium or device may be utilized, including hard disks, CD-ROM, optical storage devices, magnetic storage devices, and/or other storage media. Thus, also not shown in FIG. 1, the system 100 may be implemented using one or more computer hardware devices (which may be communicably coupled via local and/or wide-area networks) that include one or more processors, where the processor(s) execute software instructions corresponding to the various components of FIG. 1.”

It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the clustering of Shaham in view of Tang and Andoni with the computer-readable medium of Andoni because (Andoni paragraph [0089]) “the processor(s) execute software instructions corresponding to the various components of FIG. 1”

	The remaining limitations of the claim are taught by the rejection of claim 1.

Regarding Claim 15
Claim 15 is a computer system claim corresponding to method claim 2. The only difference is that claim 2 recites a computer with a processor and storage, taught in claim 14 above.

Regarding Claim 17
Claim 17 is a computer system claim corresponding to method claim 4. The only difference is that claim 17 recites a computer with a processor and storage, taught in claim 14 above.

Regarding Claim 18
Claim 18 is a computer system claim corresponding to method claim 5 The only difference is that claim 18 recites a computer with a processor and storage, taught in claim 14 above.

Regarding Claim 19
Claim 19 is a computer system claim corresponding to method claim 6. The only difference is that claim 19 recites a computer with a processor and storage, taught in claim 14 above.

Regarding Claim 20
Claim 20 is a computer system claim corresponding to method claim 7. The only difference is that claim 20 recites a computer with a processor and storage, taught in claim 14 above.

Regarding Claim 21
Claim 21 is a computer system claim corresponding to method claim 9. The only difference is that claim 21 recites a computer with a processor and storage, taught in claim 9 above.

Claims 3 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Shaham in view of Tang, Andoni, and further in view of NPL reference Shi et al. “Normalized Cuts and Image Segmentation” herein referred to as Shi.

Regarding Claim 3
Shaham in view of Tang and Andoni teaches:
The method of claim 2
(see rejection of claim 2)

Shaham further teaches:
wherein the input data is data representing an input graph of nodes and edges
[*Examiner notes: “graph Laplacian matrix” is a term of art which refers to a matrix representation of a graph containing nodes and edges]; (page 1 abstract) “Our network, which we call SpectralNet, learns a map that embeds input data points into the eigenspace of their associated graph Laplacian matrix and subsequently clusters them”

Shaham in view of Tang and Andoni does not explicitly teach:
and wherein the affinity weight between two items in the subset identifies whether there is an edge in the input graph between two nodes in the input graph that represent the two items in the subset

However, Shi teaches:
and wherein the affinity weight between two items in the subset identifies whether there is an edge in the input graph between two nodes in the input graph that represent the two items in the subset
(page 888 column 2 paragraph 3) “Our approach is most related to the graph theoretic formulation of grouping. The set of points in an arbitrary feature space are represented as a weighted undirected graph G = (V, E), where the nodes of the graph are the points in the feature space, and an edge is formed between every pair of nodes. The weight on each edge, w(i,j), is a function of the similarity between nodes i and j.”; [*Examiner notes: The function w(i,j) are the affinity weights, which identify the weight of the edge between node i and j and would be 0 if there is no edge between nodes i and j.]

	It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the clustering of Shaham in view of Tang, Andoni with the graph of Shi because (Shi page 897 column 1 paragraph 3) “The computational approach that we have developed for image segmentation is based on concepts from spectral graph theory. The core idea is to use matrix theory and linear algebra to study properties of the incidence matrix, W, and the Laplacian matrix, D - W, of the graph and relate them back to various properties of the original graph.”

Regarding Claim 16
Claim 16 is a computer system claim corresponding to method claim 3. The only difference is that claim 16 recites a computer with a processor and storage, taught in claim 14 above.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ezra J Baker whose telephone number is (703)756-1087. The examiner can normally be reached Monday - Friday 10:00 am - 8:00 pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached at (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/E.J.B./Examiner, Art Unit 2126                                                                                                                                                                                                        

/DAVID YI/Supervisory Patent Examiner, Art Unit 2126
Read full office action
Prosecution Timeline

Mar 25, 2022
Application Filed
Jun 26, 2025
Non-Final Rejection mailed — §101, §103, §112
Oct 09, 2025
Response Filed
Dec 19, 2025
Final Rejection mailed — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/863,840
Patent 12619886
Frozen Model Adaptation Through Soft Prompt Transfer
3y 9m to grant Granted May 05, 2026
17/559,159
Patent 12608619
SUPERSEDED FEDERATED LEARNING
4y 4m to grant Granted Apr 21, 2026
17/455,252
Patent 12585964
EXHAUSTIVE LEARNING TECHNIQUES FOR MACHINE LEARNING ALGORITHMS
4y 4m to grant Granted Mar 24, 2026
17/475,901
Patent 12579477
FEATURE SELECTION USING FEEDBACK-ASSISTED OPTIMIZATION MODELS
4y 6m to grant Granted Mar 17, 2026
17/460,373
Patent 12505379
COMPUTER-READABLE RECORDING MEDIUM STORING MACHINE LEARNING PROGRAM, MACHINE LEARNING METHOD, AND INFORMATION PROCESSING DEVICE OF IMPROVING PERFORMANCE OF LEARNING SKIP IN TRAINING MACHINE LEARNING MODEL
4y 3m to grant Granted Dec 23, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
50%
Grant Probability
99%
With Interview (+53.3%)
4y 0m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 16 resolved cases by this examiner. Grant probability derived from career allowance rate.