Last updated: April 20, 2026
Application No. 17/582,390
CLASSIFICATION LABEL SELECTION

Final Rejection §101§102§103
Filed
Jan 24, 2022
Examiner
ANSARI, TAHMINA N
Art Unit
2674
Tech Center
2600 — Communications
Assignee
Intel Corporation
OA Round
2 (Final)
Interview Optional

— +17.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 868 resolved cases, 2023–2026
Examiner Intelligence

ANSARI, TAHMINA N View full profile →
Grants 86% — above average
Career Allow Rate
743 granted / 868 resolved
+23.6% vs TC avg
Strong +18% interview lift
Without
With
+17.9%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
33 currently pending
Career history
901
Total Applications
across all art units
Statute-Specific Performance

§101
12.2%
-27.8% vs TC avg
§103
40.4%
+0.4% vs TC avg
§102
22.6%
-17.4% vs TC avg
§112
10.5%
-29.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 868 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
Claims 1-25 are pending in this application. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Examiner' s Responses to Applicant' s Remark
Applicants' amendments filed on August 25, 2024 have been fully considered. The amendments overcome the following rejections set forth in the office action mailed on February 25, 2025.
Applicant' s amendments overcome the interpretation of claims 1-6 and 13-19 under 35 U.S.C. 112 sixth paragraph for claim interpretation, and the rejection is hereby withdrawn. 
Applicant' s amendments to incorporate in the term “non-transitory” is duly noted; but the additional rejections of claims 1-25 under 35 U.S.C. 101 for being directed to non-statutory subject matter has not been overcome, and the rejection is hereby withdrawn.

Applicant's arguments with respect to claims 1-25 have been considered but are moot in view of the new grounds of rejection, presented below and necessitated by applicant’s amendments. 

Specification
Applicant’s amendments to the title “CLASSIFICATION LABLE SELECTION” do not overcome the objections cited. The amended title has a typographical error and the title of the invention is still not descriptive. Examiner would like to proposed “SEGMENT FUSION BASED ROBUST SEMANTIC SEGMENTATION OF SCENES USING A GRAPH CLUSTERER” as a possible amendment. A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Interpretation - 35 USC § 101
35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-25 are further rejected under 35 U.S.C. 101 because the claimed invention was determined to be directed to non-statutory subject matter.  The claims do not fall within at least one of the four categories of patent eligible subject matter because the claimed invention is directed to a specialized computing apparatus comprising significantly more.
(1) Are the claims directed to a process, machine, manufacture or composition of matter; 
(2A) 	Prong One: Are the claims directed to a judicially recognized exception, i.e., a law of nature, a natural phenomenon, or an abstract idea; 
Prong Two: If the claims are directed to a judicial exception under Prong One, then is the judicial exception integrated into a practical application;
(2B) If the claims are directed to a judicial exception and do not integrate the judicial exception, do the claims provide an inventive concept.
(Step 1) In the context of the flowchart in MPEP § 2106, subsection III, Step (1): Are the claims directed to a process, machine, manufacture or composition of matter? YES: the instant claims are all directed towards a computing system (claim 1), a computer-readable storage medium (claim 7), a semiconductor apparatus (claim 13), and a method (claim 20) comprising a memory and at least one processor, therefore is a machine, which is a statutory category of invention. 
(Step 2A) In the context of the flowchart in MPEP § 2106, subsection III, Step 2A Prong Two determines whether: Is the claim directed to a law of nature, a natural phenomenon, or an abstract idea?  YES. 
Independent Claim 20 is exemplary of the independent claims 1, 7, 13 and 20, and is presented below. Specifically, the claim recites:
20. A method comprising: 
a. identifying a plurality of segments using a graph clusterer based on semantic features generated using a first neural network, instance features generated from a second neural network and point cloud data associated with a scene; 
b. fusing the plurality of segments into a plurality of instances; and 
c. selecting classification labels for the plurality of instances.

When viewed under the broadest most reasonable interpretation, the instant claims are directed to a Judicial Exception – an abstract idea belonging to the group of mathematical concept.As a whole, the claimed features can be interpreted as an overall mathematical process, as they are a natural language representation of an overall mathematical algorithm, which qualifies as a judicial exception. The first step for “identifying a plurality of segments…” is considered to be judicially recited mathematical concept/algorithm. The second step of “fusing the plurality of segments …” is also considered to be a mathematical concept. The last step of “selecting classification labels …” is also considered to be a mathematical concept. There is nothing in the claim that requires more than an operation that a human, armed with the appropriate apparatus executing a mathematical algorithm (in this case “values”) can perform. The amendments for including a graph clusterer, a first neural network, and a second neural network are further directed towards a mathematical algorithm implemented using a machine learning algorithm and and the architecture surrounding it. The nature of these amendments are broad and can be encompassed in a generalized computing system and do not direct the claim to statutory subject matter.  
These features, in combination, are applied to an overall  “input data” comprising of “point cloud data” of a “scene” (Specification [0023]), and the nature of the term "point cloud data" can be applied to any input that can be computationally processed. As the overall features of the claims can be interpreted to be a mathematical algorithm, it does meet the requirements of Step2A for a judicial exception.
Since the claim as a whole does not integrate the exception into a practical application, in which case the claim is directed to the judicial exception (Step 2A: YES), it requires further analysis under Step 2B (where it may still be eligible if it amounts to an inventive concept). 

(Step 2B) In the context of the flowchart in MPEP 2106, subsection III, Step 2B determines whether: Does the claim recited additional elements that amount to significantly more than the judicial exception? NO. 
The instant claims do not apply, rely on, or use the judicial exception in a manner that imposes a meaningful limit on the judicial exception of “identify/fuse a plurality of segments” or “select classification labels” and therefore does not integrate the judicial exception into a practical application.In accordance with the MPEP § 2106.04(d) Integration of a Judicial Exception Into A Practical Application [R-07.2022],  
With respect to the claimed limitations, the following features are directed to a mathematical process that do not integrate the judicial exception into a practical application, as there is no “meaningful limit”.  
In particular, the claim includes additional elements as follows and includes using a computing system or  apparatus to perform the following:
a. identify/identifying a plurality of segments using a graph clusterer based on semantic features, generated using a first neural network, instance features generated from a second neural network, and point cloud data associated with a scene; 
b. fuse/fusing the plurality of segments into a plurality of instances; and 
c. select/selecting classification labels for the plurality of instances.
Step a. use an apparatus to “identify” “segments” at a high level of generality such that said “segments” can be used in the operation of the recited judicial exception (the mental step of “identify”). Identifying “segments” does not provide for “integration” of the abstract idea into a practical application, as said “segments” do not change the way in which said apparatus operates. In fact, there are no limits on the apparatus, which is recited at a high level of generality and thus said apparatus does nothing more than perform generic computing functions of “identifying” in the claim.
Step b. can be done mentally or mathematically; there are no specifics about what they are doing in ‘fusing’ thus one can merely ‘look’ at the segments and determine the “fusion”. This is considered a mathematical calculation step. One can manually correct the ‘segments” and or use a mathematical formula for the correction. 
Step c. of “selecting” the “classification labels”, can be considered an extra-solution activity, as the labels are assigned, and could be considered a routine and conventional step in the field of image processing. The step of “selecting” does not make the claim as a whole patent eligible because the claim as a whole of judicial exception does not integrate into a practical application. 
With regard to (2B), the pending claims do not show what is more than a routine in the art presented in the claims, i.e., the additional elements are nothing more than routine and well-known steps. There is no improvement to technology here. There is only a “identifying”/“fusing”, and “selecting” (extra-solution step), and it has not been shown that the mental process allows the “technology” (whether it is computer technology or any other technology) to do something that it previously was not able to do. The amendments for including a graph clusterer, a first neural network, and a second neural network are further directed towards a mathematical algorithm implemented using a machine learning algorithm and the architecture surrounding it. The nature of these amendments are broad and can be encompassed in a generalized computing system and do not direct the claim to statutory subject matter.
Independent claims 1, 7 and 13, are presented below and are rejected for the same reasons as they also include the steps (a) (b) and (c). All the independent claims are directed to a judicial exception and do not integrate the judicial exception, as they are co-extensive in scope with the method claim of Claim 20, and only recite additional limitations directed towards the physical components of the computing apparatus (1. a processor coupled to the network controller; and a memory , 7. a computing system, 13. Logic… implemented in configurable or fixed-functionality hardware,)  in which these systems are embodied.  
1. A computing system comprising: a network controller to obtain data corresponding to a scene; a processor coupled to the network controller; and a memory including a set of instructions, which when executed by the processor, cause the processor to: 
a. identify/identifying a plurality of segments using a graph clusterer based on semantic features, generated using a first neural network, instance features generated from a second neural network,  and point cloud data associated with the scene, 
b. fuse the plurality of segments into a plurality of instances, and 
c. select classification labels for the plurality of instances.
7. At least one computer readable storage medium comprising a set of instructions, which when executed by a computing system, cause the computing system to: 
a. identify/identifying a plurality of segments using a graph clusterer based on semantic features, generated using a first neural network, instance features generated from a second neural network,  and point cloud data associated with a scene; 
b. fuse the plurality of segments into a plurality of instances; and 
c. select classification labels for the plurality of instances.
13. A semiconductor apparatus comprising: one or more substrates; and logic coupled to the one or more substrates, wherein the logic is implemented at least partly in one or more of configurable or fixed-functionality hardware, the logic to: 
a. identify/identifying a plurality of segments using a graph clusterer based on semantic features, generated using a first neural network, instance features generated from a second neural network,  and point cloud data associated with a scene; 
b. fuse the plurality of segments into a plurality of instances; and 
c. select classification labels for the plurality of instances.
The claimed features, at best would invoke the analysis of the features under MPEP § 2106.05(a) as to whether the features qualify as improvements to the functioning of a computer or to any other technology or technical field. Even when considering the relevant consideration for evaluating whether the additional elements amount to an inventive concept, the limitations do not recite any elements that can be considered to qualify as “significantly more” when recited in a claim with judicial exception. The amendments for including a graph clusterer, a first neural network, and a second neural network are further directed towards a mathematical algorithm implemented using a machine learning algorithm and the architecture surrounding it. The nature of these amendments are broad and can be encompassed in a generalized computing system and do not direct the claim to statutory subject matter. At best, the claimed limitations only recite features that “apply” the judicial exception, and add “insignificant extra-solution activity to the judicial exception”. 
Dependent claims 2-6, 8-12, 14-19, and 21-25 are rejected for the same reasons; claims are directed to a judicial exception and do not integrate the judicial exception. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Claims 1-12 and 20-25 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Nie et al. (US PGPub US20220101489A1), hereby referred to as “Nie” in view of Cheng et al. (US PGPub 20220301173, filed on March 17, 2021, hereby referred to as “Cheng”).

Consider Claims 1, 7, and 20. 
Nie teaches:
1. A computing system comprising: a network controller to obtain data corresponding to a scene; / 7. At least one computer readable storage medium comprising a set of instructions, which when executed by a computing system, cause the computing system to: / 20. A method comprising: (Nie: abstract, A learning model may provide a hierarchy of convolutional layers configured to perform convolutions upon image features, each layer other than a topmost layer convoluting the image features at a lower resolution to a higher layer, and each layer other than a bottommost layer returning the image features to a lower layer. Each layer fuses the lower resolution image features received from a higher layer with same resolution image features convoluted at the layer, so as to combine large-scale and small-scale features of images. Layers of the hierarchy may be substantially equal to a number of lateral convolutions at a bottommost convolutional layer. The bottommost convolutional layer ultimately passes the fused features to an attention mapping module, which utilizes two attention mapping pathways in combination to detect non-local dependencies and interactions between large-scale and small-scale features of images without de-emphasizing local interactions. [0022]-[0028], Figures 1-2)
1. a processor coupled to the network controller; and a memory including a set of instructions, which when executed by the processor, cause the processor to: (Nie: [0022]-[0028], Figures 1-2, [0023] The learning system 100 may be implemented over a network 102 of physical or virtual server nodes 104(1), 104(2), . . . , 104(N) (where any unspecified server node may be referred to as a server node 104) connected by physical or virtual network connections. [0024] A learning model 110 implemented on a computing host accessed through an interface of the learning system 100 as described in example embodiments of the present disclosure may be stored on physical or virtual storage of a computing host 112 (“computing host storage 114”), and may be loaded into physical or virtual memory of the computing host 112 (“computing host memory 116”) in order for one or more physical or virtual processor(s) of the computing host 112 (“computing host processor(s) 118”) to perform computations using the learning model 110 to compute semantic segmentation as described herein. Computing host processor(s) 118 may be special-purpose computing devices facilitating computation of matrix arithmetic computing tasks. For example, computing host processor(s) 118 may be one or more special-purpose processor(s) 104 as described above, including accelerator(s) such as Neural Network Processing Units (“NPUs”), Graphics Processing Units (“GPUs”), Tensor Processing Units (“TPU”), and the like.)
1. identify a plurality of segments based on semantic features, instance features and point cloud data associated with the scene, / 7. identify a plurality of segments based on semantic features, instance features and point cloud data associated with a scene; / 20. identifying a plurality of segments based on semantic features, instance features and point cloud data associated with a scene;   (Nie: [0032] According to example embodiments of the present disclosure, at a preliminary convolutional layer 212 of the learning model 200, the preliminary convolutional layer 212 may perform down-sampling convolution upon the original sample image data. The original sample image data may be 64× resolution sample image data. Resolution of the 64× resolution sample image data may be down-sampled to 16× resolution by a stride of the preliminary convolutional layer 212; a pooling layer of the convolutional layer 212; or a combination thereof. According to example embodiments of the present disclosure, both a strided convolution and a pooling convolution are applied to the sample image data.[0033]-[0035], [0036] The bottommost convolutional layer 208 performs convolution upon the down-sampled sample image data. According to example embodiments of the present disclosure, a strided convolution, a pooling convolution, or a combination thereof may be applied to the 16× resolution sample image data.)
1. fuse the plurality of segments into a plurality of instances, / 7. fuse the plurality of segments into a plurality of instances; / 20. fusing the plurality of segments into a plurality of instances; and  (Nie: [0036] The bottommost convolutional layer 208 performs convolution upon the down-sampled sample image data. According to example embodiments of the present disclosure, a strided convolution, a pooling convolution, or a combination thereof may be applied to the 16× resolution sample image data. [0037], [0038] However, the bottommost convolutional layer 208 also performs convolution upon the 16× resolution features within the same convolutional layer, represented in FIG. 2 as a lateral arrow within the same convolutional layer; upon receiving 8× resolution features from the next convolutional layer 214 as a second input to the bottommost convolutional layer 208 (as shall be described subsequently), the bottommost convolutional layer 208 may perform feature fusion between the 16× resolution features and the 8× resolution features, as shall be described subsequently with reference to FIG. 3. Such feature fusion may be performed each time upon receiving 8× resolution features from the next convolutional layer 214. The fused features, which are 16× in resolution, as a second output of the bottommost convolutional layer 208, are successively output to increasingly higher convolutional layers: first a next convolutional layer 214, then a yet next convolutional layer 216, and then a topmost convolutional layer 210A or a yet further next convolutional layer 218, represented in FIG. 2 as successive upward and leftward arrows.)
1. and select classification labels for the plurality of instances./ 7. and select classification labels for the plurality of instances. / 20. selecting classification labels for the plurality of instances. (Nie: [0072] Furthermore, as FIG. 4C illustrates, the parallel unary-pairwise methodology performs closest to the ground truth (“GT”) labeled segmentations by detecting more fine detail between the multiple lampposts compared to the other methodologies illustrated. [0075] Sample data may generally be any labeled dataset indicating particular features of images and/or particular segmentations within images, at least some segmentations having semantic meaning distinct from each other. The dataset may be labeled to indicate that features, segmentations, and other aspects of images are positive or negative for a particular result, such as presence or absence of an object which may be detected. Moreover, the dataset may be labeled to indicate attention among different segmentations within images.)
Nie does not teach: identify a plurality of segments using a graph clusterer based on semantic features generated using a first neural network, instance features generated from a second neural network and point cloud data associated with the scene
Cheng teaches: 
1. A computing system comprising: a network controller to obtain data corresponding to a scene; / 7. At least one computer readable storage medium comprising a set of instructions, which when executed by a computing system, cause the computing system to: / 20. A method comprising: (Cheng: abstract, Methods and systems for graph-based panoptic segmentation of point clouds are described herein. The methods receive points of a point cloud with a semantic label from a first category. Further, a plurality of unified cluster feature vectors from a second category are received. Each unified cluster feature vector is extracted from a cluster of points in the point cloud. A graph comprising nodes and edges is constructed from the plurality of unified cluster feature vectors. Each node of the graph is the unified feature vector, and each edge of the graph indicates the relationship between every two nodes of the graph. The edges of the graph are represented as an adjacency matrix, wherein the adjacency matrix indicates the existence, or the lack of existence, of an edge between every two nodes. The graph is fed to a graph convolutional neural network configured for predicting an instance label for each node or an attribute for each edge, wherein the attribute of each edge is used for assigning the instance label to each node. The method combines points with semantic labels for the first category and points with instance labels for the second category to generate points with both a sematic label and an instance label.)
1. a processor coupled to the network controller; and a memory including a set of instructions, which when executed by the processor, cause the processor to: (Cheng: [0034] The processing system 100 may include one or more processing devices 104, such as a processor, a microprocessor, a graphics processing unit (GPU), a tensor processing unit (TPU), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a dedicated logic circuitry, or combinations thereof. The processing system 100 may optionally include one or more input/output (I/O) interfaces 106 to enable interfacing with one or more optional input devices 108 and/or output devices 110. The processing system 100 may include one or more network interfaces 112 for wired or wireless communication with other processing systems. The network interface(s) 112 may include wired links (e.g., Ethernet cable) and/or wireless links (e.g., one or more antennas) for intra-network and/or inter-network communications. [0035] The processing system 100 may also include one or more storage unit(s) 114, which may include a mass storage unit such as a solid-state drive, a hard disk drive, a magnetic disk drive and/or an optical disk drive. In some example embodiments, the storage unit(s) 114 may include a database 116 for storing training datasets which may be used to train parts of the graph-based panoptic segmentation system 102 as described in further detail below. Although FIG. 1 illustrates the storage unit(s) 114 to include the database 116, in alternative embodiments, the database 116 may be included in one or more remote storage unit(s) that can be accessed remotely via the network interface 112. The database 116 may need to be loaded in memory 118 before being used by the processing device 104. [0036] The processing system 100 may include one or more non-transitory memories 118, which may include a volatile or non-volatile memory (e.g., a flash memory, a random access memory (RAM), and/or a read-only memory (ROM)). The non-transitory memory(ies) 118 may store instructions for execution by the processing device(s) 104, such as to carry out example methods described in the present disclosure.)
1. identify a plurality of segments using a graph clusterer based on semantic features generated using a first neural network, instance features generated from a second neural network and point cloud data associated with the scene, (Cheng: [0012]-[0016], [0021] In some example aspects of the system, each unified cluster feature vector is extracted from a plurality of points of a point cloud using at least one of a learnable sparse convolution operation and a PointNet model, which maps the plurality of points the cluster to a 1×k vector, where k is a hyperparameter. In some example aspects of the system, the unified cluster feature vector includes a centroid value of each cluster, generating a unified cluster feature vector of size 1×(k+3). [0022] In some example aspects of the system, each point of the point cloud comprises at least spatial coordinates and a semantic label of the point. [0023] In some example aspects of the system, the plurality of clusters are determined using at least one of k-means clustering, partition around medoids clustering, and density-based clustering (DBSCAN). )
1. identify using a graph clusterer based on semantic features generated using a first neural network, instance features generated from a second neural network and point cloud data associated with the scene, / 7. identify using a graph clusterer based on semantic features generated using a first neural network, instance features generated from a second neural network and and point cloud data associated with a scene; / 20. identifying a plurality of segments based on semantic features using a graph clusterer based on semantic features generated using a first neural network, instance features generated from a second neural network and point cloud data associated with a scene; (Cheng: [0042] FIG. 2 is a block diagram of an example instance segmentation in accordance with example embodiments. The instance segmentation subsystem 122 performs several modules, including a filtration module 202, clustering module, an embedding module 206, and a graph representation module 208. The instance segmentation subsystem 122 also includes a graph convolutional neural network (GCNN) 210. The instance segmentation subsystem 122 receives an output from the semantic segmentation subsystem 120, which is points of a point cloud with semantic labels. It may also receive labelled datasets which are used to train the GCNN 210. Each labelled dataset a plurality of labeled point clouds. Each labeled point cloud includes a plurality of point where each point of the plurality of points is labeled with a ground truth semantic label and instance label to train the GCNN 210. [0043] The filtration module 202 is configured to select only points with semantic labels of categories of things for processing by clustering module 204, which partitions points of the same semantic label into clusters of points. The clusters of points are fed to the embedding module 206 configured to extract a unified cluster feature vector from every cluster of points. The unified cluster feature vectors are fed into graph representation 208 to create a graph of nodes and edges. The graph, comprising nodes and edges, is fed to the graph convolutional network (GCNN) 210 to predict nodes' instance labels or edges' attributes used to determine instance labels for the nodes connected by the edges.)
1. fuse the plurality of segments into a plurality of instances, / 7. fuse the plurality of segments into a plurality of instances; / 20. fusing the plurality of segments into a plurality of instances; and (Cheng: [0041] The instance segmentation subsystem 122 is configured to label points belonging to categories of things with their instance labels; instance labels are unique to each instance of the objects of the categories of things. The fusion 124 merges the output of the semantic segmentation 120 and the output of the instance segmentation 122, generating panoptic segmentation, where points of categories of stuff and points of categories of things are labelled, points of categories of stuff with semantic labels and points of categories of things with instance labels and semantic labels.)
1. and select classification labels for the plurality of instances./ 7. and select classification labels for the plurality of instances. / 20. selecting classification labels for the plurality of instances. (Cheng: [0045] Referring to FIG. 2, clustering module 204 is configured to partition points received from the filtration module 202 into clusters based on a similarity measure. Clustering module 204 applies a clustering operation to the plurality of points of every semantic label. Clustering module 204 groups points with internal similarities. Example embodiments apply different types of clustering methods. For example, k-means clustering, which uses Mahalanobis distance, partition around medoids (PAM) clustering, or density-based clustering (DBSCAN), a nonparametric method. The clustering module 204 groups every plurality of points into a cluster. Each cluster of points may have a different number of points than other clusters. In the illustration in FIG. 3, the output of the clustering 204 is clustered points 306; each pattern corresponds to a cluster of points 308 (only two of which are labelled). The cluster patterns are cluster labels representing each cluster of points. Points at this stage have semantic labels and cluster labels. It could be observed that clusters of points 308 are of different sizes. In other words, a different number of points from each cluster of points 308. The method implemented by clustering 204 is applied to every semantic label in the point cloud. After feeding the clustered points 306 into the graph representation 208 and the GCNN 210, the output includes points with semantic labels and instance labels, shown as masks 310 (described in detail below).)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to improve Nie’s learning model architecture for semantic segmentation with Cheng’s method and system for graph-based panoptic segmentation using unified cluster feature vectors, as they are both directed towards methods and systems for image segmentation. The determination of obviousness is predicated upon the following findings: both are directed towards the use of machine learning algorithms that rely on semantic segmentation, and lend to each other for combination. One skilled in the art would have been motivated to modify Nie in order to improve the sentiment analysis to leverage the graph-based panoptic segmentation of point clouds using unified cluster feature vectors as proposed by Cheng. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in the manner explained above using known engineering design, interface and programming techniques, without changing a “fundamental” operating principle of Nie, while the teaching of Cheng continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of improving the semantic segmentation process to combine “points with semantic labels for the first category and points with instance labels for the second category to generate points with both a sematic label and an instance label” for increased accuracy and efficiency. It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Consider Claims 2, 8 and 21. 
The combination of Nie and Cheng teaches: 
2. The computing system of claim 1, wherein the plurality of segments is to be fused into the plurality of instances via a learnable self-attention based network, and wherein the computing system is end-to-end learnable. / 8. The at least one computer readable storage medium of claim 7, wherein the plurality of segments is to be fused into the plurality of instances via a learnable self- attention based network. / 21. The method of claim 20, wherein the plurality of segments is fused into the plurality of instances via a learnable self-attention based network. (Nie: [0057] Based on the above, it may be seen that the bottommost convolutional layer 208 may, upon each time performing feature fusion between 16× resolution and 8× resolution features, pass the fused features upward to a higher convolutional layer. For each upward pass, the fused features may be passed to a successively higher convolutional layer: first the next convolutional layer 214, then the further next convolutional layer 216, then the yet further next convolutional layer 218. These upward passes are illustrated in FIG. 2 as leftward, rather than rightward, upward arrows. According to the “equilateral triangle” architecture as described above, the number of upward passes should result in each higher convolutional layer being passed to once except a topmost convolutional layer. [0058] Following the final lateral convolution by the bottommost convolutional layer 208, the bottommost convolutional layer 208 may pass the fused features to an attention mapping module 220. According to example embodiments of the present disclosure, an attention mapping module 220 may operate by multiple attention-mapping strategies to determine semantic dependencies across non-local pixels. Architecture of the attention mapping module 220 is subsequently described with reference to FIGS. 4A and 4B. [0059] FIG. 3 illustrates a feature fusion operation 300 according to example embodiments of the present disclosure. [0060]-[0070], Figures 4A-B, [0066] FIGS. 4A and 4B illustrate an attention mapping module 400 according to example embodiments of the present disclosure.)

Consider Claims 3, 9 and 22. 
The combination of Nie and Cheng teaches:
3. The computing system of claim 1, wherein the plurality of segments is to be fused into the plurality of instances based on an instance loss function, a segment loss function, and a distance margin parameter. / 9. The at least one computer readable storage medium of claim 7, wherein the plurality of segments is to be fused into the plurality of instances based on an instance loss function, a segment loss function, and a distance margin parameter. / 22. The method of claim 20, wherein the plurality of segments is fused into the plurality of instances based on an instance loss function, a segment loss function, and a distance margin parameter. (Nie: A loss function, or more generally an objective function, is generally any mathematical function having an output which may be optimized during the training of a learning model. [0077] Training of the learning model may, in part, be performed to train the learning model on at least one loss function to learn a weight set operative to compute a task for a particular function. The at least one loss function may be any conventional objective function operative for the learning model to be trained on for this purpose. For example, the at least one loss function may be an object detection loss function utilized in training a learning model to perform object detection tasks from images. [0078] As object detection problems generally feature relatively few positively labeled samples compared to many negatively labeled samples with regard to any particular result, object detection loss functions may focus on finding the hardest negatively labeled samples among all negatively labeled samples. For example, an object detection loss function may be an online hard example mining (“OHEM”) loss function, operative to select negatively labeled samples having highest loss. [0079] The at least one loss function may be a boundary detection loss function utilized in training a learning model to perform boundary detection tasks from images. For example, a boundary detection loss function may be a Dice loss function, operative to detect similarity between two samples at both small-scale and large-scale. [0080] For example, according to example embodiments of the present disclosure, a primary loss function may be a joint OHEM and Dice loss function, and the learning model may additionally be trained on multiple auxiliary loss functions as known to persons skilled in the art directed to the above-mentioned tasks or other related tasks. According to example embodiments of the present disclosure, the learning model may be trained on the primary loss function at the ultimate output of the learning model 200, and may be trained on one or more auxiliary loss functions at lateral convolutions of the bottommost convolutional layer 208.)

Consider Claims 4, 10 and 23. 
The combination of Nie and Cheng teaches:
4. The computing system of claim 3, wherein the segment loss function is to penalize fusion mispredictions and separation mispredictions. / 10. The at least one computer readable storage medium of claim 9, wherein the segment loss function is to penalize fusion mispredictions and separation mispredictions. / 23. The method of claim 22, wherein the segment loss function penalizes fusion mispredictions and separation mispredictions. (Nie: [0075] The dataset may be labeled to indicate that features, segmentations, and other aspects of images are positive or negative for a particular result, such as presence or absence of an object which may be detected. Moreover, the dataset may be labeled to indicate attention among different segmentations within images. [0078] As object detection problems generally feature relatively few positively labeled samples compared to many negatively labeled samples with regard to any particular result, object detection loss functions may focus on finding the hardest negatively labeled samples among all negatively labeled samples. For example, an object detection loss function may be an online hard example mining (“OHEM”) loss function, operative to select negatively labeled samples having highest loss. [0079] The at least one loss function may be a boundary detection loss function utilized in training a learning model to perform boundary detection tasks from images. For example, a boundary detection loss function may be a Dice loss function, operative to detect similarity between two samples at both small-scale and large-scale.)

Consider Claims 5, 11 and 24. 
The combination of Nie and Cheng teaches:
5. The computing system of claim 1, wherein to select the classification labels, the instructions, when executed, further cause the processor to: generate, on a per instance basis, a semantic label for each voxel in the instance, and select the classification label based on semantic labels of voxels in the instance. / 11. The at least one computer readable storage medium of claim 7, wherein to select the classification labels, the instructions, when executed, further cause the computing system to: generate, on a per instance basis, a semantic label for each voxel in the instance; and select the classification label based on semantic labels of voxels in the instance./ 24. The method of claim 20, wherein selecting the classification labels includes: generating, on a per instance basis, a semantic label for each voxel in the instance; and selecting the classification label based on semantic labels of voxels in the instance. (Nie: [0066] FIGS. 4A and 4B illustrate an attention mapping module 400 according to example embodiments of the present disclosure. [0067] According to example embodiments of the present disclosure, the attention mapping module 400 receives a set of fused features and performs attention mapping upon the fused strategies by a pairwise attention operation 402 and a parallel unary attention operation 404. [0068] According to the pairwise attention operation 402, attention may be mapped over non-local pairwise pixels of the fused features to generate a long-range attention map, identifying semantic context dependencies over large-scale ranges across the features. Mapping attention over non-local pairwise pixels may be performed by, for example, Asymmetric Pyramid Non-local Block incorporating pyramid subsampling, as proposed by Zhu et al. However, pairwise attention mapping alone in this manner is expected to emphasize large-scale dependencies and interactions between pixels and/or features, potentially at the expense of overlooking or de-emphasizing small-scale dependencies between pixels and/or features.)

Consider Claims 6, 12 and 25. 
The combination of Nie and Cheng teaches:
6. The computing system of claim 1, wherein the plurality of segments is to be variable in size. / 12. The at least one computer readable storage medium of claim 7, wherein the plurality of segments is to be variable in size. / 25. The method of claim 20, wherein the plurality of segments is variable in size.  (Examiner Note: a long range attention map that identifies semantic context dependencies over large scale ranges is analogous in scope to segments that are variable in size. Nie: [0059] FIG. 3 illustrates a feature fusion operation 300 according to example embodiments of the present disclosure. [0066]-[0070] [0068] According to the pairwise attention operation 402, attention may be mapped over non-local pairwise pixels of the fused features to generate a long-range attention map, identifying semantic context dependencies over large-scale ranges across the features. Mapping attention over non-local pairwise pixels may be performed by, for example, Asymmetric Pyramid Non-local Block incorporating pyramid subsampling, as proposed by Zhu et al. However, pairwise attention mapping alone in this manner is expected to emphasize large-scale dependencies and interactions between pixels and/or features, potentially at the expense of overlooking or de-emphasizing small-scale dependencies between pixels and/or features. [0070]
After both the pairwise attention operation 402 and the parallel unary attention operation 404 are completed, the long-range attention map and the position-sensitive attention map are added to generate combined attention map. This combined attention mapping operation may demonstrate (as shall be discussed subsequently with reference to experimental results) improved performance over attention mapping by addition alone, attention mapping by multiplication alone, attention mapping by addition and multiplication performed in sequence in either order, or attention mapping by concatenation.)

Claims 13-19 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang et al. (US PGPub 2018/0165554) hereby referred to as “Zhang”, in view of Nie et al. (US PGPub US20220101489A1), hereby referred to as “Nie” further in view of Cheng et al. (US PGPub 20220301173, filed on March 17, 2021, hereby referred to as “Cheng”. 

Consider Claim 13.
Zhang teaches: 
13. A semiconductor apparatus comprising: one or more substrates; (Zhang: [0129] Moreover, a computer can be embedded in another device, e.g., a mobile telephone (e.g., a smartphone), a personal digital assistant (PDA), a mobile audio or video player, a game console, or a portable storage device (e.g., a universal serial bus (USB) flash drive). Devices suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry. [0097], Figure 1)
-; and logic coupled to the one or more substrates, wherein the logic is implemented at least partly in one or more of configurable or fixed-functionality hardware, the logic to: (Zhang: [0120] In general, the routines executed to implement the embodiments of the present disclosure may be implemented as part of an operating system or a specific application, component, program, object, module or sequence of instructions referred to as a “computer program.” A computer program typically comprises one or more instruction sets at various times in various memory and storage devices in a computer, and that, when read and executed by one or more processors in a computer, cause the computer to perform operations necessary to execute elements involving the various aspects of the invention.)
a. obtain point cloud data associated with a scene; (Zhang: [0017] Clustering acts to effectively reduce the dimensionality of a data set by treating each cluster as a degree of freedom, with a distance from a centroid or other characteristic exemplar of the set. In a non-hybrid system, the distance is a scalar, while in systems that retain some flexibility at the cost of complexity, the distance itself may be a vector. Thus, a data set with 10,000 data points, potentially has 10,000 degrees of freedom, that is, each data point represents the centroid of its own cluster. However, if it is clustered into 100 groups of 100 data points, the degrees of freedom is reduced to 100, with the remaining differences expressed as a distance from the cluster definition. Cluster analysis groups data objects based on information in or about the data that describes the objects and their relationships. The goal is that the objects within a group be similar (or related) to one another and different from (or unrelated to) the objects in other groups. The greater the similarity (or homogeneity) within a group and the greater the difference between groups, the “better” or more distinct is the clustering.)
b. identify relationship instances between the data points based on semantic features  (Zhang: [0019] However, in complex data sets, there are relationships between data points such that a cost or penalty (or reward) is incurred if data points are clustered in a certain way. Thus, while the clustering algorithm may split data points which have an affinity (or group together data points, which have a negative affinity, the optimization becomes more difficult. [0020] Thus, for example, a semantic database may be represented as a set of documents with words or phrases. Words may be ambiguous, such as “apple”, representing a fruit, a computer company, a record company, and a musical artist. In order to effectively use the database, the multiple meanings or contexts need to be resolved. In order to resolve the context, an automated process might be used to exploit available information for separating the meanings, i.e., clustering documents according to their context.)
c. select classification labels for the plurality of instances.(Zhang: [0050] It is therefore an object to provide a method of modelling data, comprising: training an objective function of a linear classifier, based on a set of labeled data, to derive a set of classifier weights; defining a posterior probability distribution on the set of classifier weights of the linear classifier; approximating a marginalized loss function for an autoencoder as a Bregman divergence, based on the posterior probability distribution on the set of classifier weights learned from the linear classifier; and classifying unlabeled data using a compact classifier according to the marginalized loss function. [0097] Figures 1)
Even if Zhang does not teach: 
a. identify a plurality of segments using a graph clusterer based on semantic features generated using a first neural network, instance features generated from a second neural network and point cloud data associated with a scene, 
b. fuse the plurality of segments into a plurality of instances; and 
Nie teaches:
1. A computing system comprising: a network controller to obtain data corresponding to a scene; / 7. At least one computer readable storage medium comprising a set of instructions, which when executed by a computing system, cause the computing system to: / 20. A method comprising: (Nie: abstract, A learning model may provide a hierarchy of convolutional layers configured to perform convolutions upon image features, each layer other than a topmost layer convoluting the image features at a lower resolution to a higher layer, and each layer other than a bottommost layer returning the image features to a lower layer. Each layer fuses the lower resolution image features received from a higher layer with same resolution image features convoluted at the layer, so as to combine large
Read full office action
Prosecution Timeline

Jan 24, 2022
Application Filed
Mar 08, 2022
Response after Non-Final Action
Feb 20, 2025
Non-Final Rejection — §101, §102, §103
Aug 25, 2025
Response Filed
Nov 05, 2025
Final Rejection — §101, §102, §103
Apr 07, 2026
Request for Continued Examination
Apr 11, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

18/068,590
Patent 12586249
PROCESSING APPARATUS, PROCESSING METHOD, AND STORAGE MEDIUM FOR CALIBRATING AN IMAGE CAPTURE APPARATUS
2y 5m to grant Granted Mar 24, 2026
18/484,909
Patent 12586354
TRAINING METHOD, APPARATUS AND NON-TRANSITORY COMPUTER READABLE MEDIUM FOR A MACHINE LEARNING MODEL
2y 5m to grant Granted Mar 24, 2026
18/471,055
Patent 12573083
COMPUTER-READABLE RECORDING MEDIUM STORING OBJECT DETECTION PROGRAM, DEVICE, AND MACHINE LEARNING MODEL GENERATION METHOD OF TRAINING OBJECT DETECTION MODEL TO DETECT CATEGORY AND POSITION OF OBJECT
2y 5m to grant Granted Mar 10, 2026
17/976,971
Patent 12548297
IMAGE PROCESSING METHOD AND APPARATUS, COMPUTER DEVICE, STORAGE MEDIUM, AND PROGRAM PRODUCT BASED ON FEATURE AND DISTRIBUTION CORRELATION
2y 5m to grant Granted Feb 10, 2026
18/444,143
Patent 12524504
METHOD AND DATA PROCESSING SYSTEM FOR PROVIDING EXPLANATORY RADIOMICS-RELATED INFORMATION
2y 5m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
86%
Grant Probability
99%
With Interview (+17.9%)
2y 8m
Median Time to Grant
Moderate
PTA Risk
Based on 868 resolved cases by this examiner. Grant probability derived from career allow rate.