Prosecution Insights
Last updated: April 19, 2026
Application No. 18/070,979

MULTICLASS CLASSIFICATION APPARATUS AND METHOD ROBUST TO IMBALANCED DATA

Final Rejection §101§103
Filed
Nov 29, 2022
Examiner
COLEMAN, PAUL
Art Unit
2126
Tech Center
2100 — Computer Architecture & Software
Assignee
ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
OA Round
2 (Final)
70%
Grant Probability
Favorable
3-4
OA Rounds
3y 6m
To Grant
99%
With Interview

Examiner Intelligence

Grants 70% — above average
70%
Career Allow Rate
7 granted / 10 resolved
+15.0% vs TC avg
Strong +43% interview lift
Without
With
+42.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 6m
Avg Prosecution
23 currently pending
Career history
33
Total Applications
across all art units

Statute-Specific Performance

§101
36.3%
-3.7% vs TC avg
§103
42.0%
+2.0% vs TC avg
§102
6.2%
-33.8% vs TC avg
§112
12.4%
-27.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 10 resolved cases

Office Action

§101 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Status of Claims The present application is being examined under the claims filed 11/12/2025. The status of the claims are as follows: Claims 1, 4-8, and 11-15 are pending; Claims 1, 4, 6, 8, 11-15 are amended; Claims 2-3 and 8-9 are canceled. Response to Amendment The Office Action is in response to Applicant’s communication filed November 12, 2025 in response to office action mailed September 30, 2025. The Applicant’s remarks and any amendments to the claims or specification have been considered with the results that follow. Response to Arguments Regarding 35 U.S.C. § 112(b) / § 112(f) Applicant asserts that the amendments to independent claims 1, 8, and 15 (reciting a “processor” and “memory including one or more instructions …”) provides “clear structural context and implementation detail”, and thereby resolve any alleged indefiniteness, and further contends that the specification provides corresponding structure/acts for the recited “units”, citing, e.g., Spec. ¶¶[0049]-[0050], [0042], [0052]-[0055], [0056]-[0058], and FIGS. 1-9. Examiner’s Response: The § 112(b) rejection is withdrawn as to amended claims 1, 8, and 15 because Applicant’s amendments provide sufficient structural context (processor + memory + instructions causing instantiation/operations) such that the previously-noted indefiniteness concern is no longer maintained for those claims. As a reminder, the specification indeed describes the claimed functional blocks at a high level – e.g., feature dictionary generation by random sampling / over-sampling and hyperparameter selection; generator receiving noise and fake class with softmax convex weighting and convex combination with a feature dictionary; and tuning feature extraction/multiclass classification – and these disclosures are acknowledged. However, the withdrawal is based on the claim amendments curing the prior indefiniteness issue; it is not an allowance on the merits. Regarding 35 U.S.C. § 101 Applicant argues that the amended claims are directed to a practical application and/or recite “significantly more”, emphasizing (i) the “processor instantiating” specific modules, (ii) that the claimed pipeline cannot be practically performed in the human mind, and (iii) comparisons to cases such as McRO and Enfish, and BASCOM for a purported non-conventional ordered combination. Examiner’s Response: Examiner disagrees; these arguments are not persuasive because: Adding “processor” and “memory/instructions” does not integrate the judicial exception into a practical application. – The non-final Office Action explained that the claims recite, in substance, data pre-processing/organization for later analysis (balancing data; sampling/organizing feature maps into a dictionary; convex combination/softmax weighting; adversarial training), followed by model training/classification, i.e., mathematical concepts and mental processes implemented on generic computing. Merely reciting that these steps are performed by a processor executing instructions is an implementation on generic computer hardware, which does not by itself provide a technological improvement to computer functionality or otherwise integrate the abstract idea into a practical application. The asserted “defined pipeline” is still a sequence of conventional ML data-prep and training operations, as characterized by Applicant’s own disclosure. – The non-final Office Action relied on the application’s admissions/characterizations that key elements are commonly used or conventional (e.g., backbone feature extraction commonly used in deep learning-based computer vision with pre-learned ImageNet parameters; feature dictionary generation through over-sampling with tunable sampling count/hyperparameter; and adversarial training described as “commonly used” for generative models), supporting the conclusion that the additional elements amount to routine and conventional implementation of the abstract idea rather than “significantly more“. “Not practically performable in the human mind” is not dispositive here. – Even if certain computations are not practically performed mentally at scale, the claims as written still recite mathematical operations and data manipulation steps (e.g., convex combination, softmax-based weights, sampling future weights) and organizing/training activities that fall within the abstract idea groupings discussed in the non-final Office Action. Accordingly, the § 101 rejection is maintained. Regarding 35 U.S.C. § 102 Applicant traverses the § 102 rejection of claims 1, 5, 8, and 12 over Mariani and asserts that, “independent claims 1 and 8 are amended to incorporate the subject matter of former dependent claims 2-3 and 9-10, respectively, thereby rendering the rejection based on Mariani moot”, and requests “withdrawn of the rejection under 35 U.S.C. § 102” (remarks, p. 13). Examiner’s Response: Examiner agrees that, as amended, independent claims 1 and 8 now include additional limitations (e.g., feature dictionary sampling; convex combination using softmax-derived convex weights; generator receives noise and a fake class) that were not applied in the non-final § 102 anticipation position over Mariani. Accordingly, the § 102(a)(2) rejections of claims 1, 5, 8, and 12 over Mariani set forth in the non-final Office Action are withdrawn in this Office Action. Regarding 35 U.S.C. § 103 Applicant argues that Mariani fails to disclose or suggest the claimed “feature dictionary unit” and “convex combination of softmax-weighted features”, asserting Mariani operates on latent vectors rather than feature-map/dictionary blending, and that Wang is directed to feature refinement (non-local blocks) rather than data augmentation and does not teach adversarial training, dictionary sampling, or class-conditional synthesis. Applicant further asserts that the combination is deficient and cannot maintain a prima facie case of obviousness, including as to independent claim 15. Examiner’s Response: Examiner disagrees; these arguments are not persuasive because: Mariani teaches the claimed “generator configured to receive noise and a fake class”. – Mariani’s BAGAN uses a class-conditional latent vector generator that outputs latent vectors Z c drawn randomly from a class distribution, and supplies class labels c (uniformly distributed) for generation; the generator produces “fake” images conditioned on the class label. This corresponds to receiving (i) random latent input (noise) and (ii) a class input for generation (i.e., the class/fake labeling framework used in adversarial training). Wang teaches softmax-based convex weighting and a convex combination (weighted sum) over a set of features (i.e., a “feature dictionary”). The proposed combination remains obvious because Wang provides a known mechanism (softmax-normalized weighting over a feature set) that would have predictably been used within Mariani’s generative pipeline to form/weight feature representations. Applicant’s “no feature dictionary sampling in either reference” argument is not persuasive under the broadest reasonable interpretation. Accordingly, the § 103 rejection is maintained (including as applied to claim 15), for the reasons above and as set forth in the non-final Office Action. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1, 4-8, and 11-15 are rejected under 35 U.S.C. 101 for containing an abstract idea without significantly more. Regarding claim 1 Claim 1 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter? Yes, the claim is to a machine. Claim 1 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon? Yes, the claim recites an abstract idea. “a model learning unit configured to receive the balanced learning data from the balanced learning data configuration unit to provide a class result predicted through model learning” – this is evaluating data and outputting a classification result, i.e., collecting/analyzing information and providing a result, which falls within mental processes / abstract data analysis. See MPEP § 2106.04(a)(2)(III). “a feature extraction unit configured to extract a feature of the imbalanced learning data” – extracting features from data is mathematical/data-analytic processing (e.g., computing representations/descriptors). This is a mathematical concept under MPEP § 2106.04(a)(2)(I). The specification describes feature extraction in a deep learning computer vision context, i.e., extracting feature maps. “a feature dictionary unit configured to randomly sample some of feature maps obtained from the feature extraction unit to generate a feature dictionary” – randomly sampling/organizing feature maps into a dictionary constitutes selecting/organizing information, which can be performed conceptually and is treated as an abstract mental process under MPEP § 2106.04(a)(2)(III). “a feature generating unit configured to generate artificial data, based on a convex combination of a convex weight and the feature dictionary” – a convex combination with convex weights is an expressly mathematical relationship (linear combination with nonnegative coefficients summing to 1), i.e., a mathematical concept under MPEP § 2106.04(a)(2)(I). “a convex weighting unit configured to output the convex weight by using softmax” – softmax is a mathematical function; using softmax to compute convex weights is a mathematical concept under MPEP § 2106.04(a)(2)(I). Claim 1 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application? No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements: “a processor: and memory including one or more instructions that response to being executed by the processor, cause the processor at least to instantiate:” – this limitation merely confines performance of the recited operations to a generic computer. It does not integrate any abstract idea into a practical application; it is a generic “apply it on a computer” implementation clause. See MPEP § 2106.05(f). The recited “units” (balanced learning data configuration unit; feature extraction unit; feature dictionary unit; feature generating unit; generator; convex weighting unit; model learning unit) are software functional blocks that perform data processing operations (feature extraction, sampling, weighting, generating, classifying). Limiting the abstract idea to a machine learning context (imbalanced data / multiclass classification) is a field-of-use limitation and does not, by itself, integrate the abstract idea into a practical application. See MPEP § 2106.05(h). “a balanced learning data configuration unit configured to receive imbalanced learning data to obtain balanced learning data;” - the "configured to receive ... data" portion of this limitation is mere data gathering/extra-solution activity under MPEP § 2106.0S(g); the "obtain balanced learning data" portion of the limitation constitutes pre-processing/organizing information (e.g., resampling, weighting, synthesizing) for later analysis, which is just applying the abstract idea on a computer and does not improve computer functionality. MPEP § 2106.0S(f), § 2106.0S(a). “a generator configured to receive noise and a fake class;” – this limitation does not integrate the judicial exception into a practical application. The limitation merely recites receiving inputs (“noise” and a “fake class”) by a generic “generator”, without specifying any particular technical improvement in how the generator operates or how the inputs are processed. This limitation is directed to the selection and use of input data for a mathematical or abstract generative process and represents an insignificant extra-solution activity that confines the abstract idea to a particular application context (generative modeling). See MPEP § 2106.05(g); § 2106.04(d). Claim 1 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception? No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are: “a processor: and memory including one or more instructions that response to being executed by the processor, cause the processor at least to instantiate:” – generic processor/memory executing instructions is well-understood, routine, and conventional (WURC) computer functionality (generic execution of software instructions). The application describes conventional computer architecture for executing program instructions (spec. ¶[0086]). See MPEP § 2106.05(d). “a balanced learning data configuration unit configured to receive imbalanced learning data to obtain balanced learning data;” -techniques for creating "balanced learning data" (e.g., over/under-sampling, class weighting, augmentation/adversaria I synthesis) are well-understood, routine, conventional (WURC) ML pre-processing; implementing them on generic hardware is likewise conventional. See M PEP § 2106. 0S(d), (f), (g); The specification itself describes the balanced learning data configuration unit (and its sub-steps) as conventional: feature extractor" may use a backbone network which is commonly used ... and may use a ... pre-learned ... lmageNet data set" (Spec. 1][0049]) feature dictionary generation: the dictionary is created "through over-sampling", and the number of sampled feature maps is a tunable hyper-parameter- i.e., routine preprocessing. (Spec. 1][0042]) “a generator configured to receive noise and a fake class;” – receiving noise and class information as inputs to a generator is well-understood, routine, and conventional (WURC) in generative modeling systems, including class-conditional generative networks. The claim does not recite any unconventional generator architecture, input transformation, or training mechanism that would amount to significantly more than the routine implementation of a generative abstract idea on a generic computing component. Thus, this limitation does not supply an inventive concept. See MPEP § 2106.05(d); spec. ¶¶ [0076]-[0082]. Regarding claim 4 Claim 4 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter? Yes, the claim is to a machine. Claim 4 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon? Yes, the claim recites an abstract idea. “The multiclass classification apparatus of claim 1, wherein the feature generating unit complements a minority class with the artificial data.” – this limitation is directed to balancing an imbalanced dataset by adding generated samples to an underrepresented (minority) class. Complementing a minority class with artificial data is a data manipulation / data organization objective that amounts to analyzing and organizing information for downstream classification. See MPEP § 2106.04(a)(2)(I); MPEP § 2106.04(a)(2)(III). Claim 4 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application? No. There are no additional elements that integrate the judicial exception into a practical application. Claim 4 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception? No. There are no additional elements that amount to significantly more than the judicial exception. Regarding claim 5 Claim 5 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter? Yes, the claim is to a machine. Claim 5 - Step 2A Prong One - Does the claim recite an abstract idea, law of nature, or natural phenomenon? Yes, the claim recites an abstract idea. Claim 5 depends from claim 1 which recites an abstract idea (see rejection of claim 1). Claim 5 - Step 2A Prong Two - Does the claim recite additional elements that integrate the judicial exception into a practical application? No, there are no additional elements that integrate the judicial exception into a practical application. The additional elements: wherein the feature generating unit performs adversarial training which allows artificial data to be similar to a distribution of real data. - this limitation recites adversarial training only at a high, functional level and in result-oriented terms ("allows to be similar"), without any specific training algorithm, loss function, network architecture, parameter settings, or hardware improvements. This is merely instructing a computer/ML model to apply the abstract idea, which is insufficient. See MPEP § 2106.05(f) (mere instructions to apply the exception). Claim 5 - Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception? No, there are no additional elements that amount to significantly more than the judicial exception. The additional elements: wherein the feature generating unit performs adversarial training which allows artificial data to be similar to a distribution of real data. - "adversarial training" (e.g., using a generator and discriminator to push the synthetic distribution toward the real distribution) is a well-understood, routine, and conventional ML technique. Implementing it with generic computing components is likewise conventional. See MPEP § 2106.05(d) (WURC), $ 2106.05(f). Regarding claim 6 Claim 6 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter? Yes, the claim is to a machine. Claim 6 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon? Yes, the claim recites an abstract idea. “wherein the feature extraction unit comprises: a feature extractor configured to extract the feature; and” – this limitation recites extracting information from image data, which constitutes a mathematical concept (e.g., computing feature representations) and mental processes (collecting/analyzing visual information and combining results). See MPEP § 2106.04(a)(2)(I) and MPEP § 2106.04(a)(2)(III). Claim 6 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application? No. There are no additional elements that integrate the judicial exception into a practical application. “a feature adaptation unit configured to allow the feature to obtain one characteristic of a shape, an edge, and a color of an image, and the obtained features are integrated as one.” – this limitation recites obtaining shape/edge/color characteristics, a pre-processing of data (deriving or selecting image attributes) and constitutes insignificant extra-solution activity/ data gathering for the later abstract processing. See MPEP § 2106.0S(g). Additionally, "the obtaining features are integrated as one" is merely combining/concatenating/fusing feature vectors, i.e., formatting or organizing information, which is also insignificant extra-/post-solution activity. MPEP § 2106.0S(g). Claim 6 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception? No. There are no additional elements that amount to significantly more than the judicial exception. “a feature adaptation unit configured to allow the feature to obtain one characteristic of a shape, an edge, and a color of an image, and the obtained features are integrated as one.” – extracting common image characteristics (e.g., shape, edges, color) and feature fusion/integration are well understood, routine, and conventional computer-vision/ML techniques; implementing them on generic computer components is likewise conventional. See MPEP § 2106.0S(d) (WURC), § 2106.0S(f). Regarding claim 7 Claim 7 -Step 1- ls the claim to a process, machine, manufacture or composition of matter? Yes, the claim is to a machine. Claim 7 -Step 2A Prong One - Does the claim recite an abstract idea, law of nature, or natural phenomenon? Yes, the claim recites an abstract idea. • a tuning feature extraction unit configured to finely tune a feature extraction method of the feature extraction unit, based on the balanced learning data; - "fine tuning" the feature extractor using balanced learning data is algorithmic model training/adjustment (parameter optimization of a feature extractor), which is a mathematical concept under MPEP § 2106.04(a)(2). The specification likewise frames this as a model fine-tuning step (S821) to enhance classification (see Spec. ¶[0082]). • and a multiclass classification unit configured to classify a class into a plurality of classes by using the feature extracted from the tuning feature extraction unit. - performing multiclass classification on extracted features is statistical/mathematical data analysis, which is a mathematical concept under M PEP§ 2106. 04(a)(2). The specification describes this as step S822 producing the predicted class result (see Spec. ¶[0083]). Claim 7 - Step 2A Prong Two - Does the claim recite additional elements that integrate the judicial exception into a practical application? No, there are no additional elements that integrate the judicial exception into a practical application. Claim 7 -Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception? No, there are no additional elements that amount to significantly more than the judicial exception. Regarding claims 8, 11-14 (a multiclass classification method). Claim 8 is a method claim directly analogous in scope with apparatus/machine claim 1, but framed in "method" form (i.e., a multiclass classification method 'robust to imbalanced data'). claim 11 is method-analogous to claim 4; claim 12 is method-analogous to claim 5; claim 13 is method-analogous to claim 6; claim 14 is method analogous to claim 7. Accordingly, the §101 rejection set forth for claims 1, 4-7 apply equally to claims 8, 11-14, respectively. Regarding claim 15 Claim 15 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter? Yes, the claim is to a machine. Claim 15 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon? Yes, the claim recites an abstract idea. “a generator configured to receive noise and a fake class;” – receiving “noise” and a “fake class” is receiving/labeling input data used for generating synthetic samples (i.e., manipulating information inputs for a generation task). It is part of the abstract data-generation framework rather than a specific hardware improvement. See MPEP § 2106.04(a)(2)(III) (collecting/organizing information inputs for generation) and/or MPEP § 2106.04(a)(2)(III) (random/noise-driven generation context). “a convex weighting unit configured to output a convex weight by using softmax;” – softmax is a mathematical function; using softmax to compute convex weights is a mathematical concept under MPEP § 2106.04(a)(2)(I). “an artificial data generating unit configured to generate artificial data, based on a convex combination of the convex weight output from the convex weighting unit and a previously generated feature dictionary;” – the limitations recited “convex combination” is an expressly mathematical relationship (weighted combination under convex constraints), applied here to feature representations to form artificial data. This is mathematical manipulation of information under MPEP § 2106.04(a)(2)(I). “an adversarial training unit configured to perform adversarial training which allows the artificial data to be similar to a distribution of real data.” – this limitation recites a mathematical concept, namely adversarial training to optimize a model such that artificial data approximates the distribution of real data, which is a form of statistical optimization and probability distribution matching. Such concepts fall within the judicial exception for abstract ideas under MPEP § 2106.04(a)(2)(I). Claim 15 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application? No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements: “A feature generator implemented by a processor executing instructions stored in memory,” – this limitation recites generic computer implementation language (processor/memory executing instructions) and merely confines performance of the recited operations to a generic computer. It does not integrate any abstract idea into a practical application; it is a generic “apply it on a computer” implementation clause. See MPEP § 2106.05(f). “the feature generator used in a multi class classification apparatus robust to imbalanced data;” – this limitation merely limits the abstract operations to the field of multiclass classification with imbalanced data; this is a field-of-use limitation and does not integrate the exception into a practical application. See MPEP § 2106.05(h); § 2106.04(d). Claim 15 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception? No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are: “A feature generator implemented by a processor executing instructions stored in memory,” – generic processor/memory executing instructions is well-understood, routine, and conventional (WURC) computer functionality (generic execution of software instructions). The application describes conventional computer architecture for executing program instructions (spec. ¶[0086]). See MPEP § 2106.05(d). “the feature generator used in a multi class classification apparatus robust to imbalanced data;” – using a feature generator “in” a multiclass classification apparatus is well-understood, routine, and conventual (WURC)l in ML systems and is merely an intended use statement; it does not add a nonconventional technical feature and therefore does not amount to significantly more. See MPEP § 2106.05(d). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 4-8, and 11-15 are rejected under 35 U.S.C. 103 as being unpatentable over Giovanni Mariani et al. (arXiv:1803.09655v2) henceforth 'Mariani', in view of Xialong Wang et al. (arXiv:1711.07971v3) henceforth 'Wang'. (Amended) Regarding claim 1, Mariani in view of Wang, teach a multiclass classification apparatus robust to imbalanced data, the multiclass classification apparatus comprising: “a balanced learning data configuration unit configured to receive imbalanced learning data to obtain balanced learning data;” – Mariani teaches this limitation. Mariani teaches receiving imbalanced data and obtaining data by augmenting/generating minority-class samples to restore balance: “we propose a balancing generative adversarial network (BAGAN) as an augmentation tool to restore the dataset balance by generating new minority-class images.” (Mariani, p.1, § 1 Introduction) “and a model learning unit configured to receive the balanced learning data from the balanced learning data configuration unit to provide a class result predicted through model learning,” – Mariani teaches this limitation. Mariani teaches training a classifier on the augmented (re-balanced) dataset and evaluating classification performance (i.e., producing a predicted class result through model learning): “train a ResNet-18 classifier for the augmented dataset, and 6) measure the classifier accuracy” (Mariani, p. 7, § 5.2 Quality of the Final Classification) “and wherein the feature generating unit comprises: a generator configured to receive noise and a fake class,” – Mariani teaches this limitation. Mariani teaches a generator receiving (i) randomly drawn latent vectors and (ii) class labels, and generating “fake data” as output of G (i.e., noise + class-conditioned generation: “The class-conditional latent vector generator, that is a random process that takes as input a class label c and returns as output a latent vector Z c drawn at random from N c .” (Mariani, p. 4, §§ GAN initialization.) “The fake data is generated as output of G that takes as inputs latent vectors Z c …” (Mariani, p. 4, §§ GAN initialization.) Mariani does not teach these limitations: “a processor: and memory including one or more instructions that, in response to being executed by the processor, cause the processor at least to instantiate:” “wherein the balanced learning data configuration unit comprises: a feature extraction unit configured to extract a feature of the imbalanced learning data;” “a feature dictionary unit configured to randomly sample some of feature map obtained from the feature extraction unit to generate a feature dictionary;” “and a feature generating unit configured to generate artificial data, based on a convex combination of a convex weight and the feature dictionary, “and a convex weighting unit configured to output the convex weight by using softmax.” Wang, however, teaches these limitations: “a processor: and memory including one or more instructions that, in response to being executed by the processor, cause the processor at least to instantiate:” – Wang explicitly teaches this limitation. Wang teaches training on an 8-GPU machine, which inherently uses a processor and memory, e.g., VRAM: “We train on an 8-GPU machine and each GPU has 8 clips in a mini-batch (so in total with a mini-batch size of 64 clips).” (Wang, p. 5, § 4.1. Implementation Details) And explicitly recites fitting their model into memory: “To fi this model into memory, we reduce the mini-batch size to 2 clips per GPU.” (Wang, p. 7, §§ Longer sequences.) “wherein the balanced learning data configuration unit comprises: a feature extraction unit configured to extract a feature of the imbalanced learning data;” – Wang teaches operation on “features” derived from input (often “their features), i.e., feature maps/signals that used as the input x to the non-local operation: “x is the input signal (image, sequence, video; often their features) …” (Wang, p. 2, § 3.1 Formulation) “a feature dictionary unit configured to randomly sample some of feature map obtained from the feature extraction unit to generate a feature dictionary;” – Wang teaches working over a set of positions j in feature maps, and also teaches using a “subsampling trick” to use a reduced set of features (a subset of feature-map positions), which is an art-recognized way to select a subset from feature maps for downstream weighted aggregation (analogous to forming a “dictionary” of selectable feature elements): “… and j is the index that enumerates all possible positions.” (Wang, p. 2, § 3.1 Formulation) “A subsampling trick can be used … x ^ is a subsampled version.” (Wang, p. 4, § 3.3 Non-local Block) “and a feature generating unit configured to generate artificial data, based on a convex combination of a convex weight and the feature dictionary,” – Wang teaches generating an output response as a weighted sum over feature representations (which is the claimed convex-combination-style form when weights are normalized), taken over the set of feature elements (the “dictionary” set): “computes the response at a position as a weighted sum of the features at all positions” (Wang, p. 1, § 1. Introduction) “and a convex weighting unit configured to output the convex weight by using softmax.” – Wang expressly teaches softmax producing the weighting used in the non-local/self-attention formulation (i.e., producing normalized weights used for weighted sums): “The softmax operation is performed on each row.” (Wang, p. 3, § 3.2 Instantiations) “softmax computation along the dimension j.” (Wang, p. 3, § 3.2 Instantiations) It would have been obvious to a person of ordinary skill in the art arrive at the subject matter of claim 1 by incorporating Wang’s softmax-based weighted feature aggregation mechanism into Mariani’s class-conditional generative augmentation pipeline as a predictable way to compute normalized (convex) weights and to generate artificial data as a weighted combination of feature representations selected from feature maps, with a reasonable expectation of success. Such a combination merely uses known techniques according to their established functions to yield expected results, namely normalized featured-weighted data generation within a GAN-based imbalanced-data classification framework. Therefore, claim 1 is unpatentable over Mariani in view of Wang. (Cancelled) Claims 2-3 (Amended) Regarding claim 4, Mariani in view of Wang, teach the multiclass classification apparatus of claim 1, “wherein the feature generating unit complements a minority class with the artificial data.“ – Mariani expressly teaches generating new minority-class images to restore balance, i.e., complementing the minority class with artificial data: “restore the dataset balance by generating new minority-class images.” (Mariani, p. 1, § 1 Introduction) “restore balance in imbalanced datasets.” (Mariani, p. 1, § Abstract) Wang teaches its blocks are modular and combinable with other architectures, “can be combined with any existing architectures” (Wang, p. 8, § 7. Conclusion), supporting the motivation to combine Wang’s softmax-weighted feature mixing with Mariani’s GAN-based augmentation. (Original) Regarding claim 5, Mariani in view of Wang, teach the multiclass classification apparatus of claim 1, wherein: “the feature generating unit performs adversarial training which allows artificial data to be similar to a distribution of real data” – Mariani teaches this limitation. Mariani's BAGAN expressly performs adversarial training of the generator and discriminator: "Then, all the weights in the generator and discriminator are fine tuned by carrying out a traditional adversarial training ... Overall, the BAGAN training approach is organized ... c) adversarial training." (Mariani, p. 3, § BAGAN) The discriminator is trained to assign rea I class labels to rea I images and fake to generated ones; the generator is trained against that signal to match desired class labels, i.e., the classic GAN adversarial setup that drives generated samples toward the real data distribution: "The discriminator Dis trained for associating to the images generated by G the label fake, and to real images Xe their class label c. The generator is trained to avoid the fake label and match the desired class labels." (Mariani, p. 3, §3 Motivating Example) Mariani describes the objective of producing artificial images that are indistinguishable from the real training data to the discriminator, i.e., similar to the distribution of real data: "These images would be indistinguishable by the discriminator from the ones in the training dataset." (Mariani, p. 3, §3 Motivating Example) Mariani teaches the adversarial training of a generator/discriminator and the goal that the generated (artificial) images are indistinguishable from real (i.e., similar to the real data distribution). (Amended) Regarding claim 6, Mariani in view of Wang, teach the multiclass classification apparatus of claim 1, wherein “the feature extraction unit comprises: a feature extractor configured to extract the feature;” - Mariani does not teach this limitation. Wang, however, teaches this limitation. Wang teaches use of standard CNN-feature-extractor backbones (ResNet/ReXt with FPN) that produce feature maps for subsequent processing (i.e., a feature extractor configured to extract features): “The backbone is ResNet-50/101 or ResNeXt-152 [53], both with FPN [32].” (Wang, p. 8, § 6. Extension: Experiments on COCO) “and a feature adaptation unit configured to allow the feature to obtain one characteristic of a shape, an edge, and a color of an image,” – Mariani does not teach this limitation. Wang, however, teaches this limitation. Wang teaches a feature-adaptation (non-local) module that captures long-range dependencies and appearance similarity across distant pixels, which adapts features to encode structural (shape/edge) and appearance (color/texture) characteristics: “non-local operations capture long-range dependencies directly by computing interactions between any two positions, regardless of their positional distance;” (Wang, p. 1, § 1. Introduction) This disclosure teaches long-range dependencies / interactions regardless of distance. And Wang’s disclosed appearance similarity supports color/appearance-type characteristics: “It allows distant pixels to contribute to the filtered response at a location based on patch appearance similarity” (Wang, p. 2, § 2. Related Work) “and the obtained features are integrated as one.” – Mariani does not teach this limitation. Wang, however, teaches this limitation. Wang teaches integrating (fusing) the adapted features with the original features via a residual connection, yielding a single combined representation: “ z i = W z y i + x i ” (Wang, p. 4, § 3.3. Non-local Block) “where … “ + x i ” denotes a residual connection [21].” (Wang, p. 4, § 3.3. Non-local Block) A POSITA would have been motivated to incorporate Wang’s non-local feature-adaptation block into Mariani’s imbalanced-data augmentation pipeline because Marini relies on CNN-extraction features for generation and classification, and Wang expressly teaches a modular feature-adaptation mechanism that improves feature representations by capturing long-range dependencies and appearance similarity, and is designed to be combined with existing architectures to improve downstream vision tasks, with a reasonable expectation of success. (Original) Regarding claim 7, Mariani in view of Wang, teach the multiclass classification apparatus of claim 1, wherein the model learning unit comprises: and a multiclass classification unit configured to classify a class into a plurality of classes by using the feature extracted from the tuning feature extraction unit. - Mariani teaches this limitation. Mariani's setting is explicitly multiclass: "translates the latent features into the probability that the image is fake or that it belongs to one of the problem classes c 1 - c n " (Mariani, §4. BAGAN) And the evaluation trains a Res Net-18 classifier on multi-class datasets (e.g., CIFAR-10, Flowers) and reports accuracy, i.e., a multiclass classification unit using extracted features. Mariani does not teach: a tuning feature extraction unit configured to finely tune a feature extraction method of the feature extraction unit, based on the balanced learning data; Wang, however, teaches this limitation: a tuning feature extraction unit configured to finely tune a feature extraction method of the feature extraction unit, based on the balanced learning data; - Wang expressly teaches finetuning features extractors for classification tasks (standard practice): "We fine-tune on 128-frame inputs ... We initialize this model from the corresponding models trained with 32-frame inputs." (Wang, §5.1. Experiments on Kinetics) And: "We initialize our models pre-trained on Kinetics ... on which we fine-tune our network." (Wang, §5.2. Experiments on Charades) These passages evidence routine fine-tuning of the feature extraction method in the classifier. A POSITA, having Mariani's balanced training set, would routinely fine-tune a pre-trained CNN feature extractor on that balanced data to improve generalization, as shown by Wang's explicit fine-tuning regimen for classification models. Using a multiclass classifier on the tuned features over the balanced data is a predictable use of prior a rt elements to yield no more than the expected results (improved classification on a balanced set). Regarding claim 8, Mariani in view of Wang, teach a multiclass classification method robust to imbalanced data, the multiclass classification method comprising: “a balanced learning data configuration operation of receiving imbalanced learning data to obtain balanced learning data by using a balanced learning data configuration unit;” – Mariani teaches this limitation. Mariani teaches obtaining balanced learning data from imbalanced data by augmenting/generating minority-class samples to restore balance: “we propose a balancing generative adversarial network (BAGAN) as an augmentation tool to restore the dataset balance by generating new minority-class images.” (Mariani, p. 1, § 1 Introduction) “and a model learning operation of receiving the balanced learning data from the balanced learning data configuration unit to provide a class result predicted through model learning by using a model learning unit,” – Mariani teaches this limitation. Mariani teaches model learning/classification on the augmented (balanced) dataset and measuring classifier accuracy (i.e., predicted class results): “for each class we … train a ResNet-18 classifier for the augmented dataset, and 6) measure the classifier accuracy for the minority class over the test set.” (Mariani, p. 7, § 5.2 Quality of the Final Classification) “wherein the feature generating operation comprises: a generating operation of receiving noise and a fake class by using a generator;” – Mariani teaches this limitation. Mariani teaches a class-conditioned generator receiving random latent vectors (noise) and class labels (class conditioning) and producing fake data: “the class-conditional latent vector generator, that is a random process that takes as input a class label c and returns as output a latent vector Z c drawn at random …” (Mariani, p. 4, §§ GAN initialization) Mariani does not teach these limitations: “executing, by a processor based on instructions stored in memory, operations comprising:” “wherein the balanced learning data configuration operation comprises: a feature extraction operation of extracting a feature of the imbalanced learning data by a feature extraction unit;” “a feature dictionary generating operation of randomly sampling some of feature maps obtained from the feature extraction unit to generate a feature dictionary by using a feature dictionary unit;” “a feature generating operation of generating artificial data by using a feature generating unit, based on a convex combination of a convex weight and the feature dictionary,” “a convex weighting operation configured to output the convex weight using softmax.” Wang, however, teaches these limitations: “executing, by a processor based on instructions stored in memory, operations comprising:” – Wang teaches this limitation. Wang expressly discloses implementation and training using GPUs: “We train on an 8-GPU machine and each GPU has 8 clips in a mini-batch (so in total with a mini-batch size of 64 clips).” “wherein the balanced learning data configuration operation comprises: a feature extraction operation of extracting a feature of the imbalanced learning data by a feature extraction unit;” – Wang teaches this limitation. Wang teaches that the input signal is often “features”, i.e., extracted feature representations used in vision pipelines: “x is the input signal (image, sequence, video; often their features)” (Wang, p. 2, § 3.1 Formulation) “a feature dictionary generating operation of randomly sampling some of feature maps obtained from the feature extraction unit to generate a feature dictionary by using a feature dictionary unit;” – Wang teaches this limitation. Wang selecting a subset of feature-map elements via a subsampling mechanism (e.g., pooling), corresponding to sampling feature maps for subsequent weighted aggregation (i.e., a set usable as a feature “dictionary”): “A subsampling trick can be used to further reduce computation … where x ^ is a subsampled version of x (e.g., by pooling).” (Wang, p. 4, § 3.3 Non-local Block) “a feature generating operation of generating artificial data by using a feature generating unit, based on a convex combination of a convex weight and the feature dictionary,” – Wang teaches this limitation. Wang teaches generating a response as a weighted sum of features over a set of positions (a weighted combination over the feature set), which corresponds to the claimed convex-combination-style operation when weights are softmax-normalized: “a non-local operation computes the response at a position as a weighted sum of the features at all positions in the input feature maps” (Wang, p. 1, § 1. Introduction) “a convex weighting operation configured to output the convex weight using softmax.” – Wang teaches this limitation. Wang expressly teaches computing weights via softmax for use in the weighted-sum operation: “The softmax operation is performed on each row.” (Wang, p. 3, § 3.2. Instantiations) “softmax computation along the dimension j” (Wang, p. 3, § 3.2. Instantiations) It would have been obvious to a person having ordinary skill in the art to combine Mariani and Wang because Mariani provides a method for improving multiclass classification performance on imbalanced datasets by augmentation, explicitly describing BAGAN “as an augmentation tool to restore the dataset balance by generating new minority-class images.”, and further teaches class-conditional generation using random latent vectors. Wang teaches a known, modular mechanism for feature mixing/aggregation using softmax weights. A POSITA would have been motivated to incorporate Wang’s softmax-weighted feature aggregation (and subsampled feature-set selection) into Mariani’s imbalanced-data augmentation/classification method as a predictable way to implement normalized (softmax) weighting and weighted combination over a feature set during feature-based generation , with reasonable expectation of success. (Cancelled) Claims 9-10 (Amended) Regarding claim 11, Mariani in view of Wang, teach the multiclass classification method of claim 8, wherein “the feature generating operation comprises a step of complementing a minority class with the artificial data.” – Mariani teaches this limitation. Mariani teaches complementing (i.e., augmenting) a minority class with generated/artificial data to restore balance: “we propose a balancing generative adversarial network (BAGAN) as an augmentation tool to restore the dataset balance by generating new minority-class images.” (Mariani, p. 1, § 1 Introduction) Claim 11 depends form claim 8 and therefore incorporates the limitations and the motivation to combine set forth for claim 8. Mariani expressly teaches complementing a minority class with generated artificial data to restore balance. Accordingly, claim 11 is unpatentable over Mariani in view of Wang for the same reasons as claim 8. (Amended) Regarding claim 12, Mariani in view of Wang, teach the multiclass classification method of claim 8, wherein “the feature generating operation comprises a step of performing adversarial training which allows artificial data to be similar to a distribution of real data.” – Mariani teaches this limitation. Mariani's BAGAN expressly performs adversarial training of the generator and discriminator: "Then, all the weights in the generator and discriminator are fine tuned by carrying out a traditional adversarial training ... Overall, the BAGAN training approach is organized ... c) adversarial training." (Mariani, p. 3, § BAGAN) The discriminator is trained to assign rea I class labels to rea I images and fake to generated ones; the generator is trained against that signal to match desired class labels, i.e., the classic GAN adversarial setup that drives generated samples toward the reaI data distribution: "The discriminator Dis trained for associating to the images generated by G the label fake, and to real images Xe their class label c. The generator is trained to avoid the fake label and match the desired class labels." (Mariani, p. 3, §3 Motivating Example) Mariani describes the objective of producing artificial images that are indistinguishable from the real training data to the discriminator, i.e., similar to the distribution of real data: "These images would be indistinguishable by the discriminator from the ones in the training dataset." (Mariani, p. 3, §3 Motivating Example) Mariani teaches the adversarial training of a generator/discriminator and the goal that the generated (artificial) images are indistinguishable from rea I (i.e., similar to the real-data distribution). Mariani teaches performing adversarial training in which a generator is trained against a discriminator so that generated (“fake”) data becomes similar to real data (i.e., indistinguishable to the discriminator): Claim 12 depends form claim 8 and therefore incorporates the limitations and the motivation to combine set forth for claim 8. Mariani expressly teaches the additional limitation of claim 12. Accordingly, claim 12 is unpatentable over Mariani in view of Wang for the same reasons as claim 8. Regarding claim 13 Claim 13 is the method analog of apparatus claims 6 (feature extractor+ feature adaptation emphasizing shape/edge/color; integration/fusion of features). The § 103 rejection over Mariani +Wang applies equally to claim 13. Claim 14 is the method analog of apparatus claim 7 (fine-tuning the feature extractor on the balanced data; multiclass classification using the tuned features). The §103 rejection over Mariani+ Wang applies equally to claim 14. (Amended) Regarding claim 15, Mariani in view of Wang, teach a feature generator implemented by a processor executing instructions stored in memory, the feature generator used in a multi class classification apparatus robust to imbalanced data, and comprising: “a generator configured to receive noise and a fake class;” – Mariani teaches this limitation. Mariani teaches a generator that takes a class label and produces a latent vector drawn at random, and teaches that fake data is generated as output of the generator from latent vectors: Mariani's BAGAN is explicitly class-conditional and feeds the generator with latent (random) vectors paired with class labels: "The fake data is generated as output of G that takes as inputs latent vectors Z c extracted from the class-conditional latent vector generator. In turn, the class-conditional latent vector generator takes as input uniformly distributed class labels c, i.e. the fake images are uniformly distributed between the problem-specific classes." (Mariani, p. 4, §4. BAGAN) "a batch of conditional latent vectors Z c is drawn at random by applying a uniform distribution on the labels c. These vectors are processed by the generator ... " (Mariani, p. 4, §4. BAGAN) " ... returns as output a latent vector Z c drawn at random from N c ... " (Mariani, p. 4, §4. BAGAN) These excerpts teach a generator receiving random latent input (noise) and a target class label c to synthesize class-conditioned ("fake") examples. “an adversarial training unit configured to perform adversarial training which allows the artificial data to be similar to a distribution of real data.” – Mariani teaches this limitation. Mariani teaches adversarial training of a generator against a discriminator, with the objective that generated images become indistinguishable to the discriminator: "Then, all the weights in the generator and discriminator are fine tuned by carrying out a traditional adversarial training" (Mariani, §4. BAGAN) Aiming for images: "These images would be indistinguishable by the discriminator from the ones in the training dataset." (Mariani, p. 3, §3 Motivating Example) Mariani does not teach these limitations: “a convex weighting unit configured to output a convex weight by using softmax;” “an artificial data generating unit configured to generate artificial data, based on a convex combination of the convex weight output from the convex weighting unit and a previously generated feature dictionary;” Wang, however, teaches these limitations: “a convex weighting unit configured to output a convex weight by using softmax;” – Wang teaches use of softmax to compute the weights used in the operation: “The softmax operation is performed on each row.” (Wang, p. 3, § 3.2 Instantiations) “softmax computation along the dimension j.” (Wang, p. 3, § 3.2 Instantiations) “an artificial data generating unit configured to generate artificial data, based on a convex combination of the convex weight output from the convex weighting unit and a previously generated feature dictionary;” – Wang teaches this limitation. Wang teaches forming an output response as weighted sum over feature representations (i.e., a weighted combination over a set of features). Wang also teaches selecting a reduced set of feature-map elements using a subsampling trick (corresponding to using a subset/set of features for the weighted operation, i.e., a “dictionary” set): “a non-local operation computes the response at a position as a weighted sum of the features at all positions in the input feature maps” (Wang, p. 1, § 1. Introduction) “A subsampling trick can be used to further reduce computation … where x ^ is a subsampled version of x (e.g., by pooling).” (Wang, p. 4, § 3.3 Non-local Block) A POSITA would have been motivated to combine Mariani and Wang because Mariani teaches generating artificial (minority-class) data to restore balance in imbalanced datasets, and teaches class-conditional generation using randomly drawn latent vectors and producing “fake data” as generator output, while Wang teaches a known softmax-weighting mechanism for weighted feature mixing and teaches computing responses as “a weighted sum of the features”, with an express teaching that its block “can be combined with any existing architectures”. Thus, a POSITA would have had reasons to incorporate Wang’s softmax-normalized weighted feature aggregation (an its selectable/subsampled feature-set operation) into Mariani’s generation/augmentation framework as a predictable way to implement normalized weighting and weighted combinations of feature representations during artificial data generation, with a reasonable expectation of success. Conclusion THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to Paul Coleman whose telephone number is (571)272-4687. The examiner can normally be reached Mon-Fri. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached at (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /PAUL COLEMAN/ Examiner, Art Unit 2126 /LUIS A SITIRICHE/Primary Examiner, Art Unit 2126
Read full office action

Prosecution Timeline

Nov 29, 2022
Application Filed
Sep 24, 2025
Non-Final Rejection — §101, §103
Nov 12, 2025
Response Filed
Feb 06, 2026
Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12597489
METHOD, DEVICE, AND COMPUTER PROGRAM FOR PREDICTING INTERACTION BETWEEN COMPOUND AND PROTEIN
2y 5m to grant Granted Apr 07, 2026
Patent 12574861
METHOD AND SYSTEM FOR ACCELERATING DISTRIBUTED PRINCIPAL COMPONENTS WITH NOISY CHANNELS
2y 5m to grant Granted Mar 10, 2026
Patent 12443678
STEPWISE UNCERTAINTY-AWARE OFFLINE REINFORCEMENT LEARNING UNDER CONSTRAINTS
2y 5m to grant Granted Oct 14, 2025
Study what changed to get past this examiner. Based on 3 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
70%
Grant Probability
99%
With Interview (+42.9%)
3y 6m
Median Time to Grant
Moderate
PTA Risk
Based on 10 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month