Last updated: April 19, 2026
Application No. 17/179,989
LEARNING APPARATUS THAT ADJUSTS TRAINING SET USED FOR MACHINE LEARNING, ELECTRONIC APPARATUS, LEARNING METHOD, CONTROL METHOD FOR ELECTRONIC APPARATUS, AND STORAGE MEDIUM

Non-Final OA §103
Filed
Feb 19, 2021
Examiner
SITIRICHE, LUIS A
Art Unit
2126
Tech Center
2100 — Computer Architecture & Software
Assignee
Canon Kabushiki Kaisha
OA Round
3 (Non-Final)
Interview Optional

— +22.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 468 resolved cases, 2023–2026
Examiner Intelligence

SITIRICHE, LUIS A View full profile →
Grants 78% — above average
Career Allow Rate
363 granted / 468 resolved
+22.6% vs TC avg
Strong +22% interview lift
Without
With
+22.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 7m
Avg Prosecution
24 currently pending
Career history
492
Total Applications
across all art units
Statute-Specific Performance

§101
24.2%
-15.8% vs TC avg
§103
39.1%
-0.9% vs TC avg
§102
12.4%
-27.6% vs TC avg
§112
13.5%
-26.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 468 resolved cases
Office Action

§103
DETAILED ACTION
Claims 1, 13-17 are amended. Claims 1-10, 13-17 are pending.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 09/15/2025 has been entered.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
The Applicant’s arguments regarding the rejection of above claims have been fully considered.
In reference to Applicant’s arguments about:
112(f) and 112(b).
Examiner’s response:
Claim interpretation and claim rejection under 112(b) in view of the term “communicator” is withdrawn in view of amendment and arguments.

In reference to Applicant’s arguments about:
35 USC 103 Rejections.
Examiner’s response:
Applicant’s main arguments, as it can be seen at pages 10-11 in the response dated 08/20/2025, are directed to [Feature A] and [Feature B], mapped to the amended limitations:
“the feature value is a pixel difference value, a pixel mean value, or a pixel variance value of the decoded image, and 
the adjustment unit adjusts the number of pieces of training data included in the training set by obtaining a decoded image, extracting the pixel difference value, the pixel mean value, or the pixel variance value of the decoded image from the decoded image, and adding the decoded image to the training set based on the extracted pixel difference value, pixel mean value, or pixel variance value of the decoded image”.
These arguments have been fully considered, but are moot in view of new grounds of rejection.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 3, 7, 10, 13-17 are rejected under 35 U.S.C. 103 as being unpatentable over Song (US 2019/0370662- hereinafter Song) in view of Yokoi et al (US Pub. No. 2017/0132515- hereinafter Yokoi), in view of Liu et al (NPL: “Data-Driven Soft Decoding of Compressed Images in Dual Transform-Pixel Domain”- hereinafter Liu), and further in view of Huang (CN 110706713A- hereinafter Huang).
	
Referring to Claim 1, Song teaches a learning apparatus comprising: 
(i) at least one memory configured to store computer-executable instructions and at least one processor configured to execute the computer-executable instructions stored in the at least one memory (see Song at [0016]: “one or more processors; and a memory, which stores commands executable by the one or more processors), (ii) at least one circuit, or both (i) and (ii) that implement:
an adjustment unit configured to, for a training set including a plurality of pieces of training data, adjust the number of pieces of training data included in the training set such that feature values of the plurality of pieces of training data have a predetermined distribution (see Song at [0078]: “However, according to the exemplary embodiment of the present disclosure, when the neural network is adjusted in advance so that the distribution of the feature values output in the neural network uses only similar features (significant features) to those of the predetermined probability distribution, the neural network is formed of the significant features (that is, the significant nodes) prior to the training, thereby decreasing the number of times of the repetition of the training and the amount of calculation and improving training efficiency”. Therefore, since the neural network is adjusted to only use similar features to those of the predetermined distribution, and this happens before the actual training, this corresponds to the claimed adjustment of the number of pieces of training data. Previously at [0065]: “The computing device 100 may calculate an error between the first feature value distribution 310 and the predetermined Weibull distribution 200 (the solid line), and when the error is equal to or less than a predetermined value, the computing device 100 may activate the node outputting the first feature value (in this case, node n1 in the example of FIG. 4)”, therefore, the computing device is analogous to the claimed adjustment unit); and 
a training unit configured to perform machine learning using the training set to generate a learned model (see Song at [0078]: “However, according to the exemplary embodiment of the present disclosure, when the neural network is adjusted in advance so that the distribution of the feature values output in the neural network uses only similar features (significant features) to those of the predetermined probability distribution, the neural network is formed of the significant features (that is, the significant nodes) prior to the training, thereby decreasing the number of times of the repetition of the training and the amount of calculation and improving training efficiency”. Therefore, since the neural network is adjusted before the training, it is interpreted that right after this adjustment, the training set is used to generate a model with these adjustments as it decreases the times of repetition of the training. Furthermore, it can be seen at [0037]: “The computing device 100 according to the exemplary embodiment of the present disclosure may include a processor 110, a graphic processing unit (GPU) 120, and a memory 130”. Therefore, this computing device along with its processor and GPU corresponds to the training unit).
Even though Song implicitly teaches the adjustment of pieces of training data included in the training set, Yokoi explicitly teaches it, as it can be seen at [0363-0364]: “Image recognition systems learned using original images and deformed images as pieces of training data are capable of recognizing objects contained in various input images. Various deformations of each image obtain a variety of pieces of training data. This approach for artificially increasing pieces of training data will be referred to as data enhancement. A first approach in the data enhancement increases the number of pieces of training data by n times before learning, and stores the increased number of pieces of training data in a storage”. Therefore, increasing the pieces of training data in order to provide data enhancement is interpreted as the claimed “adjust the number of pieces of training data included in the training set”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Song with the above teachings of Yokoi by adjusting feature values of the plurality of pieces of training data have a predetermined distribution, as taught by Song, by adjusting the number of pieces of training data included in the training set, as taught by Yokoi. The modification would have been obvious because one of ordinary skill in the art would be motivated to enhance the data for training a model before the learning of the model, thereby improving training efficiency (as it can be seen from Song at 0078: “thereby decreasing the number of times of the repetition of the training and the amount of calculation and improving training efficiency” and Yokoi at [0363]: “This approach for artificially increasing pieces of training data will be referred to as data enhancement”).
However, the combination of Song and Yokoi fails to teach: 
wherein the training data is a decoded image for training, which is obtained by encoding and decoding an uncompressed image, 
the learned model outputs, when the decoded image is input to the learned model, a restored image obtained by restoring the decoded image,
the feature value is a pixel difference value, a pixel mean value, or a pixel variance value of the decoded image, and
the adjustment unit adjusts the number of pieces of training data included in the training set by obtaining a decoded image, extracting the pixel difference value, the pixel mean value, or the pixel variance value of the decoded image from the decoded image, and adding the decoded image to the training set based on the extracted pixel difference value, pixel mean value, or pixel variance value of the decoded image.
Liu teaches, in an analogous system, wherein the training data is a decoded image for training, which is obtained by encoding and decoding an uncompressed image (see Liu at p. 1650 section B. Our Contribution: “Fig. 1 depicts the architecture of the proposed image restoration framework, in which the degraded input is the decompressed (hard-decoded) image and the restored output is called soft-decoded image”. Further at p. 1653 section III: “In this work, we address this challenging issue by using a machine learning-based technique that incorporates high-frequency priors of uncompressed images into the restoration framework”),
the learned model outputs, when the decoded image is input to the learned model, a restored image obtained by restoring the decoded image (see Liu at p. 1650 section B. Our Contribution: “Fig. 1 depicts the architecture of the proposed image restoration framework, in which the degraded input is the decompressed (hard-decoded) image and the restored output is called soft-decoded image”. Further at p. 1653 section III: “In this work, we address this challenging issue by using a machine learning-based technique that incorporates high-frequency priors of uncompressed images into the restoration framework”);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song and Yokoi with the above teachings of Liu by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song and Yokoi, wherein the training data is a decoded image for training and ultimately restoring the decoded image, as taught by Liu. The modification would have been obvious because one of ordinary skill in the art would be motivated to improve the quality of restored images (as it can be seen from Liu at Abstract: “Experimental results are encouraging and show the promise of the new approach in significantly improving the quality of DCT-coded image”).
Huang teaches, in an analogous system, the feature value is a pixel difference value, a pixel mean value, or a pixel variance value of the decoded image (see Huang at [0076]: “Then, the portrait is generated through the above-trained decoding network. The loss function mainly calculates the difference in pixel color between the final generated image and the real image, and then adjusts the direction and magnitude of parameter changes based on the difference. Here, the basic network feedback algorithm (backpropagation) is used.” Therefore, the difference in pixel color is interpreted as the feature value being a pixel difference value); and
the adjustment unit adjusts the number of pieces of training data included in the training set by obtaining a decoded image, extracting the pixel difference value, the pixel mean value, or the pixel variance value of the decoded image from the decoded image, and adding the decoded image to the training set based on the extracted pixel difference value, pixel mean value, or pixel variance value of the decoded image (see Huang at [0076]: “Then, the portrait is generated through the above-trained decoding network. The loss function mainly calculates the difference in pixel color between the final generated image and the real image, and then adjusts the direction and magnitude of parameter changes based on the difference. Here, the basic network feedback algorithm (backpropagation) is used.” Therefore, the adjustment of using backpropagation based on the calculated difference in pixel color is interpreted as adjusting the number of pieces or training data).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song, Yokoi and Liu with the above teachings of Huang by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song, Yokoi and Liu, wherein the adjustment is based on pixel difference of the decoded image, as taught by Huang. The modification would have been obvious because one of ordinary skill in the art would be motivated to improve the accuracy of the network (as it can be seen from Huang at [0067]: “In each round of training, the parameters of the discriminator network are continuously adjusted to improve the discrimination accuracy” and at [0076]: “As the parameters are adjusted, the encoding network will encode the voiceprint feature data into the corresponding portrait feature vector more and more accurately”).

Referring to Claim 3, the combination of Song, Yokoi, Liu and Huang teaches the learning apparatus according to claim 2, wherein the adjustment unit adds, after the machine learning is performed using the plurality of pieces of training data included in the training set, training data with feature values equal to or greater than a predetermined value (see Yokoi at [0117]: “The proper mini-batch size, which depends on a problem to be solved by the CNN, is set to be within the range from 1 to approximately 1000. Experience shows that the mini-batch size has a proper value, i.e. a preferred value. If the mini-batch size were set to a value largely exceeding the proper value, the convergence rate and the generalization capability could be lowered”. Therefore, Yokoi teaches a proper value/preferred value to not exceed, which is equivalent as adding training data until that preferred value (interpreted as the predetermined value)). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Song with the above teachings of Yokoi by adjusting feature values of the plurality of pieces of training data have a predetermined distribution, as taught by Song, by adjusting the number of pieces of training data included in the training set, as taught by Yokoi. The modification would have been obvious because one of ordinary skill in the art would be motivated to enhance the data for training a model before the learning of the model, thereby improving training efficiency (as it can be seen from Song at 0078: “thereby decreasing the number of times of the repetition of the training and the amount of calculation and improving training efficiency” and Yokoi at [0363]: “This approach for artificially increasing pieces of training data will be referred to as data enhancement”).

Referring to Claim 7, the combination of Song, Yokoi, Liu and Huang teaches the learning apparatus according to claim 1, wherein the adjustment unit adjusts, for each of the plurality of training sets, the number of pieces of training data included in the training set such that feature values of the plurality of pieces of training data have a predetermined distribution (see Song at [0078]: “However, according to the exemplary embodiment of the present disclosure, when the neural network is adjusted in advance so that the distribution of the feature values output in the neural network uses only similar features (significant features) to those of the predetermined probability distribution, the neural network is formed of the significant features (that is, the significant nodes) prior to the training, thereby decreasing the number of times of the repetition of the training and the amount of calculation and improving training efficiency”); and 
each of the plurality of training units performs machine learning using a corresponding training set (see Yokoi at [0117]: “The mini-batch size represents the number of pieces of training data used for one updating of the weights W, i.e. calculation of the differential value dW. The proper mini-batch size, which depends on a problem to be solved by the CNN, is set to be within the range from 1 to approximately 1000. Experience shows that the mini-batch size has a proper value, i.e. a preferred value. If the mini-batch size were set to a value largely exceeding the proper value, the convergence rate and the generalization capability could be lowered”. Therefore, this mini-batch size corresponds to the claimed ‘training set’).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Song with the above teachings of Yokoi by adjusting feature values of the plurality of pieces of training data have a predetermined distribution, as taught by Song, by adjusting the number of pieces of training data included in the training set, as taught by Yokoi. The modification would have been obvious because one of ordinary skill in the art would be motivated to enhance the data for training a model before the learning of the model, thereby improving training efficiency (as it can be seen from Song at 0078: “thereby decreasing the number of times of the repetition of the training and the amount of calculation and improving training efficiency” and Yokoi at [0363]: “This approach for artificially increasing pieces of training data will be referred to as data enhancement”).

Referring to Claim 10, the combination of Song, Yokoi, Liu and Huang teaches the learning apparatus according to claim 7, wherein the training set includes a plurality of pieces of training data having the feature values smaller than a predetermined threshold value or a plurality of pieces of training data having the feature values equal to or greater than the predetermined threshold value (see Song at [0057]: “In the example of FIG. 3B, the corresponding feature shows a tendency in that the number of elements of the input data having a value x.sub.1 is largest, and the number of elements of the input data having the corresponding value is decreased from a value x.sub.2 to a value x.sub.1”. Therefore, decreasing the number of element to meet x.sub.1 is interpreted as having the feature value equal to a threshold).

Referring to Claim 13, Song teaches an electronic apparatus comprising: 
(i) at least one memory configured to store computer-executable instructions and at least one processor configured to execute the computer-executable instructions stored in the at least one memory (see Song at [0016]: “one or more processors; and a memory, which stores commands executable by the one or more processors), (ii) at least one circuit, or both (i) and (ii) that implement an inference unit configured to carry out an inference process using a learned model (see Song at [0038]: “The processor 110 may read a computer program stored in the memory 130 and perform a method of training an artificial neural network (ANN) and a method of classifying data by using the trained neural network according to the exemplary embodiment of the present disclosure”. Therefore, the classification corresponds to the claimed ‘inference’), wherein 
the learned model is generated by machine learning using a training set including a plurality of pieces of training data, in which the number of pieces of training data has been adjusted such that feature values of the plurality of pieces of training data have a predetermined distribution (see Song at [0078]: “However, according to the exemplary embodiment of the present disclosure, when the neural network is adjusted in advance so that the distribution of the feature values output in the neural network uses only similar features (significant features) to those of the predetermined probability distribution, the neural network is formed of the significant features (that is, the significant nodes) prior to the training, thereby decreasing the number of times of the repetition of the training and the amount of calculation and improving training efficiency”. Therefore, since the neural network is adjusted to only use similar features to those of the predetermined distribution, and this happens before the actual training, this corresponds to the claimed adjustment of the number of pieces of training data).
Even though Song implicitly teaches in which the number of pieces of training data has been adjusted, Yokoi explicitly teaches it, as it can be seen at [0363-0364]: “Image recognition systems learned using original images and deformed images as pieces of training data are capable of recognizing objects contained in various input images. Various deformations of each image obtain a variety of pieces of training data. This approach for artificially increasing pieces of training data will be referred to as data enhancement. A first approach in the data enhancement increases the number of pieces of training data by n times before learning, and stores the increased number of pieces of training data in a storage”. Therefore, increasing the pieces of training data in order to provide data enhancement is interpreted as the claimed “adjust the number of pieces of training data included in the training set”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Song with the above teachings of Yokoi by adjusting feature values of the plurality of pieces of training data have a predetermined distribution, as taught by Song, by adjusting the number of pieces of training data included in the training set, as taught by Yokoi. The modification would have been obvious because one of ordinary skill in the art would be motivated to enhance the data for training a model before the learning of the model, thereby improving training efficiency (as it can be seen from Song at 0078: “thereby decreasing the number of times of the repetition of the training and the amount of calculation and improving training efficiency” and Yokoi at [0363]: “This approach for artificially increasing pieces of training data will be referred to as data enhancement”).
However, the combination of Song and Yokoi fails to teach: 
the training data is a decoded image for training, which is obtained by encoding and decoding an uncompressed image, 
the learned model outputs, when the decoded image is input to the learned model, a restored image obtained by restoring the decoded image,
the feature value is a pixel difference value, a pixel mean value, or a pixel variance value of the decoded image, and
the number of pieces of training data included in the training set are adjusted by obtaining a decoded image, extracting the pixel difference value, the pixel mean value, or the pixel variance value of the decoded image from the decoded image, and adding the decoded image to the training set based on the extracted pixel difference value, pixel mean value, or pixel variance value of the decoded image.
Liu teaches, in an analogous system, the training data is a decoded image for training, which is obtained by encoding and decoding an uncompressed image (see Liu at p. 1650 section B. Our Contribution: “Fig. 1 depicts the architecture of the proposed image restoration framework, in which the degraded input is the decompressed (hard-decoded) image and the restored output is called soft-decoded image”. Further at p. 1653 section III: “In this work, we address this challenging issue by using a machine learning-based technique that incorporates high-frequency priors of uncompressed images into the restoration framework”),
the learned model outputs, when the decoded image is input to the learned model, a restored image obtained by restoring the decoded image (see Liu at p. 1650 section B. Our Contribution: “Fig. 1 depicts the architecture of the proposed image restoration framework, in which the degraded input is the decompressed (hard-decoded) image and the restored output is called soft-decoded image”. Further at p. 1653 section III: “In this work, we address this challenging issue by using a machine learning-based technique that incorporates high-frequency priors of uncompressed images into the restoration framework”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song and Yokoi with the above teachings of Liu by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song and Yokoi, wherein the training data is a decoded image for training and ultimately restoring the decoded image, as taught by Liu. The modification would have been obvious because one of ordinary skill in the art would be motivated to improve the quality of restored images (as it can be seen from Liu at Abstract: “Experimental results are encouraging and show the promise of the new approach in significantly improving the quality of DCT-coded image”).
Huang teaches, in an analogous system,
the feature value is a pixel difference value, a pixel mean value, or a pixel variance value of the decoded image (see Huang at [0076]: “Then, the portrait is generated through the above-trained decoding network. The loss function mainly calculates the difference in pixel color between the final generated image and the real image, and then adjusts the direction and magnitude of parameter changes based on the difference. Here, the basic network feedback algorithm (backpropagation) is used.” Therefore, the difference in pixel color is interpreted as the feature value being a pixel difference value); and
the number of pieces of training data included in the training set are adjusted by obtaining a decoded image, extracting the pixel difference value, the pixel mean value, or the pixel variance value of the decoded image from the decoded image, and adding the decoded image to the training set based on the extracted pixel difference value, pixel mean value, or pixel variance value of the decoded image (see Huang at [0076]: “Then, the portrait is generated through the above-trained decoding network. The loss function mainly calculates the difference in pixel color between the final generated image and the real image, and then adjusts the direction and magnitude of parameter changes based on the difference. Here, the basic network feedback algorithm (backpropagation) is used.” Therefore, the adjustment of using backpropagation based on the calculated difference in pixel color is interpreted as adjusting the number of pieces or training data).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song, Yokoi and Liu with the above teachings of Huang by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song, Yokoi and Liu, wherein the adjustment is based on pixel difference of the decoded image, as taught by Huang. The modification would have been obvious because one of ordinary skill in the art would be motivated to improve the accuracy of the network (as it can be seen from Huang at [0067]: “In each round of training, the parameters of the discriminator network are continuously adjusted to improve the discrimination accuracy” and at [0076]: “As the parameters are adjusted, the encoding network will encode the voiceprint feature data into the corresponding portrait feature vector more and more accurately”).
Referring to independent Claim 14 and Claim 16, they are rejected on the same basis as independent claim 1 since they are analogous claims.
Referring to independent Claim 15 and Claim 17, they are rejected on the same basis as independent claim 13 since they are analogous claims.
Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Song in view of Yokoi, in view of Liu, in view of Huang, and further in view of Duan et al (US Pub. No. 2020/0401931- hereinafter Duan).
Referring to Claim 2, the combination of Song, Yokoi, Liu and Huang teaches the learning apparatus according to claim 1, however, fails to teach wherein the adjustment unit adjusts the number of pieces of training data such that the feature values have a first distribution in which a ratio of the training data increases as the feature value of the training data increases.
Duan teaches, in an analogous system wherein the adjustment unit adjusts the number of pieces of training data such that the feature values have a first distribution in which a ratio of the training data increases as the feature value of the training data increases (see Duan at [0003]: “In order to adequately train the machine learning model using high-dimensional feature space data, the sample size of the training data needs to be sufficiently large in order to avoid data from becoming sparse”. Therefore, high-dimensional feature space data corresponds to increasing feature values and largo training data size corresponds to increasing training data).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song, Yokoi, Liu and Huang with the above teachings of Duan by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song, Yokoi, Liu and Huang, and increasing training size due to increasing feature values, as taught by Duan. The modification would have been obvious because one of ordinary skill in the art would be motivated to avoid data from becoming sparse (as it can be seen from Duan at 0003: “In order to adequately train the machine learning model using high-dimensional feature space data, the sample size of the training data needs to be sufficiently large in order to avoid data from becoming sparse”).

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Song in view of Yokoi, in view of Liu, in view of Huang, in view of Duan and further in view of Lee et al (US Pub. No. 2017/0199900- hereinafter Lee).
Referring to Claim 4, the combination of Song, Yokoi, Liu, Huang and Duan teaches the learning apparatus according to claim 2, however, fails to teach wherein the adjustment unit makes an adjustment that makes the distribution of the feature values of the plurality of pieces of training data average, after the machine learning is performed using all of the training data included in the training set.
Lee teaches, in an analogous system, wherein the adjustment unit makes an adjustment that makes the distribution of the feature values of the plurality of pieces of training data average, after the machine learning is performed using all of the training data included in the training set (see Lee at [0089]: “when K-means clustering, which is a representative clustering method, is used, 128 core representative values may be obtained, and a feature index may be formed by utilizing a distribution (an average and a variance) of the representative values and the feature values”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song, Yokoi, Liu, Huang and Duan with the above teachings of Lee by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song, Yokoi, Liu, Huang and Duan, and making the distribution average, as taught by Lee. The modification would have been obvious because one of ordinary skill in the art would be motivated to enhance the quality in image by reducing the capacity of the image and including only important information (as it can be seen from Lee at 0046: “Here, the capacity and quality of the street information DB 140 are very important. Accordingly, even though there are the same number of images, the capacity may be reduced by obtaining an image including only a small number of obstacle elements such as trees, roads, etc., and the quality may be enhanced by including only important information”).

Claims 5-6 are rejected under 35 U.S.C. 103 as being unpatentable over Song in view of Yokoi, in view of Liu, in view of Huang, and further in view of Kimmel et al (US Pub. No. 2021/0345945 - hereinafter Kimmel).
Referring to Claim 5, the combination of Song, Yokoi, Liu and Huang teaches the learning apparatus according to claim 1, however, fails to teach wherein the adjustment unit adjusts the number of pieces of training data such that the feature values have a second distribution in which a ratio of the training data increases as the feature value of the training data decreases.
Kimmel teaches, in an analogous system, wherein the adjustment unit adjusts the number of pieces of training data such that the feature values have a second distribution in which a ratio of the training data increases as the feature value of the training data decreases (see Kimmel at [0038]: “by collecting a large representative training set and using dimensionality reduction, this method has been used to develop robust statistical models of scoliotic deformity with only five or ten parameters”. Therefore, reducing dimensionality corresponds to decreasing feature value, from a large training set (increasing training data)).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song, Yokoi, Liu and Huang with the above teachings of Kimmel by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song, Yokoi, Liu and Huang, and decreasing feature values from a large training dataset, as taught by Kimmel. The modification would have been obvious because one of ordinary skill in the art would be motivated to develop a robust model (as it can be seen from Kimmel at 0038: “by collecting a large representative training set and using dimensionality reduction, this method has been used to develop robust statistical models of scoliotic deformity with only five or ten parameters”).
Referring to Claim 6, the combination of Song, Yokoi, Liu, Huang and Kimmel teaches the learning apparatus according to claim 5, wherein the adjustment unit adds, after the machine learning is performed using all of the training data included in the training set, training data having feature values smaller than a predetermined value (see Kimmel at [0038]: “by collecting a large representative training set and using dimensionality reduction, this method has been used to develop robust statistical models of scoliotic deformity with only five or ten parameters”. Therefore, reducing dimensionality with only five or ten parameters corresponds to the claimed “training data having feature values smaller than a predetermined value”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song, Yokoi, Liu and Huang with the above teachings of Kimmel by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song, Yokoi, Liu and Huang, and decreasing feature values from a large training dataset, as taught by Kimmel. The modification would have been obvious because one of ordinary skill in the art would be motivated to develop a robust model (as it can be seen from Kimmel at 0038: “by collecting a large representative training set and using dimensionality reduction, this method has been used to develop robust statistical models of scoliotic deformity with only five or ten parameters”).

Claims 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Song in view of Yokoi, in view of Liu, in view of Huang, and further in view of Tsubouchi (US Pub. No. 2016/0086094- hereinafter Tsubouchi).
Referring to Claim 8, the combination of Song, Yokoi, Liu and Huang teaches the learning apparatus according to claim 7, however, fails to teach wherein an inference process is carried out by switching the plurality of learned models that are obtained through machine learning performed by the respective training units.
Tsubouchi teaches, in an analogous system, wherein an inference process is carried out by switching the plurality of learned models that are obtained through machine learning performed by the respective training units (see Tsubouchi at 0083: “The model selection unit 24 selects a prediction model suited to a user based on related information where features value is associated with a prediction model”. Therefore, selecting the model based on the feature value information is interpreted as ‘switching the plurality of models’).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song, Yokoi, Liu and Huang with the above teachings of Tsubouchi by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song, Yokoi, Liu and Huang, and switching a model to use for inference, as taught by Tsubouchi. The modification would have been obvious because one of ordinary skill in the art would be motivated to provide a suited model based on the feature values (as it can be seen from Tsubouchi at 0083: “selects a prediction model suited to a user based on related information where features value is associated with a prediction model”).
Referring to Claim 9, the combination of Song, Yokoi, Liu, Huang and Tsubouchi teaches the learning apparatus according to claim 8, wherein the inference process is carried out by switching to any of the plurality of learned models according to a feature value of data that is subject to inference (see Tsubouchi at 0083: “The model selection unit 24 selects a prediction model suited to a user based on related information where features value is associated with a prediction model”. Therefore, selecting the model based on the feature value information is interpreted as ‘switching the plurality of models’).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Song, Yokoi, Liu and Huang with the above teachings of Tsubouchi by adjusting feature values of the plurality of pieces of training data have a predetermined distribution by adjusting the number of pieces of training data included in the training set, as taught by Song, Yokoi, Liu and Huang, and switching a model to use for inference, as taught by Tsubouchi. The modification would have been obvious because one of ordinary skill in the art would be motivated to provide a suited model based on the feature values (as it can be seen from Tsubouchi at 0083: “selects a prediction model suited to a user based on related information where features value is associated with a prediction model”).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LUIS A SITIRICHE whose telephone number is (571)270-1316. The examiner can normally be reached M-F 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached on (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/LUIS A SITIRICHE/Primary Examiner, Art Unit 2126
Read full office action
Prosecution Timeline

Feb 19, 2021
Application Filed
Mar 14, 2025
Non-Final Rejection — §103
May 01, 2025
Response Filed
Jun 23, 2025
Final Rejection — §103
Aug 20, 2025
Response after Non-Final Action
Sep 15, 2025
Request for Continued Examination
Sep 22, 2025
Response after Non-Final Action
Mar 05, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/392,690
Patent 12585947
MODIFYING COMPUTATIONAL GRAPHS
2y 5m to grant Granted Mar 24, 2026
17/164,756
Patent 12579476
ADAPTIVE LEARNING FOR IMAGE CLASSIFICATION
2y 5m to grant Granted Mar 17, 2026
17/684,752
Patent 12579445
MODELS FOR PREDICTING RESISTANCE TRENDS
2y 5m to grant Granted Mar 17, 2026
16/950,570
Patent 12572791
METHOD, DEVICE AND COMPUTER PROGRAM FOR PREDICTING A SUITABLE CONFIGURATION OF A MACHINE LEARNING SYSTEM FOR A TRAINING DATA SET
2y 5m to grant Granted Mar 10, 2026
17/216,362
Patent 12572857
Adaptive Probabilistic Latent Semantic Analysis System For Automated Document Coding And Review In Electronic Discovery
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
78%
Grant Probability
99%
With Interview (+22.1%)
3y 7m
Median Time to Grant
High
PTA Risk
Based on 468 resolved cases by this examiner. Grant probability derived from career allow rate.