Last updated: April 19, 2026
Application No. 18/169,281
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND IMAGE PROCESSING COMPUTER PROGRAM PRODUCT

Final Rejection §102§103§112
Filed
Feb 15, 2023
Examiner
YAO, JULIA ZHI-YI
Art Unit
2666
Tech Center
2600 — Communications
Assignee
Kabushiki Kaisha Toshiba
OA Round
2 (Final)
Interview Optional

— +35.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 69 resolved cases, 2023–2026
Examiner Intelligence

YAO, JULIA ZHI-YI View full profile →
Grants 68% — above average
Career Allow Rate
47 granted / 69 resolved
+6.1% vs TC avg
Strong +36% interview lift
Without
With
+35.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
29 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
8.9%
-31.1% vs TC avg
§103
52.6%
+12.6% vs TC avg
§102
11.2%
-28.8% vs TC avg
§112
26.1%
-13.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 69 resolved cases
Office Action

§102 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
Claims 1-13  were pending for examination in the Application No. 18/169,281 filed February 15th, 2023. In the remarks and amendments received on July 31st, 2025, claims 1, 3-8, 10, and 12-13 are amended and claim 2 is canceled. Accordingly, claims 1 and 3-13 are currently pending for examination in the application.

Priority (Previously Presented)
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed as Japan (JP) Patent Application No. 2022-143745, filed on September 9th, 2022.

Response to Amendment
Applicant’s amendments filed July 31st, 2025, to the Specification, Claims, and Drawings have overcome each and every objection and 35 U.S.C. § 101 rejection regarding non-statutory subject matter previously set forth in the Non-Final Office Action mailed May 15th, 2025. Accordingly, the objections and 35 U.S.C. § 101 rejection regarding non-statutory subject matter are withdrawn in response to the remarks and amendments filed. Additionally, the previously set forth 35 U.S.C. § 112(f) interpretation(s) are accordingly withdrawn in response to Applicant’s amendments. Examiner warmly thanks Applicant for considering the objections and the suggested amendments to be made to the disclosure.

Response to Arguments
Applicant’s arguments filed July 31st, 2025, regarding the rejection(s) of independent claim(s) 1, 12, and 13 have been fully considered but are not persuasive.
The examiner respectfully disagrees with Applicant's assertion that Wen fails to disclose the amended claim 1 limitation “estimate, when determining that estimating the attribute is difficult using a first identification target region, …the pseudo label based on a second identification target region that is different from the first identification target region” because the cited reference(s) of Wen (para. [0035]) do not disclose “performing determination for the pseudo label when the reliability is low” (pg. 12 of Applicant’s Remarks).
The cited reference(s) of Wen (para. [0035]) reciting “ignoring the object key part with an unreliable position prediction result” is not analogous to not performing determination for the pseudo label when the reliability is low as asserted by Applicant. This is because the “ignoring” of the “object key part” when the reliability is low (i.e., “unreliable”) is part of the determination for “estimat[ing]… the pseudo label based on a second identification target region that is different from the first identification target region” as recited in claim 1.  As detailed in the rejection of claim 1 below, the “unreliable” “object key part” that is “ignor[ed]” is the claimed “first identification target region” when “estimating the attribute—” (i.e., the “object key part”) “—is difficult” (i.e., “unreliable”). When this determination is made, the pseudo label is then estimated for a “reliable” “object key part”, which is “estimat[ing]… the pseudo label based on a second identification region—” (i.e., the “reliable” “object key part”) “—that is different from the first identification target region” (i.e., the “unreliable” “object key part”).
Therefore, independent claim 1 (and similar independent claims 12 and 13) remain rejected under 35 U.S.C. 102(a)(2) as being anticipated by Wen, for similarly recited features therein, and dependent claims 3-11, not being rejected by the same combination, are rendered obvious by Wen and/or Lv in the combination(s) set forth in the current rejection below (see analysis in “Claim Rejections - 35 USC § 103” below).

Information Disclosure Statement
The information disclosure statement (IDS) submitted on June 27th, 2025, is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the IDS is being considered and attached by the examiner.

Claim Objections
Claims 12 and 13 are objected to because of the following informalities failing to comply with 37 CFR 1.71(a)  for "full, clear, concise, and exact terms" (see MPEP § 608.01(m)): 
The examiner respectfully suggests amending the phrase “when determining that it is difficult to estimate the attribute” in the 4th-to-last line of claims 12 and 13 to recite “when determining that [[it is difficult to estimate]]estimating the attribute is difficult” to remove potential confusion regarding the subject of the term “it” in the claim(s).  
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claim(s) 7 is/are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.  
Regarding claim(s) 7, the claim is rejected for being incomplete due to the claim being dependent on canceled base claim(s) 2 (MPEP § 608.01(n)(V)). The examiner respectfully suggests amending the phrase “The image processing apparatus according to claim 2…” in the claim to recite “The image processing apparatus according to claim [[2]]1…”.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 3-7, 9, and 12-13 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Wen et al. (Wen; US 2023/0290003 A1).

Regarding claim 1, Wen discloses an image processing apparatus comprising:
one or more hardware processors (para. [0192], recite(s)
[0192] “Refer to FIG. 12 , which is a schematic structural diagram of a server 1200 provided by an embodiment of this application. The server 1200 may greatly differ as configuration or performance differs, may include one or more central processing units (CPUs) 1222 (for example, one or more processors), a memory 1232 , and one or more storage mediums 1230 (for example, one or more mass storage devices) storing an application 1242 or data 1244…”
) configured to:
acquire unlabeled training data including an image to which a correct label of an attribute is not assigned (paras. [0044] and [0051], recite(s)
[0044] “After the above processing is performed on each object key part in the target training image, the labeling position corresponding to each target object key part in the target training image will be obtained, and then a key point labeling result corresponding to the target training image, that is, a pseudo label corresponding to the target training image, is formed by using the labeling position corresponding to each target object key part in the target training image. Then, the target training image and a key point labeling result corresponding thereto are taken as a training sample.”
[0051] “The pseudo label is training data that is commonly used in semi-supervised learning. Usually, unlabeled data may be processed through a complex model with higher performance to obtain a pseudo label corresponding to the unlabeled data. The pseudo label may be inaccurate. In the embodiment of this application, the pseudo label corresponding to the target training image may be determined according to the key point detection results of the m reference object key point detection models for the target training image. This application aims to process the key point detection results of the m reference object key point detection models for the target training image through a series of processing processes to obtain the pseudo label which can accurately reflect the position of the object key part in the target training image, so as to improve the performance of the target object key point detection model trained based on a training sample including the pseudo label.”
, where the “unlabeled data” including a “target training image” is unlabeled training data; and the “position” of the “object key part” or “object key point” is an attribute);
estimate a pseudo label, which is an estimation result of the attribute of the image of the unlabeled training data, based on an identification target region according to a type of the attribute to be identified by a first learning model to be learned in the image of the unlabeled training data  (para. [0051]—see citation above—, where para. [0053] further recites:
[0053] “The target training image includes an image including the object to be detected. The object to be detected includes a plurality of object key parts. The object key part here includes a part that is on the object to be detected and that can reflect a posture of the object to be detected. Exemplarily, the target training image may be an image including a clear and complete human body to be detected. The human body to be detected includes a plurality of important joints, such as nose, left and right eyes, left and right ears, left and right shoulders, left and right elbows, left and right wrists, left and right buttocks, left and right knees, and left and right ankles.”
, where each of the “plurality of object key parts” is an identification target region and the type of “object key part” (e.g., “important joints, such as nose, left and right eyes, left and right ears, left and right shoulders,” etc.) is a type of the attribute); and
learn the first learning model that identifies the attribute of the image using first labeled training data for which the pseudo label is assigned to the image of the unlabeled training data (para. [0045], recite(s)
[0045] “Then, the target object key point detection model is trained based on the training sample constructed in the above manner…”
, where the “target object key point detection model” is the first learning model and the “training sample” is the first labeled training data disclosed previously in para. [0044] above), 
wherein
the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult using a first identification target region, which is the identification target region used for learning of the first learning model, in the image of the unlabeled training data, the pseudo label based on a second identification target region that is different from the first identification target region (para. [0035], recite(s)
[0035] “When the above model training method generates the training sample used for training the object key point detection model that needs to be actually put into use, object key point detection processing will be performed on the target training image through a plurality of reference object key point detection models with complex structures to obtain a plurality of key point detection results. Then, based on the principle that the predicted positions of the same object key part in various key point detection results are basically consistent, for each object key part, whether a position prediction result of each reference object key point detection model for the object key part is reliable is measured according to the predicted position of the key point corresponding to the object key part in each key point detection result, that is, whether the object key part is the target object key part is determined. When it is determined that the position prediction result of each reference object key point detection model for the object key part is reliable, the labeling position corresponding to the target object key part is further determined as a pseudo label. Then, a training sample is formed by using the target training image and the labeling position corresponding to each target object key part…”
, where generating “pseudo label[s]” only when “object key part[s]” have a “reliable position” is estimating the pseudo label based on at least a second identification target region (e.g., other “object key part[s]” with “reliable position[s]”)  different from at least a first identification target region (e.g., the “object key part” that has “an unreliable position”) when determining that it is difficult (e.g., “unreliable”) to estimate the attribute (e.g., “object key point”) using at least the first identification target region (e.g., the “object key part” currently being examined for reliability)).
Regarding claim 3, Wen discloses the image processing apparatus according to claim 1, wherein the one or more hardware processors are configured to estimate, when determining that the attribute is estimatable using the first identification target region in the image of the unlabeled training data, the pseudo label based on the first identification target region (para. [0035]—see citation in claim 1 limitation “the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult…” above—, where generating “pseudo label[s]” only when “object key part[s]” have a “reliable position” is estimating the pseudo label based on at least a first identification target region (e.g., an “object key part” with a “reliable position”)  when determining that the attribute (e.g., “object key point”) is estimatable (e.g., “reliable”) using the first identification target region (e.g., the “object key part” currently being examined for reliability)).

Regarding claim 4, Wen discloses the image processing apparatus according to claim 1, wherein the one or more hardware processors are configured to determine, when a state of a subject represented by the identification target region in the image of the unlabeled training data does not satisfy a predetermined estimatable condition for estimating the attribute from the first identification target region, that estimating the attribute using the first identification target region is difficult (para. [0127], recite(s)
[0127] “(2) From a spatial perspective, for a key point corresponding to a certain human body part, a server may calculate a mean value of predicted coordinates of the key point corresponding to the human body part in the m prediction results as reference coordinates corresponding to the human body part. Then, whether the prediction result belongs to a qualified prediction result corresponding to the human body part may be determined based on a distance between the predicted coordinates of the key point corresponding to the human body part in each prediction result and the reference coordinates. Specifically, it may be considered that the prediction result belongs to the qualified prediction result corresponding to the human body part when the distance between the predicted coordinates of the key point corresponding to the human body part in the prediction result and the reference coordinates is less than 0.1 (after normalization processing). It may be considered that the prediction result does not belong to the qualified prediction result corresponding to the human body part when the distance between the predicted coordinates of the key point corresponding to the human body part in the prediction result and the reference coordinates is not less than 0.1. When a quantity of qualified prediction results corresponding to the human body part is greater than or equal to m/2, the server may determine that the human body part belongs to a valid human body part, and average the predicted coordinates of the key point corresponding to the valid human body part in the qualified prediction results corresponding to the valid human body part to obtain labeling coordinates corresponding to the valid human body part. When a quantity of qualified prediction results corresponding to the human body part is less than m/2, the server may determine that the human body part belongs to an invalid human body part, ignore the invalid human body part, and not determine corresponding labeling coordinates thereof. Then, the server may form a pseudo label {tilde over (y)}t corresponding to the picture xt by using the labeling coordinates corresponding to each valid human body part.”
, where determining the reliability of an object “key point” includes determining that the object “key point” “does not belong to the qualified prediction result corresponding to the human body part when the distance between the predicted coordinates of the key point corresponding to the human body part in the prediction result and the reference coordinates is not less than 0.1 [or a mean value]” is determining that estimating the attribute (e.g., object “key point”) using the first identification target region (e.g., “corresponding… human body part”) is difficult (e.g., “does not belong to qualified prediction…” or “unreliable” as recited in para. [0035]—see claim 1 limitation “the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult…” above—) when a state of a subject represented by the identification target region (e.g., “coordinates of the key point corresponding to the human body part in each prediction result”) does not satisfy a predetermined estimatable condition (e.g., a “distance… less than 0.1 [or a mean value]”)).

Regarding claim 5, Wen discloses the image processing apparatus according to claim 1, wherein the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult using the first identification target region in the image of the unlabeled training data, the pseudo label set in advance according to a state of a subject represented by the second identification target region (para. [0035]—see citation in claim 1 limitation “the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult…” above—, where generating “pseudo label[s]” only when “object key part[s]” have a “reliable position” is estimating the pseudo label when determining that it is difficult (e.g., “unreliable”) to estimate the attribute (e.g., “object key point”) using at least the first identification target region (e.g., the “object key part” determined as “unreliable”); wherein para. [0127]—see citation in claim 4 above—, further recites that the estimated pseudo label is set in advance as Wen discloses the generated “pseudo label[s]” for each object “key point” is associated with a “corresponding… human body part” previously known in advance (e.g., “important joints” of the human body—see para. [0053] in claim 1 limitation “a pseudo label estimation unit…” above) and is set in advance according to a state of a subject (e.g., “coordinates of the key point corresponding to the human body part in each prediction result”) represented by at least a second identification region (e.g., an “object key part” with a “reliable position”)).

Regarding claim 6, Wen discloses the image processing apparatus according to claim 3, wherein the one or more hardware processors are configured to estimate, when determining that the attribute is estimatable using the first identification target region, the pseudo label from the first identification target region of the image of the unlabeled training data (para. [0035]—see similar limitation in claim 3 above) using a second learning model learned in advance (para. [0035]—see citation in claim 1 limitation “the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult…” above—, where a “reference object key point detection model[s]” is at least a second learning model; and para. [0052] further recite(s) the second learning model is learned in advance (i.e., “pre-trained”):
[0052] “The reference object key point detection model is a pre-trained model for detecting a position where an object key part is located on an object to be detected in an image, and can accurately detect the position where the object key part is located…”
).

Regarding claim 7, Wen discloses the image processing apparatus according to claim [[2]]1, wherein the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult using the first identification target region in the image of the unlabeled training data, the pseudo label from the second identification target region of the image of the unlabeled training data (para. [0035]—see similar limitation in claim 1 limitation “the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult…” above) using a second learning model learned in advance (para. [0035]—see citation in claim 1 limitation “the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult…” above—, where a “reference object key point detection model[s]” is at least a second learning model; and para. [0052]—see citation in claim 6 above—further recite(s) the second learning model is learned in advance (i.e., “pre-trained”)).

Regarding claim 9, Wen discloses the image processing apparatus according to claim 6, wherein
the first learning model is a learning model having a processing speed higher than a processing speed of the second learning model (paras. [0086] and [0102], recite(s)
[0086] “It is to be understood that the working principles of the target object key point detection model and the reference object key point detection model mentioned above are basically the same, but there are differences in the model structures of the two. Generally, the structure of the target object key point detection model is simpler than that of the reference object key point detection model…”
[0102] “In some embodiments, the embodiment of this application may also introduce an idea of knowledge distillation into a training process for the target object key point detection model, so as to further improve the model performance of the trained target object key point detection model. Knowledge distillation is a model training manner of guiding a simple model (also referred to as a student model) by using the knowledge leaned by a complex model (as referred to as a teacher model), which aims to make the simple model have comparable performance to the complex model. Moreover, the quantity of parameters of the simple model is greatly reduced compared with that of the complex model, thereby realizing compression and acceleration of a model.”
, where the first learning model (i.e., the “target object key point detection model”) being a “simpler” model than the second learning model (i.e., the “reference object key point detection model”) is the first learning model (e.g., a “student model”) having a higher processing speed (e.g., “acceleration”) than a processing speed of the second learning model (e.g., a “teacher model”)).

Regarding claim 12, the claim is the method performed by the apparatus of claim 1. Therefore, claim 12 recites similar limitations to claim 1 and is rejected for similar rationale and reasoning (see the analysis for claim 1 above).

Regarding claim 13, the claim is a computer program product having a non-transitory computer readable medium including programmed instructions stored thereon, wherein the instructions, when executed by a computer, cause the computer to perform the functions of claim 1. Wen discloses said non-transitory computer readable medium (para. [0192]—see citation in claim 1 limitation “one or more hardware processors…” above). Therefore, claim 13 recites similar limitations to claim 1 and is rejected for similar rationale and reasoning (see the analysis for claim 1 above).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 8 and 10-11 are rejected under 35 U.S.C. 103 as being unpatentable over Wen as applied to claims 3 and 1 above, and further in view of Lv et al. (Lv; US 2024/0203103 A1).

Regarding claim 8, Wen discloses the image processing apparatus according to claim 3, wherein the one or more hardware processors are configured to estimate, when determining that the attribute is estimatable using the first identification target region in the image of the unlabeled training data, the pseudo label from the first identification target region of the image of the unlabeled training data (para. [0035]—see similar limitation in claim 3 above) using (a second) learning model (para. [0035]—see citation in claim 1 limitation “the one or more hardware processors are configured to estimate, when determining that estimating the attribute is difficult…” above—, where a “reference object key point detection model[s]” is at least a second learning model).
	Where Wen does not specifically disclose
	estimates… the pseudo label… using the first learning model;
	Lv teaches in the same field of endeavor of semi-supervised learning including pseudo label generation
	estimates… the pseudo label… using the first learning model (para. [0013], [0025], and [0046], recite(s)
[0013] “The existing method for training an image classification model uses a pre-trained teacher network to generate a pseudo-label for a sample image, and then performs semi-supervised learning through a student network by using the sample image with the pseudo-label and finally forms the image classification model. However, since the pseudo-labels are completely generated by the conversion of the information learned from the sample images by the teacher network, it is not conducive for the student network to fully mine and utilize the information of the sample images, especially in an early stage of training process, a confidence of the pseudo-label generated by the teacher network is not high, which leads to the poor training effect of the image classification model, and then affects the accuracy and stability for classifying the image classification finally.”
[0025] “In the embodiment of the present application, the target sub-model is the first sub-model or the second sub-model…”
[0046] “As shown in FIG. 2, based on the classification reference information output by the second sub-model for the first unlabeled image, the electronic device determines the preset category corresponding to the maximum probability from the multiple preset categories, in response that the maximum probability corresponding to the preset category is greater than the preset probability threshold, then the electronic device generates the pseudo label of the first unlabeled image corresponding to the first sub-model; and based on the classification reference information output by the first sub-model for the first unlabeled image, the electronic device determines the preset category corresponding to the maximum probability from the multiple preset categories, in response that the maximum probability corresponding to the preset category is greater than the preset probability threshold, then the electronic device generates the pseudo label of the first unlabeled image corresponding to the second sub-model.”
, where each of a first learning model (e.g., the “first sub-model” or a “student” network) and a second learning model (e.g., the “second sub-model” or a “teacher” network) generates (i.e., estimates) pseudo labels).

Since Wen also discloses using a teacher-student model framework for training a machine learning model including training data with pseudo labels (paras. [0051], [0086], and [0102], recite(s)
[0051] “The pseudo label is training data that is commonly used in semi-supervised learning…”
[0086] “It is to be understood that the working principles of the target object key point detection model and the reference object key point detection model mentioned above are basically the same, but there are differences in the model structures of the two. Generally, the structure of the target object key point detection model is simpler than that of the reference object key point detection model…”
[0102] “In some embodiments, the embodiment of this application may also introduce an idea of knowledge distillation into a training process for the target object key point detection model, so as to further improve the model performance of the trained target object key point detection model. Knowledge distillation is a model training manner of guiding a simple model (also referred to as a student model) by using the knowledge leaned by a complex model (as referred to as a teacher model), which aims to make the simple model have comparable performance to the complex model. Moreover, the quantity of parameters of the simple model is greatly reduced compared with that of the complex model, thereby realizing compression and acceleration of a model.”
, where the “target object key point detection model” is a student model (e.g., a first learning model) and the “reference object key point detection model[s]” is/are teacher model(s) (e.g., second learning model(s))), it would have been obvious to one of ordinary skill in the art before the effective filing date of the presently filed invention to modify the system of Wen to incorporate estimating the pseudo label using at least the first learning model to improve training the first learning model by fully utilizing training images as taught by Lv (paras. [0014] and [0036], recite(s)
[0014] “In view of this, an embodiment of the present application provides a method for training the image classification model. Under a semi-supervised learning framework, a one-way teacher-student relationship between each sub-model of the model is improved to a mutual teacher-student relationship. Using the information learned by one sub-model from the sample images, to provide pseudo-labels for semi-supervised learning for another sub-model, so that each sub-model can learn from each other and teach each other, thereby making the information of the sample images can be fully mined and utilized, thereby improving the training effect of the model, and obtaining a more accurate and reliable model.”
[0036] “In this way, the first sub-model and the second sub-model may use their own learned information to provide guidance to each other, so that the one-way teacher-student relationship between the first sub-model and the second sub-model is changed to a mutual teacher-student relationship, which is conducive to complementary learning and teaching between various sub-models, so that the information of the images of the image set can be fully explored and utilized, which is conducive to improving the training effect of the model.”
).

Regarding claim 10, Wen discloses the image processing apparatus according to claim 1, wherein the one or more hardware processors are configured to (paras. [0044] and [0051]—see claim 1 limitation “an acquisition unit…” above), and learn the first learning model by using the first labeled training data(paras. [0044-0055]—see claim 1 limitation “a learning unit…” above—, where the “training sample” comprising of training images with pseudo labels is a first labeled training data).
Where Wen does not specifically disclose
	acquire second labeled training data including an image to which the correct label is assigned, and
learns the first learning model by using the first labeled training data and the second labeled training data;
Lv teaches in the same field of endeavor of semi-supervised learning including pseudo label generation
	acquire second labeled training data including an image to which the correct label is assigned, and
learns the first learning model by using the first labeled training data and the second labeled training data (paras. [0019-0021], [0024], and [0052], recite(s)
[0019] “S102, the electronic device obtains an image set used for training a model.”
[0020] “Among them, the image set includes a labeled image, an unlabeled image, and a category label of the labeled image.”
[0021] “The labeled image refers to an image with a category label, and the unlabeled image refers to an image without a category label. In practical applications, in order to further improve the accuracy for classifying the images of the model, the image set may include multiple labeled images and multiple unlabeled images, and the multiple labeled images may belong to different categories.”
[0024] “In order to train the model with high accuracy and reliability under a condition that the number of the labeled images is limited, as shown in FIG. 2, the model in the embodiment of the present application may include a first sub-model and a second sub-model, the first sub-model discriminates each image of the image set and determines the first classification reference information, the second sub-model discriminates each image of the image set and determines the first classification reference information. Based on this, the model is finally obtained by using the semi-supervised learning algorithm to train the model. In practical applications, the first sub-model and the second sub-model may have the same network structure, or, in order to simplify the model structure to achieve compression and acceleration of the model, the first sub-model and the second sub-model may also have different network structures. For example, the second sub-model adopts a more streamlined structure than the first sub-model.”
[0052] “In the above mentioned S162, the first unsupervised loss of the target sub-model may be determined according to the first classification reference information of each second unlabeled image of the image set, the first pseudo label of each unlabeled image, and a preset loss function…”
, where a first learning model (i.e., a “target sub-model”, such as a “first sub-model”) is trained using first labeled training data (e.g., unlabeled data with pseudo labels—see para. [0052] above) and second labeled training data (e.g., labeled training data)).

Since Wen also discloses using a teacher-student model framework for training a machine learning model including training data with pseudo labels (paras. [0051], [0086], and [0102]—see citations in the rationale and reasoning to combine Wen and Lv in claim 8 above), it would have been obvious to one of ordinary skill in the art before the effective filing date of the presently filed invention to modify the system of Wen to incorporate acquiring second labeled training data including an image to which the correct label is assigned and trains the first learning model using both the first labeled training data and the second labeled training data to improve the accuracy of the first learning model as taught by Lv (para. [0021]—see citation above).

Regarding claim 11, Wen in view of Lv discloses the image processing apparatus according to claim 10, wherein Lv further teaches
the image included in at least one of the unlabeled training data, the first labeled training data, and the second labeled training data is an image of a same type as an input image to be processed of the first learning model (para. [0057], recite(s)
[0057] “For example, the classification reference information obtained after predicting the same image by different sub-models should theoretically be the same, and the pseudo label of the same image corresponding to different sub-models should also be the same…”
).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Dong et al. (“Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection,” 2019) discloses in the abstract and Fig. 1:
[abstract] “Facial landmark detection aims to localize the anatomically defined points of human faces. In this paper, we study facial landmark detection from partially labeled facial images. A typical approach is to (1) train a detector on the labeled images; (2) generate new training samples using this detector's prediction as pseudo labels of unlabeled images; (3) retrain the detector on the labeled samples and partial pseudo labeled samples. In this way, the detector can learn from both labeled and unlabeled data and become robust. In this paper, we propose an interaction mechanism between a teacher and two students to generate more reliable pseudo labels for unlabeled data, which are beneficial to semi-supervised facial landmark detection. Specifically, the two students are instantiated as dual detectors. The teacher learns to judge the quality of the pseudo labels generated by the students and filter out unqualified samples before the retraining stage. In this way, the student detectors get feedback from their teacher and are retrained by premium data generated by itself. Since the two students are trained by different samples, a combination of their predictions will be more robust as the final prediction compared to either prediction. Extensive experiments on 300-W and AFLW benchmarks show that the interactions between teacher and students contribute to better utilization of the unlabeled data and achieves state-of-the-art performance.”

    PNG
    media_image1.png
    674
    630
    media_image1.png
    Greyscale

Hwang et al. (“Lightweight 3D Human Pose Estimation Network Training Using Teacher-Student Learning,” 2020) discloses in the abstract and Fig. 1:
[abstract] “We present MoVNect, a lightweight deep neural network to capture 3D human pose using a single RGB camera. To improve the overall performance of the model, we apply the teacher-student learning method based knowledge distillation to 3D human pose estimation. Real-time post-processing makes the CNN output yield temporally stable 3D skeletal information, which can be used in applications directly. We implement a 3D avatar application running on mobile in real-time to demonstrate that our network achieves both high accuracy and fast inference time. Extensive evaluations show the advantages of our lightweight model with the proposed training method over previous 3D pose estimation methods on the Human3.6M dataset and mobile devices.”

    PNG
    media_image2.png
    593
    633
    media_image2.png
    Greyscale


Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JULIA Z YAO whose telephone number is (571)272-2870. The examiner can normally be reached Monday - Friday (8:30AM - 5PM).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached at (571)270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/J.Z.Y./Examiner, Art Unit 2666                                                                                                                                                                                                        
/EMILY C TERRELL/Supervisory Patent Examiner, Art Unit 2666
Read full office action
Prosecution Timeline

Feb 15, 2023
Application Filed
May 08, 2025
Non-Final Rejection — §102, §103, §112
Jul 31, 2025
Response Filed
Oct 03, 2025
Final Rejection — §102, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/421,030
Patent 12597169
ACTIVITY PREDICTION USING PORTABLE MULTISPECTRAL LASER SPECKLE IMAGER
2y 5m to grant Granted Apr 07, 2026
18/289,093
Patent 12586219
Fast Kinematic Construct Method for Characterizing Anthropogenic Space Objects
2y 5m to grant Granted Mar 24, 2026
17/822,138
Patent 12579638
IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND IMAGE PROCESSING PROGRAM FOR PERFORMING DETERMINATION REGARDING DIAGNOSIS OF LESION ON BASIS OF SYNTHESIZED TWO-DIMENSIONAL IMAGE AND PRIORITY TARGET REGION
2y 5m to grant Granted Mar 17, 2026
17/757,727
Patent 12562063
METHOD FOR DETECTING ROAD USERS
2y 5m to grant Granted Feb 24, 2026
18/471,188
Patent 12561805
METHODS AND SYSTEMS FOR GENERATING DUAL-ENERGY IMAGES FROM A SINGLE-ENERGY IMAGING SYSTEM BASED ON ANATOMICAL SEGMENTATION
2y 5m to grant Granted Feb 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
68%
Grant Probability
99%
With Interview (+35.7%)
3y 4m
Median Time to Grant
Moderate
PTA Risk
Based on 69 resolved cases by this examiner. Grant probability derived from career allow rate.