Last updated: April 19, 2026

Application No. 18/589,736

SIMILAR IMAGE SET CREATING APPARATUS, METHOD AND STORAGE MEDIUM

Non-Final OA §103§112

Filed

Feb 28, 2024

Examiner

KEUP, AIDAN JAMES

Art Unit

2666

Tech Center

2600 — Communications

Assignee

Kabushiki Kaisha Toshiba

OA Round

1 (Non-Final)

Interview Optional

— +12.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 60 resolved cases, 2023–2026

Examiner Intelligence

KEUP, AIDAN JAMES View full profile →

Grants 80% — above average

Career Allow Rate

48 granted / 60 resolved

+18.0% vs TC avg

Moderate +12% lift

Without

With

+12.0%

Interview Lift

resolved cases with interview

Typical timeline

3y 3m

Avg Prosecution

22 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

18.7%

-21.3% vs TC avg

§103

45.8%

+5.8% vs TC avg

§102

14.7%

-25.3% vs TC avg

§112

17.9%

-22.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 60 resolved cases

Office Action

§103 §112

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
The status of claims 1-17 is:
Claims 1-17 are pending.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 02/28/2024 and 03/05/2026 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2-11 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 2 recites the limitation "select two or more images of interest from the images, based on the first features; and select, from the images, the auxiliary images similar to the selected two or more of the images of interest" in lines 3-6.  There is insufficient antecedent basis for this limitation in the claim. Claim 1 claims “select, from the images, an image of interest . . . and an auxiliary image”. There is not mention of multiple images of interest or auxiliary images as claimed in claim 2 and its dependent claims. As such, there is no antecedent basis for the limitations of claim 2 and its dependent claims.
Claims 3-11 are rejected for the same reasons as for claim 2 above as well as being dependent on claim 2.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 12, and 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Makhzani et al. (Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., & Frey, B. (2015). Adversarial autoencoders. arXiv preprint arXiv:1511.05644., hereinafter “Makhzani”) in view of Noguchi (U.S. Patent Publication No 2021/0090312 listed in the IDS received 03/05/2026, hereinafter “Noguchi”).

Regarding claim 1, Makhzani discloses a similar image set creating apparatus configured to:
acquire a plurality of images (Makhzani Fig. 8: x);
extract a plurality of first features from the images by using a first model that executes an image classification task (Makhzani Page 10: “The architecture that we use is similar to Figure 8, with the difference that we remove the semi supervised classification stage and thus no longer train the network on any labeled mini-batch. Another difference is that the inference network q(y|x) predicts a one-hot vector whose dimension is the number of categories that we wish the data to be clustered into. Figure 9 illustrates the unsupervised clustering performance of the AAE on MNIST when the number of clusters is 16. Each row corresponds to one cluster. The first image in each row shows the cluster heads, which are digits generated by fixing the style variable to zero and setting the label variable to one of the 16 one-hot vectors”; Makhzani Fig. 8: top adversarial network);
extract a plurality of second features from the images by using a second model that executes an image classification task (Makhzani Page 10: “The architecture that we use is similar to Figure 8, with the difference that we remove the semi supervised classification stage and thus no longer train the network on any labeled mini-batch. Another difference is that the inference network q(y|x) predicts a one-hot vector whose dimension is the number of categories that we wish the data to be clustered into. Figure 9 illustrates the unsupervised clustering performance of the AAE on MNIST when the number of clusters is 16. Each row corresponds to one cluster. The first image in each row shows the cluster heads, which are digits generated by fixing the style variable to zero and setting the label variable to one of the 16 one-hot vectors”; Makhzani Fig. 8: bottom adversarial network), the second model being trained in such a manner that mutually similar images in a latent space are continuously distributed compared to the first model (Makhzani Fig. 8 description: “the top adversarial network imposes a Categorical distribution on the label representation and the bottom adversarial network imposes a Gaussian distribution on the style representation”); and
create an image set based on the images (Makhzani Page 11: “We used the following evaluation protocol: Once the training is done, for each cluster i, we found the validation example xn that maximizes q(yi|xn), and assigned the label of xn to all the points in the cluster i. We then computed the test error based on the assigned class labels to each cluster. As shown in Table 3, the AAE achieves the classification error rate of 9.55% and 4.10% with 16 and 30 total labels respectively. We observed that as the number of clusters grows, the classification rate improves”).
Makhzani does not explicitly disclose the apparatus comprising processing circuitry, the processing circuitry being configured to:
select, from the images, an image of interest serving as a reference of a similar image set, and an auxiliary image similar to the image of interest, based on the first features and the second features.
However, Noguchi teaches the apparatus comprising processing circuitry (Noguchi [0008]: “The image processing apparatus may comprise a processor”), the processing circuitry configured to:
select, from the images, an image of interest serving as a reference of a similar image set (Noguchi [0089]: “In a case where the centroid is determined, the image closest to the determined centroid (image distributed at the closest position) is determined by the CPU 2 as a representative image of the group having the centroid (step S37 in FIG. 5). In a case where the representative image is determined, the image classification result is displayed in the classification result window by using the determined representative image (step S38 in FIG. 5). This classification result window is displayed on the display screen of the display apparatus 3”; Noguchi [0091]: “In the classification result display window 80, representative images IR1 to IR10 of each group classified into 10 groups are displayed under the control of the CPU 2 (a representative image display control device). For example, each of the representative images IR1 to IR10 is an image representing each of the groups G1 to G10 shown in FIG. 12. Each of the representative images IR1 to IR10 is the image closest to the centroid C1 to the centroid C10 of each of the groups G1 to G10 shown in FIG. 12”; Noguchi [0093]: “In a case where any one of the representative image IR1 to the representative image IR10 displayed in the classification result display window 80 is double-clicked (any device other than the mouse 9 may be used as long as it can select an image, such as touching the touch panel display) by the mouse 9 (a representative image selection device) (YES in step S39 of FIG. 5), the double-clicked representative image is selected. Images included in the group represented by the selected representative image (images similar to each other) are displayed in a list in the second classification window 70A (step S40 in FIG. 5)”), and an auxiliary image similar to the image of interest, based on the first features and the second features (Noguchi [0089]: “In a case where the centroid is determined, the image closest to the determined centroid (image distributed at the closest position) is determined by the CPU 2 as a representative image of the group having the centroid (step S37 in FIG. 5). In a case where the representative image is determined, the image classification result is displayed in the classification result window by using the determined representative image (step S38 in FIG. 5). This classification result window is displayed on the display screen of the display apparatus 3”; Noguchi [0091]: “In the classification result display window 80, representative images IR1 to IR10 of each group classified into 10 groups are displayed under the control of the CPU 2 (a representative image display control device). For example, each of the representative images IR1 to IR10 is an image representing each of the groups G1 to G10 shown in FIG. 12. Each of the representative images IR1 to IR10 is the image closest to the centroid C1 to the centroid C10 of each of the groups G1 to G10 shown in FIG. 12”; Noguchi [0093]: “In a case where any one of the representative image IR1 to the representative image IR10 displayed in the classification result display window 80 is double-clicked (any device other than the mouse 9 may be used as long as it can select an image, such as touching the touch panel display) by the mouse 9 (a representative image selection device) (YES in step S39 of FIG. 5), the double-clicked representative image is selected. Images included in the group represented by the selected representative image (images similar to each other) are displayed in a list in the second classification window 70A (step S40 in FIG. 5)”).
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the image selection as taught by Noguchi with the apparatus of Makhzani because it would allow a user to look at and make decisions based on the objects in the image sets (Noguchi [0089], [0091], [0093]). For example, a doctor examining medical images or an engineer looking at manufactured parts. This motivation for the combination of Makhzani and Noguchi is supported by KSR exemplary rationale (G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention and rationale (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results.

Regarding claim 16, it is rejected under the same analysis as claim 1 above.

Regarding claim 17, it is rejected under the same analysis as claim 1 above along with Noguchi’s teaching of a non-transitory computer readable medium (Noguchi Claim 17: “A non-transitory recording medium storing a computer-readable program”).
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the non-transitory computer readable medium as taught by Noguchi with the method of Makhzani because it would allow the method to be run on any computer. This motivation for the combination of Makhzani and Noguchi is supported by KSR exemplary rationale (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results.

Regarding claim 12, Makhzani discloses the apparatus, wherein the first model is trained in such a manner that a distance between mutually similar images in a latent space becomes greater than in the second model (Makhzani Fig. 8 description: “the top adversarial network imposes a Categorical distribution on the label representation and the bottom adversarial network imposes a Gaussian distribution on the style representation”).

Regarding claim 14, Makhzani does not explicitly disclose the apparatus, wherein the processing circuitry is configured to cause a display device to display the image of interest and the auxiliary image as the similar image set.
However, Noguchi teaches the apparatus, wherein the processing circuitry is configured to cause a display device to display the image of interest and the auxiliary image as the similar image set (Noguchi [0089]: “In a case where the centroid is determined, the image closest to the determined centroid (image distributed at the closest position) is determined by the CPU 2 as a representative image of the group having the centroid (step S37 in FIG. 5). In a case where the representative image is determined, the image classification result is displayed in the classification result window by using the determined representative image (step S38 in FIG. 5). This classification result window is displayed on the display screen of the display apparatus 3”; Noguchi [0091]: “In the classification result display window 80, representative images IR1 to IR10 of each group classified into 10 groups are displayed under the control of the CPU 2 (a representative image display control device). For example, each of the representative images IR1 to IR10 is an image representing each of the groups G1 to G10 shown in FIG. 12. Each of the representative images IR1 to IR10 is the image closest to the centroid C1 to the centroid C10 of each of the groups G1 to G10 shown in FIG. 12”; Noguchi [0093]: “In a case where any one of the representative image IR1 to the representative image IR10 displayed in the classification result display window 80 is double-clicked (any device other than the mouse 9 may be used as long as it can select an image, such as touching the touch panel display) by the mouse 9 (a representative image selection device) (YES in step S39 of FIG. 5), the double-clicked representative image is selected. Images included in the group represented by the selected representative image (images similar to each other) are displayed in a list in the second classification window 70A (step S40 in FIG. 5)”).
It would have been obvious to combine Makhzani and Noguchi for the same reasons as stated for claim 1 above.

Regarding claim 15, Makhzani does not explicitly disclose the apparatus, wherein the processing circuitry is configured to create a limit sample, based on the image of interest and the auxiliary image.
However, Noguchi teaches the apparatus, wherein the processing circuitry is configured to create a limit sample, based on the image of interest and the auxiliary image (Noguchi [0097]: “In a case where the second designation classification button B11 is pressed (YES in step S41 of FIG. 5), a classification command is generated and input to the CPU 2. The image displayed in the second classification window 70A becomes an image to be newly classified, and the CPU 2 (an image classification device) performs the first stage classification that classifies the images into a plurality of groups for each similar image (step S26 in FIG. 3). The similar images displayed in the second classification window 70A are further classified into groups having the number of groups input in the group number input window 73 for each more similar image”).
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the limit sample as taught by Noguchi with the method of Makhzani because it would create an even more precise sample for a user to look at and act on. This motivation for the combination of Makhzani and Noguchi is supported by KSR exemplary rationale (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results.

Claim(s) 13 is rejected under 35 U.S.C. 103 as being unpatentable over the Makhzani and Noguchi combination in view of Tao et al. (U.S. Patent Publication No 2021/0248246 listed in the IDS received 02/28/2024, hereinafter “Tao”).

Regarding claim 13, the Makhzani and Noguchi combination does not explicitly disclose the apparatus, wherein the first model and the second model are generated by unsupervised representation learning and are trained based on mutually different loss functions.
However, Tao teaches the apparatus, wherein the first model and the second model are generated by unsupervised representation learning and are trained based on mutually different loss functions (Tao [0076]: “The update unit 20G receives the first loss for the plurality of first target data XA from the first loss calculation unit 20E. The update unit 20G receives the second loss for the element classes G from the second loss calculation unit 20F. The update unit 20G updates the parameters such that both of the received first loss and the received second loss become lower. Specifically, the update unit 20G updates the parameters such that the received first loss and the received second loss are lower than the first loss and the second loss calculated by the above-mentioned processing by using the learning model 30 the parameters of which are currently stored in the storage unit 20A”; Tao [0077]: “Specifically, the update unit 20G calculates the parameters such that a loss function expressed by Equation (5) below becomes lower”; Tao [0078]: “In Equation (5), L represents the loss function. L_1 represents the first loss. L_2 represents the second loss. a is a weighting of the second loss, and is a positive real number. a only needs to be determined in advance depending on the type of target data X. In the present embodiment, the case where a=1 is described as an example”).
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the loss functions as taught by Tao with the apparatus of Makhzani and Noguchi because it would improve the accuracy of the method by minimizing loss and it would allow for the refining of each model for its specific purpose. This motivation for the combination of Makhzani and Noguchi is supported by KSR exemplary rationale (D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results.

Allowable Subject Matter
Claims 2-11 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AIDAN KEUP whose telephone number is (703)756-4578. The examiner can normally be reached Monday - Friday 8:00-4:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached at (571) 270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AIDAN KEUP/Examiner, Art Unit 2666                                                                                                                                                                                                        /Molly Wilburn/Primary Examiner, Art Unit 2666

Read full office action

Prosecution Timeline

Feb 28, 2024

Application Filed

Mar 18, 2026

Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/608,433

Patent 12602774

Regional Pulmonary V/Q via image registration and Multi-Energy CT

2y 5m to grant Granted Apr 14, 2026

18/119,840

Patent 12597140

METHOD, SYSTEM AND DEVICE OF IMAGE SEGMENTATION

2y 5m to grant Granted Apr 07, 2026

18/375,696

Patent 12597168

METHOD FOR CONVERTING NEAR INFRARED IMAGE TO RGB IMAGE AND APPARATUS FOR SAME

2y 5m to grant Granted Apr 07, 2026

17/808,715

Patent 12592082

DEVICE AND METHOD FOR PROVIDING INFORMATION FOR VEHICLE USING ROAD SURFACE

2y 5m to grant Granted Mar 31, 2026

17/984,146

Patent 12586182

Multi-Prong Multitask Convolutional Neural Network for Biomedical Image Inference

2y 5m to grant Granted Mar 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

80%

Grant Probability

92%

With Interview (+12.0%)

3y 3m

Median Time to Grant

Low

PTA Risk

Based on 60 resolved cases by this examiner. Grant probability derived from career allow rate.