DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 16 January 2026 has been entered.
Response to Amendment
Applicant’s response, filed 16 January 2026, to the last office action has been entered and made of record.
In response to the cancellation of claim 2, it is acknowledged and made of record.
In response to the amendments to the claims, they are acknowledged, supported by the original disclosure, and no new matter is added.
Amendments to the independent claims 1, 7, and 8 have necessitated a new ground of rejection over the applied prior art. Please see below for the updated interpretations and rejections.
In response to the addition of new claims 9-19, they are acknowledged and made of record.
Examiner Notes - Claim Interpretation
Claims 6, 13, and 18 are noted to recite the amended claim term “magnitude of a sum of differences”, which the originally filed disclosure does not explicitly recite the term “magnitude of a sum of differences”. The originally filed specification does recite, “[0063] In such a manner, the reference image selecting section 66 may select a reference image from among a plurality of candidate images on the basis of the smallness of the sum of the feature amount differences from the respective predetermined number of other candidate images.” The specification’s recitation of “smallness of the sum of the feature amount of differences” implies that a size indicating a measure of the feature amount differences, and is understood to provide written description support for a broadest reasonable interpretation, in light of the specification, for the amended term “magnitude of a sum of differences”.
Response to Arguments
Applicant’s arguments with respect to amended independent claims 1, 7, and 8 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 3-4, 10-11, and 15-16 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claims 1, 7, and 8 are amended to recite “select a predetermined number of second features among the plurality of second features based on a difference between the first feature and each one of the plurality of second features; calculate a sum of differences between the first feature and each one of the predetermined number of second features”.
Claims 3-4, 10-11, and 15-16 recite, “discard[ing] the feature data when the difference is greater than a given difference” and “discard[ing] the feature data when the difference is smaller than a given difference”.
Claims 3-4, 10-11, and 15-16 recitation for “the difference” as currently presented refers to amended claims 1, 7, and 8 recitation for “a difference between the first feature and each one of the plurality of second features”, which is used to select the predetermined number of second features and calculate the sum of differences for selecting the reference image. See specification [0058]-[0063], [0067]-[0070], and Fig. 6A.
However, the originally filed disclosure does not provide written description support the combination of such a difference being compared to a given difference threshold to discard the feature data when the difference is greater than or smaller than a given difference threshold.
The originally filed disclosure does describe, separately, a “value D_min” indicating a distance between the feature amount corresponding to a latest captured sample image and the feature amount of a selected, closest positive-example training data is identified. The “value D_min” distance is determined whether it is larger than a first threshold and smaller than a second threshold, for satisfying a predetermined condition for storing or discarding the feature amount corresponding to the latest captured sample image as training data. See specification [0073]-[0079] and Fig. 6B.
Examiner notes that the separate “value D_min” distance is not described in the originally filed disclosure to be used in selecting a predetermined number of second features or for calculating a sum of differences for selecting a reference image.
For the purposes of further considering the Application on the merits, the Examiner assumes that the recited “the difference” of claims 3-4, 10-11, and 15-16 is intended to refer to “a third sample feature distance” corresponding to a distance between the feature amount corresponding to a latest captured sample image and the feature amount of a selected, closest positive-example training data.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1, 3-4, 7-11, 14-16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Turkelson et al. (US 2020/0193552, effectively filed 18 December 2018), herein Turkelson, in view of Hatanka et al. (US 2018/0130225), herein Hatanaka, and Shibata et al. (US 2020/0293813, effectively filed 6 December 2017), herein Shibata.
Regarding claim 1, Turkelson discloses a training data selection device comprising:
circuitry (see Turkelson Fig. 1, [0040], [0044], and [0090]-[0091], where a training data database is disclosed, a camera of a mobile computing device or kiosk is used to capture images, and computing systems includes one or more processors) configured to:
store training data indicating a first feature corresponding to a first sample image obtained by photographing a sample (see Turkelson [0059], where the training data set may be stored in the training data database; see Turkelson [0062]-[0063], where similarity is determined between visual features extracted from newly captured images and images within the training dataset, suggesting features corresponding to images stored within the training dataset);
acquire a plurality of second sample images obtained by photographing the sample (see Turkelson [0041], where plurality of images depicting objects are obtained for generating training data; see Turkelson [0062], where a new image of an object is obtained);
generate a plurality of feature data for a plurality of second features, each feature data indicating a second feature corresponding to each second sample image of the plurality of second sample images (see Turkelson [0063], where visual features are extracted from the new image); and
determine whether to store each feature data as part of the training data (see Turkelson [0063]-[0068], where a similarity between two images can be determined by computing a distance in feature space between the feature vector representing the new image and a feature vector of a corresponding image from the training data set, and a determination is made that the new image depicts an object from objects depicted by the images of the training data set based on the distance between the two feature vectors when compared to a threshold distance, and that the new image is to be added to the training data set based on the similarity).
While Turkelson does not explicitly disclose the teachings in a single embodiment, one of ordinary skill in the art would have found it obvious to combine the related teachings of Turkelson’s disclosed embodiments to identify and determine new images to be added and stored in the training data set (see Turkelson [0102]).
This modification is rationalized as some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention.
In this instance, Turkelson teaches in separate embodiments related features storing training data set, capturing new images, and determining the similarity of extracted feature vectors of new images and images stored in the training data set to determine whether to add the new image to the training data set, and provides a suggestion and motivation that the disclosed teachings can be modified and combined.
One of ordinary skill in the art would have reasonably expect that the combined Turkelson’s teachings would successfully store the training data set, capture new images, and determine the similarity of extracted feature vectors of new images and images stored in the training data set to determine whether to add the new image to the training data set, as each element merely performs the same functions as they would separately.
While Turkelson teaches that the distance between the feature vector representing the new image and a feature vector of a corresponding image from the training data set may include cosine distance, Euclidean distance, and other metrics by which similarity may be computed (see Turkelson [0064]); Turkelson does not explicitly disclose select a predetermined number of second features among the plurality of second features based on a difference between the first feature and each one of the plurality of second features.
Hatanaka teaches in a related and pertinent object recognition system and method which can register objects into a stored dictionary based on captured images (see Hatanaka Abstract), where a plurality of consecutive captured images of a detected commodity is outputted by an image capturing device (see Hatanaka [0045]-[0046]), for registering a captured image as a new category, a feature value is extracted from each of the plurality of captured images (see Hatanaka [0048] and [0067]), a similarity degree is calculated between the registered feature values of each existing category with the feature values of each of the plurality of captured images and determines whether there are images having a similarity degree equal to or greater than a first threshold with feature values of images corresponding to existing categories, the number of images similar to any one of the images registered in each existing category are counted as first similar images and determines whether the counted number of first similar images is equal to greater than a predetermined number (see Hatanaka [0065]-[0070]), the similarity degree among the plurality of captured images is further calculated and determines whether or not there are similar images of a predetermined number of images or more in the plurality of captured images to be registered in the new category, and additionally registers and stores features values of the captured images and captured images as the new category (see Hatanaka [0071]-[0075]), where if there are many similar images among the plurality of images obtained to be registered as a new category may lead to lower recognition performance (see Hatanaka [0082]).
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of Hatanaka to the teachings of Turkelson, such that a plurality of captured images and corresponding feature values meeting a similarity distance threshold, up to a predetermined number, are selected and determined whether to add the new images to the training data set.
This modification is rationalized as an application of a known technique to a known device
ready for improvement to yield predictable results.
In this instance, Turkelson discloses a base device that stores the training data set, captures new images, and determines the similarity based on a feature distance metric of extracted feature vectors of new images and images stored in the training data set to determine whether to add the new image to the training data set.
Hatanaka teaches a known technique of registering new captured images of a commodity object as an additionally registered new category in a stored dictionary, where a similarity degree is calculated between the registered feature values of each existing category with the feature values of each of the plurality of captured images and determines whether the number of images having a similarity degree equal to or greater than a first threshold is equal to greater than a predetermined number, and further calculates the similarity degree among the plurality of captured images and determines whether or not number of similar images is equal or greater than a predetermined number of images, where exceeding the predetermined number of similar images may lead to lower recognition performance.
One of ordinary skill in the art would have recognized that by applying Hatanaka’s technique would allow for the device of Turkelson to select a predetermined number of the plurality of captured images and corresponding feature values meeting a similarity distance threshold to be determine whether to add the new images to the training data set, predictably leading to an improved device by identifying a number of training data set images for optimized performance.
While Turkelson teaches that the distance between the feature vector representing the new image and a feature vector of a corresponding image from the training data set may include cosine distance, Euclidean distance, and other metrics by which similarity may be computed (see Turkelson [0064]); Turkelson and Hatanaka do not explicitly disclose calculate a sum of differences between the first feature and each one of the predetermined number of second features; and that the determining whether to store each feature data as part of the training data is based on the sum of differences.
Shibata teaches in a related and pertinent system and method for image recognition model generation (see Shibata Abstract), which accepts images captured by a camera and performs a similar patch searching of a similar patch database (see Shibata [0061]-[0066]), where, in performing search for similar patch for an input image patch, the sum of squared differences between input image patch and similar patch candidates are calculated and a certain number of similar patch candidates are searched for in an ascending order of the calculated value of the sum of squared differences (see Shibata [0078]-[0084]).
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of Shibata to the teachings of Turkelson and Hatanaka, such that a sum of squared differences (SSD) is calculated between the selected feature vectors representing the new images and feature vectors of corresponding images from the training data set and used to determine similarity distances among the selected feature vectors representing the new images and between the selected feature vectors representing the new images feature vectors of corresponding images from the training data set, where a certain number of similar feature vectors of corresponding training data set images are selected in an ascending order of the computed SSD to determine objects of the training data set which the new image is the most similar.
This modification is rationalized as an application of a known technique to a known device
ready for improvement to yield predictable results.
In this instance, Turkelson and Hatanaka disclose a base device that stores the training data set, captures new images, and determines the similarity of extracted feature vectors of new images, and select a predetermined number of the plurality of captured images and corresponding feature values meeting a similarity distance threshold to determine whether to add the new image to the training data set.
Shibata teaches a known technique of performing search for similar image patch for an input image patch, where the sum of squared differences between input image patch and similar patch candidates are calculated and a certain number of similar patch candidates in an ascending order of the calculated value of the sum of squared differences is determined.
One of ordinary skill in the art would have recognized that by applying Shibata’s technique would allow for the device of Turkelson and Hatanaka to calculate sum of squared differences (SSD) between the feature vectors representing new images and feature vectors of corresponding training data set images to determine similarity distances among the selected feature vectors representing the new images and between the selected feature vectors representing the new images feature vectors of corresponding images from the training data set, where a certain number of similar feature vectors of corresponding training data set images are selected in an ascending order of the computed SSD to determine objects of the training data set which the new image is the most similar, predictably leading to an improved device by calculating a similarity distance metric to identify the most similar training data set images to the input new image.
Regarding claim 3, please see the above rejection of claim 1. Turkelson, Hatanaka, and Shibata disclose the training data selection device according to claim 1, wherein the circuitry is further configured to discard the feature data when the difference is greater than a given difference (see Turkelson [0063]-[0068], where a determination is made that the new image depicts an object from objects depicted by the images of the training data set based on the distance between the two feature vectors is less than a first threshold distance, and the new image is to be added to the training data; suggesting that if the difference is greater than the first threshold, the new image is not added to the training set).
Regarding claim 4, please see the above rejection of claim 1. Turkelson, Hatanaka, and Shibata disclose the training data selection device according to claim 1, wherein the circuitry is further configured to discard the feature data when the difference is smaller than a given difference (see Turkelson [0068], where if images are too similar, there may be little value in adding that image to the training data set and a determination is made whether a distance between the feature vectors of the new and training image is greater than a second threshold distance to add the image to the training set; suggesting that if the difference is less than the second threshold, the new image is not added to the training set).
Regarding claim 7, it recites a method performing the device functions of claim 1. Turkelson, Hatanaka, and Shibata teach the method by performing the device functions of claim 1. Please see above for detailed claim analysis.
Please see the above rejection for claim 1, as the rationale to combine the teachings of Turkelson, Hatanaka, and Shibata are similar, mutatis mutandis.
Regarding claim 8, it recites a non-transitory, computer readable storage medium containing a computer program, which when executed by a computer, causes the computer to perform the device functions of claim 1. Turkelson, Hatanaka, and Shibata teach a non-transitory, computer readable storage medium containing a computer program, which when executed by a computer, causes the computer to perform the device functions of claim 1 (see Turkelson [0091] and [0094]-[0095], where system memory can store program instructions executable by a processor to implement the disclosed teachings). Please see above for detailed claim analysis.
Please see the above rejection for claim 1, as the rationale to combine the teachings of Turkelson, Hatanaka, and Shibata are similar, mutatis mutandis.
Regarding claim 9, please see the above rejection of claim 1. Turkelson, Hatanaka, and Shibata disclose the training data selection device of claim 1, wherein the circuitry is further configured to select a predetermined number of second features among the plurality of second features corresponding to the plurality of second sample images in an ascending order of difference between the first feature and each one of the plurality of second features (see Shibata [0078]-[0084], where in performing search for similar patch for an input image patch, the sum of squared differences between input image patch and similar patch candidates are calculated and may search for a certain number of similar patch candidates in an ascending order of the calculated value of the sum of squared differences)
Regarding claim 10, see above rejection for claim 7. It is a method claim reciting similar subject matter as claim 3. Please see above claim 3 for detailed claim analysis as the limitations of claim 10 are similarly rejected.
Regarding claim 11, see above rejection for claim 7. It is a method claim reciting similar subject matter as claim 4. Please see above claim 4 for detailed claim analysis as the limitations of claim 11 are similarly rejected.
Regarding claim 14, see above rejection for claim 7. It is a method claim reciting similar subject matter as claim 9. Please see above claim 9 for detailed claim analysis as the limitations of claim 14 are similarly rejected.
Regarding claim 15, see above rejection for claim 8. It is a non-transitory, computer readable storage medium claim reciting similar subject matter as claim 3. Please see above claim 3 for detailed claim analysis as the limitations of claim 15 are similarly rejected.
Regarding claim 16, see above rejection for claim 8. It is a non-transitory, computer readable storage medium claim reciting similar subject matter as claim 4. Please see above claim 4 for detailed claim analysis as the limitations of claim 16 are similarly rejected.
Regarding claim 19, see above rejection for claim 8. It is a non-transitory, computer readable storage medium claim reciting similar subject matter as claim 9. Please see above claim 9 for detailed claim analysis as the limitations of claim 19 are similarly rejected.
Claims 5-6, 12-13, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Turkelson, Hatanaka, and Shibata as applied to claims 1, 7, and 8 above, and further in view of Gao et al. (US 2018/0336700), herein Gao.
Regarding claim 5, please see the above rejection of claim 1. Turkelson, Hatanaka, and Shibata disclose the training data selection device according to claim 1, wherein the circuitry is further configured to:
select a reference image from among the plurality of second sample images (see Turkelson [0041], where plurality of images depicting objects are obtained for generating training data),
wherein the circuitry is further configured to store the feature data indicating a feature corresponding to the reference image as initial training data (see Turkelson [0046], where the input images, features extracted from the input images, identifier labeling of each input image may be stored in the training data database as a training data set).
While Turkelson teaches that some embodiments may implement unsupervised learning of novel objects absent from a training data set and may cluster feature vectors and determine whether clusters have less than a threshold amount labeled feature vectors (see Turkelson [0031]); Turkelson, Hatanaka, and Shibata do not explicitly disclose that the reference image is selected on a basis of the second feature corresponding to each of the plurality of second sample images.
Gao teaches in a related and pertinent image capture and recognition method and system (see Gao Abstract), where a plurality of reference images may be determined based on performing cluster analysis on deep features of historical sample images and assigning the historical sample images to a plurality of clusters corresponding to cluster centers, and from each cluster, determining an image as a reference image having a smallest sum of degrees of difference from deep features of other images of the same cluster (see Gao [0052]-[0064] and [0101]-[0111]).
At the time of filing, one of ordinary skill in the art would have found it obvious to apply the teachings of Gao to the teachings of Turkelson, Hatanaka, and Shibata, such that feature vector clustering as taught by Gao is used and reference images are selected based on the image having a smallest sum of degrees of difference from deep features of other images of the same cluster to generate initial training data set from the obtained plurality of images depicting objects.
This modification is rationalized as an application of a known technique to a known device
ready for improvement to yield predictable results.
In this instance, Turkelson, Hatanaka, and Shibata disclose a base device that stores the training data set, captures new images, and determines the similarity of extracted feature vectors of new images, and select a predetermined number of the plurality of captured images and corresponding feature values meeting a similarity distance threshold, where a sum of squared differences (SSD) is calculated between the selected feature vectors representing the new images and feature vectors of corresponding images from the training data set and used to determine similarity distances among the selected feature vectors representing the new images and between the selected feature vectors representing the new images feature vectors of corresponding images from the training data set, and to determine whether to add the new image to the training data set.
Gao teaches a known technique for determining a plurality of reference images based on performing cluster analysis on deep features of historical sample images and assigning the historical sample images to a plurality of clusters corresponding to cluster centers, and from each cluster, determining an image as a reference image having a smallest sum of degrees of difference from deep features of other images of the same cluster.
One of ordinary skill in the art would have recognized that by applying Gao’s technique would allow for the device of Turkelson, Hatanaka, and Shibata to generate an initial training data set from the obtained plurality of images depicting objects, where feature vector clustering is used to select reference images based on the image having a smallest sum of degrees of difference from deep features of other images of the same cluster to generate initial training data set from the obtained plurality of images depicting objects, predictably leading to an improved device by generating an initial training data set from unstructured plurality of images depicting objects.
Regarding claim 6, please see the above rejection of claim 1. Turkelson, Hatanaka, and Shibata, and Gao disclose the training data selection device according to claim 5, wherein the circuitry is further configured to select the reference image from among the plurality of seconds sample images, based on a magnitude of a sum of differences between the feature of the reference image and respective second feature of the predetermined number of second sample images in the plurality of second sample images (see Gao [0052]-[0064] and [0101]-[0111], where a plurality of reference images may be determined based on performing cluster analysis on deep features of historical sample images and assigning the historical sample images to a plurality of clusters corresponding to cluster centers, and from each cluster, determining an image as a reference image having a smallest sum of degrees of difference from deep features of other images of the same cluster).
Regarding claim 12, see above rejection for claim 7. It is a method claim reciting similar subject matter as claim 5. Please see above claim 5 for detailed claim analysis as the limitations of claim 12 are similarly rejected.
Regarding claim 13, see above rejection for claim 12. It is a method claim reciting similar subject matter as claim 6. Please see above claim 6 for detailed claim analysis as the limitations of claim 13 are similarly rejected.
Regarding claim 17, see above rejection for claim 8. It is a non-transitory, computer readable storage medium claim reciting similar subject matter as claim 5. Please see above claim 5 for detailed claim analysis as the limitations of claim 17 are similarly rejected.
Regarding claim 18, see above rejection for claim 17. It is a non-transitory, computer readable storage medium claim reciting similar subject matter as claim 6. Please see above claim 6 for detailed claim analysis as the limitations of claim 18 are similarly rejected.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TIMOTHY WING HO CHOI whose telephone number is (571)270-3814. The examiner can normally be reached 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VINCENT RUDOLPH can be reached at (571) 272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TIMOTHY CHOI/Examiner, Art Unit 2671
/VINCENT RUDOLPH/Supervisory Patent Examiner, Art Unit 2671