Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement filed July 11th, 2023 fails to comply with 37 CFR 1.98(a)(2), which requires a legible copy of each cited foreign patent document; each non-patent literature publication or that portion which caused it to be listed; and all other information or that portion which caused it to be listed.
Specifically, no copy of the document “Deep Learning for Retail Product Recognition” has been included.
Further, the information disclosure statement fails to comply with 37 CFR 1.98(b)(5), which states “Each publication listen in an information disclosure statement must be identified by … author.”
The information disclosure statement has been placed in the application file, but the information referred to therein has not been considered.
Claim Objections
Claims 1, 2, and 6-8 are objected to because of the following informalities:
Claim 1 recites wherein the machine learning based system comprising: which appears to be a typographical error, and will be interpreted as if it had read
Claim 1 recites corresponding to the first one or more products,, using a transfer learning method, where the two consecutive commas appears to be a typographical error.
Claim 2 recites object comprised in the second one or images which appears to be a typographical error, and interpreted as objects comprised in the second one or more images.
Claims 6-8 recite The Machine Learning based system, capitalized, where “machine learning” is not capitalized in any other usage. This appears to be a typographical error.
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-24 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claims 1, 9, and 17 contain the trademark/trade name imagenet. Where a trademark or trade name is used in a claim as a limitation to identify or describe a particular material or product, the claim does not comply with the requirements of 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph. See Ex parte Simpson, 218 USPQ 1020 (Bd. App. 1982). The claim scope is uncertain since the trademark or trade name cannot be used properly to identify any particular material or product. A trademark or trade name is used to identify a source of goods, and not the goods themselves. Thus, a trademark or trade name does not identify or describe the goods associated with the trademark or trade name. In the present case, the trademark/trade name is used to identify/describe some particular database and, accordingly, the identification/description is indefinite.
For the purpose of examination, any publicly available database comprising images intended to be used to train vision machine learning models will fall into the scope of the recited imagenet database.
Further, Claim 1 appears to recite two rounds of fine-tuning – one performed by the fine-tuning subsystem and another performed by the image analyzing subsystem. Given that parallel Claim 9 recites only one round of fine-tuning and the spec never specifically states that two rounds of fine-tuning are required, it is unclear whether two sets of fine-tuning or only one is required by the claim language. For the purpose of examination, Claim 1 will be interpreted as if two separate fine-tunings are required, whereas Claims 9 and 17 are interpreted as requiring only the single round of fine-tuning recited, which includes analyzing the third one or more images.
Claim 3 recites the limitation the fine-tuning but two instances of fine-tuning have already been disclosed, for example separately by both the image analyzing subsystem and by the fine-tuning subsystems of Claim 1. It is thus indefinite to which instance of fine-tuning the claim limitations apply.
Further, Claims 3, 11, and 19 recite the limitation the machine learning model trained to recognize the first one or more images, in two places. There is insufficient antecedent basis for this limitation in the claim because the only machine learning model previous recited was trained to recognize the second one or more images and not the first images, and then fine-tuned using the third one or more images.
Next, the entire clause of Claims 3, 11, and 19 in which these limitations occurs reads wherein the fine-tuning is a transfer learning technique in which the machine learning model trained to recognize the first one or more images, but no limitations following the in which – the recitation lacks a limitation on the machine learning model. “In which the machine learning model … does something” but the “something” is omitted. For the purpose of examination, the claim will be interpreted in light of [0017], that the machine learning model is retrained, as the following line of the claim recites. However, that next line recites retrained to recognize the second one or more images corresponding to the first one or more products, which is non-sensical because Claim 3, as noted multiple times and is consistent with the independent claim, is concerned with transfer learning to recognize the third one or more images to recognize the second one or more products. It appears that the claim set has inconsistencies when denoting the first, second, and third set of images in different places, which must be corrected. This inconsistency also occurs with Claim 4, where given the context, it is unclear what the limitation retrained on the second one or more images requires, given the context of the specification, [0085].
Finally in Claims 3, 11, and 19 , the term infrequently is a relative term of degree which renders the claim indefinite. The term infrequently is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claim 17 recites the limitation the second one or more images on the 6th-7th line of the claim. There is insufficient antecedent basis for this limitation in the claim.
Dependent claims are rejected for inheriting the indefiniteness of a parent claim.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-24 are rejected under 35 U.S.C. 103 as being unpatentable over Rajpura et al., “Transfer Learning by Finetuning Pretrained CNNs Entirely with Synthetic Images,” in view of Shermin et al., “Enhanced Transfer Learning with ImageNet Trained Classification Layer.” Lin et al., “Microsoft COCO: Common Objects in Context,” is relied upon to demonstrate an inherency in Rajpura.
Regarding Claim 1, Rajpura teaches a machine learning based system for optimizing learning time of a machine learning model (Rajpura, title, “Finetuning Pretrained CNNs” denotes optimizing learning time by finetuning a pretrained model rather than starting from scratch), (Rajpura, pg. 5, 2nd paragraph, “All the experiments were performed on a workstation with Intel Core i7-5960X processor” denotes that they perform their method on a computer, in which program instructions to perform the functions are inherent), wherein the plurality of subsystems comprises: a data obtaining subsystem configured to obtain a plurality of data associated with a first one or more images, wherein the first one or more images is obtained from an imagenet database (Abstract, “A CNN pretrained on the COCO dataset” where images in COCO are data in an imagenet database); a data training subsystem configured to train the machine learning model on a second plurality of data associated with second one or more images corresponding to first one or more products (Abstract, “A CNN pretrained on the COCO dataset” where the pretraining is training on all the images in the COCO dataset, including second one or more images corresponding to first one or more products, that is, any products in the set of COCO categories. See Lin, pg. 3, Fig. 2(a), the toothbrush and Wii controller to demonstrate that COCO includes second one or more images corresponding to first one or more products) wherein the second one or more images in a database comprises one or more images corresponding to the first one or more products, irrespective of whether the second one or more images comprises of one or more products on which the trained machine learning model is to be performed (Rajpura is intended to recognize, pg. 3, Section 2, “an RGB image captured inside a refrigerator” and not necessarily a Wii controller or a toothbrush), a data extracting subsystem configured to extract a third plurality of data associated with third one or more images corresponding to second one or more products from the database, wherein the third one or more images corresponding to the second one or more products are pre-stored in the database (Rajpura, pg. 3, Section 2, “synthetic rendered images from available 3D models” also see pg. 2, Fig. 1, “2d Rendered Annotated Images for training” are images of products in the refrigerator) … using a transfer learning method (Rajpura, Abstract, “finetuning pretrained CNNs entirely on synthetic images is an effective strategy to achieve transfer learning”), wherein a number of the first one or more products is higher than a number of the second one or more products (COCO has Lin, Abstract, “3.5 million labeled instances in 328k images” and pg. 7, Fig. 5, 91 categories with 10,000+ instances per category while Rajpura only uses a maximum of 6000 images total, thus fewer second products than first products); a fine-tuning subsystem configured to fine-tune at least one subset of the trained machine learning model to recognize the third one or more analyzed images corresponding to the second one or more products (Rajpura, Abstract, “A CNN pretrained on the COCO dataset and fine-tuned with our 4000 synthetic images … fine-tuning with selected layers”); and the image analyzing subsystem configured to analyze fourth one or more images corresponding to the second one or more products using the fine-tuned at least one subset of the trained machine learning model trained on the third one or more recognized images using the transfer learning method (Rajpura, pg. 2, Fig. 1, “Real world images” are analyzed, also see pg. 6, Fig. 2, “Detection Results”) wherein the fine-tuned at least one subset of the trained machine learning model trained on the third one or more images is required during learning to recognize the second one or more products (Rajpura, Abstract, “fine-tuning with selected layers”) wherein the fourth one or more images comprises one or more real world test images corresponding to the second one or more products, and wherein the fine-tuned at least one subset of the trained machine learning model is performed for analyzing the one or more real world test images corresponding to the second one or more products (Rajpura, pg. 2, Fig. 1, “Real world images” are analyzed, also see pg. 6, Fig. 2, “Detection Results”).
Rajpura is vague on whether the invention teaches the claimed two fine-tuning events: a) by the image analyzing subsystem configured to learn to recognize the third one or more images … by fine-tuning the machine learning model … using a transfer learning and b) by the fine-tuning subsystem configured to fine-tune at least one subset of the trained machine learning model, because the document only briefly references “Network weights were fine-tuned by freezing layers sequentially” (Rajpura, pg. 6, caption under Fig. 2) and thus Rajpura does not clearly teach the entirety of Claim 1. However, Shermin explicitly teaches two such rounds of fine-tuning via freezing: to learn to recognize the third one or more images corresponding to the second one or more products by fine-tuning the machine learning model [which in combination with Rajpura was pretrained on COCO, thus trained on the second one or more images corresponding to the first one or more products] using a transfer learning method (Shermin, pg. 5, Section 2.3, “We start fine-tuning from the last transferred layer and freeze other layers”) and to fine-tune at least one subset of the trained machine learning model to recognize the third one or more analyzed images corresponding to the second one or more products (Shermin, pg. 5, Section 2.3, “These two steps are repeated K times … each time we unfreeze one more penultimate layer”). It would have been obvious to one or ordinary skill in the art before the effective filing date to use Shermin’s explicit layer-wise freezing strategy for fine-tuning in the invention of Rajpura. The motivation to do so is that Rajpura already freezes for fine-tuning, but does not give the details of what that encompasses, and Shermin 1) provides these details and 2) allows the network to achieve optimal performance (Shermin, pg. 7, Fig.3 and 1st paragraph, “fine-tuning from the third convolution layer onwards … yields the highest accuracy”).
Regarding Claim 2, the Rajpura/Shermin combination of Claim 1 teaches the machine learning based system of Claim 1 (and thus the rejection of Claim 1 is incorporated). Rajpura further teaches (partially inherently though Lin) to receive the second plurality of data associated with the second or more images corresponding to the first one or more products (Abstract, “the COCO dataset” must have been obtained to pretrain the model on it); provide a first plurality of labels related to the second one or more images corresponding to the first one or more products to the machine learning model, wherein the first plurality of labels comprises at least one of: objects comprised in the second one or more images (Lin, Abstract, “a total of 2.5 million labeled instances in 329k images” pg. 3, Fig. 2); and train the machine learning model by correlating the second one or more images corresponding to the first one or more products, with the first plurality of labels related to the second one or more images, wherein the machine learning model is a supervised machine learning model (Rajpura, pg. 4, “For neural network training we use … Faster-RCNN (with ResNet-101 as feature mapping network) and SSD using Tensorflow and weights pretrained on COCO dataset” are supervised machine learning model training methods).
Regarding Claim 3, the Rajpura/Shermin combination of Claim 1 teaches the machine learning based system of Claim 1 (and thus the rejection of Claim 1 is incorporated). Rajpura further teaches to: obtain the third one or more images corresponding to the second one or more products from the database (Rajpura, pg. 2, Fig, 1, “2D Rendered Annotated Images for training”), provide a second plurality of labels related to the third one or more images corresponding to the first one or more products, to the machine learning model, wherein the first plurality of labels comprises at least one of : objects comprised in the second one or more images (Rajpura, pg. 2, Fig, 1, “2D Rendered Annotated Images for training”); wherein the fine-tuning is a transfer learning technique in which the machine learning model trained to recognize the first one or more images (i.e. the pretrained COCO model) is retrained to recognize the second or more images corresponding to the first one or more products, wherein the machine learning model is retrained on the second one or more recognized images corresponding to the first one or more products, (depending on the interpretation taken vis-à-vis the 35 USC 112(b) rejection above, either rounds of the original training of the pretrained COCO model, or the fine-tuning) and wherein the fine-tuning subsystem is configured to fine-tune the machine learning model trained on the first one more images infrequently (once is infrequently).
Regarding Claim 4, the Rajpura/Shermin combination of Claim 1 teaches the machine learning based system of Claim 1 (and thus the rejection of Claim 1 is incorporated). Rajpura further teaches to: train weights of the at least one subset of the machine learning model, which is fine-tuned on at least one of … the third one or more images corresponding to the second one or more products (Rajpura, pg. 2, Fig, 1, “2D Rendered Annotated Images for training” & “Fine-tune pretrained convolutional neural network”) wherein the at least one subset of the machine learning model is retrained on the second one or more images (rounds of the original training of the pretrained COCO model).
Regarding Claim 5, the Rajpura/Shermin combination of Claim 1 teaches the machine learning based system of Claim 1 (and thus the rejection of Claim 1 is incorporated). Rajpura further teaches to: obtain the fourth one or more images corresponding to the second one or more products, wherein the fourth one or more images corresponding to the second one or more products comprises the one or more real world test images for analysis; provide the fourth one or more images corresponding to the second one or more products to the fine-tuned at least one subset of the machine learning model; apply the trained weights of the fine-tuned at least one subset of the trained machine learning model, on the fourth one or more images corresponding to the second one or more products, and analyze the fourth one or more images corresponding to the second one or more products, based on the trained weights fine-tuned at least one subset of the trained machine learning model applied on the fourth one or more images corresponding to the second one or more products (Rajpura, pg. 2, Fig. 1, “Real world images, Trained Network Weights, Predicted Objects” & pg. 4, last paragraph, “We evaluate our object detector using manually annotated crowd-sourced refrigerator images”).
Regarding Claim 6, the Rajpura/Shermin combination of Claim 1 teaches the machine learning based system of Claim 1 (and thus the rejection of Claim 1 is incorporated). Rajpura does not teach, but Shermin teaches providing probabilistic values to the fourth one or more analyzed images corresponding to the second one or more products, between 0 and 1 (Shermin, pg. 4, 1st paragraph, “the new classification module, which has a FC classification layer C with a Softmax layer” where “softmax” denotes an output with the probability of each class). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use the Rajpura/Shermin fine-tuning method for classification, with a softmax output as Shermin does, rather than localization, as Rajpura does. The motivation to do so is that all sorts of classification tasks on images are desirable (Lin, pg. 1, Fig. 1, Image classification, Object localization, and segmentation are all tasks that image recognition systems perform).
Regarding Claim 7, the Rajpura/Shermin combination of Claim 1 teaches the machine learning based system of Claim 1 (and thus the rejection of Claim 1 is incorporated). Rajpura has already been shown to teach wherein the trained machine learning model is a convolutional neural network (CNN) model (Rajpura, title, “Finetuning Pretrained CNNs”).
Regarding Claim 8, the Rajpura/Shermin combination of Claim 1 teaches the machine learning based system of Claim 1 (and thus the rejection of Claim 1 is incorporated). The rejection has already been shown to teach wherein the first one or more products and the second one or more products are different products (the COCO dataset comprising e.g. toothbrush and the fine-tuning dataset comprising products in a refrigerator).
Claims 9-16 recite a machine learning based method which is broader than the method performed by the system of Clams 1-8, respectively (that is, the analyzing step of Claim 9 is broader than the functions performed by the image analyzing subsystem of Claim 1), and is thus rejected for reasons set forth in the rejections of Claims 1-8, respectively. Similarly, Claims 17-24 recite a non-transitory computer-readable storing medium having instructions to perform that method and are also rejected for reasons set forth in the rejections of Claims 1-8, respectively.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure:
The IMAGENET 1000 Class list demonstrates different product categories present in their database.
Huang, US PG Pub 2022/0147738 also teaches fine-tuning pretrained machine learning models.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIAN M SMITH whose telephone number is (469)295-9104. The examiner can normally be reached Monday - Friday, 8:00am - 4pm Pacific.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/BRIAN M SMITH/ Primary Examiner, Art Unit 2122