Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
DETAILED ACTION
Response to Arguments
Applicant argues that Spencer in view of Redmon does not disclose “select a first machine learning model from a plurality of pre-trained machine learning models according to the object class, generating a recognition result of the object through the selected model, wherein the pre-trained machine learning models are trained for a plurality of object classes respectively” because the “models in Redmon are each trained to recognize all object classes simultaneously.” Further, applicant agues that one would not combine Spencer with Redmon because wearable devices have strict constraints of resources and that if the model selection were offloaded to the cloud, that the resulting latency would contradict Spencer’s purpose.
The examiner disagrees. First, there is no restriction in the claims that the models cannot be trained to recognize all objects simultaneously. The claim merely recites a selection is made based on object class, not that the model is designed for that object class only. Further, he rejection states that
It would be obvious to one of ordinary skill to select a system that is best suited for the class while taking into consideration speed, accuracy and the trade-offs with the selection of the model to be used for the purposes of routine optimization of the desired application.
With respect to the argument that Spencer is unable to be combined with Redmon because the wearable device has a strict constraint of resources, applicant has provided no evidence that the device of Spencer would not be able to perform the tasks on device, further, Spencer Fig 1 and ¶59-60 explicitly discloses the use of external resources 1100 and offloading processing to the cloud as an embodiment. Therefore, applicant’s arguments are not persuasive.
Claim Objections
Claims 1-13 are objected to for the use of “the pre-trained machine learning models are trained for a plurality of object classes respectively.” The use of the word respectively in grammar is used to link two parallel lists of items to show a one to one correspondence, clarifying which item in the first list matches the item in the second list. The use in the independent claims do not provide a set of parallel lists. The examiner notes that if applicant would note which model is trained for which object class, that would likely overcome Redmon.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-13 are rejected under 35 USC 103 as being unpatentable over US 2023/0110964, Spencer et al. (hereafter Spencer) in view of You Only Look Once, Redmon et al. (hereafter Redmon)
1. An object recognition device, comprising:
a first transceiver, receiving a first image; (Spencer Fig 3, 380 communications module; note that in context of this invention (claim 5), any input/output system such as a system bus that transmits and receives information appears to meet applicant’s definition of a transceiver and thus, the system busses that receive information from the image sensor would also constitute a transceiver in context of these claims)
an input device, receiving a user input; (Spencer Fig 3, 320; sensing system; 360 cameras, etc.)
an output device; (Spencer Fig 3, 340 output system)
and a processor, coupled to the input device, the output device, and the first transceiver and configured to: (Spencer Fig 3, 370 control system)
detect an object of the first image to obtain a detection region and an object class; (Spencer Fig 2322 image sensor; ¶62 )
in response to the user input matching the detection region, generate a recognition result of the object through the selected model; and (Spencer ¶62-62 eye gaze to determine which object to identify and identify the object; see also ¶33-58)
output information corresponding to the recognition result through the output device. (Spencer ¶48 provides supplemental information related to the identified object; see also ¶54)
Spencer does not disclose select a first machine learning model from a plurality of pre-trained machine learning models according to the object class, wherein the pre-trained machine learning models are trained for a plurality of object classes respectively.
Redmon discusses the relative benefits and between YOLO and other systems. The abstract first indicates that YOLO makes more localization errors, but is less likely to predict false positives on the background. Redmon then discusses limitations in section 2.4 where it states that the model struggles with small objects that appear in groups, such as flocks of birds and objects in new or unusual aspect ratios or configurations. In section 3, Redmon then compares YOLO to other systems. Section 4 discusses real world results between YOLO to other systems. Table 3 illustrates the accuracy between YOLO and other systems with respect to classes of objects. As Redmon notes, to perform image recognition, the models used are pre-trained prior to use in image recognition. Thus, it would be obvious to one of ordinary skill to select a system that is best suited for the class while taking into consideration speed, accuracy and the trade-offs with the selection of the model to be used for the purposes of routine optimization of the desired application.
Claim 13 is rejected similarly.
2. The object recognition device according to claim 1, wherein the input device comprises an image capturing device and the processor is further configured to: obtain a second image through the image capturing device; and perform eye tracking on the second image to obtain the user input. (Spencer ¶55 zoomed image, which is a second image and eye tracking on the zoom)
3. The object recognition device according to claim 1, further comprising: a sound capturing device, coupled to the processor and obtains an audio signal, wherein the processor generates the recognition result through the first machine learning model in response to the audio signal. (Spencer ¶54 voice input could be a user input for selection)
4. The object recognition device according to claim 1, wherein the output device comprises a display device, and the processor is further configured to: display a graphical user interface through the output device, wherein the graphical user interface comprises the information, wherein the information comprises at least one of description information of the object, a person’s profile, summary information, or a representative drawing. (Spencer Fig 3. Output display device; Spencer ¶48 provides supplemental information related to the identified object; see also ¶54)
5. The object recognition device according to claim 1, wherein the output device comprises a second transceiver, and the processor is further configured to: transmit an access command corresponding to the information through the output device. (Spencer Fig 3. 380 transceiver; ¶48 access the information)
6. The object recognition device according to claim 1, wherein the output device comprises a display device displaying the first image, wherein the processor is further configured to: in response to the user input matching the detection region, highlight the object of the first image through the display device. (Spencer Fig 2E, highlight the printer on the display)
7. The object recognition device according to claim 1, wherein the processor is further configured to: access an external database through the first transceiver; and obtain the information from the external database according to the recognition result. (Spencer Fig 3, 302, external resources; ¶43 obtain information about the object)
8. The object recognition device according to claim 1, further comprising: a storage medium, coupled to the processor and stores a database, wherein the processor is further configured to: obtain the information from the database according to the recognition result. (Spencer ¶43 obtain information from external databases about the object)
9. The object recognition device according to claim 1, further comprising: a storage medium, coupled to the processor and stores a plurality of machine learning models. (Spencer ¶59 storage)
Spencer does not disclose
10. The object recognition device according to claim 1, wherein the first machine learning model comprises a You Only Look One (YOLO) model.
However, Redmon page 1 teaches a YOLO model. It would have been obvious to modify the system of Spencer utilize a YOLO model for the purposes of obtaining the various advantages discussed on pages 1-2 regarding simplicity and speed among other advantages.
11. The object recognition device according to claim 1, wherein the processor is further configured to: detect the object of the first image through an object detection model to obtain the detection region and the object class. (Spencer ¶43 detects the object using some object detection model (by definition))
12. The object recognition device according to claim 1, wherein the object class comprises one of the followings: an identification of the object, a text, a two-dimensional barcode, or a product. (Spencer ¶48 QR code; note that these labels are given little patentable weight as they merely reflect an arbitrary name and for a category)
Conclusion
All claims are identical to or patentably indistinct from, or have unity of invention with claims in the application prior to the entry of the submission under 37 CFR 1.114 (that is, restriction (including a lack of unity of invention) would not be proper) and all claims could have been finally rejected on the grounds and art of record in the next Office action if they had been entered in the application prior to entry under 37 CFR 1.114. Accordingly, THIS ACTION IS MADE FINAL even though it is a first action after the filing of a request for continued examination and the submission under 37 CFR 1.114. See MPEP § 706.07(b). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ming Shui whose telephone number is (303)297-4247. The examiner can normally be reached on 7-5 Pacific Time, M-Th.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Greg Morse can be reached on 571-272-38383838. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Ming Shui/
Primary Examiner, Art Unit 2663