DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
The actual filing date for the instant application is 29 Jan 24. However, the instant application requests domestic benefit to a provisionally filed application 63/441,897, filed 30 Jan 23. As such, the effective filing date of each of the instant application’s claims under examination may be as recent as the instant application’s actual filing date of 29 Jan 24, or potentially as early as the filing date of 30 Jan 23 (filing date of 63/441,897), depending on whether there is appropriate specification support for each particular claim in the earlier-filed specification. In the case that a prior art rejection to one or more claims made in an Office action during prosecution of the instant application includes one or more prior art references that fall somewhere between 29 Jan 24 and 30 Jan 23 (an "intervening" reference), if Applicant can specifically identify appropriate specification support for each of these claims in the earlier filed provisional application, then the Examiner may determine that one or more of these prior art rejections against one or more of these claims will need to be withdrawn.
Status of Claims
Claims 1-20 are pending and examined below.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being incomplete for omitting essential steps, such omission amounting to a gap between the steps. See MPEP § 2172.01.
Claim 1 recites “assigning, by the object recognition subsystem, a first label to the first object; sending, by the interface, a query to the LLM, the query comprising the first label; receiving, by the interface, a response from the LLM, the response in reply to the query, the response comprising a second label; and assigning, by the object recognition subsystem, the second label to the second object.” Examiner is unclear as to how the LLM is able to respond with a second label for the second object when it doesn’t appear as if any information regarding the second object is sent to the LLM beforehand. Claims 2-10 are dependent on claim 1 and do not appear to rectify the issue. Appropriate clarification/correction is requested.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claims do not fall within at least one of the four categories of patent eligible subject matter. Claim 1 is directed to “a computer program product comprising data and processor-executable instructions stored in the one or more non-volatile processor-readable storage media.” Examiner interprets this to mean that the computer program product comprises data and processor-executable instructions and that this computer program product is stored on one or more non-volatile processor-readable storage media. As such, said computer program product appears to be directed to software per se. Additionally, it is unclear whether the claim is directed to said “computer program product” or the “method” recited later in the claim.
Further, even if Applicant is able to successfully argue/amend the preamble to overcome this issue, Examiner contends that that claims are still rejected because the claimed invention is directed to a judicial exception without significantly more. Subject matter eligibility analysis follows:
Step 1: This part of the eligibility analysis evaluates whether the claim falls within any statutory category. As discussed above, the claim appears to be directed to software per se (Step 1: NO). However, assuming Applicant overcomes this issue, further analysis continues as follows.
Step 2A, Prong One: This part of the eligibility analysis evaluates whether the claim recites a judicial exception.
The claim recites the steps of
assigning a first label to the first object;
sending a query, the query comprising the first label;
receiving a response, the response in reply to the query, the response comprising a second label; and
assigning the second label to the second object.
Examiner contends that these steps fall within the mental process groupings of abstract ideas because they cover concepts performed in the human mind, including observation, evaluation, judgment, and opinion. Looking at an object and assigning it a label, receiving a response to a query with a second label and assigning said second label to a second object are all steps that can be performed in the human mind as a mental process. (Step 2A, Prong One: YES).
Step 2A, Prong Two: This part of the eligibility analysis evaluates whether the claim as a whole integrates the recited judicial exception into a practical application of the exception or whether the claim is “directed to” the judicial exception. This evaluation is performed by (1) identifying whether there are any additional elements recited in the claim beyond the judicial exception, and (2) evaluating those additional elements individually and in combination to determine whether the claim as a whole integrates the exception into a practical application. See MPEP 2106.04(d). Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES).
The claim further recites “sending, by the interface, a query to the LLM, the query comprising the first label.” The interface and LLM are recited at a high level of generality and seem to perform generic computer functions of processing and transmitting information. Performing generic computer functions, alone, do not amount to significantly more than the abstract idea. The object recognition subsystem is also recited at a high level of generality.
Step 2B: This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than the recited exception i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05.
As discussed above, the steps of the claim are recited at a high level of generality. These elements amount to transmitting data over a network as well as applying labels to objects and are well-understood, routine, conventional activity. See MPEP 2106.05(d), subsection II. Even when considered in combination with the object recognition subsystem, interface, and LLM that are recited at a high level of generality, these additional elements represent mere instructions to implement an abstract idea or other exception on a computer, which do not provide an inventive concept. (Step 2B: NO).
Therefore claim 1 is ineligible. Claims 2-20 are dependent on claim 1 and do not rectify the underlying issues and are similarly rejected.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b).
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claim 1 of the examined application is provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of copending Application No. 18425557 (reference application). Although the claims at issue are not identical, they are not patentably distinct from each other because the bodies of the claims are identical while the preambles vary slightly with claim 1 of the examined application directed to a computer program product while claim 1 of the reference application is directed to a method. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the computer program product of the examined application to be directed to a method in order to define the method carried out by the computer program product.
This is a provisional nonstatutory double patenting rejection because the patentably indistinct claims have not in fact been patented.
Claims 2-20 of the examined application are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 2-20 of copending Application No. 18425557 (reference application) for similar reasons as claim 1. The bodies of the claims are identical while the preambles vary with different statutory categories.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by US 20230394855 A1 (“Xie”).
As per Claim 1, Xie discloses a computer program product for performing a method of operation of a robotic system, the robotic system comprising one or more non-volatile processor-readable storage media, one or more processors, a robot, an object recognition subsystem, and an interface to a large language model (LLM), the robot operating in an environment, the environment comprising a plurality of objects, the plurality of objects including a first object and a second object, the computer program product comprising data and processor-executable instructions stored in the one or more non-volatile processor-readable storage media that, when executed by the one or more processors communicatively coupled to the storage media, cause the one or more processors to perform the method of operation of the robotic system, the method comprising:
assigning, by the object recognition subsystem, a first label to the first object (¶ 19—“Object detector 116 detects a plurality of objects 118 in image 102”; ¶ 52—“Operation 410 determines, for each object in plurality of objects 118 in image 102, object information 126. In some examples, object information 126 comprises at least one item selected from the list consisting of: an object tag, an object caption, an object attribute, and an object location.”);
sending, by the interface, a query to the LLM, the query comprising the first label (¶ 20—“Visual information 120 comprises text that describes what is contained within image 102, such as image tags 122, an initial image caption 124, and object information 126”; ¶ 40—“The collected visual information 120 (tags 122, initial image caption 124, and object information 126) is then formatted into structured visual clues 130, which is used to directly prompt generative language model 140”);
receiving, by the interface, a response from the LLM, the response in reply to the query, the response comprising a second label (¶ 21—“A generative language model 140 generates a plurality of image story caption candidates 144, which includes story captions 141, 142, and 143, from visual information 120, which includes, from visual clues 130”; ¶ 53—“In operation 414, generative language model 140 generates, from visual information 120, plurality of image story caption candidates 144”); and
assigning, by the object recognition subsystem, the second label to the second object (¶ 27—“Selected story caption 154 is then paired with image 102 in paired set 160”).
As per Claim 2, Xie further discloses the object recognition subsystem comprising a plurality of sensors and a sensor data processor, the method further comprising:
scanning the environment, by the plurality of sensors, to generate sensor data (¶ 19—“ Architecture 100 intakes an image 102”);
detecting, by the sensor data processor, the presence of the first object and the second object, wherein the detecting, by the sensor data processor, the presence of the first object and the second object is based at least in part on the sensor data (¶ 19—“Object detector 116 detects a plurality of objects 118 in image 102”).
As per Claim 3, Xie further discloses wherein the assigning, by the object recognition subsystem, a first label to the first object includes:
identifying the first object based at least in part on the sensor data (¶ 19—“Object detector 116 detects a plurality of objects 118 in image 102”); and
assigning a natural language label to the first object (¶ 19—“Object detector 116 detects a plurality of objects 118 in image 102”; ¶ 52—“Operation 410 determines, for each object in plurality of objects 118 in image 102, object information 126. In some examples, object information 126 comprises at least one item selected from the list consisting of: an object tag, an object caption, an object attribute, and an object location.”).
As per Claim 4, Xie further discloses wherein the sending, by the interface, a query to the LLM includes formulating a natural language statement, the natural language statement comprising the natural language label assigned to the first object (¶ 40—“formatted into structured visual clues 130, which is used to directly prompt generative language model 140”).
As per Claim 5, Xie further discloses the method further comprising determining a degree of confidence in the identifying of the first object exceeds a determined confidence threshold, wherein the determining a degree of confidence in the identifying of the first object exceeds a determined confidence threshold includes determining a probability (¶ 27—“ vision language model 150 scores plurality of image story caption candidates 144 and a down selection component 152 selects selected story caption 154 based on at least the scores from vision language model 150”).
As per Claim 6, Xie further discloses wherein the scanning the environment, by the plurality of sensors, to generate sensor data includes generating at least one of image data, video data, audio data, or haptic data (¶ 19—“Architecture 100 intakes an image 102 and has a vision language model 112, a captioner 114, and an object detector 116. Object detector 116 detects a plurality of objects 118 in image 102. Vision language model 112 and captioner 114 produce visual information 120 from image 102 and plurality of objects 118”).
As per Claim 7, Xie further discloses wherein the detecting, by the sensor data processor, the presence of the first object and the second object includes detecting, by the sensor data processor, the presence of the first object and the second object in real time (¶ 19—“Architecture 100 intakes an image 102 and has a vision language model 112, a captioner 114, and an object detector 116. Object detector 116 detects a plurality of objects 118 in image 102.”).
As per Claim 8, Xie further discloses the method further comprising assigning, by the object recognition subsystem, a third label to the second object, wherein the assigning, by the object recognition subsystem, a third label to the second object includes:
identifying the second object based at least in part on the sensor data (¶ 19—“Object detector 116 detects a plurality of objects 118 in image 102”); and
determining a degree of confidence in the identifying of the second object fails to exceed a determined confidence threshold (¶ 27—“ vision language model 150 scores plurality of image story caption candidates 144 and a down selection component 152 selects selected story caption 154 based on at least the scores from vision language model 150”).
As per Claim 9, Xie further discloses wherein the assigning, by the object recognition subsystem, the second label to the second object includes updating the degree of confidence in the identifying of the second object (¶ 55—“vision language model 150 scores plurality of image story caption candidates 144 and a down selection component 152 selects selected story caption 154 based on at least the scores from vision language model 150”).
As per Claim 10, Xie further discloses wherein the sending, by the interface, a query to the LLM includes formulating a natural language statement, the natural language statement comprising the first label (¶ 40--“formatted into structured visual clues 130, which is used to directly prompt generative language model 140”).
As per Claim 11, Xie further discloses wherein the formulating a natural language statement includes structuring the natural language statement to cause the response from the LLM to follow a defined structure (¶ 40—“structured visual clues 130”; ¶ 21—“Caption focus 132 acts as an input that instructs generative language model 140 to produce story captions that are particularly suited for some desired application of using architecture 100”).
As per Claim 12, Xie further discloses wherein the receiving, by the interface, a response from the LLM includes:
receiving a natural language statement, the natural language statement comprising a natural language label (¶ 21—“A generative language model 140 generates a plurality of image story caption candidates 144, which includes story captions 141, 142, and 143, from visual information 120, which includes, from visual clues 130”); and
parsing the natural language statement to extract the natural language label (¶ 21—“A generative language model 140 generates a plurality of image story caption candidates 144, which includes story captions 141, 142, and 143, from visual information 120, which includes, from visual clues 130”).
As per Claim 13, Xie further discloses wherein the assigning, by the object recognition subsystem, a second label to the second object includes assigning the natural language label to the second object (¶ 27—“Selected story caption 154 is then paired with image 102 in paired set 160”).
As per Claim 14, Xie further discloses wherein the assigning, by the object recognition subsystem, a first label to the first object includes:
identifying the first object (¶ 19—“Object detector 116 detects a plurality of objects 118 in image 102”);
and assigning a natural language label to the first object (¶ 52—“Operation 410 determines, for each object in plurality of objects 118 in image 102, object information 126. In some examples, object information 126 comprises at least one item selected from the list consisting of: an object tag, an object caption, an object attribute, and an object location.”).
As per Claim 15, Xie further discloses the method further comprising:
determining a degree of confidence in the identifying of the first object exceeds a determined confidence threshold, wherein the determining a degree of confidence in the identifying of the first object exceeds a determined confidence threshold includes determining a probability (¶ 27—“ vision language model 150 scores plurality of image story caption candidates 144 and a down selection component 152 selects selected story caption 154 based on at least the scores from vision language model 150”).
As per Claim 16, Xie further discloses wherein the sending, by the interface, a query to the LLM includes formulating a natural language statement, the natural language statement comprising the natural language label (¶ 40—“formatted into structured visual clues 130, which is used to directly prompt generative language model 140”).
As per Claim 17, Xie further discloses wherein the formulating a natural language statement includes structuring the natural language statement to cause the response from the LLM to follow a defined structure (¶ 40—“structured visual clues 130”; ¶ 21—“Caption focus 132 acts as an input that instructs generative language model 140 to produce story captions that are particularly suited for some desired application of using architecture 100”).
As per Claim 18, Xie further discloses wherein the receiving, by the interface, a response from the LLM includes:
receiving a natural language statement, the natural language statement comprising a natural language label (¶ 21—“A generative language model 140 generates a plurality of image story caption candidates 144, which includes story captions 141, 142, and 143, from visual information 120, which includes, from visual clues 130”); and
parsing the natural language statement to extract the natural language label (¶ 21—“A generative language model 140 generates a plurality of image story caption candidates 144, which includes story captions 141, 142, and 143, from visual information 120, which includes, from visual clues 130”).
As per Claim 19, Xie further discloses the method further comprising assigning, by the object recognition subsystem, a third label to the second object, wherein the assigning, by the object recognition subsystem, a second label to the second object includes comparing the second label with the third label (¶ 27—“ vision language model 150 scores plurality of image story caption candidates 144 and a down selection component 152 selects selected story caption 154 based on at least the scores from vision language model 150”).
As per Claim 20, Xie further discloses wherein the assigning, by the object recognition subsystem, a second label to the second object further includes updating a degree of confidence (¶ 27—“ vision language model 150 scores plurality of image story caption candidates 144 and a down selection component 152 selects selected story caption 154 based on at least the scores from vision language model 150”).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BASIL T JOS whose telephone number is (571)270-5915. The examiner can normally be reached 11:00 - 8:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, THOMAS WORDEN can be reached at (571) 272-4876. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Basil T. Jos/Primary Examiner, Art Unit 3658