DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
35 U.S.C. 101 Positive Statement
As to claim 7, typically the broadest reasonable interpretation of a “computer readable storage media” would be considered to include signals or carrier waves which would render the claim non-statutory under 35 U.S.C. 101. However, in paragraph 42 of the originally filed specification they define “storage media” as “also called ‘mediums’’, and a “computer readable storage medium” is then later defined in paragraph 42 as “not to be construed as storage in the form of transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide, light pulses passing through a fiber optic cable, electrical signals communicated through a wire, and/or other transmission media” . Therefore the claim is considered statutory.
Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 8 and 9 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 8, the limitation “wherein the stored program are stored in a computer readable storage device in a data processing system in a data processing system”, is vague and indefinite. Specifically, it is unclear in what way this is connected to the computer program product disclosed in parent claim 7. Is the “computer readable storage media” of claim 7 meant to be replaced with the “computer readable storage device” of claim 8, and is the “computer program product” of claim 7 somehow meant to be replaced with the “data processing system”? If so, it’s unclear how a product can be replaced with a system. Or, are the program instructions stored on the computer readable storage media of the computer program product of claim 7 somehow loaded into the “remote data processing system” and then transferred over a network to the data processing system where they are stored in the “computer readable storage device”?
Regarding claim 9, the limitations “wherein the stored program are stored in a computer readable storage device in a data processing system in a server data processing system” and “a computer readable storage device associated the remote data processing system” are vague and indefinite. Specifically, it is unclear in what way these “devices” and “systems” are connected to the computer program product and media disclosed in parent claim 7. Is the “computer readable storage media” of claim 7 meant to be replaced with either one of the “computer readable storage device” of claim 9, and is the “computer program product” of claim 7 to be replaced with the “server data processing system” or the “remote data processing system” of claim 9? If so, it’s unclear how a product can be replaced with a system.
Additionally, regarding claim 9, the limitation “wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system” is vague and indefinite. Specifically, “downloaded in response to a request over a network to a remote data processing system” sounds as though the server data processing system is requesting and downloading the instruction from the remote data processing system, however the phrase “for use in a computer readable storage device associated with the remote data processing system” sounds more like the remote system is downloading the instructions from the server. Therefore the limitation is not clear and clarification is necessary.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1, 7 and 15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by US 2018/0143966 to Lu et al. (“Lu”).
Regarding claim 1, Lu discloses a computer-implemented method comprising:
partitioning, using a trained image segmentation model, an input image into a plurality of patches (Fig. 1; Fig. 7; paragraphs 75-78 and 179-185, wherein the image encoder corresponds to a trained segmentation model that partitions the input image into a plurality of regions/patches);
generating, using a vision transformer model, a plurality of patch embeddings, each patch embedding comprising a multidimensional numerical representation of a patch in the plurality of patches (Fig. 1; Fig. 7; paragraphs 75-78 and 179-180, wherein the image encoder corresponds to the broadest reasonable interpretation of a “vision transformer model”, which the Examiner is not interpreting to be a true vision transformer (ViT) as is known in the art but rather any type of broader “model” (imitation/emulation) thereof, that performs an equivalent function of generating a plurality of multidimensional numerical feature vectors (i.e. patch embeddings) for each patch/region, as is done by the image encoder the prior art of Lu);
generating, using a trained patch-label similarity model, a plurality of word embeddings corresponding to the plurality of patch embeddings (Fig. 4; Fig. 7; paragraphs 110-115, 166-174 and 181-183, wherein the combination of LSTM and attender (i.e. attention mechanism) corresponds to a “trained patch-label similarity model”, that ultimately generates a plurality of image feature vectors combined with attention probability masses corresponding to a plurality of “patch embeddings”); and
generating, using a trained label prediction model and the plurality of word embeddings, a text label corresponding to the input image (Fig. 5; Fig. 7; paragraphs 166-177 and 183-185, wherein the plurality of image feature vectors and attention probability masses are summed to produce image context vector that is used by the emitter to produce a caption word (i.e. text label) corresponding to the input image).
Regarding claim 7, please refer to the rejection of claim 1 above. Lu further discloses a computer program product comprising one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by a processor to cause the processor to perform operations disclosed in claim 1 (Fig. 25; paragraphs 185-187 and 359-361).
Regarding claim 15, please refer to the rejection of claim 1 above. Lu further discloses a computer system comprising a processor and one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions executable by the processor to cause the processor to perform operations the operations of claim 1 (Fig. 25; paragraphs 185-187 and 359-361).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 2-4, 10-12 and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over US 2018/0143966 to Lu et al. (“Lu”) in view of US 2020/0097604 to Lee et al. (“Lee”).
Regarding claim 2, Lu discloses the computer-implemented method of claim 1.
Lu does not disclose expressly wherein the trained patch-label similarity model comprises a similarity matrix, a cell of the similarity matrix storing a pair-wise similarity score between a patch embedding and a word embedding.
Lee discloses a process for image captioning comprising partitioning an image into patches/regions, generating region/patch vectors and comparing the region/patch vectors to word vectors in a first stage attention mechanism that include determining a pair-wise similarity score between the vectors and storing each score in a matrix (Fig. 2, element 214; paragraphs 53-56).
Lu & Lee are combinable because they are from the same art of image captioning using attention models.
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention to incorporate the technique of using an attention mechanism to generate a similarity matrix where each cell stores a pair-wise similarity score between a patch vector/embedding and a word vector/embedding, as taught by Lee, into the image caption process of Lu.
The suggestion/motivation for doing so would have been to provide improved techniques in determining which regions correspond to which words by determining the biggest region-word pair response for the matching process (Lee, paragraphs 01 and 58).
Therefore, it would have been obvious to combine Lee with Lu to obtain the invention as specified in claim 2.
Regarding claim 3, the combination of Lu and Lee discloses the computer-implemented method of claim 2, wherein the pair-wise similarity score between a patch embedding and a word embedding is computed by analyzing a plurality of training images and corresponding training image captions (Lee, paragraphs 18-19).
Regarding claim 4, the combination of Lu and Lee discloses the computer-implemented method of claim 3, further comprising:
partitioning a training image in the plurality of training images into a plurality of training patches; and generating, using the vision transformer model, a plurality of training patch embeddings, each training patch embedding comprising a multidimensional numerical representation of a training patch in the plurality of training patches (Lee, paragraphs 38 and 39. Lu, paragraphs 75-76, wherein a CNN is used to partition and generate the region/patch vectors/embeddings, CNN’s are necessarily trained on training data of the same type (i.e. regions of an image and associated/known vectors)).
Regarding claims 10-12, please refer to the rejections of claims 2-4, respectively, above.
Regarding claims 16-18, please refer to the rejections of claims 2-4, respectively, above.
Claims 8 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over US 2018/0143966 to Lu et al. (“Lu”) in view of US 2021/0256515 to Gale.
Regarding claim 8, Lu discloses the computer program product of claim 7.
Lu does not disclose expressly wherein the stored program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system.
Gale discloses a process in which program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system (paragraph 21).
Lu & Gale are combinable because they are from the same art of using program instruction to implement the operations of the invention.
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention to incorporate the technique of storing program instruction in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system, as disclosed by Gale, into the process of implement operation via a computer program product disclosed by Lu.
The suggestion/motivation for doing so would have been to provide a program in response to a request for usage (Gale, paragraph 21).
Therefore, it would have been obvious to combine Lu with Gale to obtain the invention as specified in claim 8.
Regarding claim 9, Lu discloses the computer program product of claim 7.
Lu does not disclose expressly wherein the stored program instructions are stored in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system, further comprising: program instructions to meter use of the program instructions associated with the request; and program instructions to generate an invoice based on the metered use.
Gale discloses a process in which program instructions are stored in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system, further comprising: program instructions to meter use of the program instructions associated with the request; and program instructions to generate an invoice based on the metered use (paragraph 21).
Lu & Gale are combinable because they are from the same art of using program instruction to implement the operations of the invention.
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention to incorporate the technique of storing program instruction in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system, further comprising program instructions to meter use of the program instructions associated with the request and program instructions to generate an invoice based on the metered use, as disclosed by Gale, into the process of implement operation via a computer program product disclosed by Lu.
The suggestion/motivation for doing so would have been to provide a program in response to a request for usage (Gale, paragraph 21).
Therefore, it would have been obvious to combine Lu with Gale to obtain the invention as specified in claim 9.
Allowable Subject Matter
Claims 5, 6, 13, 14, 19 and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. See attached PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AARON W CARTER whose telephone number is (571)272-7445. The examiner can normally be reached 8am - 5pm (Mon - Fri).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John Villecco can be reached at (571) 272-7319. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AARON W CARTER/Primary Examiner, Art Unit 2661