DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 3-5, 10-17, 19, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Tomioka et al., US PGPUB No. 20190130216 A1, hereinafter Tomioka, in view of Denney et al., US PGPUB No. 20220335631 A1, hereinafter Denney, and further in view of Motiian et al., US PGPUB No. 20240153259 A1, hereinafter Motiian.
Regarding claim 17, Tomioka discloses a system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations for conditioning a generative machine-learning model configured to generate an image based on an input (Tomioka; a system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers [¶ 0030 and ¶ 0037] cause the one or more computers to perform operations for conditioning a generative ML model [¶ 0032-0033 and ¶ 0043] configured to generate an image based on an input [¶ 0046], as illustrated within Fig. 1 and Fig. 3; wherein, learning model selection processing [¶ 0049-0052], as illustrated Fig. 5; additionally, information processing apparatus [¶ 0193], as illustrated within Fig. 12), the operations comprising:
obtaining a plurality of training images (Tomioka; the operations [as addressed above] comprising obtaining a plurality of training images [¶ 0030-0032]; moreover, image capturing of a scene in relation with calculating evaluation values of the learning models using the training images [¶ 0041-0043]; and moreover, a plurality of input images [¶ 0066]);
grouping the training images into a plurality of image clusters (Tomioka; grouping the training images into a plurality of image clusters [¶ 0036 and ¶ 0043]); and
a set of intensity levels (Tomioka; a set of intensity levels (i.e. evaluation values) [¶ 0051-0052]).
Tomioka fails disclose grouping the training images into a plurality of image clusters, wherein each respective image cluster includes a respective subset of the training images;
for each respective image cluster:
determining a respective descriptor for the respective image cluster; and
generating a respective set of instances of the generative machine-learning model on the training images in the image cluster, the generating comprising:
for each of a set of intensity levels, generating a respective instance of the machine-learning model by conditioning the machine-learning model based on (i) the image cluster and (ii) a respective embedding size determined from the respective intensity level.
However, Denney teaches obtaining a plurality of training images (Denney; obtaining a plurality of training images [¶ 0029]);
grouping the training images into a plurality of image clusters (Denney; grouping the training images into a plurality of image clusters (i.e. K sets) [¶ 0029-0030]), wherein each respective image cluster includes a respective subset of the training images (Denney; each respective image cluster (i.e. k sets) includes a respective subset (i.e. reference images) of the training images [¶ 0029-0030]);
for each respective image cluster:
determining a respective descriptor for the respective image cluster (Denney; determining a respective descriptor for the respective image cluster for each respective image cluster [¶ 0040-0042]; moreover, one or more K flows [¶ 0034-0036]; additionally, one or more median-absolute deviations [¶ 0044-0045]); and
generating a respective set of instances of the generative machine-learning model on the training images in the image cluster (Denney; generating a respective set of instances of the generative ML model on the training images in the image cluster for each respective image cluster [¶ 0031 and ¶ 0033]), the generating comprising:
for each of a set of intensity levels, generating a respective instance of the machine-learning model by conditioning the machine-learning model based on (i) the image cluster and (ii) a respective embedding size determined from the respective intensity level (Denney; generating a respective instance of the ML model by conditioning the ML model based on (i) the image cluster and (ii) a respective implicitly embedding size (given error space, m levels) determined from the respective intensity level (i.e. confidence levels) for each of a set of intensity levels (i.e. confidence levels) [¶ 0034-0037]).
Tomioka and Denney are considered to be analogous art because both pertain to generating and/or managing data in relation with providing media related data to a user, wherein one or more computerized units are utilized in order to produce a computational modeling effect.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention was made to modify Tomioka, to incorporate obtaining a plurality of training images; grouping the training images into a plurality of image clusters, wherein each respective image cluster includes a respective subset of the training images; for each respective image cluster: determining a respective descriptor for the respective image cluster; and generating a respective set of instances of the generative machine-learning model on the training images in the image cluster, the generating comprising: for each of a set of intensity levels, generating a respective instance of the machine-learning model by conditioning the machine-learning model based on (i) the image cluster and (ii) a respective embedding size determined from the respective intensity level (as taught by Denney), in order to provide improved object detection (Denney; [¶ 0002-0003]).
Tomioka as modified by Denney fails to explicitly disclose a respective embedding size.
However, Motiian teaches a respective embedding size (Motiian; a respective embedding size [¶ 0038]).
Tomioka in view of Denney and Motiian are considered to be analogous art because they pertain to generating and/or managing data in relation with providing media related data to a user, wherein one or more computerized units are utilized in order to produce a computational modeling effect.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention was made to modify Tomioka as modified by Denney, to incorporate a respective embedding size (as taught by Motiian), in order to provide improved image processing and image based training (Motiian; [¶ 0002-0003]).
Regarding claim 19, Tomioka in view of Denny and Motiian further discloses the system of claim 17, the generative machine-learning model (Tomioka; the generative ML model [as addressed within the parent claim(s)]).
However, Motiian teaches the generative machine-learning model is a diffusion model comprising a textual inversion component (Motiian; generative ML model is a diffusion model comprising a textual inversion component [¶ 0054-0055]), wherein the respective embedding size is a dimension number of a respective textual inversion embedding vector for the respective image cluster and the respective intensity level (Motiian; the respective embedding size is a dimension number [¶ 0038] of a respective textual inversion embedding vector for the respective image cluster and the respective intensity level [¶ 0054-0055]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention was made to modify Tomioka as modified by Denney and Motiian, to incorporate the generative machine-learning model is a diffusion model comprising a textual inversion component, wherein the respective embedding size is a dimension number of a respective textual inversion embedding vector for the respective image cluster and the respective intensity level (as taught by Motiian), in order to provide improved image processing and image based training (Motiian; [¶ 0002-0003]).
Regarding claim 20, the rejection of claim 20 is addressed within the rejection of claim 17, due to the similarities claim 20 and claim 17 share, therefore refer to the rejection of claim 17 regarding the rejection of claim 20. Although, claim 20 and claim 17 may not be identical, they are considerably comparable or substantially equivalent given their overlapping subject matter. However, the subject matter/limitations not addressed by claim 17 is/are addressed below.
Tomioka discloses one or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations (Tomioka; one or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations [¶ 0037 and ¶ 0040]; moreover; executable instructions [¶ 0248]).
(further refer to the rejection of claim 17)
Regarding claim 1, the rejection of claim 1 is addressed within the rejection of claim 17, due to the similarities claim 1 and claim 17 share, therefore refer to the rejection of claim 17 regarding the rejection of claim 1. Although, claim 17 and claim 1 may not be identical, they are considerably comparable or substantially equivalent given their overlapping subject matter. Thus, it is reasonable to reject claim 1 based on the teachings and rational in relation with the prior art within the rejection of claim 17.
Regarding claim 3, Tomioka in view of Denney and Motiian further discloses the method of claim 1, wherein the intensity level is a style intensity level that characterizes a specificity to a particular image style for a generated image (Tomioka; the intensity level (i.e. evaluation value) is a style (i.e. match type) intensity level that characterizes a specificity to a particular image style/type for a generated image [¶ 0121]).
Regarding claim 4, Tomioka in view of Denney and Motiian further discloses the method of claim 1, wherein the intensity level is a subject intensity level that characterizes a specificity to a particular subject for a generated image (Tomioka; the intensity level is a subject (i.e. types of object) intensity level that characterizes a specificity to a particular subject/type-of-object for a generated image [¶ 0123-0124 and ¶ 0233]).
Regarding claim 5, the rejection of claim 5 is addressed within the rejection of claim 19, due to the similarities claim 5 and claim 19 share, therefore refer to the rejection of claim 19 regarding the rejection of claim 5.
Regarding claim 10, Tomioka in view of Denney and Motiian further discloses Tomioka in view of Denney further discloses the method of claim 1, wherein grouping the training images into the plurality of image clusters (Denney; grouping the training images into the plurality of image clusters [as addressed within the parent claim(s)]) comprises:
using K-means clustering to group the training images into the image clusters (Denney; using K-means (i.e. k flow) clustering to group the training images into the image clusters [¶ 0033]; additionally, clustering algorithm [¶ 0029-0030]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention was made to modify Tomioka in view of Denney and Motiian, to incorporate grouping the training images into the plurality of image clusters comprises: using K-means clustering to group the training images into the image clusters (as taught by Denney), in order to provide improved object detection (Denney; [¶ 0002-0003]).
Regarding claim 11, Tomioka in view of Denney and Motiian further disclose the method of claim 1, wherein grouping the training images into the plurality of image clusters (Denney; grouping the training images into the plurality of image clusters [as addressed within the parent claim(s)]) comprises:
using human labeling to group the training images into the image clusters (Denney; using human labeling to group the training images into the image clusters [¶ 0029]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention was made to modify Tomioka in view of Denney and Motiian, to incorporate grouping the training images into the plurality of image clusters comprises: using human labeling to group the training images into the image clusters (as taught by Denney), in order to provide improved object detection (Denney; [¶ 0002-0003]).
Regarding claim 12, Tomioka in view of Denney and Motiian further disclose the method of claim 1, further comprising:
after grouping the training images into the plurality of image clusters and before generating the instances of the generative machine-learning model (Denney; after grouping the training images into the plurality of image clusters and before generating the instances of the generative machine-learning model [¶ 0027], as illustrated within Fig. 2; wherein, Fig. 2 illustrates, an order of steps), pre-processing each image cluster, the pre-processing comprising one or more of:
creating flipped copies, splitting oversized images, performing auto-focal point cropping, performing auto-sized cropping, or automatically generating captions (Denney; pre-processing each image cluster, the pre-processing comprising an alignment [¶ 0030-0031 and ¶ 0034] and even further removing invalid, unwanted, and irrelevant regions corresponding to performing auto-size cropping [¶ 0041]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing of the claimed invention was made to modify Tomioka in view of Denney and Motiian, to incorporate after grouping the training images into the plurality of image clusters and before generating the instances of the generative machine-learning model, pre-processing each image cluster, the pre-processing comprising one or more of: creating flipped copies, splitting oversized images, performing auto-focal point cropping, performing auto-sized cropping, or automatically generating captions (as taught by Denney), in order to provide improved object detection (Denney; [¶ 0002-0003]).
Regarding claim 13, Tomioka in view of Denney and Motiian further disclose the method of claim 1, wherein:
before generating the instances of the generative machine-learning model, the generative machine-learning model has been pre-trained on one or more general training data sets (Tomioka; the generative ML model has been pre-trained on one or more general training data sets before generating the instances of the generative ML model [¶ 0027 and ¶ 0059-0060]; moreover, trained in advanced [¶ 0088]).
Regarding claim 14, Tomioka in view of Denney further disclose the method of claim 1, further comprising:
after generating the instances of the generative machine-learning model, finetuning the instances of the generative machine-learning model using feedback data (Tomioka; finetuning the instances of the generative ML model using feedback data after generating the instances of the generative ML model [¶ 0152 and ¶ 0164]).
Regarding claim 15, Tomioka in view of Denney and Motiian further disclose the method of claim 14, wherein the feedback data is human-provided feedback (Tomioka; the feedback data is human-provided feedback [¶ 0152]).
Regarding claim 16, Tomioka in view of Denney and Motiian further disclose the method of claim 14, wherein finetuning the instances of the generative machine-learning model comprises:
performing reinforcement learning using the feedback data as a reward signal (Tomioka; finetuning the instances of the generative machine-learning model comprises: performing reinforcement learning using the feedback data as a reward signal [¶ 0051-0052]).
Allowable Subject Matter
Claims 2, 6-9, and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of Reference Cited for a listing of analogous art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Charles Lloyd Beard whose telephone number is (571)272-5735. The examiner can normally be reached Monday - Friday, 8:00 AM - 5: 00 PM, alternate Fridays EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy Goddard can be reached at (571) 272-7773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
CHARLES LLOYD. BEARD
Primary Examiner
Art Unit 2611
/CHARLES L BEARD/Primary Examiner, Art Unit 2611