DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice to Applicants
Limitations appearing inside of {} are intended to indicate the limitations not taught by said prior art(s)/combinations.
Claims 1-20 are pending in the application
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites “digital content” in line 4. Should this be “the digital content” if it is referring the first instance in line 3?
Claim 1 recites “the plurality of subject matter examples of the digital content” in line 6-7. The wording “of the digital content” is confusing. The claim previously recites “a plurality of subject matter examples of different subject matter exhibited by digital content” (lines 3-4), and reiterates “different” (line 7). How are “the plurality of subject matter examples” both “of the digital content” and “different” from the digital content? For the purpose of examination the limitation is interpreted as:
synthesizing, by the processing device, stylized training data by transferring the style from the style example to the plurality of subject matter examples
Claim 5 recites “a second said style” in line 3. There is insufficient antecedent basis for this limitation in the claim. The antecedent basis of “said style” is “style exhibited by digital content” (claim 1, line 2-3), where only one style is identified. For the purpose of examination, the limitation is interpreted as “a second style”, e.g., the original style of the subject matter example.
Claims 2-10 are rejected as they depend from a rejected claim.
Claims 12-15 are rejected as they depend from a rejected claim.
Three different terms are used to described training data throughout the claims, as follows:
stylized training data, in claims 1, 6, 11, 13, 14, 15;
stylized training content, in claims 5, 11 and 19;
synthetic training data, in claims 16 and 18.
It is unclear if/how these terms differ. For the purpose of examination, the said three terms throughout the claims are interpreted as “data that is usable to train the style machine-learning model 124” (¶[0028]). Using one term throughout the claims for proper antecedent basis and clarity.
Claims 17-20 are rejected as they depend from a rejected claim.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 5 U.S.C. 101 because the claimed invention is directed to an abstract idea (Step 2A Prong 1) without additional imitations that integrate the abstract idea into a practical application (Step 2A Prong 2) and without amounting to significantly more than the abstract idea (Step 2B).
Claim 1 recites a method, claim 11 recites a system and claim 16 recites a storage medium. Thus the claims are is directed to statutory category of invention (MPEP §2106.03). (Step 1: YES)
Step 2A Prong 1 evaluates if the claim recites any judicial exception (MPEP §2106.04(a)). Claim 1 recites the following limitations:
Obtaining a style example exhibited by digital content and a plurality of subject matter examples
Synthesizing training data
Training a machine-learning model
Processing device
The USPTO has enumerated groupings of abstract ideas (See §MPEP 2106.04(a)(2)), defined as:
I) Mathematical concepts – mathematical relationships, mathematical formulas or equations, mathematical calculations;
II) Certain methods of organizing human activity – fundamental economic principles or practices (including hedging, insurance, mitigating risk); commercial or legal interactions (including agreements in the form of contracts; legal obligations; advertising, marketing or sales activities or behaviors; business relations); managing personal behavior or relationships or interactions between people (including social activities, teaching, and following rules or instructions); and
III) Mental processes – concepts performed in the human mind with or without physical aid (i.e., pen and paper, or by using a computer). These include an observation, evaluation, judgment, or opinion.
The limitations [A] and [B] recite a mental process which is recognized by the court as an abstract idea (See MPEP §2106.04(a)(2)). The limitations obtaining a style and synthesizing style human mind using a physical aid (i.e. pencil and paper). The courts do not distinguish between mental processes that are performed entirely in the human mind and mental processes that require a human to use a physical aid (e.g., pen and paper or a slide rule) to perform the claim (MPEP §2106.04(a)(2)). (Step 2A Prong 1: YES)
Step 2A Prong 2 analysis evaluates weather the claim recites additional elements that integrate the exception in to a practical application of that exception according to MPEP §2106.04(d) by: 1) Identify additional elements recited in the claim beyond the judicial exception; and 2) If additional elements are identified, then evaluate the additional elements both individually and in combination to determine whether the claim as a whole integrates the exception into a practical application.
Additional elements [C] training a machine-learning model, and [D] a processing device amount to using a computer as a tool. According to the specification an improvement is not made to the computer/processor: “An implementation of the described modules and techniques is stored on or transmitted across some form of computer-readable media. The computer-readable media includes a variety of media that is accessed by the computing device 702”, (¶[0056]). An ‘improvement cannot be to the judicial exception itself, i.e., ‘apply it’/automating an otherwise mentally/manually performed process, even if requiring the use of a computer as a tool (MPEP §2106.05(a)). Under consideration is whether the invention is directed towards improvements to the functioning of a computer or any other technology/technical field. (Step 2A Prong 2: NO)
Step 2B analysis evaluates if the claim as a whole amounts to significantly more than the judicial exception. The additional elements, as stated in Prong 2, do not amount to significantly more than the judicial exception. The invention provides a elements for improving the field by enabling automation of an abstract idea. These elements in combination are found not to be enough to qualify as "significantly more" (MPEP 2106.05(d)). (Step 2B: NO)
Claims 11 and 16 are similarly rejected as analogous claim 1.
Claims 2-10, and Claims 17-20 add further details to digital images or training the learning models without adding significantly more.
Claims 12-15 recites a system comprising modules and a system implemented by a processing device (i.e., computer as tool).
Claims 7 and 16 recite a display in a user interface (extra-solution activity) used for outputting result (signal).
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-4, 7, 10-17, and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Diesendruck et al., US 20240193911 A1, hereinafter “Diesendruck”.
Regarding claim 1, Diesendruck teaches a method comprising:
obtaining, by a processing device, a style example of style exhibited by digital content and a plurality of subject matter examples of different subject matter exhibited by digital content (Diesendruck, ¶[0051]; style matching system 106 receives the small set of images with a request to generate a large dataset of style-matching images having styles and content that match the initial or original small sample set of input images);
synthesizing, by the processing device, stylized training data by transferring the style from the style example to the plurality of subject matter examples of the digital content having the different subject matter using a machine-learning model (Diesendruck, ¶[0055]; act 206 of synthesizing new images from the expanded input image set; the style matching system 106 utilizes a generative machine-learning model and/or a deep learning model to generate new synthesized images that appear similar in style and content to input images); and
training, by the processing device, a style machine-learning model using the stylized training data to identify the style in a subsequent item of digital content (Diesendruck, ¶[0098]; FIG. 5 also includes the act 210 of training an image-based machine learning model using the style-matching dataset; ¶[0099]; FIG. 6 illustrates a series of acts for utilizing the style matching system 106; ¶[0100]; the act 610 may involve comparing an initial set of input images to multiple sets (e.g., large sets) of stored image datasets to determine the style distribution).
Regarding claim 2, Ham teaches the method as described in claim 1. Ham further teaches wherein the digital content is a digital image and the style is a visual style depicted by the digital image (Diesendruck, ¶[0021]; input image dataset; ¶[0026]; “image style” (or “style” for short) refers to the look and feel of an image. Examples of styles include geometry, visual theme, topic, color palette, arrangement, feature sets, creation medium, characteristics, scale, resolution, perspective, capture type, spacing, object types, and/or other image styles that distinguish one set of images from another image set).
Regarding claim 3, Ham teaches method as described in claim 2. Ham further teaches wherein the different subject matter corresponds to different objects that are depicted in the plurality of subject matter examples, one to another (Diesendruck, ¶[0027]; term “image content” (or “content” for short) refers to the semantic meaning of what is depicted in an image. In many cases, the content of an image refers to the subject matter and/or objects in an image).
Regarding claim 4, Ham teaches the method as described in claim 1. Ham further teaches wherein the different subject matter corresponds, respectively, to different semantic content (Diesendruck, ¶[0027]; the term “image content” (or “content” for short) refers to the semantic meaning of what is depicted in an image. In many cases, the content of an image refers to the subject matter and/or objects in an image. Content can include foreground as well as background portions of an image).
Regarding claim 7, Diesendruck teaches the method as described in claim 1. Diesendruck further teaches further comprising: identifying the style in the subsequent item of digital content using the trained style machine-learning model (Diesendruck, ¶[0098]; FIG. 5 also includes the act 210 of training an image-based machine learning model using the style-matching dataset; ¶[0099]; FIG. 6 illustrates a series of acts for utilizing the style matching system 106; ¶[0100]; the act 610 may involve comparing an initial set of input images to multiple sets (e.g., large sets) of stored image datasets to determine the style distribution); and
outputting a result of the identifying for display in a user interface (Diesendruck, ¶[0121]; converting data 707 stored in the memory 703 into text, graphics, and/or moving images (as appropriate) shown on the display device 715).
Regarding claim 10, Diesendruck teaches the method as described in claim 1. Diesendruck further teaches wherein the synthesizing and the training are performed for a plurality of said styles (Diesendruck, ¶[0023]; the style matching system utilizes the original input image set to conditionally sample synthesized images from within the style-mixed (i.e., plurality of said styles) embedding space (i.e., synthesizing), which results in a larger dataset that is often indistinguishable from the original input image set (i.e. for training)).
Regarding claim 11, Diesendruck teaches a training data generation system comprising: a style selection module implemented by a processing device (Diesendruck, See FIG 1 exhibits computing device 108) to obtain a style example of style exhibited by digital content Diesendruck, ¶[0051]; style matching system 106 receives the small set of images with a request to generate a large dataset of style-matching images having styles and content that match the initial or original small sample set of input images;
a content selection module implemented by the processing device to obtain a plurality of subject matter examples of different subject matter exhibited by digital content (Diesendruck, ¶[0053]; the style matching system 106 determines how similar each image style set is to the small set of input images. Then, based on the style similarity values, the style matching system 106 generates a style distribution and selects a proportional number of sample images from each image style set (i.e., different images) to generate an extended input image set); and
a style transfer system implemented by the processing device to synthesize stylized training data configured to train a machine-learning model to identify the style, the stylized training content synthesized by transferring the style from the style example to the plurality of subject matter examples of the digital content having the different subject matter using a machine-learning model (Diesendruck, ¶[0055]; act 206 of synthesizing new images from the expanded input image set; the style matching system 106 utilizes a generative machine-learning model and/or a deep learning model to generate new synthesized images that appear similar in style and content to input images).
Claim 12 is similarly analyzed as analogous claim 2.
Claim 13 is similarly analyzed as analogous claim 6.
Regarding claim 14, Diesendruck teaches the training data generation system as described in claim 11. Diesendruck further teaches wherein the style transfer system is configured to use a plurality of said styles (Diesendruck, ¶[0053]; the style matching system 106 generates a style distribution and selects a proportional number of sample images from each image style set (i.e., different images) to generate an extended input image set; ¶[0082]; style-mixed images from a catalog of image styles)
and the machine-learning model is configured to identify the plurality of said styles based on the stylized training data (Diesendruck, ¶[0015]; the style matching system identifies a catalog of stored image sets having different styles (e.g., a style catalog of image sets); ¶[0100]; the style matching system compares the initial input images to the stored image sets to determine a style distribution of distances between the initial input images and the sets of stored images).
Regarding claim 15, Diesendruck teaches the training data generation system as described in claim 11. Diesendruck further teaches further comprising a machine-learning training module configured to train the machine- learning model using the stylized training data to identify the style in a subsequent item of digital content (Diesendruck, ¶[0098]; FIG. 5 also includes the act 210 of training an image-based machine learning model using the style-matching dataset; ¶[0099]; FIG. 6 illustrates a series of acts for utilizing the style matching system 106; ¶[0100]; the act 610 may involve comparing an initial set of input images to multiple sets (e.g., large sets) of stored image datasets to determine the style distribution).
Regarding claim 16, Diesendruck teaches one-or-more computer-readable storage media storing instructions that, responsive to execution by a processing device, causes the processing device to perform operations (Diesendruck, ¶[0113]; computer-executable instructions or data structures can be transferred automatically from transmission media to non-transitory computer-readable storage media ) comprising:
receiving an item of digital content (Diesendruck, ¶[0119]; Executing the instructions 705 may involve the use of the data 707 that is stored in the memory 703);
identifying, by a machine-learning model, a style exhibited by the item of digital content, the machine-learning model trained using synthetic training data generated by transferring the style from a style example of digital content to a plurality of subject matter examples of digital content having different subject matter, one to another (Diesendruck, ¶[0098]; FIG. 5 also includes the act 210 of training an image-based machine learning model using the style-matching dataset; ¶[0099]; FIG. 6 illustrates a series of acts for utilizing the style matching system 106; ¶[0100]; the act 610 may involve comparing an initial set of input images to multiple sets (e.g., large sets) of stored image datasets to determine the style distribution; ¶[0053] the style matching system 106 generates a style distribution and selects a proportional number of sample images from each image style set (i.e., different images) to generate an extended input image set) ; and
outputting a result of the identifying for display in a user interface (Diesendruck, ¶[0121]; converting data 707 stored in the memory 703 into text, graphics, and/or moving images (as appropriate) shown on the display device 715).
Claim 17 is similarly analyzed as analogous claims 2 and 3.
Regarding claim 20, Diesendruck teaches the one-or-more computer-readable storage media as described in claim 16. Diesendruck further teaches wherein the identifying of the style includes identifying the style from a plurality of said styles using the machine-learning model (Diesendruck, ¶[0098], FIG. 5 includes the act 210 of training an image-based machine learning model using the style-matching dataset; the style matching system 106 utilizes the style-matching image set and/or their corresponding customized pre-trained embeddings to improve the functions and operations of other image-based machine-learning models; ¶[0100]; the series of acts 600 includes an act 610 of comparing input images to sets of stored images to determine a style distribution; the act 610 may involve comparing an initial set of input images to multiple sets (e.g., large sets) of stored image datasets to determine the style distribution between the initial set of input images and the multiple large sets of stored images; where a plurality of said styles is referred to in ¶[0082]; style-mixed images from a catalog of image styles).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 5, 6, 9, 18, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Diesendruck in view of Ham (Ham, Cusuh , et al., "CoGS: Controllable Generation and Search from Sketch and Style", Cornell University arXiv, arXiv.org [retrieved 2023-06-29]. Retrieved from the Internet <https://arxiv.org/pdf/2203.09554.pdf>., 07/20/2022), as cited in the IDS (02/27/2024).
Regarding claim 5, Diesendruck teaches the method as described in claim 1. Diesendruck does not explicitly disclose wherein the training includes use of a first said item of stylized training content having the style and exhibiting first said subject matter as positive training data and a second said item having a second said style and exhibiting the first said subject matter as negative training data.
However, Han, in a similar field of endeavor of synthesizing style training data sets, teaches wherein the training includes use of a first said item of stylized training content having the style and exhibiting first said subject matter as positive training data and a second said item having a second said style and exhibiting the first said subject matter as negative training data (Ham [p 7, §3.4, ¶3]; contrastive learning; positive j(i) is an image generated by the CoGS transformer using the same sketch and different style images as i, while the remaining 2(N-1) elements are negatives generated using different sketches).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include positive and negative training data as taught by Ham to the invention of Diesendruck. The motivation to do so would be to map embeddings of structurally similar images close together in latent space and embeddings of dissimilar images far apart which encourages a smooth latent space W that can both reconstruct an input image and synthesize novel outputs.
Regarding claim 6, Diesendruck teaches the method as described in claim 1. Diesendruck does not explicitly teach wherein the synthesizing the stylized training data is performed using a neural style transfer machine-learning model configured using an encoder-decoder architecture.
However, Ham teaches wherein the synthesizing the stylized training data is performed using a neural style transfer machine-learning model configured using an encoder-decoder architecture (Ham, [p 2, §1, ¶2]; synthesize images via a VQGAN decoder, driven by discrete codebook representations generated via a transformer-based sketch and style encoder).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include encoder-decoder architecture as taught by Ham to the invention of Diesendruck. The motivation to do so would be to obtain style embeddings and generating an image that is semantically and stylistically similar to y but not necessarily share the same _ne-grained details of the object (e.g., pose, orientation, scale).
Regarding claim 9, Diesendruck teaches the method as described in claim 1. Diesendruck does not explicitly disclose wherein the training is performed using contrastive losses to drive a learning signal.
However, Ham further teaches wherein the training is performed using contrastive losses to drive a learning signal (Ham [p 7, §3.4, ¶3]; contrastive learning).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include use of contrastive loss as taught by Ham to the invention of Diesendruck. The motivation to do so would be to map embeddings of structurally similar images close together in latent space and embeddings of dissimilar images far apart which encourages a smooth latent space W that can both reconstruct an input image and synthesize novel outputs.
Claim 18 is similarly analyzed as analogous claim 6.
Claim 19 is similarly analyzed as analogous claim 5.
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Diesendruck in view of Johnson (Johnson, Justin , et al., "Perceptual Losses for Real-Time Style Transfer and Super-Resolution.", Cornell University arXiv, arXiv.org [retrieved 2023-06-28]. Retrieved from the Internet <https://arxiv.org/pdf/1603.08155.pdf>., 03/27/2016), as cited in the IDS (02/27/2024)
Regarding claim 8, Diesendruck teaches the method of claim 1. Diesendruck further teaches wherein the synthesizing and the training are performed {in real time} (Diesendruck, ¶[0055]; act 206 of synthesizing new images from the expanded input image set; and ¶[0098]; FIG. 5 also includes the act 210 of training an image-based machine learning model using the style-matching dataset). Diesendruck does not explicitly teach in real time.
However, Johnson, a similar field of endeavor of style transfer, teaches in real time (Johnson, [p 2, §1, ¶3]; the transformation networks run in real-time).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include real-time image processing as taught by Johnson to the invention of Diesendruck. The motivation to do so would be to generate high-quality images by designing and optimizing perceptual loss functions.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Weidi et al., US 20250217935 A1, teaches generate a training data set of obtaining watermark style information and constructing a neural network model, calling the training data set to train the neural network model for watermark restoration.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHANDHANA PEDAPATI whose telephone number is 571-272-5325. The examiner can normally be reached M-F 8:30am-6pm (ET).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached at 571-272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CHANDHANA PEDAPATI/Examiner, Art Unit 2669 /CHAN S PARK/Supervisory Patent Examiner, Art Unit 2669