DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1-21, all the claims pending in the application, are rejected.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 5-7 and 11-21 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 5 recites “an identifier that includes a name of the original video” and later recites “and identifier of the original video”. It is unclear whether the identically-recited “identifier” in these limitations is the same. If so, the Examiner recommends amending the second limitation to recite “the identifier”; otherwise, the Examiner recommends amending the claim to include distinguishing modifiers (e.g., “first identifier” and “second identifier”).
Claims 6-7 are rejected by virtue of their dependency on claim 5.
Also, claim 6 recites “downloading the original image from memory”. There is insufficient antecedent basis for the term in bold in the claims. The claim recites “the original video” several times, so “the original image” will be interpreted as “the original video”.
Claim 7 is further rejected by virtue of its dependency on claim 6.
Claim 11 recites a parenthetical expression “the scene type (string)”. It is unclear whether “string” is intended to further limit the “scene type” or whether it is merely exemplary language. When considering the claim on its merits, the parenthetical expression will be ignored.
Also, claim 11 recites “the video identifier, the video number”. Although claim 8, from which claim 11 depends recites “an image identifier, an image number”, there is insufficient antecedent basis for the above limitations in the claims. The above limitation of claim 11 will be interpreted as “the image identifier, the image number”.
Claims 12-20 are rejected by virtue of their dependency on claim 11.
Also, claim 17 includes a parenthetical expression: “memory (cloud, database)”. It is unclear whether “cloud” and/or “database” are intended to further limit the “memory” or whether these terms are merely exemplary language. When considering the claim on its merits, the parenthetical expression will be ignored.
Claims 18-20 are further rejected by virtue of their dependency on claim 17.
Independent claim 21 is directed to a “system…, the system comprising memory storing program code” and a plurality of functions. It is unclear how a system can comprise functions. Though the claim requires the functions be performed “via an input device” or “via a processor”, it is also unclear whether the “input device” and “processor” are part of the system. When considering the claim on its merits, the processor and input device will be considered part of the claimed system.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-3 and 21 are rejected under 35 U.S.C. 102(a)(1) and 102(a)(2) as being anticipated by U.S. Patent Application Publication No. 2020/0090001 to Zargahi et al. (hereinafter “Zargahi”).
As to independent claim 1, Zargahi discloses a method for generating customized imagery, the method comprising: storing, in memory of a computing device, program code for generating an application programming interface (API) that communicates with plural disparate image processing tools; executing, by a computing device, the program code for generating the API; the API causing the computing device to perform operations (Abstract, [0036, 0053, 0094-0098, 0143] disclose that Zargahi is directed to “implementing a distributed computing system that provides synthetic data as a services (‘SDaaS’)”, wherein SDaaS is a “cloud” computing system “that allows customers to configure, generate, access, manage, and process synthetic data training datasets for machine learning” and “can include an API library” which “may support the interaction between hardware architecture of the device and the software framework” of the SDaaS; Figs. 6-7 show an example environment 600/700 in which requests from a client 705 are received and managed by a scheduler 630/730 of the SDaaS which “can select tasks from the queue for routing or assignment for processing on CPUs 640 and GPUs 650”, wherein the disparate CPUs 640/740 and GPUs 650/750 “render images” and perform “image processing”) that include: establishing communication with each image processing tool ([0092] discloses that a management layer of the SDaaS “orchestrates a resource allocation between CPUs 640 and GPUs 650 to process synthetic data tasks” which presupposes communication has been established); receiving input parameters that define operations to be performed by one of the plural disparate image processing tools in generating a customized image, and define attributes of the customized imagery to be generated ([0006, 0058, 0083] discloses that “a client may submit a synthetic data request to create 50 k synthetic data assets (e.g., images) from a selected source asset (e.g., a 3D model)”, wherein “training datasets are generated by altering intrinsic parameters and extrinsic parameters of synthetic data assets (e.g., 3D models) or scenes (e.g., 3D scenes and videos)...based on intrinsic-parameter variation (e.g., asset shape, position and material properties) and extrinsic-parameter variation (e.g., variation of environmental and/or scene parameters such as lighting, perspective)”); generating parameterized calls based on the input parameters, the parameterized calls providing instructions for the one image processing tool configured to generate the customized image ([0061] discloses “real-time calls to APIs or services to generate relevant training datasets having [the] identified parameters”); sending parameterized calls to the one image processing tool; generating the customized imagery based on the input parameters; receiving the customized imagery from the one image processing tool ([0091-0105] discloses that the client requests to generate synthetic training datasets are managed by a scheduler 630/730 of the SDaaS which “can select tasks from the queue for routing or assignment for processing on CPUs 640 and GPUs 650” which generate and return synthetic data assets according to “the asset-variation parameters”); and storing the received customized imagery in a database as training data for an artificial intelligence model ([0071-0072, 0104-0109, 0120-0121] discloses that the generated synthetic assets and framesets are stored in SDaaS store 126 for “machine learning training”).
As to claim 2, Zargahi further discloses that the operations involve an image transformation and the input parameters further identify an original video and a type of transformation to be performed ([0006, 0085] discloses that the machine learning scenario can include “video surveillance” and that a relevant set of variation parameters (e.g., asset-variation parameters, scene-variation parameters, etc.) is specified for each machine learning scenario in order to produce the frameset package) and further that “training datasets are generated by altering intrinsic parameters and extrinsic parameters of synthetic data assets (e.g., 3D models) or scenes (e.g., 3D scenes and videos)...based on intrinsic-parameter variation (e.g., asset shape, position and material properties) and extrinsic-parameter variation (e.g., variation of environmental and/or scene parameters such as lighting, perspective)”).
As to claim 3, Zargahi further discloses that the input parameters further identify one or more attributes of the original video that are to be modified to generate the customized imagery ([0006, 0085] discloses that the machine learning scenario can include “video surveillance” and that a relevant set of variation parameters (e.g., asset-variation parameters, scene-variation parameters, etc.) is specified for each machine learning scenario in order to produce the frameset package) and further that “training datasets are generated by altering intrinsic parameters and extrinsic parameters of synthetic data assets (e.g., 3D models) or scenes (e.g., 3D scenes and videos)...based on intrinsic-parameter variation (e.g., asset shape, position and material properties) and extrinsic-parameter variation (e.g., variation of environmental and/or scene parameters such as lighting, perspective)”).
Independent claim 21, as best understood, recites a system for generating customized imagery, the system comprising: memory storing program code for generating an application programming interface (API) that communicates with plural disparate image processing tools (Abstract, [0036, 0053, 0094-0098, 0143] disclose that Zargahi is directed to “implementing a distributed computing system that provides synthetic data as a services (‘SDaaS’)”, wherein SDaaS is a “cloud” computing system “that allows customers to configure, generate, access, manage, and process synthetic data training datasets for machine learning” and “can include an API library” which “may support the interaction between hardware architecture of the device and the software framework” of the SDaaS; Figs. 6-7 show an example environment 600/700 in which requests from a client 705 are received and managed by a scheduler 630/730 of the SDaaS which “can select tasks from the queue for routing or assignment for processing on CPUs 640 and GPUs 650”, wherein the disparate CPUs 640/740 and GPUs 650/750 “render images” and perform “image processing”) and further recites steps recited in independent claim 1. Accordingly, claim 21 is rejected for reasons analogous to those discussed above in conjunction with claim 1.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Zargahi in view of U.S. Patent Application Publication No. 2021/0004608 to Jaipuria et al. (hereinafter “Jaipuria”).
As to claim 4, Zargahi does not expressly disclose that the attributes include one or more style transfers to be performed on the original video.
Jaipuria, like Zargahi, is directed to generating synthetic videos used to train a deep neural network (Abstract and [0028-0031]). Jaipuria discloses that the generator may input a real video image from a first domain, such as sunny, and output video images from a user-selected other domain, such as winter, rain, night, etc. ([0029-0037]). Notably, the specification of the subject application provides examples of “a style transfer” as a “change in scenery based on weather, season, time of day, etc.” ([0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zargahi to generate the synthetic video as a style transfer specified by a user and performed on an original video, as taught by Jaipuria, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have “improve[d] training the DNN” by virtue of training the model to be robust to “different environmental conditions”, as taught by Jaipuria ([0010]).
Claims 5-10 are rejected under 35 U.S.C. 103 as being unpatentable over Zargahi in view of U.S. Patent Application Publication No. 2021/0227276 to Mayol Cuevas et al. (hereinafter “Mayol Cuevas”).
As to claim 5, Zargahi does not expressly disclose that the step of sending parameterized calls further comprises: generating an instruction message having: an identifier that includes a name of the original video and a timestamp; and a body that includes the name of original video, and identifier of the original video, and the one or more attributes of the original video to be transformed.
Mayol Cuevas, like Zargahi, is directed to a “video enhancement service” which manages user requests for the video enhancement using an “application programming interface (API) call” (Abstract and [0013-0014, 0018]). Mayol Cuevas discloses example API calls 601, 611, 701, and 711 in Figs. 6-7 which show that a call may include the video stream name, start and end time stamps, fragment IDs of the video, and the type, speed, and quality of the enhancement ([0062-0071]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zargahi to include in the API call are request message including the name of the video and timestamps, the name and identifier of the video, and attributes to be transformed, as taught by Mayol Cuevas, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have clearly communicated the parameters of the requested video enhancement, thus improving the likelihood of satisfying the user’s desired enhancement.
As to claim 6, Zargahi further discloses that the computing device performs further operations including: loading the instruction message into a job queue of the one image processing tool; and delete the instruction message from the job queue ([0095] and Figs. 6-7 show an example environment 600/700 in which requests from a client 705 are received and managed by a scheduler 630/730 of the SDaaS which “can select tasks from the queue for routing or assignment for processing on CPUs 640 and GPUs 650”, wherein the queues include CPU queue, GPU queue, and hybrid queue, and wherein the request in the queue must be loaded in order to be performed and must be deleted in order to move on to the next request in the queue).
Zargahi does not expressly disclose extracting at least the name of the original video and the one or more attributes to be transformed; downloading the original image from memory; executing the one image processing tool to customize the original video using the one or more attributes; and uploading the customized image to memory.
Mayol Cuevas, like Zargahi, is directed to a “video enhancement service” which manages user requests for the video enhancement using an “application programming interface (API) call” (Abstract and [0013-0014, 0018]). Mayol Cuevas discloses a process flow in which a video requesting device 131 issues an API like the one discussed above in conjunction with claim 5 and shown in Figs. 6-7 which includes the name of the video and attributes to be transformed, a provider network 100 receives the request and retrieves the video from memory and performs the enhancement identified in the call using the appropriate ML enhancement model, and finally the enhanced video is uploaded to the video requesting device 131 (Figs. 1 and 4 and [0023-0027, 0041-0052]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zargahi to retrieve the video to be enhanced based on the name and identifying information in the API call, select an appropriate model/tool for performing the enhancement, and upload the enhanced video back to the requesting user device, as taught by Mayol Cuevas, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have “allow[ed] for an API for a video to also request that the lower quality video be enhanced” ([0014] of Mayol Cuevas).
As to claim 7, Zargahi further discloses that customizing the original video includes: dividing the original video into plural frames; transforming each frame according to the one or more attributes; and compiling the plural transformed frames into the video format ([0077] of Zargahi discloses that the set of images of a scene are altered on a frame by frame basis which requires first dividing the video scene into frames and recompiling them into “the synthetic data scene frameset” after altering them according to “the scene-variation parameters”).
As to claim 8, Zargahi further discloses that the operation involves synthetic image generation ([0058]: “a client may submit a synthetic data request to create 50 k synthetic data assets (e.g., images) from a selected source asset (e.g., a 3D model)”) and the input parameters include a scene identifier, and an object identifier ([0006, 0085] discloses that a relevant set of variation parameters (e.g., asset-variation parameters, scene-variation parameters, etc.) is specified for each machine learning scenario in order to produce the frameset package and further that “training datasets are generated by altering intrinsic parameters and extrinsic parameters of synthetic data assets (e.g., 3D models) or scenes (e.g., 3D scenes and videos)...based on intrinsic-parameter variation (e.g., asset shape, position and material properties) and extrinsic-parameter variation (e.g., variation of environmental and/or scene parameters such as lighting, perspective)”, wherein the scene and the asset (object) must be identified in order to be altered). Zargahi does not expressly disclose that the input parameters include an image identifier, an image number, and a scene type.
Mayol Cuevas, like Zargahi, is directed to a “video enhancement service” which manages user requests for the video enhancement using an “application programming interface (API) call” (Abstract and [0013-0014, 0018]). Mayol Cuevas discloses that the enhancement is made based on start and end time stamps that identify video frames/image, and “scene type” ([0057-0071]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zargahi to further perform the video alteration based on identified frames and scene type, as taught by Mayol Cuevas, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have allowed for enhancements particular to various scene types, thus providing more variety to a user making the request for video/image enhancement.
As to claim 9, Zargahi further discloses that the attributes include at least one of: camera angle, altitude, weather parameters, lighting parameters, an object type, and a number of images to capture ([0006, 0036] discloses that the extrinsic parameter variation may include lighting and perspective).
As to claim 10, Zargahi as modified above further teaches that the step of sending parameterized calls further comprises: generating a request message that includes at least the input parameters ([0062-0071] of Mayol Cuevas discloses example API calls 601, 611, 701, and 711 in Figs. 6-7 which show that a call may include the input parameters; the reasons for combining the references are the same as those discussed above in conjunction with claim 8).
Claims 11-15 are rejected under 35 U.S.C. 103 as being unpatentable over Zargahi in view of Mayol Cuevas and further in view of U.S. Patent Application Publication No. 2023/0107397 to Umakandan et al. (hereinafter “Umakandan”).
As to claim 11, Zargahi as modified above does not expressly disclose verifying that the one or more input parameters in the request message include at least the video identifier, the video number, the scene identifier, and the scene type (string). However, Umakandan discloses the “verifying of input parameters included in the API call” ([0042, 0048]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the proposed combination of Zargahi and Mayol Cuevas to verify that the API includes particular input parameters (such as the claimed input parameters taught by Zargahi and Mayol Cuevas, as discussed above in conjunction with claim 8), as taught by Umakandan, to arrive at the claimed invention discussed above. Such a modification is the result of combining prior art elements according to known methods to yield predictable results. It is predictable that the proposed modification would have ensured compliance of the API call, as an API call with missing parameters may “return an error” ([0043] of Umakandan).
As to claim 12, Zargahi as modified above further teaches that when the one or more input parameters are verified, the method comprises: generating an instruction message in the job queue of the one image processing tool that generates synthetic images ([0042-0043] of Umakandan discloses that the API call returns an error if the input parameters are missing, but proceeds with the request otherwise; [0095] of Zargahi discloses that a scheduler 630/730 of the SDaaS which “can select tasks from the queue for routing or assignment for processing on CPUs 640 and GPUs 650”, wherein the queues include CPU queue, GPU queue, and hybrid queue; the reasons for combining the references are the same as those discussed above in conjunction with claim 11).
As to claim 13, Zargahi as modified above further teaches that the instruction message includes: an identifier that contains the scene identifier, the image identifier, and a timestamp; and a body that contains the scene identifier and the scene type ([0057-0071] and Figs. 6-7 of Mayol Cuevas discloses that the API call includes a fragment (scene) ID, start and end timestamps (each of which identify a video frame), and scene type; the reasons for combining the references are the same as those discussed above in conjunction with claim 8).
As to claim 14, Zargahi as modified above further teaches extracting the scene identifier and the scene type from the instruction message ([0026-0027] of Mayol Cuevas discloses that the video enhancement service enhances the video per the request according to the parameters in the API call which presupposes extracting the parameters; the reasons for combining the references are the same as those discussed above in conjunction with claim 8).
As to claim 15, Zargahi as modified above further teaches executing the one image processing tool; and generating the synthetic video in the one image processing tool based on at least the scene identifier and the scene type ([0006, 0085-0097] of Zargahi discloses that a relevant set of variation parameters (e.g., asset-variation parameters, scene-variation parameters, etc.) is specified for each machine learning scenario in order to produce the frameset package and further that “training datasets are generated by altering intrinsic parameters and extrinsic parameters of synthetic data assets (e.g., 3D models) or scenes (e.g., 3D scenes and videos)...based on intrinsic-parameter variation (e.g., asset shape, position and material properties) and extrinsic-parameter variation (e.g., variation of environmental and/or scene parameters such as lighting, perspective)”, wherein the scene must be identified in order to be altered by the CPU or GPU (image processing tool); [0057-0071] and Figs. 6-7 of Mayol Cuevas discloses enhancing the video scene according to the scene type in the request; the reasons for combining the references are the same as those discussed above in conjunction with claim 8).
Allowable Subject Matter
Claims 16-20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims and if the above rejections under 35 USC 112 are overcome.
Pertinent Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Dirac (U.S. Patent No. 10,474,926) is directed to a service employing image processing tools for modifying training data for particular model training. This is pertinent to the independent claims.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEAN M CONNER whose telephone number is (571)272-1486. The examiner can normally be reached 10 AM - 6 PM Monday through Friday, and some Saturday afternoons.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Greg Morse can be reached at (571) 272-3838. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SEAN M CONNER/Primary Examiner, Art Unit 2663