DETAILED ACTION
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1, 5, 8, 12, 15 and 18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Independent claim 1, and similarly claims 8 and 15, recite (i) generating automatically, by applying the learned correspondences of the at least one trained neural network, a first plurality of pixels that is in accordance one or more contexts that are associated with the imaginative scenario, and (ii) generating automatically, by applying the learned correspondences of the at least one
trained neural network, a second plurality of pixels that is in accordance the representation of the first plurality of pixels. The specifications discloses neural networks in the context of identifying objects, inferring semantic chains, and learning correspondences between pixel patterns and syntactical elements (¶33, 38, 114, 156, and 203). However, the specifications does not describe a neural network configured to generate pixel data. Rather, the disclosure consistently describes selecting, matching, and compositing existing images based on semantic associations (¶194-195), including substituting or superimposing portions of images to form an “imaginative image”.
The specifications therefore does not reasonably convey possession of a neural network that generates pixel values or video frames based on learned correspondences, as required by the claim. Additionally, while the specifications discusses contextual weights (e.g. W2-type weights), such weights are described at a conceptual level and are not disclosed as being applied to a neural network to generate pixel data. There is no description of how such contextual information is used to condition or control pixel generation.
Further, the claims require generating a second plurality of pixels “in accordance with” the first plurality of pixels, which implies a dependency or sequential relationship between generated pixel sets (e.g. temporal progression of frames). The specifications does not describe generating pixel data conditioned on previously generated pixel data, nor does it disclose any mechanism for sequential or temporally dependent generation of video frames.
Accordingly, the specifications fails to demonstrate possession of the claimed invention as a while, including the integration of neural network based generation, context-based conditioning, and a sequential pixel generation.
With respect to dependent claims 5, 12, and 18, which recite generating pixels by applying a plurality of probabilities, the specifications disclose W1, W2, and W3 weights as conceptual probabilities associated with semantic chains. However, the specifications does not describe how such probabilities are mathematically computed, how they are applied to pixel-level generation, or how they are integrated into a neural network to produce pixel outputs. As such, the specifications does not reasonably convey possession of generating pixel data based on application of probabilities as claimed.
Claims 1, 5, 8, 12, 15 and 18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the enablement requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention.
The claims require (i) generating automatically, by applying the learned correspondences of the at least one trained neural network, a first plurality of pixels that is in accordance one or more contexts that are associated with the imaginative scenario, and (ii) generating automatically, by applying the learned correspondences of the at least one trained neural network, a second plurality of pixels that is in accordance the representation of the first plurality of pixels.
The specifications does not teach how to implement these requirements in practice. In particular, the specifications does not disclose (i) any neural network architecture capable of generating pixel values or video frames, (ii) any training methodology for such a neural network, including loss functions, training structures, or optimization techniques, (iii) how contextual information (e.g. W2-type probabilities) is applied to influence pixel-level generation, or (iv) how to generate temporally consistent video frames or otherwise generate pixel data conditioned on previously generated pixel data.
Instead, the disclosure is directed primarily to assigning probabilities to semantic chains, selecting and matching existing images, and composting images based on semantic relationships. The specifications therefore lacks sufficient guidance to enable the full scope of the claimed invention, which encompasses neural network-based video generation with contextual conditioning and sequential dependency.
Given the breadth of the claims, the limited guidance provided in the specifications, and the unpredictability of neural network-based generation at the time of filing, undue experimentation would be required to implement the claimed function. Accordingly, the claims are not enabled.
Response to Arguments
Applicant's arguments filed 3/11/2026 have been fully considered but they are not persuasive.
With respect to the written description requirement, Applicant cites portions of the specifications describing neural networks learning correspondences between syntactical elements with pixel patterns (e.g. ¶33, 38, 114, 156, and 203). While these portions demonstrate that neural networks may be used for identifying objects and inferring semantic relationships, they do not disclose or suggest a neural network configured to generate pixel data or video frames as required by the claims. Rather, the specifications consistently describes selecting, matching, and compositing existing images (e.g. ¶194-195) which is materially different from generating pixel data using a neural network.
Applicant further cites portions of the specifications relating to contextual weights (e.g. W2-type probabilities) and argues that such context can guide image generation. However, this cited disclosure describes contextual weights at a conceptual level associated with semantic chains and does not provide any description of how such context is operationally applied to neural networks to generate pixel data. There is no disclosure of conditioning a neural network on contextual inputs to produce pixel outputs.
Applicant also relies on disclosure relating to “focus of attention” and generation of imaginative scenarios. While the specifications describe generating imaginative images through selection and compositing of image content, it does not disclose generating pixel data via a neural network, nor does it disclose generating successive pixel sets conditioned on previously generated pixel data, as required by the claims.
With respect to enablement, the applicant asserts that a person of ordinary skill in the art would implement the claimed invention using known neural network techniques. However, the specifications does not provide sufficient guidance for implementing a neural network that generates pixel data for video, applies contextual conditioning, and generates sequential pixel data with dependency between frames. The disclosure lacks details regarding model architecture, training methodology, and mechanism for temporal consistency. As such, undue experimentation would be required to practice the full scope of the claimed invention. Accordingly, the rejections under 35 U.S.C. 112(a) are maintained. See above rejection under 35 U.S.C. 112(a) for full details.
Applicant’s additional arguments filed 3/11/2026, with respect to rejection under 35 U.S.C. 103 have been fully considered and are persuasive. The rejection under 35 U.S.C. 103 of 1-20 has been withdrawn. It is noted that an updated prior art search has been conducted. No prior art is applied at this time because the claimed are rejected under 35 U.S.C. 112(a). However, the search will be continued, and prior art rejections may be made in subsequent Office Actions.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN KY whose telephone number is (571)272-7648. The examiner can normally be reached Monday-Friday 9-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached at 571-272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/KEVIN KY/Primary Examiner, Art Unit 2671