DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to because Figure 9 is before Figure 5, and it is out of order. Either, the figures have to be renumbered, or the figures have to be moved to be in order. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The use of the terms “tumblr” and “HGTV” in paragraph 61 and “WiFi” in paragraph 45, which is a trade name or a mark used in commerce, has been noted in this application. The term should be accompanied by the generic terminology; furthermore the term should be capitalized wherever it appears or, where appropriate, include a proper symbol indicating use in commerce such as ™, SM , or ® following the term.
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) are permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.
The disclosure is objected to because of the following informalities:
Paragraph 45, “WiFi” should be spelled “Wi-Fi” or “WIFI” and it is trademarked/copyrighted term as noted above.
Paragraph 61, “tumblr” should be “Tumblr”, and this is a trademarked/copyrighted term as noted above.
Appropriate correction is required.
Claim Interpretation
For claims 3, 10, and 17, the term “neural atlas” is being defined as a representation of at least one singular element of a scene or plurality of scenes. Applicant may address this interpretation if they believe that it is in error in response to this action.
For claims 4, 11, and 18, the term “neural enhancement model” is being interpreted as meaning, “a neural networking process that enhances the image”. Applicant may address this interpretation if they believe that it is in error in response to this action.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 3, 10, and 17 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the enablement requirement. The claims contain subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention. The term “neural atlas” is defined as follows by the specification in paragraph 56 as “a unified video representation of a scene or a group of similar scenes”. Now, this definition has nothing to do with an atlas or any neural network. Further, what would even constitute a “unified video representation” of a scene? Would a written description of any scene constitute a unified video representation? Does the representation have to be a video? This further goes to specify that it can be just a singular foreground or a singular background of a scene. What aspect of this is unified? For something to be unified, there has to be multiple elements put together, but just a foreground and just a background are singular objects. What aspects are being unified? Further, it goes on to specify that the video itself can be the “unified video representation”. If the video itself, which is composed of scenes or a group of similar scenes, that would render the whole video as the “neural atlas”. Further, the usage of similar in this context is something that would be found to be relative to each person. So, this leaves the “neural atlas” extremely nebulous, it is unclear if it is the video, some representation of multiple elements within the video, a representation of a singular element in a single scene, or simply some other form of representation that is deemed to embody the video or aspects therein. Looking at the various wands factors, the term is to the first factor is extremely broad and would entail multiple contradictory interpretations including singular elements of a single frame to the entire video. The nature of the invention involves the cutting edge of neural network development and AI usage, so new terminology is expected. However, new terminology has to be properly defined via the specification, and as demonstrated in the analysis before, there are multiple definitions possible that are contradictory. The prior art as it stands does not have a definition for “neural atlas” that would be applicable in this case or a definition that is in line with the one provided in the specification. The examiner found the definition of “neural atlas” in other art to mainly refer to an atlas of neural functions which is not applicable to the current invention. As such, this impacts the level of predictability in the art as the art doesn’t allow for a plausible alternative definition that is more defined than applicant’s own definition. This makes it extremely difficult to say that it would be predictable in light of the current prior art. The inventor provides a few examples of what this could be such as, foregrounds and backgrounds of a scene or scenes. However, these examples seemingly contradict the definition that is provided of a, “unified video representation of a scene or a group of similar scenes” as the examples allow it to be just a singular, not-unified, part of a scene which cannot be a unified video representation of a scene as it would merely be a part of a scene rather than a representation of the whole scene that unifies one or more elements. The disclosure as a whole seems to be using the term “neural atlas” in a way that is more in line with a prophetic example rather than a working example or does not seem to use a working example of the term. Paragraph 69 of the specification seems to use the claimed “neural atlas” as a possible alternative to the neural enhancement model’s neural network, but the definition of “neural atlas” from paragraph 56 does not disclose that it is a neural network or any form of processing beyond a representation of elements within a scene or scenes. As such, this would require a significant amount of undue experimentation to make and use the invention as claimed as the term has no clear meaning and multiple contradictory meanings, it would require a significant amount of experimentation to find which interpretation is intended or works and how to use it.
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 3, 7, 10, 14, and 17 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
The claims 3, 10, and 17 contain subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention. The term “neural atlas” is defined as follows by the specification in paragraph 56 as “a unified video representation of a scene or a group of similar scenes”. Now, this definition has nothing to do with an atlas or any neural network. Further, what would even constitute a “unified video representation” of a scene? Would a written description of any scene constitute a unified video representation? Does the representation have to be a video? This further goes to specify that it can be just a singular foreground or a singular background of a scene. What aspect of this is unified? For something to be unified, there has to be multiple elements put together, but just a foreground and just a background are singular objects. What aspects are being unified? Further, it goes on to specify that the video itself can be the “unified video representation”. If the video itself, which is composed of scenes or a group of similar scenes, that would render the whole video as the “neural atlas”. Further, the usage of similar in this context is something that would be found to be relative to each person. So, this leaves the “neural atlas” extremely nebulous, it is unclear if it is the video, some representation of multiple elements within the video, a representation of a singular element in a single scene, or simply some other form of representation that is deemed to embody the video or aspects therein.
The term “uniqueness metric” in claims 7 and 14 is a relative term which renders the claim indefinite. The term “uniqueness metric” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The term, “uniqueness metric” is given no real definition within the claims or specification as to how it can be determined. Various people of ordinary skill in the art may determine uniqueness is a number of ways. Differences in various scenes, lighting, compositions, foregrounds, backgrounds, costumes, and simply angles of a shot can all be used to determine uniqueness, and various people of ordinary skill in the art may place more weight on some of these aspects over others and arrive at completely different results. The specification does not provide any definite method or way to compute uniqueness, and as such, it renders the term indefinite.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1, 3, 7-8, 10, 14-15, and 17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Dehan et al. (“Complete and temporally consistent video outpainting”), hereinafter referred to as Dehan.
In regards to claim 1, Dehan discloses a method comprising: obtaining, using at least one processing device of an electronic device, a video including multiple scenes at a first aspect ratio (Abstract, The abstract discloses that it acquires a video with a specific aspect ratio and aims to broaden the aspect ratio); performing, using the at least one processing device, backward optical flow estimation and forward optical flow estimation for each of the multiple scenes to select an image frame having a largest missing area (Page 690, third new paragraph and sixth new paragraph, Denotes that it can use forwards and backwards optical flow with the sixth new paragraph detailing that the largest remaining masked region is selected); performing, using the at least one processing device, outpainting on the image frame having the largest missing area to generate a first outpainted image frame at a second aspect ratio different from the first aspect ratio (Fifth new paragraph of page 690, This section discloses that the outpainted image is generated at a new aspect ratio); and performing, using the at least one processing device, backward optical flow estimation and forward optical flow estimation using the first outpainted image frame to generate additional outpainted image frames in the multiple scenes at the second aspect ratio (Page 690, third new paragraph and sixth new paragraph, Denotes that it can use forwards and backwards optical flow with the sixth new paragraph detailing that the largest remaining masked region is selected with the sixth paragraph stating that it can be propagated to adjacent frames).
In regards to claim 3, Dehan discloses further comprising: determining a unified video representation of at least two scenes of the multiple scenes using a neural network, the unified video representation including at least one neural atlas (fourth new paragraph of page 690, The neural atlas disclosed by the specification is left indefinite, however, out of the competing definitions provided, paragraph 56 allows for a neural atlas to just be a foreground or background, the method disclosed in this paragraph involves both a foreground and background which would cover this claim).
In regards to claim 7, Dehan discloses further comprising: determining a uniqueness metric for each of the multiple scenes; and identifying at least two scenes of the multiple scenes having uniqueness metrics within a similarity threshold (The section entitled SSIM and Figure 8 on page 693, This section discusses a method of evaluating similarity between various frames and it defines the value of 1 as showing that the two frames are identical which is enough to be within a BRI of a threshold with the numerical value generated by this method acting as a uniqueness metric.).
In regards to claims 8 and 15, they are similar to claim 1, and they are similarly rejected.
In regards to claims 10 and 17, they are similar to claim 3, and they are similarly rejected.
In regards to claim 14, it is similar to claim 7, and it is similarly rejected.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 2, 9, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Dehan et al. (“Complete and temporally consistent video outpainting”), hereinafter referred to as Dehan, in view of Gao et al. (“Flow-edge Guided Video Completion”), hereinafter referred to as Gao.
In regards to claim 2, Dehan does not explicitly disclose the elements of this claim.
However, Gao does disclose further comprising: determining whether there are empty pixels in the outpainted image frames (Figure 2 and the associated description of the figure on page 3, Discloses that the method can find missing pixels within an image); and in response to determining that there are one or more empty pixels in at least one of the outpainted image frames, performing backward optical flow estimation and forward optical flow estimation using one of the outpainted image frames again to correct the one or more empty pixels (Figure 2 and the associated description of the figure on page 3, The description discloses the use of both forward and backward flow to figure out where missing pixels are).
It would be prima facie obvious to combine the teachings of Dehan and Gao. As combining the teachings of Dehan and Gao would allow for a predictable increase in the accuracy of the outpainted image. Combining the methods of inpainting and outpainting would allow for not only the image to be expanded, but that any empty spot within the image or outpainted sections to be filled in. This would allow for a more overall accurate image as empty space would be completely removed. As such, it would be prima facie obvious to combine.
In regards to claims 9 and 16, they are similar to claim 2, and they are rejected similarly.
Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Dehan et al. (“Complete and temporally consistent video outpainting”), hereinafter referred to as Dehan, in view of KS et al. (US 20220398700 A1), hereinafter referred to as KS, and Chen et al. (US 20230019733 A1), hereinafter referred to as Chen.
In regards to claim 4, Dehan does not explicitly disclose further comprising: performing deflickering and artifact correction on at least one of the outpainted image frames using a neural enhancement model.
However, KS does disclose further comprising: performing deflickering on at least one of the outpainted image frames using a neural enhancement model (Paragraph 14, KS’ paragraph 14 discloses that the neural networks can perform deflickering on the plurality of frames of a video).
It would be prima facie obvious to combine the teachings of Dehan with the teachings of KS. The removal of flickering via deflickering would have led to a predictable increase in image clarity as the removal of image flickering would allow for a more clear image that would be more accurate overall. As such, it would be prima facie obvious to combine these arts.
KS does not explicitly disclose further comprising: artifact correction on at least one of the outpainted image frames using a neural enhancement model.
However, Chen does disclose artifact correction on at least one of the outpainted image frames using a neural enhancement model (Paragraph 21, Chen discloses a neural network can be used to remove artifacts from videos).
It would be prima facie obvious to combine the teachings of Dehan and KS with the teachings of Chen. The removal of artifacts via artifact correction would have led to a predictable increase in image clarity as the removal of image artifacts would allow for a more clear image that would be more accurate overall. As such, it would be prima facie obvious to combine these arts.
In regards to claims 11 and 18, they are similar to claim 4, and they are rejected similarly.
Claims 5-6, 12-13, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Dehan et al. (“Complete and temporally consistent video outpainting”), hereinafter referred to as Dehan, in view of Cragg et al. (US 20240273670 A1), hereinafter referred to as Cragg, and Jin et al. (CN 116741179 A), hereinafter referred to as Jin.
In regards to claim 5, Dehan does not explicitly disclose any elements of this claim.
Cragg does disclose wherein performing outpainting on the image frame having the largest missing area comprises: extracting a text prompt from the image frame having the largest missing area using an image-to-text model (Abstract, The abstract discloses a system to extract a prompt from the image); and outpainting the image frame having the largest missing area using a text-to-image diffusion model and the optimized text prompt (Abstract, Discloses that a diffusion model generates the image based on a text prompt).
It would be prima facie obvious to combine the teachings of Dehan with the teachings of Cragg as it would allow for a predictable increase in accuracy as additional clarification on what is occurring within the image frame via a textual extraction of the events occurring within the image frame would allow for multiple pieces of data to be used as inputs. As such, the likelihood of a variable or part of the image to be missed would be greatly reduced which would allow for a more accurate outpainted image. As such, it would be prima facie obvious to combine.
Cragg does not explicitly disclose optimizing the text prompt using a large language model.
Jin does disclose optimizing the text prompt using a large language model (New paragraph 6 of page 5, This paragraph discloses that a large language model may be used to optimize text to generate a prompt).
It would be prima facie obvious to combine the teachings of these arts as further optimizing the prompt via a large language model would allow for a predictable increase in accuracy as the prompt can be further optimized to reduce redundant words or phrases that cover the same objects so as to avoid too much weight being replaced on textual redundancies caused by the analysis.
In regards to claim 6, Cragg discloses wherein the image-to-text model comprises a prompt extraction model and a language-image pre-training framework (Abstract and paragraph 149, The abstract discloses a system to extract a prompt from the image which is within the BRI of the prompt extraction model and paragraph 149 discloses the usage of a CLIP model or a contrastive language-image pre-training model which would read upon the disclosed language-image pre-training framework.).
In regards to claims 12 and 19, they are similar to claim 5, and they are similarly rejected.
In regards to claims 13 and 20, they are similar to claim 6, and they are similarly rejected.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CONOR AIDAN O'MALLEY whose telephone number is (571)272-0226. The examiner can normally be reached Monday - Friday 9:00 am. - 5:00 pm. EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Moyer can be reached at 5722729523. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
CONOR AIDAN. O'MALLEY
Examiner
Art Unit 2675
/CONOR A O'MALLEY/Examiner, Art Unit 2675
/ANDREW M MOYER/Supervisory Patent Examiner, Art Unit 2675