Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 5/23/25 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-23 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4, 6-7, 10, 12-15 and 17-21 are rejected under 35 U.S.C. 103 as being unpatentable over Nicastri et al. (US 2021/0312188) in view of Lev-Tov et al. (US 2018/0232451).
Regarding claims 1 and 14, Nicastri teaches an apparatus comprising:
an image processing system configured to receive images from an image capture device (Fig. 1, cameras 14 supplies images received by system 10); and
an event summarization system coupled to the image processing system and configured (Fig. 1, system 10 summarizes video) to:
receive a request to create an event summarization, the request including details associated with the event summarization (Figs. 4-20, search query received through UI to perform a search and output a compiled);
identify at least one image relevant to the event summarization based on the structured data and the details associated with the event summarization (paragraphs 43, 49, 53-54 teaches identifying moments within the video captured by cameras based on search. The search algorithm is utilized to generate comparisons between the search and video data to determine which ones match the search);
select at least one of the identified images relevant to the event summarization; arrange the selected images (Fig. 21 and paragraphs 70-72 teaches selecting various video clips 40a, 40b, etc.); and
create a video summary representing the event summarization, the video summary including the arrangement of the selected images (Fig. 21 and paragraphs 70-72 teaches selecting various video clips 40a, 40b, etc. are compiled into a summarized composite file 134 for the user based on the search).
However, while Nicastri teaches the claimed as discussed above, fails to teach, but Lev-Tov teaches the claimed:
pre-process, by the image processing system, a plurality of images into image feature vectors (paragraph 52 teaches the claimed wherein images are processed to generate feature vectors);
store, by the image processing system, the image feature vectors in a common embedding space (paragraph 52 teaches storing the image vectors in an image database to create an image feature (IFV) space);
identify, by a large language model (LLM), structured data from the received request (paragraphs 4-6, 20-21 teaches an LLM using a query to generated structured data based on the request);
identify at least one image relevant to the event summarization based on the structured data and the details associated with the event summarization (paragraphs 4-6, 20-23 and 38-40 teaches an LLM using a query to generated structured data based on the request);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Lev-Tov into they system of Nicastri because said incorporation allows for the benefit of providing an alternative or improved upon method of searching for video/image content by using text based searches which additionally allows for language based searches (paragraphs 3-8).
Claim 1’s methodology is implemented by the apparatus of claim 14.
Regarding claims 2 and 17, Nicastri teaches the claimed wherein the request to create an event summarization includes at least one of an object to include in the event summarization, an activity to include in the event summarization, a time period associated with the event summarization, or a time limit for the event summarization (Examiner notes the alternative language stated above and while all are not required to meet the claimed limitations, Figs. 5-20, and more specifically paragraphs 31, 44-45 search input includes objects, time range and activities).
Regarding claims 3 and 18, Nicastri teaches the claimed wherein arranging the selected images to be included in the event summarization is based on at least one of a chronological order, a particular topic, or a specific theme (paragraph 34).
Regarding claim 4, Nicastri teaches the claimed wherein a first portion of the selected images are associated with a first camera and a second portion of the selected images are associated with a second camera, and wherein at least one image from the first portion and at least one image from the second portion are included in the video summary (See Fig. 21 wherein multiple camera outputs are used and selected portions are taken from various camera angles and compiled).
Regarding claims 6 and 19, Nicastri teaches the claimed further comprising:
determining an event summarization time limit associated with the video summary (paragraph 30 and 46);
determining an image time limit for each image in the video summary (paragraph 30 and 46 searches for images/videos around the search time range); and
identifying specific selected images to include in the video summary based on the event summarization time limit and the image time limit (paragraph 30 and 46 based on search time range “from those video streams, a collection of video clips contain the relevant video frame may be identified”).
Regarding claims 7 and 15, Nicastri teaches the claimed wherein the request to create an event summarization is a natural language request (paragraph 44).
Regarding claim 10, The method of claim 1, further comprising adjusting an amount of time a particular image is displayed in the video summary based on an importance associated with the particular image (Fig. 21, the length of the image/video displayed in the summary is based on how long the search query results in the finding of such events/person/objects/activity, etc. and is therefore adjusted in accordance).
Regarding claim 12, Nicastri teaches the claimed wherein creating the video summary representing the event summarization is performed by an image processing system (Fig. 1 and 21, system performs the video summarization/compilation).
Regarding claims 13 and 20, Nicastri teaches the claimed further comprising communicating the video summary to at least one system associated with the request to create the event summarization (Fig. 1 exported to user’s device requesting the video search).
Regarding claim 21, Lev-Tov teaches the claimed determining, by the LLM, an intent of the received request; and generating, by the LLM, queries for searching relevant images (paragraph 52 teaches language model used for training the system and paragraphs 57-61 and 65-67 teaches wherein the trained model is used for processing textual based queries). The prior motivation as discussed above is incorporated herein.
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Nicastri et al. (US 2021/0312188) in view of Lev-Tov et al. (US 2018/0232451) and further in view of Gong et al. (US 2014/0365506).
Regarding claim 5, Nicastri and Lev-Tov teaches the claimed as discussed in claim 1 above, however fails to, but Gong teaches the claimed further comprising:
identifying at least one missing event summarization detail in the request to create an event summarization (paragraphs 47 and 56 teaches a video searching tool that identifies missing detections in the search); and
requesting clarification of the at least one missing event summarization detail (paragraphs 47 and 56 teaches a video searching tool that identifies missing detections in the search and allows an iterative process of asking a user to refine its search to refine the results).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Gong into the system of Nicastri and Lev-Tov because such an incorporation allows for the benefit of improving the system by improving the accuracy of a video query search and filter down the results to a “correct” or intended video results (paragraph 47).
Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Nicastri et al. (US 2021/0312188) in view of Lev-Tov et al. (US 2018/0232451) and further in view of Chen et al. (US 2014/0365506).
Regarding claim 11, Nicastri and Lev-Tov teaches the claimed as discussed in claim 1 above, however fails to, but Chen teaches wherein arranging the selected images to be included in the event summarization includes:
adding a first group of images associated with a first topic to the beginning of the event summarization (paragraphs 188-192); and
adding a second group of images associated with a second topic to the end of the event summarization (paragraphs 188-192 teaches wherein search results for a video query is organized in a specific order based on scores, which are based on query inputs/text).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Chen into the system of Nicastri and Lev-Tov because said incorporation allows for the benefit of improving the quality of the video summary sequence (paragraph 188).
Claims 8-9 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Nicastri et al. (US 2021/0312188) in view of Lev-Tov et al. (US 2018/0232451) and further in view of Harary et al. (US 2025/0005727).
Regarding claim 8, Nicastri and Lev-Tov teaches the claimed as discussed in claim 1 above, however, fails to teach, but Harary teaches the claimed further comprising:
encoding the natural language request using a text embedding model (paragraph 39-40);
creating features associated with the encoded natural language request (paragraph 40-44);
identifying features associated with the identified images (paragraph 40-44);
comparing the identified features of the identified images to the features associated with the encoded natural language request (paragraph 40-44, checking for distances between query and normal text embeddings); and
identifying relevant identified images based on the comparison (paragraph 40-44).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Harary to be incorporated into the alphanumeric search ability of Nicastri because such an incorporation would allow for the benefit of improving the accuracy of text/natural language searches (paragraphs 3-6, 17 and 56).
Regarding claim 9, The method of claim 8, wherein the features associated with the encoded natural language request and the features of the identified images are stored in a common embedding space (paragraph 40-44).
Regarding claim 16, Nicastri teaches the claimed as discussed in claim 14 above, however, fails to teach, but Harary teaches the claimed further comprising a multimodal embedding system including:
a text embedding model configured to encode the natural language request and create features associated with the encoded natural language request (paragraph 39-40); and
an image embedding model configured to: encode the identified images (paragraph 39 and 42 teaches image encoder);
create features associated with the encoded identified images (paragraphs 39 and 42);
compare the identified image features with the natural language request features (paragraphs 40-44); and
identify relevant identified images based on the comparison (paragraph 40-44).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Harary to be incorporated into the alphanumeric search ability of Nicastri because such an incorporation would allow for the benefit of improving the accuracy of text/natural language searches (paragraphs 3-6, 17 and 56).
Claims 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Nicastri et al. (US 2021/0312188) in view of Lev-Tov et al. (US 2018/0232451) and further in view of Xie et al. (US 2009/0024598).
Regarding claim 22, Nicastri and Lev-Tov teaches the claimed as discussed in claim 1 above, however, fails to teach, but Xie teaches the claimed further comprising updating the LLM based on the structured data (paragraphs 12 and 17 teaches a language model being updated/learning new structured data based on initial input).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Xie into the proposed combinations of Nicastri and Lev-Tov because said incorporation allows for the benefit of improving the language model’s accuracy and give better results to the user (paragraph 6 of Xie).
Regarding claim 23, Xie teaches the claimed wherein updating the LLM includes continually updating LLM training and evaluation data based on user feedback to improve subsequent event summarizations (paragraphs 12 and 17 teaches a language model being updated/learning new structured data based on initial input).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Xie into the proposed combinations of Nicastri and Lev-Tov because said incorporation allows for the benefit of improving the language model’s accuracy and give better results to the user (paragraph 6 of Xie).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GELEK W TOPGYAL whose telephone number is (571)272-8891. The examiner can normally be reached M-F (9:30-6 PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Vaughn can be reached at 571-272-3922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GELEK W TOPGYAL/ Primary Examiner, Art Unit 2481