DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 5-9, 11-13, 15-19, 21-22 are rejected under 35 U.S.C. 103 as being unpatentable over Sekar (US 2019/0289359 A1), hereinafter “Sekar”, and in view of Babazaki (US 2024/0135751 A1), hereinafter “Babazaki”.
As per claim 1, Sekar teaches a content search system comprising:
“a repository for storing content” at [0049];
(Sekar teaches video library 112 is a database of multiple video data files 114, with each video data file storing video data in a video file format)
“an index database comprising a search index for the repository” at [0049]-[0050] and Fig. 3;
(Sekar teaches additionally, the video library 112 also stores a respective content data file 116 (i.e., “search index”) for each video data file 114. The content data file 116 is generated by a machine learning based content data generation system 118. Machine learning based content data generation 118 is configured detailed, indexed content data automatically for each video data file 114 in the video library 112)
“a processor coupled to the repository and the index database, a computer memory storing code executable by the processor for: accessing an image stored or to be stored in the repository, applying a machine learning model to detect object and an action being performed in the image,” at [0049]-[0051], [0086]-[0095] and Figs. 3, 14;
(Sekar teaches the machine learning based content data generation system 118 receives video data included in video data file 114, detects objects and actions in the video segments and assigns labels to the detected objects and actions)
“wherein applying the machine learning model comprises: applying an object detection model trained to infer object classes to the image to detect the objects in the image and assign object class labels to the image indicating the objects detected in the image” at [0049]-[0051], [0086]-[0095] and Figs. 3, 14;
(Sekar teaches applying an object detection model to the video scenes to identify object category that includes one or more attributes that identify objects appearing in the scene, including for example “people”, “golden retriever”, “water”, “swords” etc., and assigning text labels identifying objects appearing in the video)
“applying an action detection model
(Sekar teaches the machine learning based content data generation system 118 detect human, objects in the video scenes and processing the detect human and objects to detect the action being performed by the detected human or objects and assigning action labels to the video scenes)
“updating the search index to index the image using the objects and the action being performed detected in the image to enable searching of the image by the objects, and the action being performed detected in the image, wherein updating the search index comprises adding content indexing labels for the image to the search index, the content indexing labels comprising the object class labels indicating the objects detected in the image and the action class label indicating the action detected in the image” at [0049]-[0050], [0072] and Fig. 3
(Sekar teaches at Fig. 3 the content data files 116 which includes labels indicating objects (e.g., “Golden retriever”, “Water”, “Sand”) and actions (e.g., “Jogging”, “Conversation, Argument”, “Eating”, “Fighting”) detected in the video segments. The content data files 116 enable searching for scenes in which a particular object appears, searching for scenes which has a particular action)
Sekar does not explicitly teach “applying an action detection model trained using the object class to infer classes of actions being performed” as claimed. However, Babazaki teaches a method for training an action recognition model using object class (e.g., person H and/or object OBJ) and applying the action detection model to the object class labels (e.g., “person H”, “object OBJ”) assigned by the object detection model to infer of actions being performed by the person H with respect to object OBJ. at [0053]-[0054], [0100]-[0116]. Thus, it would have been obvious to one of ordinary skill in the art to combine Babazaki with Sekar in order to improve the accuracy of the action detection model by training the action detection model using the training data set which includes object class of the detected objects in the image.
As per claim 2, Sekar and Babazaki teach the system of claim 1 discussed above. Sekar also teaches: wherein the code is further executable for: “receiving a search query from a client, servicing the search query, servicing the search query comprising: using the search index to search the repository; determining that the image matches the search query based on the labels included for the image in the search index matching the search query; based on determination that the image matches the search query, generating a result that includes the image; and return the result to the client” at [0053]-[0054], [0062]-[0072] and Figs. 5-11.
As per claim 3, Sekar and Babazaki teach the system of claim 2 discussed above. Sekar also teaches: wherein “the search query comprises a search term, and wherein determining that the image matches the search query comprises determining that the at least one of the content indexing labels matches the search term” at [0062]-[0072] and Figs. 5-11.
As per claim 5, Sekar and Babazaki teach the system of claim 1 discussed above. Sekar also teaches: wherein the code is executable for “providing an image processing worker for: process a content indexing request that includes a reference to the image in the repository; servicing the content request, wherein servicing the content indexing request comprises: retrieving the image from the repository using the reference from the content indexing request; the applying of the machine learning model to the image to detect the objects and the action being performed in the images; and generating a content indexing response that comprises the content indexing labels” at [0086]-[0096] and Fig. 3.
As per claim 6, Sekar and Babazaki teach the system of claim 5 discussed above. Sekar also teaches: wherein the code is executable for providing a search agent for: “generating the content indexing request; receiving the content indexing response to the content indexing request; reading the content indexing labels from the content indexing response; and sending the content indexing labels read from the content indexing response to the index database to update the search index” at [0086]-[0096] and Figs. 3-11.
As per claim 7, Sekar and Babazaki teach the system of claim 6 discussed above. Sekar also teaches: wherein “the search agent writes the content indexing request to a topic; and the image processing worker writers the content indexing response to the topic” at [0049]-[0050], [0072] and Fig. 3.
As per claim 8, Sekar and Babazaki teach the system of claim 6 discussed above. Sekar also teaches: wherein “the search agent and the image processing worker are asynchronous processes” at [0086]-[0096].
As per claim 9, Sekar and Babazaki teach the system of claim 8 discussed above. Sekar also teaches: wherein “the search agent sends the content indexing labels to the index database as atomic metadata” at [0086]-[0096] and Fig. 3.
As per claim 21, Sekar and Babazaki teach the system of claim 1 discussed above. Babazaki also teaches: wherein “the object detection model is trained to perform a single object or multiple object, localization for instances of object classes detected in the image; and wherein the action detection model is trained to infer a proximity of the instances of the object class to each other in the image” at [0064]-[0080].
Claims 11-13, 15-19, 22 recite similar limitations as in claims 1-3, 4-9, 21 above and are therefore rejected by the same reasons.
Claims 4,14 are rejected under 35 U.S.C. 103 as being unpatentable over Sekar and Babazaki, as applied to claims 1-3, 5-9, 11-13, 15-19 above, and further in view of Chang et al. (US 2022/0414142 A1), hereinafter “Chang”.
As per claims 4, 14, Sekar and Babazaki teach the system of claim 2 discussed above. Sekar teaches the search query comprises a search term at [0062]. Sekar does not teach “servicing the search query comprises expanding the search query to include a synonym of the search term; and determining that the image matches the search query based on the at least one of the content indexing labels matching the synonym of the search term” as claimed. However, Chang teaches a similar method for searching images using a search query, including the steps of “expanding the search query to include a synonym of the search term; and determining that the image matches the search query based on the at least one of the labels matching the synonym of the search term” at [0035]-[0040]. Thus, it would have been obvious to one of ordinary skill in the art to combine Chang with Sekar’s teaching to use synonym as alternative object terms for the query object because “when the object selection system does not recognize an object in the query string, …if the object selection system recognizes a synonym of the object as belong to a known class, the object selection can utilize the known-class object detection neural network to detect the object accurately and efficiently” as suggested by Chang at [0035].
Claims 10, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Sekar and Babazaki as applied to claims 1-3, 5-9, 11-13, 15-19 above, and further in view of Yang et al. (US 2021/0064704 A1), hereinafter “Yang”.
As per claims 10, 20, Sekar and Babazaki teach the system of claim 9 discussed above. Sekar does not teach “translating the labels from a first language to a second language prior to updating the search index with the content indexing labels” as claimed. However, Yang teaches a similar method for searching images using tags associated with the images, including the steps of “translating the labels from a first language to a second language prior to updating the search index with the content indexing labels” at [0002]-[0003]. Thus, it would have been obvious to one of ordinary skill in the art to combine Yang with Sekar’s teaching in order to allow searching of the images using a different language.
Response to Arguments
Applicant’s arguments with respect to claims 1-22 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KHANH B PHAM whose telephone number is (571)272-4116. The examiner can normally be reached Monday - Friday, 8am to 4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sanjiv Shah can be reached at (571)272-4098. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/KHANH B PHAM/Primary Examiner, Art Unit 2166
January 5, 2026