DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments filed on 1/14/2026, with respect to claim(s) 1-18, have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Further Applicant arguments with respect to objection of claim 20, have been fully considered and are persuasive in view of Amendment, hence said objection is withdrawn.
Furthermore, regarding independent claims 1, 10 and 19 (and their respective dependent claims), Applicant argues that “Tang fails to disclose each and every element of amended claim 1. In particular, the Office asserts that Tang allegedly discloses "receiving a video frame" and "responsive to the video frame comprising at least one of a salient object or a region of interest, partitioning the video frame into one or more tiles based on a location of the at least one of the salient object or the region of interest".
Examiner respectfully disagrees, as Tang discloses in Fig. 5:102:306, and in paragraph 138, discloses “The computer vision AI model(s) implemented by the ROI detection module 306 may be configured to detect and/or classify one or more types of objects. In one example, the computer vision AI model implemented by the ROI detection module 306 may be configured to perform person detection. In another example, the computer vision AI model implemented by the ROI detection module 306 may be configured to perform vehicle detection”, Hence as can be seen from above passage that in response to received video frames ROI detection module perform ROI detection on the objects and further Figs. 6 and 8, shows bounding boxes of ROI’s i.e., partitioning based on the detected ROI’s. Therefore Tang discloses the argued limitations as presented by the Applicant.
Applicant's arguments with respect to claims 19-20, have been fully considered but they are not persuasive. Regarding claim 19, Applicant argues that “Tang is silent on any such boundary-defining partitioning. For these reasons, Tang does not disclose each and every feature of claim 1. As such, claim 1 is novel in view of Tang. Independent claims 10 and 19 presently recite features substantially similar to those features of claim 1 described above. Thus, claims 10 and 19 are also novel in view of Tang.” (please see Remarks, page 8).
Examiner respectfully disagrees, as In response to applicant's argument that the references fail to show certain features of the invention with respect to claim 19, it is noted that the features upon which applicant relies (i.e., “boundary-defining partitioning”, please see Remarks, page 8) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
With respect to claim 19, Applicant further argues that “As described above with respect to claim 1, Tang employs a fixed tile grid for auto- exposure (AE) statistics and merely adjusts weighting values for tiles overlapping an ROI. Tang does not perform adaptive salient object or ROI-dependent partitioning. Moreover, regarding the metadata-related features of claim 19, nowhere does Tang disclose determining whether metadata is available for a video frame or conditionally processing that metadata to identify a salient object or a region of interest. Tang's disclosure at paragraph [0216] merely discusses "object identification and tracking" using computer-vision algorithms to label a subject (e.g., a pedestrian) for AE control. See Tang at paragraph [0216] ("The ROI tracking module 354 associates ROI IDs to moving subjects and maintains the ROI IDs over time"). These "ROI IDs" are internal identifiers generated by Tang's own tracking module. They are not external metadata received from a video source or otherwise indicating that metadata is "available for the video frame", as required by claim 19. Tang's AE pipeline continuously processes live image data captured by a camera, not encoded video frames accompanied by separate metadata. Thus, Tang's paragraph [0216] cannot reasonably be interpreted as disclosing processing of metadata responsive to its availability for identifying a salient object or a region of interest. Furthermore, Tang paragraphs [0215] and [0240] do not disclose a condition that metadata is not available for the video frame or "processing the video frame...using at least one trained machine learning model" to identify a salient object or a region of interest. Paragraph [0215] merely explains that when a new subject appears in the scene ("if the ROI ID is new...there may not be any historical data"), the AE control module initializes a new ROI ID. This statement refers to the absence of historical tracking data for a newly detected subject not to the absence of metadata describing the frame. Similarly, paragraph [0240] of Tang describes the generation of exposure statistics when a tracked ROI is newly introduced and lacks prior luminance history. Neither passage teaches or implies the use of a trained machine- learning model to identify a salient object or region of interest in response to metadata unavailability. Tang's ROI detection and tracking modules always operate via computer-vision algorithms applied to incoming frames. These modules do not conditionally invoke machine- learning inference based on metadata availability, nor do they describe a fallback inference path. For these reasons, Tang does not disclose each and every feature of claim 19. As such, claim 19 is novel in view of Tang.” (please see Remarks, pages 9-10).
Examiner respectfully disagrees, as Tang discloses the argued limitations. For instance, Tang in paragraph 216, discloses “When there is one ROI ID that has historical information (e.g., the ROI 370a) for the ROI position and one new current ROI position for the ROI 370a, the prediction may be configured to calculate a linear fitting function regarding x,y (x and y coordinates of the ROI start points) and a time t”, here historical ROI ID corresponds to claimed “responsive to metadata being available for the video frame” and further paragraph 217 discloses “the AI metering control module 320 generates the fitting functions, the current time TS may be applied to the fitting functions in order to predict the start point of the ROIs” corresponds to claimed “process the metadata to identify ROI in the video frame. Similarly, Tang paragraphs 215, and 240, discloses If the ROI ID is new (e.g., the ROI 454 for the new pedestrian 362c), then there may not be any historical data i.e., corresponds to claimed responsive to metadata not being available then the AI metering module partition the frame 450, as being shown in Fig. 8:454. Hence, Tang reference reads on the argued limitations as presented by the Applicant. Examiner suggests Applicant to further elaborate on the conditions and/or partitioning step in the claim in order to overcome the cited references.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1, and 10, is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Chen (US PGPUB 2024/0073455 A1).
As per claim 1, Chen discloses a method (Alessandrini, Figs. 1-6D) comprising:
receiving a video frame (Chen, paragraphs 31 and 46); and
responsive to the video frame comprising at least one of a salient object or a region of interest (Chen, Fig. 5B:T2=ROI, and paragraphs 50-51, and 56), partitioning the video frame into one or more tiles by defining boundaries of the one or more tiles based on a location of the at least one of the salient object or the region of interest (Chen, paragraphs 50 and 56, discloses control circuitry, e.g., control circuitry 210, encodes each frame of the media content item with a certain partitioning structure. Referring to FIG. 5B, frame 500 has been encoded with a partitioning structure comprising three tiles T1, T2 and T3, wherein T2 comprises the ROI, i.e., the mountain peak having the assigned partition boundary. In the example shown in FIG. 5B, each of the tiles extends the full height of the frame. However, in some examples, one or more of the tiles may extend partially along the height of the frame, e.g., see dashed line on FIG. 5B representing an alternate ROI. Irrespective of the exact configuration of the tiles, the boundary of each tile and the partition boundary, which defines the ROI).
As per claim 10, Tang discloses a processing device (Tang, Fig. 3:100) comprising:
a video encoder (Chen, paragraphs 47 and 50) configured to:
For rest of claim limitations please see the analysis of claim 1.
Claim(s) 19-20, is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Tang (US PGPUB 2023/0419505 A1).
As per claim 19, Tang discloses a processing device (Tang, Fig. 3:100) comprising:
a video encoder (Tang, paragraph 41, discloses The processor 102 may be configured to implement a video encoder) configured to:
receive a video frame (Tang, Fig. 3:102:104, and Fig. 5:304:306);
responsive to metadata being available for the video frame, process the metadata to identify at least one of a salient object or a region of interest in the video frame (Tang, Tang, Fig. 5:306, and paragraphs 216);
responsive to metadata not being available for the video frame (Tang, paragraphs 215, and 240, discloses If the ROI ID is new (e.g., the ROI 454 for the new pedestrian 362c), then there may not be any historical data), process the video frame or a representation thereof using at least one trained machine learning model to generate an output identifying the at least one of the salient object or the region of interest in the video frame (Tang, paragraphs 215, and 240); and
partition the video frame into one or more tiles based on a location of the at least one of the salient object or the region of interest (Tang, Fig. 6:350 and Fig. 8:450, shows partitioning of the video frame based on location of ROI).
As per claim 20, Tang further discloses the processing device of claim 19, wherein the video encoder is configured to partition the video frame into the one or more tiles by:
partitioning the video frame such that the at least one of the salient object or the region of interest is entirely within a single tile of the one or more tiles (Tang, Fig. 6:370a).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-6, 8-15, and 17-20, is/are rejected under 35 U.S.C. 103 as being unpatentable over Tang (US PGPUB 2023/0419505 A1) and further in view of Alessandrini (US Patent 11,314,959 B1).
As per claim 1, Tang discloses a method (Tang, Figs. 1-8) comprising:
receiving a video frame (Tang, Fig. 3:102:104, and Fig. 5:102:304:306); and
responsive to the video frame comprising at least one of a salient object or a region of interest (Tang, Fig. 5:306, and paragraphs 138-139, 170 and 176, discloses ROI detection),
Although Tang discloses partitioning the frame into one or more tiles based on location of ROI as being shows in Fig. 6 and Fig. 8, however, Tang does not explicitly disclose partitioning the frame into one or more tiles by defining boundaries of the one or more tiles based on a location of the at least one of the salient object or the region of interest.
Alessandrini discloses partitioning the frame into one or more tiles by defining boundaries of the one or more tiles based on a location of the at least one of the salient object or the region of interest (Alessandrini, Fig. 3A-3D:1000, Column 14, lines 57-67, discloses identifying ROIs 880 in which the 2D array of pixels 801 is divided into tiles 808 and Column 15, lines 1-9, discloses such an indication of location may include a description of the edges and/or corners (e.g., lengths and/or coordinates) of the boundary defining the ROI 880).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Tang teachings by implementing ROI identification technique to the system, as taught by Alessandrini.
The motivation would be to provide an improved system for making more efficient use of processing resources in identifying and quantifying differing types of ROIs (Column 2, lines 24-26), as taught by Alessandrini.
As per claim 2, Tang in view of Alessandrini further discloses the method of claim 1, further comprising:
receiving metadata associated with the video frame, the metadata at least indicating the location of the at least one of the salient object or the region of interest in the video frame (Tang, paragraphs 139-140, discloses The ROI information for each of the ROIs calculated in the ROI list may comprise a ROI position (e.g., an x,y coordinate in the YUV image), a ROI ID and/or an ROI weight. For example, the signal LROI may comprise the list of ROIs detected comprising the ROI position).
As per claim 3, Tang in view of Alessandrini further discloses the method of claim 2, wherein the location indicated by the metadata includes pixel coordinates in the video frame (Tang, paragraphs 101, 105, 171 and 174, discloses The CNN module 190b may be configured to analyze the pixel data of the video frame 350 to detect a size and/or location of various types of objects/subjects captured in the video frame 350).
As per claim 4, Tang in view of Alessandrini further discloses the method of claim 1, further comprising:
generating, by an inference engine, an output identifying the at least one of the salient object or the region of interest and the location of the at least one of the salient object or the region of interest in the video frame (Tang, paragraphs 22, 166, and 213-214, discloses the AI metering control module 320 may perform the ROI prediction by estimating the real-time ROI position).
As per claim 5, Tang in view of Alessandrini further discloses the method of claim 4, wherein generating the output further comprises:
identifying at least one of an additional object or a region in the video frame (Tang, paragraphs 142 and 173, discloses The people 362a-362b may be objects of interest that the computer vision operations of the ROI detection module 306 may detect as an ROI in the ROI list); and
indicating at least one of that the additional object is a non-salient object or that the region is not of interest (Tang, paragraph 173, discloses the background objects 356a-356n may not be objects of interest that the computer vision operations of the ROI detection module 306 may detect as an ROI in the ROI list).
As per claim 6, Tang in view of Alessandrini further discloses the method of claim 1, wherein partitioning the video frame into the one or more tiles comprises:
partitioning the video frame such that the at least one of the salient object or the region of interest is entirely within a single tile of the one or more tiles (Tang, Fig. 6:370a).
As per claim 8, Tang in view of Alessandrini further discloses the method of claim 1, wherein partitioning the video frame into the one or more tiles comprises:
grouping the at least one of the salient object or the region of interest with at least one of an additional salient object or an additional region of interest to form at least one of a group of salient objects or a group of regions of interest (Tang, Fig. 6 and Fig. 8, and paragraphs 173, 183 and 185); and
partitioning the video frame such that the at least one of the group of salient objects or the group of regions of interest is entirely within a single tile of the one or more tiles (Tang, paragraphs 185, 210 and 212).
As per claim 9, Tang in view of Alessandrini further discloses the method of claim 1, further comprising:
encoding the one or more tiles using one at least one encoding technique (Tang, paragraph 86, discloses encoding techniques);
inserting the encoded one or more tiles into an encoded bitstream (Tang, paragraphs 67 and 83); and
transmitting the encoded bitstream to a destination device (Tang, paragraphs 83 and 246).
As per claim 10, Tang discloses a processing device (Tang, Fig. 3:100) comprising:
a video encoder (Tang, paragraph 41, discloses The processor 102 may be configured to implement a video encoder) configured to:
For rest of claim limitations please see the analysis of claim 1.
As per claim 11, please see the analysis of claim 2.
As per claim 12, please see the analysis of claim 3.
As per claim 13, Tang in view of Alessandrini further discloses the processing device of claim 10, wherein the video encoder comprises an inference engine configured to:
responsive to receiving the video frame or a representation thereof, generate an output identifying the at least one of the salient object or the region of interest and the location of the at least one of the salient object or the region of interest in the video frame (Tang, paragraphs 22, 166, and 213-214, discloses the AI metering control module 320 may perform the ROI prediction by estimating the real-time ROI position).
As per claim 14, Tang in view of Alessandrini further discloses the processing device of claim 13, wherein the inference engine is further configured to:
identify at least one of an additional object or a region in the video frame (Tang, paragraphs 142 and 173, discloses The people 362a-362b may be objects of interest that the computer vision operations of the ROI detection module 306 may detect as an ROI in the ROI list); and
generate the output to further indicate at least one of that the additional object is a non-salient object or that the region is not of interest (Tang, paragraphs 167 and 173, discloses the background objects 356a-356n may not be objects of interest that the computer vision operations of the ROI detection module 306 may detect as an ROI in the ROI list).
As per claim 15, please see the analysis of claim 6.
As per claim 17, please see the analysis of claim 8.
As per claim 18, Tang in view of Alessandrini further discloses the processing device of claim 10, further comprising:
a processor (Tang, Fig. 3:102); and
a video source configured to generate the video frame (Tang, Fig. 3:104 and Fig. 5:VIDEO).
Claim(s) 7 and 16, is/are rejected under 35 U.S.C. 103 as being unpatentable over Tang (US PGPUB 2023/0419505 A1) and further in view of Alessandrini (US Patent 11,314,959 B1) and further in view of Cohn (US PGPUB 2021/0334547 A1).
As per claim 7, Tang in view of Alessandrini further discloses the method of claim 1, wherein partitioning the video frame into the one or more tiles comprises:
Although Tang discloses partitioning of video frame as being explained above however, Tang in view of Alessandrini does not explicitly disclose partitioning the video frame such that at least a portion of the at least one of the salient object or the region of interest that is above a specified threshold is entirely within a single tile of the one or more tiles.
Cohn discloses partitioning the video frame such that at least a portion of the at least one of the salient object or the region of interest that is above a specified threshold is entirely within a single tile of the one or more tiles (Cohn, paragraphs 44, 61, 78 and 87).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Tang in view of Alessandrini teachings by generating content aware metadata, as taught by Cohn.
The motivation would be to achieve improved saliency detection by discarding or ignoring noise (paragraph 76), as taught by Cohn.
As per claim 16, please see the analysis of claim 7.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SYED Z HAIDER whose telephone number is (571)270-5169. The examiner can normally be reached MONDAY-FRIDAY 9-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SAM K Ahn can be reached at 571-272-3044. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SYED HAIDER/Primary Examiner, Art Unit 2633