Last updated: April 18, 2026

Application No. 18/651,735

SYSTEMS AND METHODS FOR OBJECT AND EVENT DETECTION AND FEATURE-BASED RATE-DISTORTION OPTIMIZATION FOR VIDEO CODING

Non-Final OA §103

Filed

May 01, 2024

Examiner

MIKESKA, NEIL R

Art Unit

2485

Tech Center

2400 — Computer Networks

Assignee

Op Solutions LLC

OA Round

1 (Non-Final)

Interview Optional

— +7.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 491 resolved cases, 2023–2026

Examiner Intelligence

MIKESKA, NEIL R View full profile →

Grants 74% — above average

Career Allow Rate

363 granted / 491 resolved

+15.9% vs TC avg

Moderate +7% lift

Without

With

+7.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 7m

Avg Prosecution

7 currently pending

Career history

498

Total Applications

across all art units

Statute-Specific Performance

§101

4.6%

-35.4% vs TC avg

§103

61.1%

+21.1% vs TC avg

§102

28.1%

-11.9% vs TC avg

§112

4.6%

-35.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 491 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-8, 10-16, 18, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Naikal (US 2014/0333775) in view of Shi (US 2014/0269901).
Regarding Claim 1, Naikal discloses a method of encoding video (method of video sparse encoding ; para [0016]-[0021], [0042]) comprising: 
extract a plurality of features in a picture in a video frame (event processor extracts features from the identified key-frames to identify a sequence of feature vectors; para {0020]-[0021], (0026], [0028], [0039]); 
group at least a portion of the plurality of features into at least one object (feature data include data vectors that describe one or more features [group plurality of features] for objects in video data that are generated in each frame; para [0026], [0028], [0029], [0045]); 
determine a region for the at least one object (region corresponding to object in a video is processed using a bounding box [geometric representation]; Fig. 6 and para [0040]-[0041), [0050]-[0052], [0059]); 
assign object identifiers to the at least one object (feature descriptors [identifiers] are used as metadata to identify objects in video data; para [0018], [0021]); and 
Naikal does not expressly disclose to encode the object Identifiers into the bitstream.
encode the object Identifiers into the bitstream (the rates are adjusted for each 
macroblock [coding unit] encoded; para [0045)-(0046]). It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the video encoding system of Naikal by using a relevancy map to adjust encoding bit rates, as suggested by Shi, for the purpose of optimizing video coding efficiency by enabling variable bit rates and greater video compression (para [0001], Shi)

Regarding Claim 2, Naikal discloses the method of claim 1, wherein a feature model is used to extract the plurality of features (trained feature model is used to implement feature extraction; para [0020]).

Regarding Claim 3, Naikal discloses the method of claim 1, wherein the region is represented by a geometric representation (region corresponding to object in a video is processed using a bounding box [geometric representation]; Fig. 6 and para [0040]-[0041], [0050]-[0052], [00591).

Regarding Claim 4, Naikal discloses the method of claim 3, wherein the geometric representation is one of a bounding box or a contour (region corresponding to object in a video is processed using a bounding box; Fig. 6 and para [0040]-(0041], [0050]-[0052], [0059]).

Regarding Claim 5, Naikal discloses the method of claim 4, wherein the object identifiers comprise a region identifier and a label (object metadata includes identifying x,y pixel locations of object [region identifier] within video frame and an event label indicating an event related to the object; Fig. 2 and para [0052], [0054], [0059], (0061 ]).

Regarding Claim 6, Naikal discloses the method of claim 5, wherein when the geometric representation is a bounding box (bounding box; Fig. 6 and para [0040]-[0041 ], [0050]-[0052], [0059]), the bounding box is identified by the coordinates of a specific comer and the width and height of the bounding box (bounding box is defined by four comers and a height and width to correspond to object size within a video frame; Fig. 6 and para (0040]-[0041], [00521, (0058)-(0059]).

Regarding Claims 7 and 18, Naikal discloses the method of claim 1, wherein an object is further evaluated over a sequence of frames to determine an event (events are identified over multiple frames of video data; para [0046]), an event identifier is associated with an object and the event identifier is encoded into the bitstreamn (metadata encoded in the video includes identifying an event associated
with an object of interest; para [0017], [0020], [00461).

Regarding Claims 8 and 19, Naikal discloses the method of claim 1, wherein the object identifiers are inserted into the bitstream as supplemental enhancement information (feature descriptors are encoded in video frames [bitstream] and are configured as metadata [supplemental enhancement message] to identify objects in a video; para [0018], [0021], [0028), [0040]-[0043]).

Regarding Claim 10, while Naikal does not Shi discloses a method encoding video (methods and apparatuses for video encoding; para [0014]-[00161) comprising: 
extracting a set of features from a picture in the video (a plurality of identified video content elements of the video signal are extracted, wherein the video content includes features from the video signal; para [0035]-[00361); 
generate a relevance map for the extracted features (a saliency map [relevancy map] is generated based on the identified video content/features; para (0032), [0036], [0041)-[0043]); 
determine a relevance score for portions of the picture using the relevance map (a saliency map may be used to determine a saliency score; para [0017)-(0018), [0026), [0042]-[0043], [0046)-[00471); and 
encode the portion of the picture with a bit rate determined at least in part by the relevance score 
(video coding bit rate is based on saliency score [relevance score], wherein a high saliency score 
results in a higher bit rate and a low saliency score results in a lower bit rate; para [0018], 
[0026), [0047]).

Regarding Claim 11, Naikal discloses the method of claim 1.
Naikal fails to explicitly disclose generate a relevance map for the extracted features; determine a relevance score for portions of the picture using the relevance map; and encode the portion of the picture with a bit rate determined at least in part by the relevance score.
Shi teaches video encoding systems (para [0001]) and further teaches generate a relevance map for the extracted features (a saliency map [relevancy map] is generated based on the identified video content/features; para [0032], (0036], [0041]-[00431); determine a relevance score for portions of the picture using the relevance map (a saliency map may be used to determine a saliency score; para [0017]-[0018], [0026], [0042]-[0043], (0046]-[0047]); and encode the portion of the picture with a bit rate determined at least in part by the relevance score (video coding bit rate is based on saliency score [relevance score], wherein a high saliency score results in a higher bit rate and a low saliency score results in a lower bit rate; para [0018], [0026), [0047]).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the video encoding system of Naikal by using a relevancy map to adjust encoding bit rates, as suggested by Shi, for the purpose of optimizing video coding efficiency by enabling variable bit rates and greater video compression (para [0001], Shi)

Regarding Claims 12 and 14, modified Naikal discloses the method of claim 12, wherein the picture is represented by a plurality of coding units {encoding macroblocks [plurality of coding units}; para [0015], (0045]) and the relevance map is determined at the coding unit level with each coding unit having a coding unit relevance score (a saliency map and associated score is determined for a macroblock;
para [0015], (0018], [0026], (0042]-[0047]).

Regarding Claim 13, while Naikal does not, Shi discloses the method of claim 13, wherein the 
encoding includes allocating bit rate for each coding unit (the rates are adjusted for each 
macroblock [coding unit] encoded; para [0045)-(0046])

Regarding Claim 15, Shi discloses the method of claim 15, wherein the 
encoding includes at least one of intra prediction, motion estimation, and transform quantization 
(each macroblock may be encoded in intra-coded mode, inter-coded mode; para [00151), and wherein 
the relative relevance score is used in an explicit rate distortion optimization mode to alter the 
encoding during at least one of the intra prediction, motion estimation, and transform quantization 
processes (saliency scores may be used to adjust and/or bias rate-distortion processes for encoding 
macroblocks, wherein each macroblock may be encoded in intra-coded mode, inter-coded mode; para 
[0015]. (0045]).

Regarding Claim 16, Shi discloses the method of claim 15, wherein the relative relevance score is used in a rate distortion function to determine an adjusted bitrate for each coding unit {saliency scores may be used to adjust and/or bias rate-distortion processes for encoding macroblocks, wherein a high saliency score results in a higher bit rate and a low saliency score results in a lower bit rate; para [0015], [0045]).

Claims 9 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US 2014/0333775 to Naikal in view of US 6,512,793 to Maeda.
Regarding Claims 9 and 20, Naikal discloses the method of claim 1.
Naikal fails to explicitly disclose wherein the bitstream includes a slice header and 
the sliced header is used to signal the presence of an object in a given slice.
Maeda teaches video coding systems {Col. 2, lines 8-16) and further teaches wherein the bitstream includes a slice header and the sliced header is used to signal the presence of an object in a given slice (header includes data  indicative of object in frame, wherein data includes location of the object as well as size, shape  and texture of the object, wherein the header is a slice header when encoding using-MPEG4; Col. 8,  lines 25-35, Col. 28, lines 62-67 and Col. 29, lines 1-13).
It would have been obvious to one of ordinary skill in the art at the time of the invention to 
modify the video coding system of Naikal by indicating object presence using header 
data, as suggested by Maeda, for the purpose of optimizing coding efficiency by
indicating only portions of video frames that include important object data (Col. 1, lines 16-34, 
Maeda).

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over US 2014/0269901 to Shi in view of US 2014/0333775 to Naikal.

Regarding Claim 17, Shi discloses the method of claim 13.
Shi fails to explicitly disclose grouping at least a portion of the extracted features into at least one object; determining a region for the at least one object; assigning object identifiers to the at least one object; and encoding the object identifiers into the bitstream.
Naikal teaches video encoding (para [00161) and further teaches group at least a portion of the plurality of features into at least one object (feature data include data vectors that describe one or more features (group plurality of features] for objects in video data that are generated in each frame; para [0026], [0028], [0029], (0045]); determine a region for the at least one object (region corresponding to object in a video is processed using a bounding box [geometric representation]; Fig. 6 and para [0040]-(0041], [0050)-[0052], [00591);
assign object identifiers to the at least one object (feature descriptors identifiers] are used as metadata to identify objects in video data; para [0018], [0021]); and 
encode the object identifiers into the bitstream (feature descriptors are encoded in transmitted video frames [bitstream] and are configured as metadata to identify objects in a video; para [0018], [0021], [0040]-[00431).
It would have been obvious to one of ordinary skill in the art at the time of the invention to modify the video coding system of Magnum Semiconductor Inc. by encoding video using feature based object descriptors, as suggested by Naikal, for the purpose of enhancing system reliability in accurately detecting objects of interest in video frames (para [0004], Naikal).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Yamazaki; Satoshi	US 20240412385 A1	OBJECT TRACKING PROCESSING DEVICE, OBJECT TRACKING PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM
Yoshida; Jun et al.	US 20240190591 A1	DETECTION DEVICE
INAMI; Ken	US 20220009499 A1	VEHICLE CONTROL SYSTEM
Cronje; Jaco	US 20210248400 A1	Operator Behavior Recognition System
ZHOU; Yun et al.	US 20150379354 A1	METHOD AND SYSTEM FOR DETECTING MOVING OBJECTS
Tanaka; Yuki et al.	US 20130163831 A1	OBJECT RECOGNITION APPARATUS AND DICTIONARY DATA REGISTRATION METHOD
Ito; Masamichi et al.	US 20060282874 A1	RECEIVING APPARATUS AND METHOD
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NEIL MIKESKA whose telephone number is (571)272-3917. The examiner can normally be reached M-F: 6a - 2p.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jay Patel can be reached at (571) 272-2988. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NEIL R MIKESKA/             Primary Examiner, Art Unit 2485

Read full office action

Prosecution Timeline

May 01, 2024

Application Filed

Oct 24, 2025

Non-Final Rejection — §103

Apr 02, 2026

Response Filed

Precedent Cases

Applications granted by this same examiner with similar technology

18/589,756

Patent 12604017

ENCODING METHOD, ENCAPSULATION METHOD, DISPLAY METHOD, APPARATUS, AND ELECTRONIC DEVICE

2y 5m to grant Granted Apr 14, 2026

18/850,883

Patent 12604003

METHODS AND APPARATUS OF ENCODING/DECODING VIDEO PICTURE PARTITIONED IN CTU GRIDS

2y 5m to grant Granted Apr 14, 2026

18/362,832

Patent 12587687

HIGH-LEVEL SYNTAX DESIGN FOR GEOMETRY-BASED POINT CLOUD COMPRESSION

2y 5m to grant Granted Mar 24, 2026

18/483,111

Patent 12581071

INTERACTION OF MULTIPLE PARTITIONS

2y 5m to grant Granted Mar 17, 2026

18/474,486

Patent 12563192

CONSTRAINTS ON PARTITIONING OF VIDEO BLOCKS

2y 5m to grant Granted Feb 24, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

74%

Grant Probability

81%

With Interview (+7.0%)

2y 7m

Median Time to Grant

Low

PTA Risk

Based on 491 resolved cases by this examiner. Grant probability derived from career allow rate.