Last updated: May 29, 2026

Application No. 18/745,591

INTELLIGENT MULTI-STREAM VIDEO CODING FOR VIDEO SURVEILLANCE

Final Rejection §103

Filed

Jun 17, 2024

Priority

Dec 23, 2021 — provisional 63/293,172 +1 more

Examiner

TOPGYAL, GELEK W

Art Unit

2481

Tech Center

2400 — Computer Networks

Assignee

Op Solutions LLC

OA Round

2 (Final)

This examiner grants 59% of cases after interview

— +18.7% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 611 resolved cases, 2023–2026

Examiner Intelligence

TOPGYAL, GELEK W View full profile →

Grants 59% of resolved cases

Career Allowance Rate

360 granted / 611 resolved

+0.9% vs TC avg

Strong +19% interview lift

Without

With

+18.7%

Interview Lift

resolved cases with interview

Typical timeline

3y 7m

Avg Prosecution

19 currently pending

Career history

642

Total Applications

across all art units

Statute-Specific Performance

§101

0.6%

-39.4% vs TC avg

§103

84.3%

+44.3% vs TC avg

§102

12.2%

-27.8% vs TC avg

§112

0.6%

-39.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 611 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-10 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-10 rejected under 35 U.S.C. 103 as being unpatentable over Kim et al. (US 2022/0006960) in view of Liu et al. (US 2022/0116627).
Regarding claim 1, Kim teaches a system for video surveillance comprising a plurality of cameras (Fig. 1), each of said cameras (Fig. 1, 121, 122, 123, 13-1-13-n, etc.) capturing video content, and comprising:
an action recognition engine (each camera detects events/actions as discussed in Figs. 3-4, paragraph 53 teaches image analysis for a first camera and 54 teaches for additional 2nd or 3rd cameras that also performs image analysis), the action recognition engine classifying the video content to at least one of a plurality of predetermined actions (each camera detects events/actions as discussed in paragraphs 53 and 89), the action recognition engine having an interface enabling communication (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras) with at least one other action recognition engine, whereby detected actions and tasks related to the detected action can be exchanged with another of the plurality of cameras (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras. Fig. 5 and corresponding disclosure teaches wherein communication between a first and second camera takes place. The detection of actions/events in the transmitted video is sent between cameras as noted in paragraph 7);
a video encoder receiving the video content, detected action content and providing an encoded video substream for human viewing, an encoded feature substream for the detected action content (Figs. 2-4, video encoding is performed by the main processor 250 to generate the video “VD/AD” substream. Figs. 2-4, the system’s processors 250 and 270 combines the generated substream VD/AD and the metadata generated from the features of the action/event detected in the form of META1 and META2, the combined stream is transmitted to another camera);
a multiplexor receiving the encoded feature substream and encoded video substream and outputting an encoded camera bitstream including encoded video content and detected action content (Figs. 2-4, the system’s processors 250 and 270 combines the generated substream VD/AD and the metadata generated from the features of the action/event detected in the form of META1 and META2, the combined stream is transmitted to another camera).
However, while Kim teaches a system of cameras able to communicate between each other and create an encoded stream sharing video/audio and features of the action/event detected in the form of META1 and META2, fails to explicitly teach, however, Liu teaches:
a feature extractor comprising a partial neural network, the feature extractor receiving the captured video content and generating a plurality of features representing the video content (paragraph 29 teaches the claimed separate feature extractor 141 within an encoder that is able to generate an output feature stream output from the feature encoding 143. Paragraph 30 teaches that the feature extraction is an artificial neural network, which is typically operated on a front-end layer of a neural network on an front end portion of a task specific neural network, much like in Liu.);
a feature encoder operatively coupled to the feature extractor and generating an encoded feature substream for a machine vision task (paragraph 29 teaches the claimed separate feature extractor 141 within an encoder that is able to generate an output feature stream output from the feature encoding 143).
a video encoder receiving the video content and feature substream and providing an encoded video substream for human viewing, an encoded feature substream for a machine vision application (Fig. 1 and paragraphs 28-32 teaches the creation of a full encoded stream that includes portions of the video encoded stream (which is for human viewing) and a feature stream (for machine vision task) multiplexed together into a bitstream).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Liu into the system of Kim such that the combined bitstream includes both teachings of Kim (detected action content) and Liu (feature stream for machine vision tasks), because said incorporation allows for the benefit of improving surveillance systems by improving the efficiency of the combined system by sharing metrics with other cameras/systems (paragraphs 6-7, 56 and 99).

Regarding claims 2 and 7, Kim teaches the claimed wherein upon detection of a predetermined action by a first camera, the first camera communicates at least one task to a second camera (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras. Fig. 5 and corresponding disclosure teaches wherein communication between a first and second camera takes place. The detection of actions/events in the transmitted video is sent between cameras as noted in paragraph 7).
Regarding claims 3 and 8, Kim teaches the claimed wherein upon detection of a predetermined action by a first camera, the first camera communicates a first task to a second camera and a second task to a third camera (Fig. 5 and corresponding disclosure teaches wherein communication between a first and second camera takes place. The detection of actions/events in the transmitted video is sent between cameras as noted in paragraph 7. Fig. 15, steps S520-S530 results in a third camera being contacted and sent a task).
Regarding claims 4 and 9, Kim teaches the claimed wherein the tasks include at least one of object detection, object count, object tracking, and object identification (paragraphs 53-54).
Regarding claims 5 and 10, Kim teaches the claimed wherein the objects are human and the object identification includes facial recognition (paragraph 49, 53 and 89).
Regarding claim 6, Kim teaches a system for video surveillance comprising a plurality of cameras (Fig. 1), each of said cameras capturing video content ((Fig. 1, 121, 122, 123, 13-1-13-n, etc.) and comprising:
an action recognition engine (each camera detects events/actions as discussed in Figs. 3-4, paragraph 53 teaches image analysis for a first camera and 54 teaches for additional 2nd or 3rd cameras that also performs image analysis), the action recognition engine classifying the video content to at least one of a plurality of predetermined actions (each camera detects events/actions as discussed in paragraphs 53 and 89), the action recognition engine having an interface enabling communication (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras), the action recognition engine having an interface enabling communication with at least one other action recognition engine, whereby detected actions and tasks related to the detected action can be exchanged with another of the plurality of cameras (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras. Fig. 5 and corresponding disclosure teaches wherein communication between a first and second camera takes place. The detection of actions/events in the transmitted video is sent between cameras as noted in paragraph 7);
a video encoder receiving the video content and providing an encoded video bitstream for a human viewer (Figs. 2-4, video encoding is performed by the main processor 250 to generate the video “VD/AD” substream);
a feature encoder operatively coupled to the action recognition engine and generating an encoded feature set therefrom (Figs. 2-4, processor 270 performs the feature encoding to generate metadata DOUT, which includes a substream for metadata (META1 and META2));
at least one of said plurality of cameras having a feature multiplexor receiving the encoded feature sets from at least one other camera (META2 or META1 based on the situation) and outputting an encoded feature bitstream and detected action content for a plurality of cameras (Figs. 2-4, the system’s processors 250 and 270 combines the generated substream VD/AD and the metadata generated from the features of the action/event detected in the form of META1 and META2, the combined stream is transmitted/output for the plurality of cameras).
However, while Kim teaches a system of cameras able to communicate between each other and create an encoded stream sharing video/audio and features of the action/event detected in the form of META1 and META2, fails to explicitly teach, however, Liu teaches:
a feature extractor comprising a partial neural network, the feature extractor receiving the captured video content and generating a plurality of features representing the video content (paragraph 29 teaches the claimed separate feature extractor 141 within an encoder that is able to generate an output feature stream output from the feature encoding 143. Paragraph 30 teaches that the feature extraction is an artificial neural network, which is typically operated on a front-end layer of a neural network on an front end portion of a task specific neural network, much like in Liu.);
a feature encoder operatively coupled to feature extractor and generating an encoded feature set for a machine vision task (paragraph 29 teaches the claimed separate feature extractor 141 within an encoder that is able to generate an output feature stream output from the feature encoding 143);
Liu also teaches a video encoder (Fig. 1 and paragraphs 28-32 teaches the creation of a full encoded stream that includes portions of the video encoded stream (which is for human viewing) and a feature stream (for machine vision task) multiplexed together into a bitstream).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Liu into the system of Kim such that the combined bitstream includes both teachings of Kim (detected action content) and Liu (feature stream for machine vision tasks), because said incorporation allows for the benefit of improving surveillance systems by improving the efficiency of the combined system by sharing metrics with other cameras/systems (paragraphs 6-7, 56 and 99).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GELEK W TOPGYAL whose telephone number is (571)272-8891. The examiner can normally be reached M-F (9:30-6 PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Vaughn can be reached at 571-272-3922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GELEK W TOPGYAL/           Primary Examiner, Art Unit 2481

Read full office action

Prosecution Timeline

Jun 17, 2024

Application Filed

Jun 17, 2025

Non-Final Rejection mailed — §103

Dec 13, 2025

Response Filed

Apr 01, 2026

Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/923,815

Patent 12633158

FAKE FINGER DETERMINATION APPARATUS AND FAKE FINGER DETERMINATION METHOD

1y 6m to grant Granted May 19, 2026

18/493,140

Patent 12626726

EDITING DEVICE, IMAGE PROCESSING DEVICE, TERMINAL DEVICE, EDITING METHOD, IMAGE PROCESSING METHOD, AND PROGRAM

2y 6m to grant Granted May 12, 2026

18/618,819

Patent 12614342

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

2y 1m to grant Granted Apr 28, 2026

18/725,149

Patent 12601836

RADIO-WAVE SENSOR INSTALLATION ASSISTANCE DEVICE, COMPUTER PROGRAM, AND RADIO-WAVE SENSOR INSTALLATION POSITION DETERMINATION METHOD

1y 9m to grant Granted Apr 14, 2026

18/726,136

Patent 12597341

INSTALLATION SUPPORT DEVICE FOR RADIO WAVE SENSOR, COMPUTER PROGRAM, METHOD OF DETERMINING INSTALLATION POSITION OF RADIO WAVE SENSOR, AND METHOD OF SUPPORTING INSTALLATION OF RADIO WAVE SENSOR

1y 9m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

59%

Grant Probability

78%

With Interview (+18.7%)

3y 7m (~1y 8m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 611 resolved cases by this examiner. Grant probability derived from career allowance rate.