Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-10 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-10 rejected under 35 U.S.C. 103 as being unpatentable over Kim et al. (US 2022/0006960) in view of Liu et al. (US 2022/0116627).
Regarding claim 1, Kim teaches a system for video surveillance comprising a plurality of cameras (Fig. 1), each of said cameras (Fig. 1, 121, 122, 123, 13-1-13-n, etc.) capturing video content, and comprising:
an action recognition engine (each camera detects events/actions as discussed in Figs. 3-4, paragraph 53 teaches image analysis for a first camera and 54 teaches for additional 2nd or 3rd cameras that also performs image analysis), the action recognition engine classifying the video content to at least one of a plurality of predetermined actions (each camera detects events/actions as discussed in paragraphs 53 and 89), the action recognition engine having an interface enabling communication (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras) with at least one other action recognition engine, whereby detected actions and tasks related to the detected action can be exchanged with another of the plurality of cameras (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras. Fig. 5 and corresponding disclosure teaches wherein communication between a first and second camera takes place. The detection of actions/events in the transmitted video is sent between cameras as noted in paragraph 7);
a video encoder receiving the video content, detected action content and providing an encoded video substream for human viewing, an encoded feature substream for the detected action content (Figs. 2-4, video encoding is performed by the main processor 250 to generate the video “VD/AD” substream. Figs. 2-4, the system’s processors 250 and 270 combines the generated substream VD/AD and the metadata generated from the features of the action/event detected in the form of META1 and META2, the combined stream is transmitted to another camera);
a multiplexor receiving the encoded feature substream and encoded video substream and outputting an encoded camera bitstream including encoded video content and detected action content (Figs. 2-4, the system’s processors 250 and 270 combines the generated substream VD/AD and the metadata generated from the features of the action/event detected in the form of META1 and META2, the combined stream is transmitted to another camera).
However, while Kim teaches a system of cameras able to communicate between each other and create an encoded stream sharing video/audio and features of the action/event detected in the form of META1 and META2, fails to explicitly teach, however, Liu teaches:
a feature extractor comprising a partial neural network, the feature extractor receiving the captured video content and generating a plurality of features representing the video content (paragraph 29 teaches the claimed separate feature extractor 141 within an encoder that is able to generate an output feature stream output from the feature encoding 143. Paragraph 30 teaches that the feature extraction is an artificial neural network, which is typically operated on a front-end layer of a neural network on an front end portion of a task specific neural network, much like in Liu.);
a feature encoder operatively coupled to the feature extractor and generating an encoded feature substream for a machine vision task (paragraph 29 teaches the claimed separate feature extractor 141 within an encoder that is able to generate an output feature stream output from the feature encoding 143).
a video encoder receiving the video content and feature substream and providing an encoded video substream for human viewing, an encoded feature substream for a machine vision application (Fig. 1 and paragraphs 28-32 teaches the creation of a full encoded stream that includes portions of the video encoded stream (which is for human viewing) and a feature stream (for machine vision task) multiplexed together into a bitstream).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Liu into the system of Kim such that the combined bitstream includes both teachings of Kim (detected action content) and Liu (feature stream for machine vision tasks), because said incorporation allows for the benefit of improving surveillance systems by improving the efficiency of the combined system by sharing metrics with other cameras/systems (paragraphs 6-7, 56 and 99).
Regarding claims 2 and 7, Kim teaches the claimed wherein upon detection of a predetermined action by a first camera, the first camera communicates at least one task to a second camera (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras. Fig. 5 and corresponding disclosure teaches wherein communication between a first and second camera takes place. The detection of actions/events in the transmitted video is sent between cameras as noted in paragraph 7).
Regarding claims 3 and 8, Kim teaches the claimed wherein upon detection of a predetermined action by a first camera, the first camera communicates a first task to a second camera and a second task to a third camera (Fig. 5 and corresponding disclosure teaches wherein communication between a first and second camera takes place. The detection of actions/events in the transmitted video is sent between cameras as noted in paragraph 7. Fig. 15, steps S520-S530 results in a third camera being contacted and sent a task).
Regarding claims 4 and 9, Kim teaches the claimed wherein the tasks include at least one of object detection, object count, object tracking, and object identification (paragraphs 53-54).
Regarding claims 5 and 10, Kim teaches the claimed wherein the objects are human and the object identification includes facial recognition (paragraph 49, 53 and 89).
Regarding claim 6, Kim teaches a system for video surveillance comprising a plurality of cameras (Fig. 1), each of said cameras capturing video content ((Fig. 1, 121, 122, 123, 13-1-13-n, etc.) and comprising:
an action recognition engine (each camera detects events/actions as discussed in Figs. 3-4, paragraph 53 teaches image analysis for a first camera and 54 teaches for additional 2nd or 3rd cameras that also performs image analysis), the action recognition engine classifying the video content to at least one of a plurality of predetermined actions (each camera detects events/actions as discussed in paragraphs 53 and 89), the action recognition engine having an interface enabling communication (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras), the action recognition engine having an interface enabling communication with at least one other action recognition engine, whereby detected actions and tasks related to the detected action can be exchanged with another of the plurality of cameras (Fig. 2, Network Interface located on a plurality of cameras allows communication with other cameras. Fig. 5 and corresponding disclosure teaches wherein communication between a first and second camera takes place. The detection of actions/events in the transmitted video is sent between cameras as noted in paragraph 7);
a video encoder receiving the video content and providing an encoded video bitstream for a human viewer (Figs. 2-4, video encoding is performed by the main processor 250 to generate the video “VD/AD” substream);
a feature encoder operatively coupled to the action recognition engine and generating an encoded feature set therefrom (Figs. 2-4, processor 270 performs the feature encoding to generate metadata DOUT, which includes a substream for metadata (META1 and META2));
at least one of said plurality of cameras having a feature multiplexor receiving the encoded feature sets from at least one other camera (META2 or META1 based on the situation) and outputting an encoded feature bitstream and detected action content for a plurality of cameras (Figs. 2-4, the system’s processors 250 and 270 combines the generated substream VD/AD and the metadata generated from the features of the action/event detected in the form of META1 and META2, the combined stream is transmitted/output for the plurality of cameras).
However, while Kim teaches a system of cameras able to communicate between each other and create an encoded stream sharing video/audio and features of the action/event detected in the form of META1 and META2, fails to explicitly teach, however, Liu teaches:
a feature extractor comprising a partial neural network, the feature extractor receiving the captured video content and generating a plurality of features representing the video content (paragraph 29 teaches the claimed separate feature extractor 141 within an encoder that is able to generate an output feature stream output from the feature encoding 143. Paragraph 30 teaches that the feature extraction is an artificial neural network, which is typically operated on a front-end layer of a neural network on an front end portion of a task specific neural network, much like in Liu.);
a feature encoder operatively coupled to feature extractor and generating an encoded feature set for a machine vision task (paragraph 29 teaches the claimed separate feature extractor 141 within an encoder that is able to generate an output feature stream output from the feature encoding 143);
Liu also teaches a video encoder (Fig. 1 and paragraphs 28-32 teaches the creation of a full encoded stream that includes portions of the video encoded stream (which is for human viewing) and a feature stream (for machine vision task) multiplexed together into a bitstream).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the current application to incorporate the teachings of Liu into the system of Kim such that the combined bitstream includes both teachings of Kim (detected action content) and Liu (feature stream for machine vision tasks), because said incorporation allows for the benefit of improving surveillance systems by improving the efficiency of the combined system by sharing metrics with other cameras/systems (paragraphs 6-7, 56 and 99).
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GELEK W TOPGYAL whose telephone number is (571)272-8891. The examiner can normally be reached M-F (9:30-6 PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Vaughn can be reached at 571-272-3922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GELEK W TOPGYAL/ Primary Examiner, Art Unit 2481