Prosecution Insights
Last updated: April 19, 2026
Application No. 17/840,565

Intelligent Multi-Camera Switching with Machine Learning

Non-Final OA §103
Filed
Jun 14, 2022
Examiner
ABOUZAHRA, MAHMOUD KAMAL
Art Unit
2486
Tech Center
2400 — Computer Networks
Assignee
Hewlett-Packard Development Company, L.P.
OA Round
6 (Non-Final)
57%
Grant Probability
Moderate
6-7
OA Rounds
2y 7m
To Grant
62%
With Interview

Examiner Intelligence

Grants 57% of resolved cases
57%
Career Allow Rate
16 granted / 28 resolved
-0.9% vs TC avg
Minimal +4% lift
Without
With
+4.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
41 currently pending
Career history
69
Total Applications
across all art units

Statute-Specific Performance

§101
0.5%
-39.5% vs TC avg
§103
74.2%
+34.2% vs TC avg
§102
12.2%
-27.8% vs TC avg
§112
5.4%
-34.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 28 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 01/22/2026 has been entered. Response to Amendment The Amendment filed 01/22/2026 has been entered. Claims 1- 41 are pending in this application. Claims 21, 30, and 38 have been amended. Claims 1-20, and 39 are cancelled. Claims 40- 41 are new. Response to Arguments Applicant’s arguments with respect to claims 21, 30, and 38 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 21-24 , 26- 38 and 40 are rejected under 35 U.S.C. 103 as being unpatentable over Jinwei Feng (US 20110285807 A1) (hereinafter Feng) in view of Mehdi Seyfi (US 20190130594 A1) (hereinafter Seyfi) further in view of Peter Chu (US 20190158733 A1) (hereinafter Chu): Regarding Claim 21, Feng teaches a method (method for video conferencing [0006]) comprising: capturing, by a primary camera in an environment, a video stream of the environment (a first room view camera that captures the environment [0007], [0039], [0062]); capturing, in the environment by a plurality of microphones, sound from a speaker (microphones in the environment captures audio from the speaker [0037], [0054], [0103]); determining, by a processing unit from the sound captured by the plurality of microphones, direction information that indicates a position of the speaker in the environment (an audio processing unit determines direction information from the microphone captured audio the location of the speaker [0045]- [0046], [0100], [0103]- [0104]); (capture the view of the speaker) based on the direction information (capture the best view of the speaker based on the direction information [0057], [0068]); capturing, by the secondary camera, video of the environment for transmission to a far end (capturing by a secondary camera a video of the environment and transmit the captured video to the far end [0038], [0043], [0048], [0102]). Feng does not explicitly teach the following limitations; however, in an analogous art, Huang teaches determining, by the processing unit from an image in the video stream, a facial pose of the speaker (determines the facial pose from an image containing a detected face [0008], [0049]- [0050], [0035]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing disclosed by Feng to add facial pose determination by Huang to improve accuracy of the estimated location of the person of interest (Huang [0069]). Huang does not explicitly teach the following limitations; however, in an analogous art, Chu teaches determining, by the processing unit from the facial pose, direction pose information that indicates a direction in which the speaker is looking (the head orientation information indicates the direction in which the person is looking [0067]); selecting, by the processing unit based on the pose information the position information and the direction information, a secondary camera in the environment (selecting based on the location of the speaker and the head orientation information a secondary camera in the environment that have a better view of the speaker [0008], [0028], [0067]– [0068], [0081]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing as disclosed by Feng in view of Huang to further add speakers pose information and camera selection as disclosed above by Chu to increase the tracking accuracy (Chu [0059]). Regarding Claim 22, Feng in view of Huang and Chu teach the method of claim 21. Feng further teaches wherein the environment is a room (the environment is a room [0042]). Regarding Claim 23, Feng in view of Huang and Chu teach the method of claim 21. Feng further teaches selecting in the environment, by the processing unit, a camera that provides a view of the face of the speaker (selecting a camera that have view of the face of the speaker [0007]- [0008], [0081], [0100] [0129]- [0130]). Regarding Claim 24, Feng in view of Huang and Chu teach the method of claim 21. Feng further teaches selecting in the environment, by the processing unit in an absence of the sound, a camera that provides facial views of attendees in the environment (when no speaker is detected (step 208), the camera is switched to room view camera that includes all participants (step 204) [0082]- [0083], [0095]- [0096] Fig. 5). Regarding Claim 26, Feng in view of Huang and Chu teach the method of claim 21. Huang further teaches implementing, by the primary camera for individual face and pose detection, machine learning based on neural networks (Using the image data from the camera to detect faces and pose estimation using Machine learning neural network [0049]- [0050], [0052]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing disclosed by Feng to add facial pose determination by Huang to improve accuracy of the estimated location of the person of interest (Huang [0069]). Regarding Claim 27, Feng in view of Huang and Chu teach the method of claim 21. Huang further teaches implementing, by the secondary camera for individual face and pose detection, machine learning based on neural networks (Using the image data from the camera to detect faces and pose estimation using Machine learning neural network [0049]- [0050], [0052]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing disclosed by Feng to add facial pose determination by Huang to improve accuracy of the estimated location of the person of interest (Huang [0069]). Regarding Claim 28, Feng in view of Huang and Chu teach the method of claim 21. Chu further teaches performing, by the processing unit on audio from each microphone in the plurality of microphones, sound source localization to determine a particular individual who is speaking (sound source localization uses the audio from the microphone to determine the location of the speaker [0077]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing as disclosed by Feng in view of Huang to further add speakers pose information and camera selection as disclosed above by Chu to increase the tracking accuracy (Chu [0059]). Regarding Claim 29, Feng in view of Huang and Chu teach the method of claim 21. Feng further teaches coupling the primary camera to the plurality of microphones (the camera and the microphones are coupled [0065]). Regarding Claim 30, Feng teaches a system (apparatus for video conferencing [0006]) comprising: a plurality of cameras in an environment, one of the cameras being a primary camera to capture a video stream of the environment (multiple camera in the environment, a first room view camera that captures the environment [0007], [0039], [0062]); a plurality of microphones in the environment, the microphones are to capture sound in the environment; (microphones in the environment captures audio from the speaker [0037], [0054], [0103]); a processing unit to: determine, in response to the sound from a speaker in the environment, direction information that indicates a position of the speaker in the environment, (an audio processing unit determines direction information from the microphone captured audio the location of the speaker [0045]- [0046], [0100], [0103]- [0104]); a secondary camera to capture video of the environment for transmission to a far end. (capturing by a secondary camera a video of the environment and transmit the captured video to the far end [0038], [0043], [0048], [0102]). Feng does not explicitly teach the following limitations; however, in an analogous art, Huang teaches determine, from an image in the video stream, a facial pose of the speaker (determines the facial pose from an image containing a detected face [0008], [0049]- [0050], [0035]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing disclosed by Feng to add facial pose determination by Huang to improve accuracy of the estimated location of the person of interest (Huang [0069]). Huang does not explicitly teach the following limitations; however, in an analogous art, Chu teaches determine, from the facial pose, direction pose information that indicates a direction in which the speaker is looking (the head orientation information indicates the direction in which the person is looking [0067]); and select, from the plurality of cameras based on the pose information and the direction information, a secondary camera (selecting based on the location of the speaker and the head orientation information a secondary camera in the environment that have a better view of the speaker [0008], [0028], [0067]– [0068], [0081]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing as disclosed by Feng in view of Huang to further add speakers pose information and camera selection as disclosed above by Chu to increase the tracking accuracy (Chu [0059]). Regarding Claim 31, Feng in view of Huang and Chu teach the system of claim 30. Feng further teaches wherein the environment is a room (the environment is a room [0042]). Regarding Claim 32, Feng in view of Huang and Chu teach the system of claim 30. Feng further teaches wherein the primary camera comprises the processing unit (the cameras comprise processing units (part number 52 A and 52 B) [0062]). Regarding Claim 33, Feng in view of Huang and Chu teach the system of claim 30. Feng further teaches wherein the processing unit is to select a camera in the environment that provides a view of a face of the speaker. (selecting a camera that have view of the face of the speaker [0007]- [0008], [0081], [0100] [0129]- [0130]). Regarding Claim 34, Feng in view of Huang and Chu teach the system of claim 30. Feng further teaches the processing unit is to select, in an absence of absence of attendees in the environment, the secondary camera (a secondary camera is selected based on the location of the speaker and the camera with the best view is selected [0079]- [0081]). Regarding Claim 35, Feng in view of Huang and Chu teach the system of claim 30. Chu further teaches wherein the processing unit is to perform, on audio from each microphone of the plurality of microphones, sound source localization to determine a particular individual who is speaking (sound source localization uses the audio from the microphone to determine the location of the speaker [0077]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing as disclosed by Feng in view of Huang to further add speakers pose information and camera selection as disclosed above by Chu to increase the tracking accuracy (Chu [0059]). Regarding Claim 36, Feng in view of Huang and Chu teach the system of claim 30. Huang further teaches the primary camera is to implement, for individual face and pose detection, machine learning based on neural networks (Using the image data from the camera to detect faces and pose estimation using Machine learning neural network [0049]- [0050], [0052]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing disclosed by Feng to add facial pose determination by Huang to improve accuracy of the estimated location of the person of interest (Huang [0069]). Regarding Claim 37, Feng in view of Huang and Chu teach the system of claim 30. Huang further teaches wherein the secondary camera is to implement, for individual face and pose detection, machine learning based on neural networks (Using the image data from the camera to detect faces and pose estimation using Machine learning neural network [0049]- [0050], [0052]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing disclosed by Feng to add facial pose determination by Huang to improve accuracy of the estimated location of the person of interest (Huang [0069]). Regarding Claim 38, Feng teaches A non-transitory computer-readable storage medium storing instructions that, when executed by a processing unit (non-transitory computer-readable storage and a processor [0186]), cause the processing unit to: determine, from sound captured in an environment by a plurality of microphones, position direction information that indicates a position of a speaker in the environment (an audio processing unit determines direction information from the microphone captured audio the location of the speaker [0045]- [0046], [0100], [0103]- [0104]); and select from a plurality of cameras, based on (facial information ) and coordinates of each camera of the plurality of cameras, a secondary camera for transmitting video of the environment for to a far end (the position of the camera is known in the environment, and the facial information are used to determine the secondary cameras to choose. The captured video is transmit to the far end [0038], [0043], [0048], [0102], [0133], ). Feng does not explicitly teach the following limitations; however, in an analogous art, Huang teaches determine, from an image in a video stream of the environment captured by a primary camera, a facial pose of a speaker in the environment (determines the facial pose from an image containing a detected face [0008], [0049]- [0050], [0035]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing disclosed by Feng to add facial pose determination by Huang to improve accuracy of the estimated location of the person of interest (Huang [0069]). Huang does not explicitly teach the following limitations; however, in an analogous art, Chu teaches determine, from the facial pose, pose information that indicates a direction in which the speaker is looking (the head orientation information indicates the direction in which the person is looking [0067]); and select from a plurality of cameras, based on the pose information , a secondary camera (selecting a secondary camera that has the best facial view of the speaker [0028], [0067], [0071], [0081]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing as disclosed by Feng in view of Huang to further add speakers pose information and camera selection as disclosed above by Chu to increase the tracking accuracy (Chu [0059]). Regarding Claim 40, Feng in view of Huang and Chu teach the method of claim 21. Feng further teaches detecting, by the processing unit, a plurality of faces in the image in the video stream (detect face in the video stream using facial recognition techniques [0129]); determining, by the processing unit, the speaker using the direction information and the detected faces (the speaker is determined using the direction information and facial recognition[0008], [0136], [0163],[0179]- [0181]); detecting, by the processing unit, a visibility of the speaker in the video of the environment ( detects if the speaker is visible and framed in the captured video of the environment [0125], [0181]); and in response to the visibility of the speaker being adequate, transmitting the video of the environment to the far end (once the speaker is framed correctly, the video is transmitted to the far end [0181]). Feng does not explicitly teach the following limitations; however, in an analogous art, Chu further teaches wherein a field of view (FOV) of the primary camera includes the secondary camera (Fig. 5A shows that the primary camera FOV includes the secondary camera, and the secondary camera FOV includes the Primary camera). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing as disclosed by Feng in view of Huang to further add speakers pose information and camera selection as disclosed above by Chu to increase the tracking accuracy (Chu [0059]). Claim 25 is rejected under 35 U.S.C. 103 as being unpatentable over Jinwei Feng (US 20110285807 A1) (hereinafter Feng) in view of Mehdi Seyfi (US 20190130594 A1) (hereinafter Seyfi) in view of Peter Chu (US 20190158733 A1) (hereinafter Chu) further in view of Yibo Liu (US 20140049595 A1) (hereinafter Liu): Regarding Claim 25, Feng in view of Krupka teaches the method of claim 21; however, fails to explicitly teach selecting, by the processing unit, the secondary camera in an absence of attendees in the environment. However, Jinwei, in analogous art, teaches selecting, by the processing unit, the secondary camera in an absence of attendees in the environment (the video data is analyzed (block 256) and when no faces are detected the secondary camera (block 254) is used to capture the images (Fig. 5)). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing as disclosed by Feng in view of Huang and Chu to further add the detection of occupied and unoccupied environment of Liu to improve the reliability of participant detection in an environment (Liu [0111]). Claim 41 is rejected under 35 U.S.C. 103 as being unpatentable over Jinwei Feng (US 20110285807 A1) (hereinafter Feng) in view of Mehdi Seyfi (US 20190130594 A1) (hereinafter Seyfi) in view of Peter Chu (US 20190158733 A1) (hereinafter Chu) further in view of Peter L. Chu (US 20120320143 A1) (hereinafter Peter): Regarding Claim 41, Feng in view of Krupka teaches the method of claim 21; however, fails to explicitly teach wherein an additional secondary camera is physically closer to the speaker as compared to the secondary camera. However, Peter, in analogous art, teaches wherein an additional secondary camera is physically closer to the speaker as compared to the secondary camera (secondary cameras that are closer to the speaker [0074]). It would have been obvious to the person having ordinary skill in the art before the effective filling date of the claimed invention to modify the multi-camera videoconferencing as disclosed by Feng in view of Huang and Chu to further add multiple secondary cameras as disclosed by Peter to obtain the view of the talking participant (Peter [0003]). Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to MAHMOUD KAMAL ABOUZAHRA whose telephone number is (703)756-1694. The examiner can normally be reached M-F 7:00 AM to 5:00 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jamie Atala can be reached at (571) 272-7384. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /MAHMOUD KAMAL ABOUZAHRA/Examiner, Art Unit 2486 /JAMIE J ATALA/Supervisory Patent Examiner, Art Unit 2486
Read full office action

Prosecution Timeline

Jun 14, 2022
Application Filed
Jan 23, 2024
Non-Final Rejection — §103
Mar 15, 2024
Response Filed
Jun 06, 2024
Final Rejection — §103
Jul 09, 2024
Interview Requested
Aug 01, 2024
Applicant Interview (Telephonic)
Aug 08, 2024
Examiner Interview Summary
Aug 16, 2024
Response after Non-Final Action
Oct 14, 2024
Notice of Allowance
Oct 14, 2024
Response after Non-Final Action
Nov 06, 2024
Response after Non-Final Action
Jan 16, 2025
Non-Final Rejection — §103
Apr 15, 2025
Response Filed
Jun 19, 2025
Final Rejection — §103
Aug 07, 2025
Response after Non-Final Action
Sep 16, 2025
Final Rejection — §103
Nov 24, 2025
Response after Non-Final Action
Jan 22, 2026
Request for Continued Examination
Jan 28, 2026
Response after Non-Final Action
Mar 21, 2026
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12558845
System and Method for a Three-Dimensional Optical Switch Display Device
2y 5m to grant Granted Feb 24, 2026
Patent 12464148
COMPUTER-IMPLEMENTED MULTI-SCALE MACHINE LEARNING MODEL FOR THE ENHANCEMENT OF COMPRESSED VIDEO
2y 5m to grant Granted Nov 04, 2025
Patent 12422691
VEHICULAR CAMERA ASSEMBLY WITH LENS BARREL WELDED AT IMAGER HOUSING
2y 5m to grant Granted Sep 23, 2025
Patent 12387309
INSPECTION APPARATUS AND INSPECTION METHOD
2y 5m to grant Granted Aug 12, 2025
Patent 12389089
THERMAL SENSOR, THERMAL SENSOR ARRAY, ELECTRONIC APPARATUS INCLUDING THE THERMAL SENSOR, AND OPERATING METHOD OF THE THERMAL SENSOR
2y 5m to grant Granted Aug 12, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

6-7
Expected OA Rounds
57%
Grant Probability
62%
With Interview (+4.4%)
2y 7m
Median Time to Grant
High
PTA Risk
Based on 28 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month