Last updated: April 19, 2026

Application No. 18/494,670

USING INCLUSION ZONES IN VIDEOCONFERENCING

Non-Final OA §103

Filed

Oct 25, 2023

Examiner

JONES, CARISSA ANNE

Art Unit

2691

Tech Center

2600 — Communications

Assignee

Hewlett-Packard Development Company, L.P.

OA Round

3 (Non-Final)

Interview Optional

— +25.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 24 resolved cases, 2023–2026

Examiner Intelligence

JONES, CARISSA ANNE View full profile →

Grants 83% — above average

Career Allow Rate

20 granted / 24 resolved

+21.3% vs TC avg

Strong +25% interview lift

Without

With

+25.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 10m

Avg Prosecution

30 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

3.1%

-36.9% vs TC avg

§103

76.0%

+36.0% vs TC avg

§102

11.6%

-28.4% vs TC avg

§112

4.9%

-35.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 24 resolved cases

Office Action

§103

DETAILED ACTION
This action is in response to the application filed 02/06/2026. Claims 1 – 20 are pending and have
been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 – 3, 5, 9 – 11, 13, and 16 – 20 are rejected under 35 U.S.C. 103 as being unpatentable over Dao et al. (U.S. Pub. No. 2024/0214520, hereinafter “Dao”) in view of Kvamstad et al. (EP Pub. No. 4407980, hereinafter “Kvamstad”).
Regarding Claim 1, Dao teaches
A method of using an inclusion zone for a videoconference (see Dao Abstract, A computer-implemented method of operating a video conference endpoint. The video conference endpoint includes a video camera which captures images showing a field of view), the method comprising:
capturing an image of a location (see Dao Figure 2, item 204, capture an image of the field of view, and Figure 3, a video conference suite in which a camera captures an image of the field of view);
applying a subject detector model to the image to identify room location for each subject detected in the image (see Dao Paragraph [0051], an image is captured via the cameras of the field of view containing the spatial boundary. The processor then identifies all people within the field of view in step 206. This identification of people can be performed, for example, via a machine learning model trained to identify people within an image. In some examples, a trained convolutional neural network, such as a ‘you only look once’ (or YOLO) object detection algorithm, or a computer vision Haar Feature-based cascade classifier, or a histogram of orientated gradients, can be used to identify people within the image. The processor increments a counter j to indicate the number of people identified in the field of view of the camera. After that, the processor enters a loop defined by steps 208-216. In step 208, the position of person i is estimated within the field of view of the camera and Paragraph [0052] The estimation of a person's position or location in the field of view is, in some examples, performed in four steps: (i) estimate a distance from the person's face to the camera; (ii) compute a direction to the face of the person relative to the camera horizontal; (iii) compute the camera's orientation via the use of one or more accelerometers in the endpoint; and (iv) compute the direction of the person's face relative to a plane of the floor of a room in the field of view);
defining the inclusion zone for the location, the inclusion zone based on the top-down view of the location (see Dao Paragraph [0063] and Figure 3, A spatial boundary 108 (indicated by the dotted line) is defined as a maximum distance from the camera, as a top-down view of the conference suite);
determining if the room location for each subject is within the inclusion zone (see Dao Paragraph [0060] and Figure 2, processor determines whether person i is within the spatial boundary previously defined (item 210));
filtering data associated with subjects that are determined to be not within the inclusion zone (see Dao Paragraph [0065], a graphical indication that a person is inside or outside of the spatial boundary is provided and associated with the respective person. In this example, a tick symbol is provided adjacent to a person who is within the spatial boundary whilst a cross symbol is provided adjacent to a person who is outside of the spatial boundary. Other graphical indications can be provided, for example a bounding box around only people who are found to be within the spatial boundary or a bounding box around all detected people but with different colors for people inside and outside the boundary. This can allow the user to tailor the data defining the spatial boundary to appropriately exclude or include those people to be framed); and
processing data associated with subjects that are determined to be within the inclusion zone (see Dao Paragraph [0065], a graphical indication that a person is inside or outside of the spatial boundary is provided and associated with the respective person. In this example, a tick symbol is provided adjacent to a person who is within the spatial boundary whilst a cross symbol is provided adjacent to a person who is outside of the spatial boundary. Other graphical indications can be provided, for example a bounding box around only people who are found to be within the spatial boundary or a bounding box around all detected people but with different colors for people inside and outside the boundary. This can allow the user to tailor the data defining the spatial boundary to appropriately exclude or include those people to be framed). 

Dao does not expressively teach
Using coordinates to identify a subject location in a video stream
the room coordinates being based on a top-down view of the location;

However, Kvamstad teaches
Using coordinates to identify a subject location in a video stream (see Kvamstad Paragraph [0011], by estimating the location of people in an image or video output and/or by determining how far away each person is from the video conferencing camera, disclosed systems and methods may promote automatic framing of meeting participants and exclusion of non-participants in a video conference feed or output. A deep learning model can be trained by supervision using a dataset of images and corresponding labels, and the model may be used to describe the location of each person that is visible in a video output by providing location coordinates relative to the video conferencing camera. Furthermore, in some embodiments, the deep learning model may describe where each person is in a video output, how far each person is from the video conferencing camera, and/or whether the person is located within a meeting region, and Figure 2A and 2B showing the top-down view of the conference room and Figure 2C shows the calculation of coordinates of a person)
the room coordinates being based on a top-down view of the location (see Kvamstad Paragraph [0011], by estimating the location of people in an image or video output and/or by determining how far away each person is from the video conferencing camera, disclosed systems and methods may promote automatic framing of meeting participants and exclusion of non-participants in a video conference feed or output. A deep learning model can be trained by supervision using a dataset of images and corresponding labels, and the model may be used to describe the location of each person that is visible in a video output by providing location coordinates relative to the video conferencing camera. Furthermore, in some embodiments, the deep learning model may describe where each person is in a video output, how far each person is from the video conferencing camera, and/or whether the person is located within a meeting region, Paragraph [0012], lateral and longitudinal coordinates relative to a top-down view of each person captured in a video output can be calculated using (i) a depth estimation, and (ii) an angle α between an optical axis of the camera and a vector originating from the camera and extending in the direction of the person, and Figure 2A and 2B showing the top-down view of the conference room and Figure 2C shows the calculation of coordinates of a person);

It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of a method in which an inclusion zone is established in a video conference video stream, and using a model, the location of subjects in the video are detected and subsequently filtered from the video stream if they are not located in the inclusion zone and processed if they are located within the inclusion zone (as taught in Dao), with using coordinates to identify a subject location in a video stream in which the coordinates are based on a top-down view of the room (as taught in Kvamstad), the motivation being to provide a more precise form of location identification when locating a subject in a video stream, which can thus be applied to minimize distractions in a user’s video conference video stream (see Kvamstad Paragraph [0002] and [0027]).

Regarding Claim 2, Dao in view of Kvamstad teaches
	The method of claim 1, wherein capturing images of the location includes capturing images of a portion of an enclosed room or a portion of an open concept workspace (see Dao Figure 2, item 204, capture an image of the field of view, and Figure 3, a video conference suite in which a camera captures an image of the field of view).

Regarding Claim 3, Dao in view of Kvamstad teaches
	The method of claim 1, wherein applying the subject detector model includes defining bounding boxes for each human head of each subject that is detected in the image (see Dao Paragraph [0051], This identification of people can be performed, for example, via a machine learning model trained to identify people within an image, Paragraph [0052], (c) using a trained machine learning algorithm on the image; (d) detecting faces within the image, and using a face bounding box size, Paragraph [0065], a bounding box around only people who are found to be within the spatial boundary or a bounding box around all detected people but with different colors for people inside and outside the boundary).

Regarding Claim 5, Dao in view of Kvamstad teaches
	The method of claim 1, wherein defining the inclusion zone further includes recording world coordinates of a subject during a calibration phase to create boundary lines of the inclusion zone (see Dao Paragraph [0023], the method may include a user input step, in which a user provides the data defining the spatial boundary. The user may provide the data via a user interface, for example by defining distances to the side or forward from the camera via a user interface. The user may provide the data by entering the video conference endpoint into a data entry mode, in which the video conference endpoint tracks the location of the user, and the user prompts the video conference endpoint to use one or more locations of the user to define the spatial boundary).

Regarding Claim 9, Dao in view of Kvamstad teaches
	The method of claim 1, wherein processing the data includes transmitting the data to a far end of the videoconference (see Dao Paragraph [0061] and Figure 2, Once all people have had their positions estimated and the determination made as to whether to frame them or not, ‘Y’, the method moves to step 218 and the or each cropped region is extracted containing one or more of the people in the framing list. These cropped regions are then used to generate one or more single video streams, each video stream containing a respective crop region, or a composite video stream which contains a plurality of crop regions. These are transmitted in step 220 (transmit the video stream), and Paragraph [0014], the method may further comprise transmitting the or each video signal to a receiver. The receiver may be a second video conference endpoint connected to the first video conference endpoint via a computer network).

Regarding Claims 10, 11, 13, and 16, they are rejected similarly as Claims 1, 3, 5, and 9, respectively. The system can be found in Dao (Paragraph [0003], system).

Regarding Claim 17, Dao in view of Kvamstad teaches
The system of claim 10, wherein the processor is further caused to define a virtual boundary line separates the inclusion zone from an exclusion zone (see Dao Paragraph [0063], FIG. 3 shows a video conference suite including the video conference endpoint 100 of FIG. 1. The camera 102 captures a field of view 106 (indicated by the dashed line) which includes a first room 104 and a second room 110. The first and second rooms are separated by a glass wall 112, and in this example room 104 is a video conference suite and room 110 is an office. A spatial boundary 108 (indicated by the dotted line) is defined as a maximum distance from the camera. In this example it means that people 114a-114d are within the spatial boundary, whilst person 116 (who is within the field of view 106 of the camera 102 but not within the first room 104) is not within the spatial boundary. People 114a-114d can therefore be framed by the video conference endpoint 100, and person 116 can be excluded).

Regarding Claim 18, Dao in view of Kvamstad teaches
The system of claim 17, wherein data in the inclusion zone is processed differently from data in the exclusion zone (see Dao Paragraph [0063], FIG. 3 shows a video conference suite including the video conference endpoint 100 of FIG. 1. The camera 102 captures a field of view 106 (indicated by the dashed line) which includes a first room 104 and a second room 110. The first and second rooms are separated by a glass wall 112, and in this example room 104 is a video conference suite and room 110 is an office. A spatial boundary 108 (indicated by the dotted line) is defined as a maximum distance from the camera. In this example it means that people 114a-114d are within the spatial boundary, whilst person 116 (who is within the field of view 106 of the camera 102 but not within the first room 104) is not within the spatial boundary. People 114a-114d can therefore be framed by the video conference endpoint 100, and person 116 can be excluded).

Regarding Claim 19, it is rejected as a combination of Claims 1 and 3. The non-transitory computer-readable medium can be found in Dao (Paragraph [0041], computer readable medium).

Regarding Claim 20, it is rejected similarly as Claim 1. The non-transitory computer-readable medium can be found in Dao (Paragraph [0041], computer readable medium).

Claims 4 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Dao et al. (U.S. Pub. No. 2024/0214520, hereinafter “Dao”) in view of Kvamstad et al. (EP Pub. No. 4407980, hereinafter “Kvamstad”) and Kim (W.O. Pub. No. 2024/101472).
Regarding Claim 4, Dao in view of Kvamstad teach all the limitations of Claim 1, but do not expressively teach
The method of claim 1, wherein defining the inclusion zone further includes manually inputting room coordinates of the inclusion zone during a manual calibration phase using a graphical user interface.

However, Kim teaches
The method of claim 1, wherein defining the inclusion zone further includes manually inputting room coordinates of the inclusion zone during a manual calibration phase using a graphical user interface (see Kim Page 2, The region of interest setting unit 12 can set a region of interest for an image input through the input unit 11. Additionally, the area of interest setting unit 12 may set at least one entry/exit area connected to the area of interest for the image input through the input unit 11. In one embodiment, the area of interest setting unit 12 may receive coordinate values of the area of interest or the entry/exit area, and may display the area of interest or the entry/exit area on the image based on the input coordinates). 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of a method in which an inclusion zone is established in a video conference video stream, and using a model, the coordinates of subjects in the video are detected and subjects are subsequently filtered from the video stream if they are not located in the inclusion zone and processed if they are located within the inclusion zone (as taught in Dao in view of Kvamstad), 
with defining the inclusion zone by manually inputting room coordinates of the inclusion zone (as taught in Kim), the motivation being to provide a more precise form of location identification using coordinates, to thus easily and precisely track objects (such as a subject’s location) (see Kim Page 2).

Regarding Claim 12, it is rejected similarly as Claim 4. The system can be found in Dao (Paragraph [0003], system).

Claims 6, 7, and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Dao et al. (U.S. Pub. No. 2024/0214520, hereinafter “Dao”) in view of Kvamstad et al. (EP Pub. No. 4407980, hereinafter “Kvamstad”) and Xu et al. (U.S. Pub. No. 2022/0335727, hereinafter “Xu”).
Regarding Claim 6, Dao in view of Kvamstad teaches
The method of claim 5, wherein defining the inclusion zone further includes determining, in an automatic calibration phase, maximum parameters of the location (see Dao Paragraph [0050], In a first step 202, the processor receives data defining a spatial boundary within a field of view of the camera or cameras 102. This data can be received, for example, via the human machine interface 14, or via the network interface 8. This data can, for example, identify a maximum distance from the camera (e.g. in metres) that the spatial boundary bounds. The data could also, for example, identify a maximum angle from the camera that the spatial boundary extends and Paragraph [0063] spatial boundary 108 is defined as maximum distance from the camera (depth)).

Dao in view of Kvamstad does not expressively teach
Determining minimum parameters of the location

However, Xu teaches
	Determining minimum parameters of the location (see Xu Paragraph [0098], determines image coordinate system, including maximum and minimum width, of a target area within a location) 

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of a method in which an inclusion zone is established in a video conference video stream, and using a model, the coordinates of subjects in the video are detected and subjects are subsequently filtered from the video stream if they are not located in the inclusion zone and processed if they are located within the inclusion zone, and additionally defining the inclusion zone by determining maximum parameters of the location (as taught in Dao in view of Kvamstad), with determining minimum parameters of the location (as taught in Xu), the motivation being to obtain a range of parameters in order to develop a target area or “inclusion zone” that can be defined as large or as small as the area available (see Xu Paragraph [0098] and Figure 6).

Regarding Claim 7, Dao in view of Kvamstad and Xu teaches
The method of claim 6, wherein that maximum and minimum room parameters include a maximum room width parameter, a minimum room width parameter (see Xu Paragraph [0098], determines image coordinate system, including maximum and minimum width, of a target area within a location), and a maximum room depth parameter (see Dao Paragraph [0050], In a first step 202, the processor receives data defining a spatial boundary within a field of view of the camera or cameras 102. This data can be received, for example, via the human machine interface 14, or via the network interface 8. This data can, for example, identify a maximum distance from the camera (e.g. in metres) that the spatial boundary bounds. The data could also, for example, identify a maximum angle from the camera that the spatial boundary extends and Paragraph [0063] spatial boundary 108 is defined as maximum distance from the camera (depth)).

Regarding Claim 14, it is rejected similarly as Claim 6. The system can be found in Dao (Paragraph [0003], system).

Claims 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Dao et al. (U.S. Pub. No. 2024/0214520, hereinafter “Dao”) in view of Kvamstad et al. (EP Pub. No. 4407980, hereinafter “Kvamstad”) and Igo et al. (U.S. Pub. No. 2025/0054112, hereinafter “Igo”).
Regarding Claim 8, Dao in view of Kvamstad teach all the limitations of Claim 1, but do not expressively teach
The method of claim 1, wherein filtering the data associated with subjects that are determined to be not within the inclusion zone includes at least one of:
muting audio included in the data; and
blurring video included in the data.

However, Igo teaches
The method of claim 1, wherein filtering the data associated with subjects that are determined to be not within the inclusion zone includes at least one of:
muting audio included in the data; and
blurring video included in the data (see Igo Paragraph [0005], the operations also include determining a zone of inclusion for the object based on the video data and the location data and applying continuous and undisrupted background blur to pixels of the video data located outside the zone of inclusion, thus blurring subjects and items that are outside of the zone of inclusion).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of a method in which an inclusion zone is established in a video conference video stream, and using a model, the coordinates of subjects in the video are detected and subjects are subsequently filtered from the video stream if they are not located in the inclusion zone and processed if they are located within the inclusion zone (as taught in Dao in view of Kvamstad), with filtering the data associated with subjects that are determined to be not within the inclusion zone by blurring video included in the data (as taught in Igo), the motivation being to minimize distractions in a user’s video conference video stream and enhance privacy by blurring areas and subjects that are not within an inclusion area (see Igo Abstract and Paragraph [0002]).

Regarding Claim 15, it is rejected similarly as Claim 8. The system can be found in Dao (Paragraph [0003], system).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Refer to PTO-892, Notice of References Cited for a listing of analogous art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CARISSA A JONES whose telephone number is (703)756-1677. The examiner can normally be reached Telework M-F 6:30 AM - 4:00 PM CT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 5712727503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CARISSA A JONES/Examiner, Art Unit 2691        

/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691

Read full office action

Prosecution Timeline

Oct 25, 2023

Application Filed

Jun 26, 2025

Non-Final Rejection — §103

Sep 29, 2025

Response Filed

Dec 03, 2025

Final Rejection — §103

Jan 12, 2026

Interview Requested

Jan 20, 2026

Applicant Interview (Telephonic)

Jan 20, 2026

Examiner Interview Summary

Feb 06, 2026

Request for Continued Examination

Feb 18, 2026

Response after Non-Final Action

Feb 25, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/299,777

Patent 12598267

IMAGE CAPTURE APPARATUS AND CONTROL METHOD

2y 5m to grant Granted Apr 07, 2026

18/354,967

Patent 12598354

INFORMATION PROCESSING SERVER, RECORD CREATION SYSTEM, DISPLAY CONTROL METHOD, AND NON-TRANSITORY RECORDING MEDIUM

2y 5m to grant Granted Apr 07, 2026

18/124,682

Patent 12593004

DISPLAY METHOD, DISPLAY SYSTEM, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM STORING PROGRAM

2y 5m to grant Granted Mar 31, 2026

18/163,371

Patent 12556468

QUALITY TESTING OF COMMUNICATIONS FOR CONFERENCE CALL ENDPOINTS

2y 5m to grant Granted Feb 17, 2026

18/297,357

Patent 12556655

Efficient Detection of Co-Located Participant Devices in Teleconferencing Sessions

2y 5m to grant Granted Feb 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

83%

Grant Probability

99%

With Interview (+25.0%)

2y 10m

Median Time to Grant

High

PTA Risk

Based on 24 resolved cases by this examiner. Grant probability derived from career allow rate.