Last updated: April 19, 2026

Application No. 18/755,269

MACHINE LEARNING MODELS FOR ADAPTIVE POST-PROCESSING USING RESULTS OF SEGMENTATION IN CONFERENCING TOOLS

Non-Final OA §101§103

Filed

Jun 26, 2024

Examiner

TIEU, BINH KIEN

Art Unit

2694

Tech Center

2600 — Communications

Assignee

Microsoft Technology Licensing, LLC

OA Round

1 (Non-Final)

Interview Optional

— +9.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 931 resolved cases, 2023–2026

Examiner Intelligence

TIEU, BINH KIEN View full profile →

Grants 87% — above average

Career Allow Rate

809 granted / 931 resolved

+24.9% vs TC avg

Moderate +10% lift

Without

With

+9.8%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

25 currently pending

Career history

956

Total Applications

across all art units

Statute-Specific Performance

§101

6.1%

-33.9% vs TC avg

§103

43.9%

+3.9% vs TC avg

§102

26.5%

-13.5% vs TC avg

§112

4.1%

-35.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 931 resolved cases

Office Action

§101 §103

DETAILED ACTION

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claim 20 is rejected under 35 U.S.C. 101 because as not falling with one of the four statutory categories of invention.
Regarding claim 20, the claimed invention is directed to non-statutory subject matter The claim 20 does not fall within at least one of the four categories of patent eligible subject matter because the independent claim 19 recited in the its preamble that "One or more computer-readable medium having stored therein computer-executable instructions…” According to the specification, paragraph [0186], 
"…The term “non-transitory computer-readable media” specifically excludes transitory propagating signals, carrier waves, and wave forms or other intangible or transitory media that may nevertheless be readable by a computer…”

Thus, terms “One or more computer-readable medium” can be interpreted as either “non-transitory computer-readable media” or “transitory computer-readable media”. Therefore, claim 20 directs to a transitory computer-readable
medium (i.e., communication media) storing (computer readable) instructions or the like, wherein the transitory computer-read medium is a computer software and not a physical article or object and as such as is not a machine or manufacture. The computer software is not a combination of Diamond V. Diehr, 450 U.S. 175, 184 (1981); Parker V. Flook, 437 U.S. 588 n.9 (1978); Gottschalk V. Benson, 409 U.S. 63, 70 (1972); Cochrane V. Deener, 94 U.S. 780, 787-88(1876). 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-2, 4-5, 8-10, 13, 16 and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Faramarzi et al. (US 2015/0172726) in view of Yang et al. (US 2023/0044969).
	Regarding claim 1, Faramarzi et al. (hereinafter “Faramarzi”) teaches a client computing device (i.e., UE 116, as shown in figure 3) comprising a processor system (i.e., main processor 340) and memory (i.e., memory 360; para.[0035]-[0034]), wherein the client computing device implements a conferencing tool (i.e., system 600a or 600b, as shown in figures 6A and 6B) configured to perform operations. Faramarzi further teaches the system, as shown in figure 6A, comprising a video-encoding system 605a comprising an encoder 625a and a video-decoding system 650a comprising a decoder 655a. Faramarzi further teaches the encoder 625a encode video to generate an output (encoded data) included a bitstream that includes the encoded video along with metadata (para.[0057]). Faramarzi further teaches the decoder 655a receiving the output from the encoder 625a of the video-encoding system 605a. Faramarzi further teaches the decoder 655a to decode the output received from the video-encoding system 605a and to generate a decoded video and further generate a super-resolved video by the super-resolution processor 660a. Then, the post-processing block 665a to perform post-processing on the super-resolved video to generate a more enhanced video as an output of the video decoding system 650a (para.[0058]).
It should be noticed that Faramarzi failed to clearly teach the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence. However, Yang et al. (hereinafter “Yang”) teaches a system 100, as shown in figure 1, comprising client devices 104 and a cloud network 102 which comprises a plurality of computing nodes 118 to host a video service 112 (para.[0035]). Yang further teaches the video services 112 comprising a video matting model (e.g., Improved Video Matting (IVM) model; para.[0043]). The IVM model 110 unitizes a framework, such as a framework, as shown in figure 2, comprising an coder 204 and decoder 206 (para.[0044]-[0045]). The encoder 204 extracts a feature representation of the input frame (read on encoded data) and provides the extracted feature presentations of each input frame to the decoder 206 (0048). Yang further teaches the feature of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence (i.e., a representation of a foreground object is generated and obtained; para.[0069] and [0075]); and applying a trained post-processing model (the IVM model 110) to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence (i.e., the IVM model performs a process to extract a foreground object, such as a human being from any video frame; para.[0045], [0051], [0070]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence, as taught by Yang, into view of Faramarzi in order to provide and display the enhanced video.
Regarding claim 2, Yang further teaches limitations of the claim, such as generating the representation of a foreground object (as the segmentation information) included in the particular frame. Yang further teaches the features of determining and extracting a foreground object based on the generated representation of a foreground object (para.[0069]-[0070]).
Regarding claim 4, Yang further teaches limitations of the claim, such as the extracted feature representation (as segmentation information) of each input frame (para.[0048]). Temporal information (as metadata) in each frame or bitstream is used to determine the extracted feature representation (para.[0049]).
Regarding claim 5, Yang further teaches limitations of the claim, such as the IVM model 110 is utilized and/or performed at the video service 112 in a cloud network 102, as shown in figure 1 (para.[0043]). Yang further teaches framework 200, as shown in figure 2, which may utilize by a video matting mode or the IVM model 110. Thus, the  segmentation information is obtained from a server computing device, such as the framework 200 (para.[0046] and [0049]).
Regarding claim 8, Yang further teaches limitations of the claim, such as upsampler 306, as shown in figure 6, for upsampling input frames (para.[0062]).
Regarding claim 9, Yang further teaches limitations of the claim in paragraphs [0030], [0053], [0056]-[0058].
Regarding claim 10, Yang further teaches limitations of the claim in paragraph  [0047].
Regarding claim 13, Yang further teaches limitations of the claim, such as the IVM model may perform or repeat the process to extract a foreground object on from any video frame in paragraph [0070].
Regarding claim 16, Yang further teaches limitations of the claim in paragraph  [0036].
Regarding claim 19, Faramarzi in a client computing device (i.e., UE 116, as shown in figure 3) that implements a conferencing tool (i.e., system 600a or 600b, as shown in figures 6A and 6B). Faramarzi further teaches the system, as shown in figure 6A, comprising a video-encoding system 605a comprising an encoder 625a and a video-decoding system 650a comprising a decoder 655a. Faramarzi further teaches the encoder 625a encode video to generate an output (encoded data) included a bitstream that includes the encoded video along with metadata (para.[0057]). Faramarzi further teaches the decoder 655a receiving the output from the encoder 625a of the video-encoding system 605a. Faramarzi further teaches the decoder 655a to decode the output received from the video-encoding system 605a and to generate a decoded video and further generate a super-resolved video by the super-resolution processor 660a. Then, the post-processing block 665a to perform post-processing on the super-resolved video to generate a more enhanced video as an output of the video decoding system 650a (para.[0058]).
It should be noticed that Faramarzi failed to clearly teach the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence. However, Yang et al. (hereinafter “Yang”) teaches a system 100, as shown in figure 1, comprising client devices 104 and a cloud network 102 which comprises a plurality of computing nodes 118 to host a video service 112 (para.[0035]). Yang further teaches the video services 112 comprising a video matting model (e.g., Improved Video Matting (IVM) model; para.[0043]). The IVM model 110 unitizes a framework, such as a framework, as shown in figure 2, comprising an coder 204 and decoder 206 (para.[0044]-[0045]). The encoder 204 extracts a feature representation of the input frame (read on encoded data) and provides the extracted feature presentations of each input frame to the decoder 206 (0048). Yang further teaches the feature of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence (i.e., a representation of a foreground object is generated and obtained; para.[0069] and [0075]); and applying a trained post-processing model (the IVM model 110) to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence (i.e., the IVM model performs a process to extract a foreground object, such as a human being from any video frame; para.[0045], [0051], [0070]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence, as taught by Yang, into view of Faramarzi in order to provide and display the enhanced video.
Regarding claim 20, Faramarzi teaches one or more computer-readable media (i.e., memory 360; para.[0035]) having stored therein computer-executable instructions for causing a processor system (i.e., main processor 340), when programmed thereby, to perform operations. Faramarzi further teaches the system, as shown in figure 6A, comprising a video-encoding system 605a comprising an encoder 625a and a video-decoding system 650a comprising a decoder 655a. Faramarzi further teaches the encoder 625a encode video to generate an output (encoded data) included a bitstream that includes the encoded video along with metadata (para.[0057]). Faramarzi further teaches the decoder 655a receiving the output from the encoder 625a of the video-encoding system 605a. Faramarzi further teaches the decoder 655a to decode the output received from the video-encoding system 605a and to generate a decoded video and further generate a super-resolved video by the super-resolution processor 660a. Then, the post-processing block 665a to perform post-processing on the super-resolved video to generate a more enhanced video as an output of the video decoding system 650a (para.[0058]).
It should be noticed that Faramarzi failed to clearly teach the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence. However, Yang et al. (hereinafter “Yang”) teaches a system 100, as shown in figure 1, comprising client devices 104 and a cloud network 102 which comprises a plurality of computing nodes 118 to host a video service 112 (para.[0035]). Yang further teaches the video services 112 comprising a video matting model (e.g., Improved Video Matting (IVM) model; para.[0043]). The IVM model 110 unitizes a framework, such as a framework, as shown in figure 2, comprising an coder 204 and decoder 206 (para.[0044]-[0045]). The encoder 204 extracts a feature representation of the input frame (read on encoded data) and provides the extracted feature presentations of each input frame to the decoder 206 (0048). Yang further teaches the feature of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence (i.e., a representation of a foreground object is generated and obtained; para.[0069] and [0075]); and applying a trained post-processing model (the IVM model 110) to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence (i.e., the IVM model performs a process to extract a foreground object, such as a human being from any video frame; para.[0045], [0051], [0070]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of obtaining segmentation information for the decoded video for the current unit of the video sequence, the segmentation information indicating one or more foreground segments of the decoded video for the current unit of the video sequence and applying a trained post-processing model to the one or more foreground segments of the decoded video for the current unit of the video sequence but not to one or more other segments of the decoded video for the current unit of the video sequence, as taught by Yang, into view of Faramarzi in order to provide and display the enhanced video.

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Faramarzi et al. (US 2015/0172726) in view of Yang et al. (US 2023/0044969) as applied to claims 1 and 2 above, and further in view of Wang et al. (US 2025/0378619).
Regarding claim 3, Faramarzi and Yang, in combination, teach all subject matters as claimed above, except for the feature of determining the segmentation information using a machine learning model having a convolutional U-net architecture. However, Wang et al. (hereinafter “Wang”) teaches a DM 208, as shown in figure 1, representing a U-NET machine learning model used for image segmentation tasks (para.[0076]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the feature of determining the segmentation information using a machine learning model having a convolutional U-net architecture, as taught by Wang, into view of Faramarzi and Yang in order to use for video segmentation tasks, such as decoding.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Faramarzi et al. (US 2015/0172726) in view of Yang et al. (US 2023/0044969) as applied to claim 1 above, and further in view of Li et al. (US 2025/0336039).
Regarding claim 11, Faramarzi and Yang, in combination, teach all subject matters as claimed above, except for the feature of the trained post-processing model being a super-resolution/video restoration model configured to increase spatial resolution, mitigate compression artifacts, and mitigate upsampling artifacts. However, Li et al. (hereinafter “Li”) teaches the feature of a super-resolution/video restoration model in paragraphs [0014] and [0024].
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the feature of the trained post-processing model being a super-resolution/video restoration model configured to increase spatial resolution, mitigate compression artifacts, and mitigate upsampling artifacts, as taught by Li, into view of Faramarzi and Yang in order to use to perform super-resolution upscaling of an input video.

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Faramarzi et al. (US 2015/0172726) in view of Yang et al. (US 2023/0044969) as applied to claim 1 above, in view of Lee (US 2012/0169828).
Regarding claim 17, Faramarzi and Yang, in combination, teach all subject matters as claimed above, except for the features of performing video quality analysis; and based at least in part on results of the video quality analysis: adjusting one or more target characteristics of video. However, Lee teaches the features in paragraph [0079] for a purpose of improvement in video quality enhancement.
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the features of performing video quality analysis; and based at least in part on results of the video quality analysis: adjusting one or more target characteristics of video, as taught by Lee, into view of Yang in order to improve in video quality enhancement.

Allowable Subject Matter
Claims 6-7, 12, 14-15 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.









Any inquiry concerning this communication or earlier communications from the examiner should be directed to BINH TIEU whose telephone number is (571)272-7510. The examiner can normally be reached on 9-5. The Examiner’s fax number is (571) 273-7510 and E-mail address: BINH.TIEU@USPTO.GOV.
Examiner interviews are available via telephone or video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, FAN S. TSANG can be reached on (571) 272-7547.
Any response to this action should be mailed or handed carry deliveries to:
Commissioner of Patents and Trademarks 
401 Dulany Street 
Alexandria, VA 22314
Or    faxed to: (571) 273-8300

 	Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (FAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the FAIR system, see fitp://nair-direct.usoto.aqev. If you have any questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).


/Binh Kien Tieu/Primary Examiner, Art Unit 2694                                                                                                                                                                                                        
Date:   February 2026

Read full office action

Prosecution Timeline

Jun 26, 2024

Application Filed

Mar 13, 2026

Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/526,929

Patent 12603111

AUDIO GUESTBOOK SYSTEMS AND METHODS

2y 5m to grant Granted Apr 14, 2026

18/472,974

Patent 12598223

Dynamic Teleconference Content Item Distribution to Multiple Devices Associated with a User

2y 5m to grant Granted Apr 07, 2026

17/878,004

Patent 12592994

REAL-TIME USER SCREENING OF MESSAGES WITHIN A COMMUNICATION PLATFORM

2y 5m to grant Granted Mar 31, 2026

18/494,991

Patent 12592740

WIRELESS COMMUNICATION DEVICE AND WIRELESS COMMUNICATION METHOD

2y 5m to grant Granted Mar 31, 2026

18/281,213

Patent 12573198

COMMUNICATION SYSTEM, OUTPUT DEVICE, COMMUNICATION METHOD, OUTPUT METHOD, AND OUTPUT PROGRAM

2y 5m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

87%

Grant Probability

97%

With Interview (+9.8%)

2y 5m

Median Time to Grant

Low

PTA Risk

Based on 931 resolved cases by this examiner. Grant probability derived from career allow rate.