Last updated: May 29, 2026
Application No. 17/904,692
METHOD AND DEVICE FOR GENERATING PHOTOPLETHYSMOGRAPHY SIGNALS

Non-Final OA §103
Filed
Aug 19, 2022
Priority
Feb 21, 2020 — CN 202010110278.X +1 more
Examiner
MCCORMACK, ERIN KATHLEEN
Art Unit
3791
Tech Center
3700 — Mechanical Engineering & Manufacturing
Assignee
LEPU MEDICAL TECHNOLOGY (BEIJING) CO., LTD.
OA Round
2 (Non-Final)
This examiner grants 12% of cases after interview

— +60.0% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 26 resolved cases, 2023–2026
Examiner Intelligence

MCCORMACK, ERIN KATHLEEN View full profile →
Grants only 12% of cases
Career Allowance Rate
3 granted / 26 resolved
-58.5% vs TC avg
Strong +60% interview lift
Without
With
+60.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
57 currently pending
Career history
125
Total Applications
across all art units
Statute-Specific Performance

§101
1.7%
-38.3% vs TC avg
§103
95.8%
+55.8% vs TC avg
§102
2.1%
-37.9% vs TC avg
§112
0.4%
-39.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 26 resolved cases
Office Action

§103
DETAILED ACTION
Applicant’s arguments, filed on 07/21/2025, have been fully considered. The following rejections and/or objections are either reiterated or newly applied. They constitute the complete set presently being applied to the instant application. 
Applicants have amended their claims, filed on 07/21/2025, and therefore rejections newly made in the instant office action have been necessitated by amendment. 
Claims 1-4 and 6-11 are the current claims hereby under examination.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Objections
Claims 1, 4, and 7 are objected to because of the following informalities:
In claim 1, “PPG” should be spelled out the first time it is introduced in the claims.
In claim 4, “a invalidity condition” should read “an invalidity condition”
In claim 7, “a invalidity condition” should read “an invalidity condition”
Appropriate correction is required.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-4 and 6-11 are rejected under 35 U.S.C. 103 as being unpatentable over Addison (EP 3207862) in further view of Puig (US PG Pub 20160249820).

Regarding independent claim 1, Addison teaches a method for generating PPG signals ([0042]: “The sensor is placed in contact with the patient, such as by clipping or adhering the sensor around a finger, toe, or ear of a patient. The sensor's emitters emit light of two particular wavelengths into the patient's tissue, and the photodetector detects the light after it is reflected or transmitted through the tissue. The detected light signal, called a photoplethysmogram (PPG), modulates with the patient's heartbeat, as each arterial pulse passes through the monitored tissue and affects the amount of light absorbed or scattered.”; [0021]: “the method also includes identifying, in the component signal, individual pulses representative of individual heart beats; for each identified pulse, locating a corresponding portion of two of the red, green, and blue signals; and measuring blood oxygen saturation of the patient from the located corresponding portions of the two signals”), comprising: 
performing a continuous acquisition and photographing operation on a local skin surface of a living body by a mobile terminal ([0017]: “a method for measuring blood oxygen saturation of a patient includes receiving, from a video camera, a video signal encompassing exposed skin of a patient”; [0046]: “The camera operates at a frame rate, which is the number of image frames taken per second (or other time period). Example frame rates include 20, 30, 40, 50, or 60 frames per second, greater than 60 frames per second, or other values between those”), and during the continuous acquisition and photographing process, performing continuous video fragment caching on video data obtained by photographing with a preset cached fragment time threshold as a fragment length to generate multiple cached fragment video data ([0019]: “extracting the red, green, and blue signals comprises selecting pixels within the image frame exhibiting a modulation that is at the primary frequency and that has an amplitude above a threshold”); 
performing frame image extraction on the cached fragment video data to generate an image sequence of cached fragment frame, and performing image quality detection on all cached fragment frame images in the image sequence of cached fragment frame according to a preset red light pixel threshold range to generate an image quality detection result, wherein the image sequence of cached fragment frame comprises multiple of the cached fragment frame images ([0082]: “Figure 10B also includes a cross-correlation process that cross-correlates the frequency spectrums of the three color signals to amplify the results. All four resulting spectrums are analyzed to select and accumulate peaks. A cross correlated spectrum can be calculated by multiplying or summing existing spectrum together. An individual spectrum can be scaled before being combined based on signal quality. For example, because most RGB cameras have twice the number of green pixels compare to red and blue ones, the Green signal is usually better and can be weighted above Red and Blue. This method can follow the strongest peaks around the spectrum over time, as the patient's physiology (such as respiration rate and heart rate) changes.”); 
when the image quality detection result satisfies a validity condition, performing one-dimensional red light signal extraction on all the cached fragment frame images in the image sequence of cached fragment frame according to the red light pixel threshold range to generate a first red light digital signal, and performing one-dimensional green light signal extraction on all the cached fragment frame images in the image sequence of cached fragment frame according to a preset green light pixel threshold range to generate a first green light digital signal ([0017]: “extracting from the video signal time-varying red, green, and blue signals”; [0019]: “extracting the red, green, and blue signals comprises selecting pixels within the image frame exhibiting a modulation that is at the primary frequency and that has an amplitude above a threshold”; [0050]: “three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal”); and 
performing the signal band-pass filtering preprocessing on the first red light digital signal according to a preset band-pass filtering frequency threshold range to generate a second red light digital signal, and performing signal band-pass filtering preprocessing on the first green light digital signal according to the preset band-pass filtering frequency threshold range to generate a second green light digital signal ([0080]: “De-noising includes filtering the signal to remove noise sources and frequencies outside of a known physiologic range”. Denoising filtering can include signal band-pass filtering.).
However, Addison does not teach performing maximum frequency difference determination on the second red light digital signal and the second green light digital signal to generate a first determination result.
Puig discloses methods and systems for estimating heart rate by tracking optical signal frequency components. Specifically, Puig teaches performing maximum frequency difference determination on the second red light digital signal and the second green light digital signal to generate a first determination result ([0054]: “to develop trace plot 450 from plot 400, qualified peaks are processed in order of descending Relative FFT value. A position corresponding to a qualified peak is added to any trace whose most recent frequency differs from the qualified peak's frequency by less than a maximum frequency difference D.sub.MAX. If a peak is not closer to any active trace than D.sub.MAX, the qualified peak can become a position in a new active trace”). Addison and Puig are analogous arts as they are both related to using optical signals to determine health parameters of a user. 
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the invention to include the maximum frequency difference determination from Puig in the method from Addison as it provides a more detailed, in depth analysis of the received signals, which can create a more accurate result and generation of the PPG signals.  
The Addison/Puig combination teaches performing signal-to- noise ratio determination on the second red light digital signal and the second green light digital signal to generate a second determination result when the first determination result satisfies the validity condition (Addison, [0063]: “The system can focus on pixels that represent the desired vital sign, thereby increasing the signal-to-noise ratio (SNR) of the analyzed signal.”); and performing PPG signal generation on the second red light digital signal and the second green light digital signal to generate a first PPG signal of the cached fragment video data when the second determination result satisfies the validity condition (Addison,  [0090]: “The bottom plot in Figure 6B shows three different SpO2 values from a video signal, one for each pair of signals. The top trace is from a ratio of ratios calculation of the Red and Green signals, the middle is the Red and Blue signals, and the bottom is the Green and Blue signals”);
wherein the step of when the image quality detection result is a validity condition, performing one-dimensional red light signal extraction on all the cached fragment frame images in the image sequence of cached fragment frame according to the red light pixel threshold range to generate a first red light digital signal, and performing one-dimensional green light signal extraction on all the cached fragment frame images in the image sequence of cached fragment frame according to a preset green light pixel threshold range to generate a first green light digital signal specifically comprises:
a step 51, when the image quality detection result satisfies the validity condition, initializing the first red light digital signal to be null, initializing the first green light digital signal to be null, initializing a first index to 1, and initializing a first total number to a total number of the cached fragment frame images in the image sequence of cached fragment frame (Addison, [0046]: “the field of view 216 encompasses exposed skin of the patient, in order to detect physiologic signals visible from the skin such as arterial oxygen saturation (SpO2 or SvidO2). The camera generates a sequence of images over time. A measure of the amount, color, or brightness of light within all or a portion of the image over time is referred to as a light intensity signal. In an embodiment, each image includes a two-dimensional array or grid of pixels, and each pixel includes three color components - for example, red, green, and blue. A measure of one or more color components of one or more pixels over time is referred to as a "pixel signal," which is a type of light intensity signal. The camera operates at a frame rate, which is the number of image frames taken per second (or other time period). Example frame rates include 20, 30, 40, 50, or 60 frames per second, greater than 60 frames per second, or other values between those”);
a step 52, setting a first-index frame image as the cached fragment frame image, corresponding to the first index, in the image sequence of cached fragment frame;
a step 53, collecting all pixels, meeting the red light pixel threshold range, in the first-index frame image to generate a red pixel set, calculating a total number of pixels included in the red pixel set to generate a total number of red pixel, calculating the sum of pixel values of all the pixels in the red pixel set to generate a value sum of red pixel, and generating first-index frame red light channel data according to a quotient obtained by dividing the value sum of red pixel by the total number of red pixel; and adding the first-index frame red light channel data into the first red light digital signal as signal point data (Addison, [0082]: “Figure 10B also includes a cross-correlation process that cross-correlates the frequency spectrums of the three color signals to amplify the results. All four resulting spectrums are analyzed to select and accumulate peaks. A cross correlated spectrum can be calculated by multiplying or summing existing spectrum together. An individual spectrum can be scaled before being combined based on signal quality. For example, because most RGB cameras have twice the number of green pixels compare to red and blue ones, the Green signal is usually better and can be weighted above Red and Blue. This method can follow the strongest peaks around the spectrum over time, as the patient's physiology (such as respiration rate and heart rate) changes.”; [0070]: “Figure 5E shows a method for video-based monitoring of a patient's vital signs, according to an embodiment. The method includes receiving a video signal from a video camera at 511. The video signal includes a plurality of sequential image frames, each image frame having a field of view that includes exposed skin of a patient, such as the face or forehead. The method includes segmenting a first image frame into a plurality of regions at 512, and then, for each region, extracting from the video signal a time-varying color signal at 513. In an example, three time-varying color signals are extracted from each region, corresponding to red, green, and blue pixel values. The method includes identifying a frequency content of each color signal at 514, and selecting regions that have a shared frequency content at 515. The shared frequency content is a modulation at a shared frequency. For example, two regions that both exhibit color signals that modulate at the patient's heart rate, such as a frequency of 60 beats per minute, are selected. In an embodiment, the shared modulation must pass criteria, such as those described above, to select the desired regions. For example, an amplitude threshold for the modulation frequency can be applied as a criterion for selecting regions. In an embodiment, the regions that satisfy this criterion are non-adjacent to each other; they do not need to be in contact with each other or next to each other on the patient. Rather, regions that exhibit a shared modulation at a physiologic frequency, above a noise threshold, are selected even if they are located at disparate, non - contiguous locations across the patient.”; [0050]: “the video camera records multiple sequential image frames (such as image frames 300A and 300B) that each include the head region 314 and chest region 316. The pixels or detected regions in these sequential images exhibit subtle modulations caused by the patient's physiology, such as heartbeats and breaths. In particular, the color components of the pixels vary between the frames based on the patient's physiology. In one embodiment, the camera employs the Red/Green/Blue color space and records three values for each pixel in the image frame, one value each for the Red component of the pixel, the Blue component, and the Green component. Each pixel is recorded in memory as these three values, which may be integer numbers (typically ranging from 0 to 255 for 8-bit color depth, or from 0 to 4095 for 12-bit color depth) or fractions (such as between 0 and 1). Thus, three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal.”);
a step 54, collecting all pixels, meeting the green light pixel threshold range, in the first-index frame image to generate a green pixel set, calculating a total number of pixels included in the green pixel set to generate a total number of green pixels, calculating the sum of pixel values of all the pixels in the green pixel set to generate a value sum of green pixel, and generating first-index frame green light channel data according to a quotient obtained by dividing the green pixel value sum by the total number of green pixel; and adding the first-index frame green light channel data into the first green light digital signal as signal point data (Addison, [0082]: “Figure 10B also includes a cross-correlation process that cross-correlates the frequency spectrums of the three color signals to amplify the results. All four resulting spectrums are analyzed to select and accumulate peaks. A cross correlated spectrum can be calculated by multiplying or summing existing spectrum together. An individual spectrum can be scaled before being combined based on signal quality. For example, because most RGB cameras have twice the number of green pixels compare to red and blue ones, the Green signal is usually better and can be weighted above Red and Blue. This method can follow the strongest peaks around the spectrum over time, as the patient's physiology (such as respiration rate and heart rate) changes.”; [0070]: “Figure 5E shows a method for video-based monitoring of a patient's vital signs, according to an embodiment. The method includes receiving a video signal from a video camera at 511. The video signal includes a plurality of sequential image frames, each image frame having a field of view that includes exposed skin of a patient, such as the face or forehead. The method includes segmenting a first image frame into a plurality of regions at 512, and then, for each region, extracting from the video signal a time-varying color signal at 513. In an example, three time-varying color signals are extracted from each region, corresponding to red, green, and blue pixel values. The method includes identifying a frequency content of each color signal at 514, and selecting regions that have a shared frequency content at 515. The shared frequency content is a modulation at a shared frequency. For example, two regions that both exhibit color signals that modulate at the patient's heart rate, such as a frequency of 60 beats per minute, are selected. In an embodiment, the shared modulation must pass criteria, such as those described above, to select the desired regions. For example, an amplitude threshold for the modulation frequency can be applied as a criterion for selecting regions. In an embodiment, the regions that satisfy this criterion are non-adjacent to each other; they do not need to be in contact with each other or next to each other on the patient. Rather, regions that exhibit a shared modulation at a physiologic frequency, above a noise threshold, are selected even if they are located at disparate, non - contiguous locations across the patient.”; [0050]: “the video camera records multiple sequential image frames (such as image frames 300A and 300B) that each include the head region 314 and chest region 316. The pixels or detected regions in these sequential images exhibit subtle modulations caused by the patient's physiology, such as heartbeats and breaths. In particular, the color components of the pixels vary between the frames based on the patient's physiology. In one embodiment, the camera employs the Red/Green/Blue color space and records three values for each pixel in the image frame, one value each for the Red component of the pixel, the Blue component, and the Green component. Each pixel is recorded in memory as these three values, which may be integer numbers (typically ranging from 0 to 255 for 8-bit color depth, or from 0 to 4095 for 12-bit color depth) or fractions (such as between 0 and 1). Thus, three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal.”);
a step 55, increasing the first index by 1;a step 56, determining whether the first index is greater than the first total number; if the first index is less than or equal to the first total number, performing the step 52; or, if the first index is greater than the first total number, performing a step 57 (Addison, [0050]: “the video camera records multiple sequential image frames (such as image frames 300A and 300B) that each include the head region 314 and chest region 316. The pixels or detected regions in these sequential images exhibit subtle modulations caused by the patient's physiology, such as heartbeats and breaths. In particular, the color components of the pixels vary between the frames based on the patient's physiology. In one embodiment, the camera employs the Red/Green/Blue color space and records three values for each pixel in the image frame, one value each for the Red component of the pixel, the Blue component, and the Green component. Each pixel is recorded in memory as these three values, which may be integer numbers (typically ranging from 0 to 255 for 8-bit color depth, or from 0 to 4095 for 12-bit color depth) or fractions (such as between 0 and 1). Thus, three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal.”); and 
the step 57, transferring the first red light digital signal to an upper processing process as a one- dimensional red light signal extraction result, and transferring the first green light digital signal to the upper processing process as a one-dimensional green light signal extraction result (Addison, [0017]: “extracting from the video signal time-varying red, green, and blue signals”; [0019]: “extracting the red, green, and blue signals comprises selecting pixels within the image frame exhibiting a modulation that is at the primary frequency and that has an amplitude above a threshold”; [0050]: “three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal”).

Regarding claim 2, the Addison/Puig combination teaches the method for generating PPG signals according to Claim 1, wherein the method further comprises: performing a real time display on the first PPG signal and locally displaying a PPG waveform image in real time through a display interface (Addison, [0115]: “Component 1 exhibits a repeating pattern of modulations at a relatively steady frequency. Component 1 is constructed from the portions of the source signals that modulate at that frequency. In this case, the frequency of the modulations in Component 1 represents the heart rate of the patient. The contributions of the patient's heart rate to each source signal have been pulled together and combined into the waveform of Component 1, creating a waveform that identifies the heart rate more clearly than any single source signal did”; [0117]: “As discussed above, different groups of pixels or regions can be selected to measure different vital signs, such as heart rate and respiration rate. Figure 13 represents the source signals from a first group of pixels or regions that modulate with the patient's heart rate. These signals are decomposed via ICA to arrive at a relatively clean heart rate signal in Component 1. A different group of pixels or regions that modulate with respiration rate can also be decomposed via ICA to arrive at a relatively clean respiration rate signal. Another region may be decomposed via ICA to arrive a pulsatile signal that demonstrate perfusion status of the patient (such as by Delta POP or DPOP, by measuring the variations in amplitude of the pulses at the top and bottom of the baseline modulations). These vital signs may be measured from the same region or different regions”; [0121]: “The vital signs are output for further processing or display”; [0055]: “The plots in Figure 4A show a clear pattern of repeating modulations or pulses over time. The pulses in each region 1A, 2A, 3A and in the Combined Forehead plot are caused by the patient's heart beats, which move blood through those regions in the patient's forehead, causing the pixels to change color with each beat. The heart rate of the patient can be measured from these signals by measuring the frequency of the modulations. This measurement can be taken via a frequency transform of the signal (discussed below with reference to Figure 10A and Figure 4B ) or via a pulse recognition algorithm that identifies each pulse in the signal (for example, by pulse size and shape, by zero crossings, maximums, or minimums in the derivative of the signal, and/or by checking the skew of the derivative of the signal to identify a pulse as a cardiac pulse, which has a characteristically negative skew). The modulations in the plot of the Chest region, in Figure 4A , are caused by the patient's breaths, which cause the chest to move in correspondence with the breathing rate. The patient's breathing/respiration rate can be measured from this signal in the same way as just described for the heart rate (except for the skew approach).”).

Regarding claim 3, the Addison/Puig combination teaches the method for generating PPG signals according to Claim 1, wherein the method further comprises: 
after the continuous acquisition and photographing operation, performing video stitching on all the cached fragment video data in chronological order by the mobile terminal to generate complete skin surface video data (Addison, [0046]: “. The camera generates a sequence of images over time. A measure of the amount, color, or brightness of light within all or a portion of the image over time is referred to as a light intensity signal. In an embodiment, each image includes a two-dimensional array or grid of pixels, and each pixel includes three color components - for example, red, green, and blue. A measure of one or more color components of one or more pixels over time is referred to as a "pixel signal," which is a type of light intensity signal. The camera operates at a frame rate, which is the number of image frames taken per second (or other time period). Example frame rates include 20, 30, 40, 50, or 60 frames per second, greater than 60 frames per second, or other values between those.”), and sending the video data of complete skin surface to a remote server ([0047]: “The detected images are sent to a monitor 224, which may be integrated with the camera 214 or separate from it and coupled via wired or wireless communication with the camera (such as wireless communication 220 shown in Figure 2A ). The monitor 224 includes a processor 218, a display 222, and hardware memory 226 for storing software and computer instructions. Sequential image frames of the patient are recorded by the video camera 214 and sent to the processor 218 for analysis”); 
performing frame image extraction on the video data of complete skin surface by the server to generate an image sequence of complete skin surface video frame, wherein the image sequence of complete skin surface video frame comprises multiple complete skin surface video frame images (Addison, [0082]: “Figure 10B also includes a cross-correlation process that cross-correlates the frequency spectrums of the three color signals to amplify the results. All four resulting spectrums are analyzed to select and accumulate peaks. A cross correlated spectrum can be calculated by multiplying or summing existing spectrum together. An individual spectrum can be scaled before being combined based on signal quality. For example, because most RGB cameras have twice the number of green pixels compare to red and blue ones, the Green signal is usually better and can be weighted above Red and Blue. This method can follow the strongest peaks around the spectrum over time, as the patient's physiology (such as respiration rate and heart rate) changes.”; ([0047]: “The detected images are sent to a monitor 224, which may be integrated with the camera 214 or separate from it and coupled via wired or wireless communication with the camera (such as wireless communication 220 shown in Figure 2A ). The monitor 224 includes a processor 218, a display 222, and hardware memory 226 for storing software and computer instructions. Sequential image frames of the patient are recorded by the video camera 214 and sent to the processor 218 for analysis”); 
performing one-dimensional red light signal extraction on all the complete skin surface video frame images in the image sequence of complete skin surface video frame according to a preset server red light pixel threshold range to generate a first server red light digital signal, and performing one-dimensional green light signal extraction on all the complete skin surface video frame images in the image sequence of complete skin surface video frame according to a preset server green light pixel threshold range to generate a first server green light digital signal (Addison, [0017]: “extracting from the video signal time-varying red, green, and blue signals”; [0019]: “extracting the red, green, and blue signals comprises selecting pixels within the image frame exhibiting a modulation that is at the primary frequency and that has an amplitude above a threshold”; [0050]: “three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal”); 
according to a preset server band-pass filtering frequency threshold range, performing signal band-pass filtering preprocessing on the first server red light digital signal to generate a second server red light digital signal, and performing signal band-pass filtering preprocessing on the first server green light digital signal to generate a second server green light digital signal (Addison, [0080]: “De-noising includes filtering the signal to remove noise sources and frequencies outside of a known physiologic range”. Denoising filtering can include signal band-pass filtering.); performing maximum frequency difference determination on the second server red light digital signal and the second server green light digital signal to generate a first server determination result (Puig, [0054]: “to develop trace plot 450 from plot 400, qualified peaks are processed in order of descending Relative FFT value. A position corresponding to a qualified peak is added to any trace whose most recent frequency differs from the qualified peak's frequency by less than a maximum frequency difference D.sub.MAX. If a peak is not closer to any active trace than D.sub.MAX, the qualified peak can become a position in a new active trace”); when the first server determination result satisfies the validity condition, performing signal-to- noise ratio determination on the second server red light digital signal and the second server green light digital signal to generate a second server determination result (Addison, [0063]: “The system can focus on pixels that represent the desired vital sign, thereby increasing the signal-to-noise ratio (SNR) of the analyzed signal.”); and when the second server determination result satisfies the validity condition, performing PPG signal generation on the second server red light digital signal and the second server green light digital signal to generate a second PPG signal of the complete skin surface video data (Addison,  [0090]: “The bottom plot in Figure 6B shows three different SpO2 values from a video signal, one for each pair of signals. The top trace is from a ratio of ratios calculation of the Red and Green signals, the middle is the Red and Blue signals, and the bottom is the Green and Blue signals”); and
performing PPG data file conversion on the second PPG signal by the server to generate a PPG data file of complete skin surface, and saving the video data of complete skin surface and the PPG data file of complete skin surface in a medical database (Addison, [0135]: “The computer storage media may include, but are not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic disk storage, or any other hardware medium which may be used to store desired information and that may be accessed by components of the system. Components of the system may communicate with each other via wired or wireless communication”).

Regarding claim 4, the Addison/Puig combination teaches the method for generating PPG signals according to Claim 1, wherein the step of performing frame image extraction on the cached fragment video data to generate a cached fragment frame image sequence, and performing image quality detection on all cached fragment frame images in the image sequence of cached fragment frame according to a preset red light pixel threshold range to generate an image quality detection result comprises: 
sequentially performing data fragment extraction on the cached fragment video data according to a preset cached fragment frame data length to obtain multiple data fragments, generating the multiple cached fragment frame images according to the extracted multiple data fragments, and sorting all the cached fragment frame images in order to generate the image sequence of cached fragment frame (Addison, [0046]: “the field of view 216 encompasses exposed skin of the patient, in order to detect physiologic signals visible from the skin such as arterial oxygen saturation (SpO2 or SvidO2). The camera generates a sequence of images over time. A measure of the amount, color, or brightness of light within all or a portion of the image over time is referred to as a light intensity signal. In an embodiment, each image includes a two-dimensional array or grid of pixels, and each pixel includes three color components - for example, red, green, and blue. A measure of one or more color components of one or more pixels over time is referred to as a "pixel signal," which is a type of light intensity signal. The camera operates at a frame rate, which is the number of image frames taken per second (or other time period). Example frame rates include 20, 30, 40, 50, or 60 frames per second, greater than 60 frames per second, or other values between those”); and 
sequentially performing cached fragment frame image extraction on the image sequence of cached fragment frame in chronological order to generate a current cached fragment frame image (Addison, [0017]: “extracting from the video signal time-varying red, green, and blue signals”; [0019]: “extracting the red, green, and blue signals comprises selecting pixels within the image frame exhibiting a modulation that is at the primary frequency and that has an amplitude above a threshold”; [0050]: “three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal”); calculating a total number of pixels in the current cached fragment frame image to generate a total pixel number, and calculating a total number of pixels with a pixel value within the red light pixel threshold range to generate a total number of red light pixel; generating a red proportion parameter of a current frame according to a ratio of the total number of red light pixel to the total number of pixels (Addison, [0050]: “the video camera records multiple sequential image frames (such as image frames 300A and 300B) that each include the head region 314 and chest region 316. The pixels or detected regions in these sequential images exhibit subtle modulations caused by the patient's physiology, such as heartbeats and breaths. In particular, the color components of the pixels vary between the frames based on the patient's physiology. In one embodiment, the camera employs the Red/Green/Blue color space and records three values for each pixel in the image frame, one value each for the Red component of the pixel, the Blue component, and the Green component. Each pixel is recorded in memory as these three values, which may be integer numbers (typically ranging from 0 to 255 for 8-bit color depth, or from 0 to 4095 for 12-bit color depth) or fractions (such as between 0 and 1). Thus, three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal.”); performing image quality detection on the red proportion parameter of the current frame according to a preset red proportion lower threshold; when the red proportion parameter of the current frame is less than the red proportion lower threshold, setting the image quality detection result as a invalidity condition, and stopping the cached fragment frame image extraction performed on the image sequence of cached fragment frame; or, when the red proportion parameter of the current frame is greater than or equal to the red proportion lower threshold, setting the image quality detection result as the validity condition, and continuing to perform the cached fragment frame image extraction on the image sequence of cached fragment frame (Addison, [0082]: “Figure 10B also includes a cross-correlation process that cross-correlates the frequency spectrums of the three color signals to amplify the results. All four resulting spectrums are analyzed to select and accumulate peaks. A cross correlated spectrum can be calculated by multiplying or summing existing spectrum together. An individual spectrum can be scaled before being combined based on signal quality. For example, because most RGB cameras have twice the number of green pixels compare to red and blue ones, the Green signal is usually better and can be weighted above Red and Blue. This method can follow the strongest peaks around the spectrum over time, as the patient's physiology (such as respiration rate and heart rate) changes.”; [0070]: “Figure 5E shows a method for video-based monitoring of a patient's vital signs, according to an embodiment. The method includes receiving a video signal from a video camera at 511. The video signal includes a plurality of sequential image frames, each image frame having a field of view that includes exposed skin of a patient, such as the face or forehead. The method includes segmenting a first image frame into a plurality of regions at 512, and then, for each region, extracting from the video signal a time-varying color signal at 513. In an example, three time-varying color signals are extracted from each region, corresponding to red, green, and blue pixel values. The method includes identifying a frequency content of each color signal at 514, and selecting regions that have a shared frequency content at 515. The shared frequency content is a modulation at a shared frequency. For example, two regions that both exhibit color signals that modulate at the patient's heart rate, such as a frequency of 60 beats per minute, are selected. In an embodiment, the shared modulation must pass criteria, such as those described above, to select the desired regions. For example, an amplitude threshold for the modulation frequency can be applied as a criterion for selecting regions. In an embodiment, the regions that satisfy this criterion are non-adjacent to each other; they do not need to be in contact with each other or next to each other on the patient. Rather, regions that exhibit a shared modulation at a physiologic frequency, above a noise threshold, are selected even if they are located at disparate, non - contiguous locations across the patient.”).

Regarding claim 6, the Addison/Puig combination teaches the method for generating PPG signals according to Claim 1, wherein the step of performing signal band-pass filtering preprocessing on the first red light digital signal according to a preset band-pass filtering frequency threshold range to generate a second red light digital signal, and performing signal band-pass filtering preprocessing on the first green light digital signal according to the preset band-pass filtering frequency threshold range to generate a second green light digital signal comprises: 
according to the band-pass filtering frequency threshold range, performing digital signal filtering processing on low-frequency noise signal points with a signal frequency lower than the band-pass filtering frequency threshold range and high-frequency noise signal points with a signal frequency higher than the band-pass filtering frequency threshold range in the first red light digital signal to generate the second red light digital signal (Addison; [0080]: “The method includes capturing video, acquiring and averaging color signals (shown as pR, pG, and pB for "photoplethysmogram" red, green, and blue) within a well-perfused ROI, de-noising the signal, performing an FFT (fast Fourier transform) operation over a sliding time window (such as 20 seconds) to identify frequency components of the signals, finding peak frequencies, and accumulating peaks over a period of time (such as one second). De-noising includes filtering the signal to remove noise sources and frequencies outside of a known physiologic range. Examples of filtering operations to remove noise are described below with reference to Figures 17A and 17B . In accumulating peaks, the method may add frequencies multiple times based on their relative height, and may add harmonics of already -added frequencies only once. Frequency peaks are added to the accumulator at the frame rate, such as 25-30 times per second”. Denoising filtering can include signal band-pass filtering; [0040]: “The present invention relates to the field of medical monitoring, and in particular non-contact, video-based monitoring of pulse rate, respiration rate, motion, activity, and oxygen saturation. Systems and methods are described for receiving a video signal in view of a patient, identifying a physiologically relevant area within the video image (such as a patient's forehead or chest), extracting a light intensity signal from the relevant area, filtering those signals to focus on a physiologic component, and measuring a vital sign from the filtered signals. The video signal is detected by a camera that views but does not contact the patient. With appropriate selection and filtering of the video signal detected by the camera, the physiologic contribution to the detected signal can be isolated and measured, producing a useful vital sign measurement without placing a detector in physical contact with the patient”); and 
according to the band-pass filtering frequency threshold range, performing digital signal filtering processing on low-frequency noise signal points with a signal frequency lower than the band-pass filtering frequency threshold range and high-frequency noise signal points with a signal frequency higher than the band-pass filtering frequency threshold range in the first green light digital signal to generate the second green light digital signal (Addison; [0080]: “The method includes capturing video, acquiring and averaging color signals (shown as pR, pG, and pB for "photoplethysmogram" red, green, and blue) within a well-perfused ROI, de-noising the signal, performing an FFT (fast Fourier transform) operation over a sliding time window (such as 20 seconds) to identify frequency components of the signals, finding peak frequencies, and accumulating peaks over a period of time (such as one second). De-noising includes filtering the signal to remove noise sources and frequencies outside of a known physiologic range. Examples of filtering operations to remove noise are described below with reference to Figures 17A and 17B . In accumulating peaks, the method may add frequencies multiple times based on their relative height, and may add harmonics of already -added frequencies only once. Frequency peaks are added to the accumulator at the frame rate, such as 25-30 times per second”. Denoising filtering can include signal band-pass filtering; [0040]: “The present invention relates to the field of medical monitoring, and in particular non-contact, video-based monitoring of pulse rate, respiration rate, motion, activity, and oxygen saturation. Systems and methods are described for receiving a video signal in view of a patient, identifying a physiologically relevant area within the video image (such as a patient's forehead or chest), extracting a light intensity signal from the relevant area, filtering those signals to focus on a physiologic component, and measuring a vital sign from the filtered signals. The video signal is detected by a camera that views but does not contact the patient. With appropriate selection and filtering of the video signal detected by the camera, the physiologic contribution to the detected signal can be isolated and measured, producing a useful vital sign measurement without placing a detector in physical contact with the patient”).

Regarding claim 7, the Addison/Puig combination teaches the method for generating PPG signals according to Claim 1, wherein the step of performing maximum frequency difference determination on the second red light digital signal and the second green light digital signal to generate a first determination result, performing signal-to-noise ratio determination on the second red light digital signal and the second green light digital signal to generate a second determination result when the first determination result satisfies the validity condition, and performing PPG signal generation on the second red light digital signal and the second green light digital signal to generate a first PPG signal of the cached fragment video data when the second determination result satisfies the validity condition comprises: 
a step 71, performing digital signal time domain-frequency domain conversion on the second red light digital signal through discrete Fourier transform to generate a red light frequency domain signal, performing digital signal time domain-frequency domain conversion on the second green light digital signal through discrete Fourier transform to generate a green light frequency domain signal; extracting a maximum-energy frequency from the red light frequency domain signal to generate a maximum red light frequency, and extracting a maximum-energy frequency from the green light frequency domain signal to generate a maximum green light frequency (Addison, [0080]: “As discussed above, a region of interest can be formed based on pixels that modulate with the patient's heart rate. Heart rate can then be calculated from the frequency content of that pixel signal. An example method for calculating heart rate is shown in Figure 10B . The method includes capturing video, acquiring and averaging color signals (shown as pR, pG, and pB for "photoplethysmogram" red, green, and blue) within a well-perfused ROI, de-noising the signal, performing an FFT (fast Fourier transform) operation over a sliding time window (such as 20 seconds) to identify frequency components of the signals, finding peak frequencies, and accumulating peaks over a period of time (such as one second).”); calculating a frequency difference between the maximum red light frequency and the maximum green light frequency to generate a maximum red-green frequency difference; when the maximum red-green frequency difference does not exceed a preset maximum frequency difference threshold range, setting the first determination result as the validity condition; or, when the maximum red-green frequency difference exceeds the maximum frequency difference threshold range, setting the first determination result as a invalidity condition (Puig, [0054]: “o develop trace plot 450 from plot 400, qualified peaks are processed in order of descending Relative FFT value. A position corresponding to a qualified peak is added to any trace whose most recent frequency differs from the qualified peak's frequency by less than a maximum frequency difference D.sub.MAX. If a peak is not closer to any active trace than D.sub.MAX, the qualified peak can become a position in a new active trace”);
a step 72, when the first determination result satisfies the validity condition, according to a preset band-stop filtering frequency threshold range, removing valid signal points with a signal frequency meeting the band-stop filtering frequency threshold range from the second red light digital signal through multi-order Butterworth band-stop filtering to generate a red light noise signal, and removing valid signal points with a signal frequency meeting the band-stop filtering frequency threshold range from the second green light digital signal through multi-order Butterworth band- stop filtering to generate a green light noise signal (Addison; [0080]: “The method includes capturing video, acquiring and averaging color signals (shown as pR, pG, and pB for "photoplethysmogram" red, green, and blue) within a well-perfused ROI, de-noising the signal, performing an FFT (fast Fourier transform) operation over a sliding time window (such as 20 seconds) to identify frequency components of the signals, finding peak frequencies, and accumulating peaks over a period of time (such as one second). De-noising includes filtering the signal to remove noise sources and frequencies outside of a known physiologic range. Examples of filtering operations to remove noise are described below with reference to Figures 17A and 17B . In accumulating peaks, the method may add frequencies multiple times based on their relative height, and may add harmonics of already -added frequencies only once. Frequency peaks are added to the accumulator at the frame rate, such as 25-30 times per second”. Denoising filtering can include signal band-pass filtering; [0040]: “The present invention relates to the field of medical monitoring, and in particular non-contact, video-based monitoring of pulse rate, respiration rate, motion, activity, and oxygen saturation. Systems and methods are described for receiving a video signal in view of a patient, identifying a physiologically relevant area within the video image (such as a patient's forehead or chest), extracting a light intensity signal from the relevant area, filtering those signals to focus on a physiologic component, and measuring a vital sign from the filtered signals. The video signal is detected by a camera that views but does not contact the patient. With appropriate selection and filtering of the video signal detected by the camera, the physiologic contribution to the detected signal can be isolated and measured, producing a useful vital sign measurement without placing a detector in physical contact with the patient”);
a step 73, calculating signal energy of the second red light digital signal to generate red light signal energy, calculating signal energy of the red light noise signal to generate red light noise energy, generating valid red light signal energy according to a difference between the red light signal energy and the red light noise energy, and generating a red light signal-to-noise ratio according to a ratio of the valid red light signal energy to the red light noise energy (Addison, [0063]: “Selecting non-adjacent regions enables the system to focus on the pixels or regions that carry the physiologic signal with the highest signal to noise ratio, ignoring other areas in the image frame that are contributing a relatively higher degree of noise, such as pixels that do not vary much with heart rate, but that might vary due to a passing shadow or patient movement. The system can focus on pixels that represent the desired vital sign, thereby increasing the signal-to-noise ratio (SNR) of the analyzed signal. With signals from several regions available, the signals with the strongest SNR can be chosen, and signals with weak SNR can be discarded. The chosen signals can be combined together to produce a signal with a strong physiologic component”; [0072]: “The combined light signal can be used to calculate statistics, such as an amplitude of the physiologic frequency (in the time or frequency domain), a variability of the frequency over time, a variability of the intensity or color of the selected pixels over time, a skew of the modulations, or a signal to noise ratio. Skew is a useful metric because cardiac pulses tend to have a negative skew. Thus, modulations of pixels that exhibit a negative skew may be more likely to be physiologic. In an embodiment, one or more statistics are calculated, and then used to apply a weight to each color signal (from an individual pixel or from a region) that is being combined. This method results in a weighted average that applies more weight to the pixels that exhibit modulations that are stronger or more likely to be physiologic. For example, pixels that modulate with a strongly negative skew, or a high signal to noise ratio, can be weighted more heavily. The criteria used to select regions can also be used to assign weights; for example, regions or pixels that meet a first, stricter set of criteria may be combined with a first, higher weight, and regions or pixels that meet a second, looser set of criteria may be combined with a second, lower weight”); 
a step 74, calculating signal energy of the second green light digital signal to generate green light signal energy, calculating signal energy of the green light noise signal to generate green light noise energy, generating valid green light signal energy according to a difference between the green light signal energy and the green light noise energy, and generating a green light signal-to-noise ratio according to a ratio of the valid green light signal energy to the green light noise energy (Addison, [0063]: “Selecting non-adjacent regions enables the system to focus on the pixels or regions that carry the physiologic signal with the highest signal to noise ratio, ignoring other areas in the image frame that are contributing a relatively higher degree of noise, such as pixels that do not vary much with heart rate, but that might vary due to a passing shadow or patient movement. The system can focus on pixels that represent the desired vital sign, thereby increasing the signal-to-noise ratio (SNR) of the analyzed signal. With signals from several regions available, the signals with the strongest SNR can be chosen, and signals with weak SNR can be discarded. The chosen signals can be combined together to produce a signal with a strong physiologic component”; [0072]: “The combined light signal can be used to calculate statistics, such as an amplitude of the physiologic frequency (in the time or frequency domain), a variability of the frequency over time, a variability of the intensity or color of the selected pixels over time, a skew of the modulations, or a signal to noise ratio. Skew is a useful metric because cardiac pulses tend to have a negative skew. Thus, modulations of pixels that exhibit a negative skew may be more likely to be physiologic. In an embodiment, one or more statistics are calculated, and then used to apply a weight to each color signal (from an individual pixel or from a region) that is being combined. This method results in a weighted average that applies more weight to the pixels that exhibit modulations that are stronger or more likely to be physiologic. For example, pixels that modulate with a strongly negative skew, or a high signal to noise ratio, can be weighted more heavily. The criteria used to select regions can also be used to assign weights; for example, regions or pixels that meet a first, stricter set of criteria may be combined with a first, higher weight, and regions or pixels that meet a second, looser set of criteria may be combined with a second, lower weight”); 
a step 75, when the red light signal-to-noise ratio and the green light signal-to-noise ratio are both less than a preset signal-to-noise threshold, setting the second determination result as the invalidity condition; or, when any one of the red light signal-to-noise ratio and the green light signal-to-noise ratio is greater than or equal to the signal-to-noise threshold, setting the second determination result as the validity condition (Addison, [0063]: “Selecting non-adjacent regions enables the system to focus on the pixels or regions that carry the physiologic signal with the highest signal to noise ratio, ignoring other areas in the image frame that are contributing a relatively higher degree of noise, such as pixels that do not vary much with heart rate, but that might vary due to a passing shadow or patient movement. The system can focus on pixels that represent the desired vital sign, thereby increasing the signal-to-noise ratio (SNR) of the analyzed signal. With signals from several regions available, the signals with the strongest SNR can be chosen, and signals with weak SNR can be discarded. The chosen signals can be combined together to produce a signal with a strong physiologic component”; [0072]: “The combined light signal can be used to calculate statistics, such as an amplitude of the physiologic frequency (in the time or frequency domain), a variability of the frequency over time, a variability of the intensity or color of the selected pixels over time, a skew of the modulations, or a signal to noise ratio. Skew is a useful metric because cardiac pulses tend to have a negative skew. Thus, modulations of pixels that exhibit a negative skew may be more likely to be physiologic. In an embodiment, one or more statistics are calculated, and then used to apply a weight to each color signal (from an individual pixel or from a region) that is being combined. This method results in a weighted average that applies more weight to the pixels that exhibit modulations that are stronger or more likely to be physiologic. For example, pixels that modulate with a strongly negative skew, or a high signal to noise ratio, can be weighted more heavily. The criteria used to select regions can also be used to assign weights; for example, regions or pixels that meet a first, stricter set of criteria may be combined with a first, higher weight, and regions or pixels that meet a second, looser set of criteria may be combined with a second, lower weight”); and 
a step 76, when the second determination result satisfies the validity condition, setting a red light digital signal of the first PPG signal as the second red light digital signal, and setting a green light digital signal of the first PPG signal as the second green light digital signal, wherein the first PPG signal comprises the red light digital signal and the green light digital signal ([0042]: “The sensor is placed in contact with the patient, such as by clipping or adhering the sensor around a finger, toe, or ear of a patient. The sensor's emitters emit light of two particular wavelengths into the patient's tissue, and the photodetector detects the light after it is reflected or transmitted through the tissue. The detected light signal, called a photoplethysmogram (PPG), modulates with the patient's heartbeat, as each arterial pulse passes through the monitored tissue and affects the amount of light absorbed or scattered.”; [0021]: “the method also includes identifying, in the component signal, individual pulses representative of individual heart beats; for each identified pulse, locating a corresponding portion of two of the red, green, and blue signals; and measuring blood oxygen saturation of the patient from the located corresponding portions of the two signals”).

Regarding claim 8, the Addison/Puig combination teaches the method for generating PPG signals according to Claim 1, wherein the method further comprises: 
when the image quality detection result is an invalidity condition, stopping the continuous acquisition and photographing operation and generating out-of-skin surface error information, then transferring the out-of-skin surface error information to the mobile terminal, and displaying, by the mobile terminal, the out- of-skin surface error information as warning information through the display interface (Addison, [0063]: “Selecting non-adjacent regions enables the system to focus on the pixels or regions that carry the physiologic signal with the highest signal to noise ratio, ignoring other areas in the image frame that are contributing a relatively higher degree of noise, such as pixels that do not vary much with heart rate, but that might vary due to a passing shadow or patient movement. The system can focus on pixels that represent the desired vital sign, thereby increasing the signal-to-noise ratio (SNR) of the analyzed signal. With signals from several regions available, the signals with the strongest SNR can be chosen, and signals with weak SNR can be discarded. The chosen signals can be combined together to produce a signal with a strong physiologic component”; [0072]: “The combined light signal can be used to calculate statistics, such as an amplitude of the physiologic frequency (in the time or frequency domain), a variability of the frequency over time, a variability of the intensity or color of the selected pixels over time, a skew of the modulations, or a signal to noise ratio. Skew is a useful metric because cardiac pulses tend to have a negative skew. Thus, modulations of pixels that exhibit a negative skew may be more likely to be physiologic. In an embodiment, one or more statistics are calculated, and then used to apply a weight to each color signal (from an individual pixel or from a region) that is being combined. This method results in a weighted average that applies more weight to the pixels that exhibit modulations that are stronger or more likely to be physiologic. For example, pixels that modulate with a strongly negative skew, or a high signal to noise ratio, can be weighted more heavily. The criteria used to select regions can also be used to assign weights; for example, regions or pixels that meet a first, stricter set of criteria may be combined with a first, higher weight, and regions or pixels that meet a second, looser set of criteria may be combined with a second, lower weight”; [0093]: “Figure 7 shows a method of calibrating a video-based SpO2 measurement, according to an embodiment of the invention. The method includes performing a spot check with a contact oximeter at 701, comparing the oximeter SpO2 to the video SpO2 (also called Svid O2) at 702, and determining the calibration between the two values (such as an offset, scaling factor, and/or coefficient) at 703. The method then includes measuring SpO2 from the video signal with the calibration at 704. At 705, a timer is used to prompt re-calibration. For example, the timer may be set to expire in 15 minutes, or one hour, or two hours, or other time durations desired by the caregiver. If the time has expired, the method returns to 701; if not, the method continues to 706, where the video SpO2 value is compared to a threshold to identify changes. If the video SpO2 value crosses the threshold, the method includes sounding an alarm (such as an audible sound and/or a visible alert) at 707, and prompting re-calibration at 701. If not, the method returns to continue measuring at 704”); 
when the first determination result is the invalidity condition, stopping the continuous acquisition and photographing operation and the PPG signal generation process and generating signal quality error information, then transferring the signal quality error information to the mobile terminal, and displaying, by the mobile terminal, the signal quality error information as warning information through the display interface (Addison, [0063]: “Selecting non-adjacent regions enables the system to focus on the pixels or regions that carry the physiologic signal with the highest signal to noise ratio, ignoring other areas in the image frame that are contributing a relatively higher degree of noise, such as pixels that do not vary much with heart rate, but that might vary due to a passing shadow or patient movement. The system can focus on pixels that represent the desired vital sign, thereby increasing the signal-to-noise ratio (SNR) of the analyzed signal. With signals from several regions available, the signals with the strongest SNR can be chosen, and signals with weak SNR can be discarded. The chosen signals can be combined together to produce a signal with a strong physiologic component”; [0072]: “The combined light signal can be used to calculate statistics, such as an amplitude of the physiologic frequency (in the time or frequency domain), a variability of the frequency over time, a variability of the intensity or color of the selected pixels over time, a skew of the modulations, or a signal to noise ratio. Skew is a useful metric because cardiac pulses tend to have a negative skew. Thus, modulations of pixels that exhibit a negative skew may be more likely to be physiologic. In an embodiment, one or more statistics are calculated, and then used to apply a weight to each color signal (from an individual pixel or from a region) that is being combined. This method results in a weighted average that applies more weight to the pixels that exhibit modulations that are stronger or more likely to be physiologic. For example, pixels that modulate with a strongly negative skew, or a high signal to noise ratio, can be weighted more heavily. The criteria used to select regions can also be used to assign weights; for example, regions or pixels that meet a first, stricter set of criteria may be combined with a first, higher weight, and regions or pixels that meet a second, looser set of criteria may be combined with a second, lower weight”; [0093]: “Figure 7 shows a method of calibrating a video-based SpO2 measurement, according to an embodiment of the invention. The method includes performing a spot check with a contact oximeter at 701, comparing the oximeter SpO2 to the video SpO2 (also called Svid O2) at 702, and determining the calibration between the two values (such as an offset, scaling factor, and/or coefficient) at 703. The method then includes measuring SpO2 from the video signal with the calibration at 704. At 705, a timer is used to prompt re-calibration. For example, the timer may be set to expire in 15 minutes, or one hour, or two hours, or other time durations desired by the caregiver. If the time has expired, the method returns to 701; if not, the method continues to 706, where the video SpO2 value is compared to a threshold to identify changes. If the video SpO2 value crosses the threshold, the method includes sounding an alarm (such as an audible sound and/or a visible alert) at 707, and prompting re-calibration at 701. If not, the method returns to continue measuring at 704”); or 
when the second determination result is the invalidity condition, stopping the continuous acquisition and photographing operation and the PPG 10signal generation process and generating signal quality error information, then transferring the signal quality error information to the mobile terminal, and displaying, by the mobile terminal, the signal quality error information as warning information through the display interface (Addison, [0063]: “Selecting non-adjacent regions enables the system to focus on the pixels or regions that carry the physiologic signal with the highest signal to noise ratio, ignoring other areas in the image frame that are contributing a relatively higher degree of noise, such as pixels that do not vary much with heart rate, but that might vary due to a passing shadow or patient movement. The system can focus on pixels that represent the desired vital sign, thereby increasing the signal-to-noise ratio (SNR) of the analyzed signal. With signals from several regions available, the signals with the strongest SNR can be chosen, and signals with weak SNR can be discarded. The chosen signals can be combined together to produce a signal with a strong physiologic component”; [0072]: “The combined light signal can be used to calculate statistics, such as an amplitude of the physiologic frequency (in the time or frequency domain), a variability of the frequency over time, a variability of the intensity or color of the selected pixels over time, a skew of the modulations, or a signal to noise ratio. Skew is a useful metric because cardiac pulses tend to have a negative skew. Thus, modulations of pixels that exhibit a negative skew may be more likely to be physiologic. In an embodiment, one or more statistics are calculated, and then used to apply a weight to each color signal (from an individual pixel or from a region) that is being combined. This method results in a weighted average that applies more weight to the pixels that exhibit modulations that are stronger or more likely to be physiologic. For example, pixels that modulate with a strongly negative skew, or a high signal to noise ratio, can be weighted more heavily. The criteria used to select regions can also be used to assign weights; for example, regions or pixels that meet a first, stricter set of criteria may be combined with a first, higher weight, and regions or pixels that meet a second, looser set of criteria may be combined with a second, lower weight”; [0093]: “Figure 7 shows a method of calibrating a video-based SpO2 measurement, according to an embodiment of the invention. The method includes performing a spot check with a contact oximeter at 701, comparing the oximeter SpO2 to the video SpO2 (also called Svid O2) at 702, and determining the calibration between the two values (such as an offset, scaling factor, and/or coefficient) at 703. The method then includes measuring SpO2 from the video signal with the calibration at 704. At 705, a timer is used to prompt re-calibration. For example, the timer may be set to expire in 15 minutes, or one hour, or two hours, or other time durations desired by the caregiver. If the time has expired, the method returns to 701; if not, the method continues to 706, where the video SpO2 value is compared to a threshold to identify changes. If the video SpO2 value crosses the threshold, the method includes sounding an alarm (such as an audible sound and/or a visible alert) at 707, and prompting re-calibration at 701. If not, the method returns to continue measuring at 704”).

Regarding claim 9, the Addison/Puig combination teaches an equipment, comprising a memory and a processor, wherein the memory is used to store a program, and the processor is used to implement the method according to claim 1 (Addison, [0016]: “, a system for video-based measurement of a patient's pulse rate includes a video camera positioned remote from a patient, the video camera having a field of view encompassing exposed skin of the patient; a calibration strip positioned within the field of view, the calibration strip comprising a scale viewable by the camera; and a hardware memory coupled to the video camera by wired or wireless communication, the memory storing instructions for instructing a processor to: detect a first light intensity signal from the scale and a second light intensity signal from the exposed skin of the patient; adjust a calibration of the video camera based on a measurement of the first light intensity signal; apply the calibration to the second light intensity signal; measure a pulse rate of the patient from the calibrated second light intensity signal; and output the measured pulse rate for further processing or display”).

Regarding claim 10, the Addison/Puig combination teaches a computer program product comprising instructions, enabling a computer to implement the method according to claim 1 when running on the computer (Addison, [0135]: “The systems and methods described here may be provided in the form of tangible and non -transitory machine-readable medium or media (such as a hard disk drive, hardware memory, etc.) having instructions recorded thereon for execution by a processor or computer. The set of instructions may include various commands that instruct the computer or processor to perform specific operations such as the methods and processes of the various embodiments described here. The set of instructions may be in the form of a software program or application. The computer storage media may include volatile and non -volatile media, and removable and non-removable media, for storage of information such as computer-readable instructions, data structures, program modules or other data. The computer storage media may include, but are not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic disk storage, or any other hardware medium which may be used to store desired information and that may be accessed by components of the system. Components of the system may communicate with each other via wired or wireless communication. The components may be separate from each other, or various combinations of components may be integrated together into a medical monitor or processor, or contained within a workstation with standard computer hardware (for example, processors, circuitry, logic circuits, memory, and the like). The system may include processing devices such as microprocessors, microcontrollers, integrated circuits, control units, storage media, and other hardware”).

Regarding claim 11, the Addison/Puig combination teaches a computer-readable storage medium, comprising an instruction, wherein when the instruction runs on a computer, the computer implements the method according to claim 1 (Addison, [0135]: “The systems and methods described here may be provided in the form of tangible and non -transitory machine-readable medium or media (such as a hard disk drive, hardware memory, etc.) having instructions recorded thereon for execution by a processor or computer. The set of instructions may include various commands that instruct the computer or processor to perform specific operations such as the methods and processes of the various embodiments described here. The set of instructions may be in the form of a software program or application. The computer storage media may include volatile and non -volatile media, and removable and non-removable media, for storage of information such as computer-readable instructions, data structures, program modules or other data. The computer storage media may include, but are not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic disk storage, or any other hardware medium which may be used to store desired information and that may be accessed by components of the system. Components of the system may communicate with each other via wired or wireless communication. The components may be separate from each other, or various combinations of components may be integrated together into a medical monitor or processor, or contained within a workstation with standard computer hardware (for example, processors, circuitry, logic circuits, memory, and the like). The system may include processing devices such as microprocessors, microcontrollers, integrated circuits, control units, storage media, and other hardware”).


Response to Arguments
All of applicant’s argument regarding the rejections and objections previously set forth have been fully considered and are persuasive unless directly addressed subsequently.  
 Applicant's arguments filed 07/21/2025 have been fully considered but they are not persuasive. Applicant argues that Addison does not disclose performing one-dimensional light signal extraction according to a threshold range on frame images. However, as explained in the rejection above, Addison does disclose these features ([0017]: “extracting from the video signal time-varying red, green, and blue signals”; [0019]: “extracting the red, green, and blue signals comprises selecting pixels within the image frame exhibiting a modulation that is at the primary frequency and that has an amplitude above a threshold”; [0050]: “three one-dimensional vectors for each pixel in the field of view can be extracted from the video signal”). 
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007).  In this case, as explained above and in the previous Office Action, the motivation for combining Addison and Puig is that it provides a more detailed, in depth analysis of the received signals, which can create a more accurate result and generation of the PPG signals.  
In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Applicant states that Puig does not teach pixel extraction or the preset threshold range, however Puig is not being relied on to teach these limitations, therefore this argument is moot. 


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIN K MCCORMACK whose telephone number is (703)756-1886. The examiner can normally be reached Mon-Fri 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jason Sims can be reached at 5712727540. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/E.K.M./Examiner, Art Unit 3791                                                                                                                                                                                                        


/DEVIN B HENSON/Primary Examiner, Art Unit 3791
Read full office action
Prosecution Timeline

Aug 19, 2022
Application Filed
May 14, 2025
Non-Final Rejection mailed — §103
Jul 21, 2025
Response Filed
Sep 24, 2025
Final Rejection mailed — §103
Nov 12, 2025
Response after Non-Final Action
Apr 28, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

17/454,751
Patent 12558004
SENSOR DEVICE MONITORS FOR CALIBRATION
4y 3m to grant Granted Feb 24, 2026
17/839,688
Patent 12484793
APPARATUS AND METHOD FOR ESTIMATING BLOOD PRESSURE
3y 5m to grant Granted Dec 02, 2025
17/569,310
Patent 12419557
PRESSURE SENSOR ARRAY FOR URODYNAMIC TESTING AND A TEST APPARATUS INCLUDING THE SAME
3y 8m to grant Granted Sep 23, 2025
Study what changed to get past this examiner. Based on 3 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

2-3
Expected OA Rounds
12%
Grant Probability
72%
With Interview (+60.0%)
3y 4m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 26 resolved cases by this examiner. Grant probability derived from career allowance rate.