Last updated: July 17, 2026

Application No. 18/414,891

TWO-DIMENSIONAL POSE ESTIMATIONS

Non-Final OA §103

Filed

Jan 17, 2024

Priority

Jul 27, 2021 — continuation of PCTIB2021056819

Examiner

LIN, JESSICA YIFANG

Art Unit

2668

Tech Center

2600 — Communications

Assignee

Hinge Health Inc.

OA Round

3 (Non-Final)

Interview Optional

— -8.3% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 80% grant rate with -8.3% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 10 resolved cases, 2023–2026

Examiner Intelligence

LIN, JESSICA YIFANG View full profile →

Grants 80% — above average

Career Allowance Rate

8 granted / 10 resolved

+18.0% vs TC avg

Minimal -8% lift

Without

With

+-8.3%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

48 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§103

83.3%

+43.3% vs TC avg

§102

16.7%

-23.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 10 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 5/12/2026 has been entered.
 Information Disclosure Statement
The information disclosure statement (IDS) submitted on 1/17/2024, 5/16/2026 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Objections
Claim 19 objected to because of the following informalities: 
“…to merge the second output with the second suboutput to generate generate the merged output.” Should read “…to merge the second output with the second suboutput to generate the merged output.” The repeated word should be deleted.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-3, 5-7, 9-21 are rejected under 35 U.S.C. 103 as being unpatentable over Choutas et. al. (European Patent EP 3547211 A1) in view of Narapureddy et. al. (United States Patent US 11625838 B1).
Regarding claims 1 and 9, Choutas et. al. discloses an apparatus comprising: a communications interface at which to receive (Choutas et. al. para. 0059) from an image source, raw image data that includes a representation of a first object and a second object (Choutas et. al. figure 2); a memory storage unit in which to store the raw image data; and a neural network engine (Choutas et. al. para. 0002) to apply a first convolution to the raw image data to extract first features from a first output, to downsample the first output to extract a first set of subfeatures from a first suboutput, to apply by the higher-resolution branch to the first output, a second convolution to extract a second set of features that is representative of a second output, the second convolution to the first suboutput to extract a second set of subfeatures from a second suboutput, (Choutas et. al. para. 0014-0019) and merge the second output and the second suboutput to generate joint heatmaps (Choutas et. al. para. 0044) of the first object and the second object, and bone heatmaps (Choutas et. al. para. 0043) of the first object and the second object (Choutas et. al. para. 0014-0019, figure 5). Choutas et. al. also teaches a neural network engine that includes a higher-resolution branch and a lower-resolution branch (Choutas et. al. para 0044, the spatial resolution of a heatmap could be lower than the input frame, due to the stride of the network).
However, Choutas et. al. fails to disclose as claimed, merge the second output and the second suboutput to produce a merged data; apply a third convolution to the merged data to extract a third set of features that is representative of a merged output; and generate, based on the merged output,(i) one or more first joint heatmaps of the first object, (ii) one or more joint heatmaps of the second object, (iii) one or more first bone heatmaps of the first object, and(iv) one or more second bone heatmaps of the second object.
Narapureddy et. al. teaches and merge the second output and the second suboutput to produce a merged data; apply a third convolution to the merged data to extract a third set of features that is representative of a merged output; and generate, based on the merged output,(i) one or more first joint heatmaps of the first object, (ii) one or more joint heatmaps of the second object, (iii) one or more first bone heatmaps of the first object, and(iv) one or more second bone heatmaps of the second object (Narapureddy et. al., Fig. 4 422, Generating merged feature data based at least in part on merging the second feature data over time, col 4, lines 20-50, the techniques described herein operate in a common spatiotemporal volume, are end-to-end learnable, and are not restricted to the multi-view setting only. Joints and/or joint data, as described herein, may correspond to anatomical human joints. Col 6, lines 15-20, For each person, we train the network to detect its “center”, which is defined as the midpoint between neck and center of the hips of the person. The loss at each time t is expressed directly as a distance between the expected heatmap and the output heatmap. Fig. 2, col 5, lines 38-62, All matched descriptors which overlap are then merged (merging 216) into a single descriptor which is finally deconvolved (deconvolutions 218) into a 3D pose for the person tracked at central frame. Col 10, lines 1-14, a softmax-based merging strategy is used, Col 8., lines 20-21). 

    PNG
    media_image1.png
    318
    524
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    872
    416
    media_image2.png
    Greyscale
 
    PNG
    media_image3.png
    602
    822
    media_image3.png
    Greyscale

This is important to the claimed invention because the merging of two different suboutputs from the same convolution applied to two different neural network branches with two different inputs are combined back together to produce the final output for the joint heatmaps. Thus, it would have been obvious to one skilled in the art prior to the effective filing date of the claimed invention to have substituted the neural network backbone and the generating merged feature data steps of Narapureddy et. al. from 3D to 2D and also combined with the teachings of Choutas et. al. so that these features are used in the 2D human pose estimation solution (KSR Rationale B, Rationale A). This method also produces fewer errors when estimating 2D poses in each frame, as 2D images are inherent in the 3D space. 

Dependent claims 2-3, 5-6, and 8 rely on the apparatus of claim 1, and dependent claims 10-14 rely on the method of claim 9.
Claims 2 and 10 pertain to the apparatus of claim 1 and method of claim 9, wherein the second suboutput is upsampled and merged with the second output to generate the merged data. Narapureddy et. al. discloses second suboutput is upsampled and merged with the second output to generate the merged data (Narapureddy et. al., Fig. 4 422, Generating merged feature data based at least in part on merging the second feature data over time).

Claims 3 and 11 recite the apparatus of claim 2 and method of claim 10, wherein the second output is downsampled and merged with the second suboutput to generate the merged data. Narapureddy et. al. discloses wherein the second output is downsampled and merged with the second suboutput to generate the merged data (Narapureddy et. al. Fig. 2, Merging step).

Regarding claim 12, Narapureddy et. al. discloses the method of claim 11, further comprising applying a third convolution to the merged output prior to generating the joint heatmaps and the bone heatmaps (Narapureddy et. al. Fig. 2, col. 6, lines 52-60, col. 8, lines 23-27, Tesseract Convolutions. The tesseract is then passed through multiple 4D convolutions and max pooling layers of 4D CNN to produce a reduced size tesseract feature. The merged tesseract is finally passed through multiple 4D deconvolution layers during deconvolution to produce a vector of 3D heatmaps of the person’s joints at time t.).

Claims 5-6 and claims 13-14 specify the apparatus of claim 1 and method of claim 9, wherein the first set of features are low level features and more specifically as edges. Narapureddy et. al. discloses wherein the first set of features are low level features and more specifically as edges. (Narapureddy et. al. col 7, lines 5-15).

    PNG
    media_image4.png
    312
    548
    media_image4.png
    Greyscale


Claims 7 and 15-16 recite the apparatus of claim 1 and the method of claim 9, wherein downsampling comprises maximum pooling and upsampling comprises a deconvolution operation. Narapureddy et. al. discloses wherein downsampling comprises maximum pooling and upsampling comprises a deconvolution operation (Narapureddy et. al. col. 6 lines 52-60, col. 8, lines 23-27, Tesseract Deconvolutions. The input to this sub-network is the output of the HRNet (or other backbone network) pre-final layer which is cast in 3D at each time stamp. The same procedure is followed as for the person detection network to generate the features for each time instance of the tesseract. The tesseract is then passed through multiple 4D convolutions and max pooling layers of 4D CNN to produce a reduced size tesseract feature. The merged tesseract is finally passed through multiple 4D deconvolution layers during deconvolution 218 to produce a vector of 3D heatmaps of the person’s joints at time t). However, using maximum pooling and deconvolution for down-and upsampling, respectively, are regarded as an obvious alternative in the context of CNNs.

Regarding claim 17, which is a non-transitory computer readable medium encoded with codes, wherein the codes are to direct a processor to carry out the method of claim 9, and associated with the apparatus of claim 1, which the rejection analysis is incorporated herein.

Claims 18-20 recite the non-transitory computer readable medium of claim 17, wherein the codes are to direct the processor to carry out the tasks carried out by the apparatus as outlined by claims 2-4.

Claim 18 recites wherein the codes are to direct the processor to upsample the second suboutput and to merge the second suboutput with the second output to generate a first merged output. Choutas et. al. discloses this limitation in paragraph [0014]-[0019].

Claim 19 recites wherein the codes are to direct the processor to downsample the second output and to merge the second output with the second suboutput to generate a first merged suboutput. Choutas et. al. discloses this limitation in paragraph [0014]-[0019].

Claim 20 recites wherein the codes are to direct the processor to apply a third convolution to the second merged output to generate a third output, and to apply the third convolution the first merged suboutput to generate a third suboutput. Choutas et. al. discloses this limitation in paragraphs [0061]-[0062].

Claim 21 recites the non-transitory computer readable medium of claim 19, and Narapureddy et. al. further discloses wherein the first set of features are edges (Narapureddy et. al. col 7, lines 5-15).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Choutas et. al. (European Patent EP 3547211 A1) in view of Narapureddy et. al. (United States Patent US 11625838 B1) as applied to claim 1 above, and further in view of Jun et. al. (Jun, J., Lee, J., & Kim, C. (2020). Human Pose Estimation Using Skeletal Heatmaps. 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 1287-1292, and Wang et .al. (Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., Liu, W., & Xiao, B. (2019). Deep High-Resolution Representation Learning for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 3349-3364.).

Regarding claim 8, Choutas et. al. and Narapureddy et. al. disclose the apparatus of claim 1, wherein the merged output is used to generate (i) a first plurality of joint heatmaps of the first object, wherein each of the first plurality of joint heatmaps corresponds to a different joint of the first object. However, Choutas et. al. and Narapureddy et. al. fail to disclose (ii) a second plurality of joint heatmaps of the second object, wherein each of the second plurality of joint heatmaps corresponds to a different joint of the second object,(iii) the one or more first bone heatmaps of the first object, wherein each of the one or more first bone heatmaps corresponds to a different pair of the first plurality of joint heatmaps, and(iv) the one or more second bone heatmaps of the second object, wherein each of the one or more second bone heatmaps corresponds to a different pair of the second plurality of joint heatmaps.
Jun et. al. and Wang et. al. teaches the second output and the second suboutput are merged (Jun et. al.: first fusion step in figure 2; Wang et. al.: first fusion step in figure 2, sections 3.1-3.2, figure 3) to generate joint heatmaps of the first object and the second object (Jun et. al. figure 1f, output heatmaps in figure 2, first paragraph in section II, multiple objects in figure 5), and bone heatmaps of the first object and the second object (Jun et. al. paragraph 6 in section 1: "Given an image in Figure. 1(a), skeletal attention module produces skeletal heatmaps as shown in Fig. 1(b)", HS in figure 3, section II.C). The merging of the second output and the second suboutput to generate joint heatmaps of the first object and the second object, and bone heatmaps of the first object and the second object, is important to the claimed invention because this is the step where two different branches with the same convolution applied are combined back together. Thus, it would have been obvious to one skilled in the art prior to the effective filing date of the claimed invention to have combined the teachings of Choutas et. al., Narapureddy et. al., Jun et. al., and Wang et. al., so that these features are shown in the final solution of the neural network.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JESSICA YIFANG LIN whose telephone number is (571)272-6435. The examiner can normally be reached M-F 7:00am-6:15pm, with optional day off.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at 571-272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JESSICA YIFANG LIN/
Examiner, Art Unit 2668                                                                                                                                                                                            May 19, 2026

/VU LE/Supervisory Patent Examiner, Art Unit 2668

Read full office action

Prosecution Timeline

Jan 17, 2024

Application Filed

Jan 12, 2026

Non-Final Rejection mailed — §103

Feb 04, 2026

Response Filed

Mar 25, 2026

Final Rejection mailed — §103

May 11, 2026

Examiner Interview Summary

May 12, 2026

Request for Continued Examination

May 14, 2026

Response after Non-Final Action

May 29, 2026

Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/387,485

Patent 12678109

CONTROL METHOD AND CONTROL SYSTEM FOR IMAGE SCANNING, ELECTRONIC APPARATUS, AND STORAGE MEDIUM

2y 8m to grant Granted Jul 14, 2026

18/569,546

Patent 12597139

CONTROLLING AN ALERT SIGNAL FOR SPECTRAL COMPUTED TOMOGRAPHY IMAGING

2y 3m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 2 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

80%

Grant Probability

72%

With Interview (-8.3%)

2y 5m (~0m remaining)

Median Time to Grant

High

PTA Risk

Based on 10 resolved cases by this examiner. Grant probability derived from career allowance rate.