Last updated: April 19, 2026

Application No. 18/539,722

METHOD AND SYSTEM FOR TRACKING OBJECT FOR AUGMENTED REALITY

Non-Final OA §103§112

Filed

Dec 14, 2023

Examiner

HE, YINGCHUN

Art Unit

2613

Tech Center

2600 — Communications

Assignee

VIRNECT CO., LTD.

OA Round

1 (Non-Final)

Interview Optional

— +14.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 644 resolved cases, 2023–2026

Examiner Intelligence

HE, YINGCHUN View full profile →

Grants 82% — above average

Career Allow Rate

529 granted / 644 resolved

+20.1% vs TC avg

Moderate +14% lift

Without

With

+14.4%

Interview Lift

resolved cases with interview

Typical timeline

2y 5m

Avg Prosecution

27 currently pending

Career history

671

Total Applications

across all art units

Statute-Specific Performance

§101

8.4%

-31.6% vs TC avg

§103

54.0%

+14.0% vs TC avg

§102

5.4%

-34.6% vs TC avg

§112

17.9%

-22.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 644 resolved cases

Office Action

§103 §112

DETAILED ACTION
*Note in the following document:
1. Texts in italic bold format are limitations quoted either directly or conceptually from claims/descriptions disclosed in the instant application.
2. Texts in regular italic format are quoted directly from cited reference or Applicant’s arguments.
3. Texts with underlining are added by the Examiner for emphasis.
4. Texts with 
5. Acronym “PHOSITA” stands for “Person Having Ordinary Skill In The Art”.

	
	Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings are either low resolution or out of focus and therefore not of sufficient quality to permit examination. Accordingly, replacement drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to this Office action. The replacement sheet(s) should be labeled “Replacement Sheet” in the page header (as per 37 CFR 1.84(c)) so as not to obstruct any portion of the drawing figures. If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action.
Applicant is given a shortened statutory period of TWO (2) MONTHS to submit new drawings in compliance with 37 CFR 1.81. Extensions of time may be obtained under the provisions of 37 CFR 1.136(a) but in no case can any extension carry the date for reply to this letter beyond the maximum period of SIX MONTHS set by statute (35 U.S.C. 133). Failure to timely submit replacement drawing sheets will result in ABANDONMENT of the application.

Specification
The disclosure is objected to because of the following informalities: [470] of the disclosure recites “FIG.26”.  There is no Fig.26 in drawing.  It is suggested Applicant to confirm the cited FIG.26 is intended to be FIG.18.  Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 9 and 11 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding Claim 9, Claim 9 recites executing the object tracking based on the obtained 3D definition model, obtaining the plurality of frame images based on the object tracking, extracting the descriptors within the plurality of obtained frame images, and determining the key frame image based on the extracted descriptors in parallel.  It is unclear whether all the cited executing, obtaining, extracting and determining steps are in parallel or the last two/three steps are in parallel.
Regarding Claim 11, Claim 11 recites the limitation "occlusion area" in line 2. There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claim(s) 1 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Lin (US 2018/0335840 A1).
Regarding Claim 1, Lin disclose an object tracking method ([0004]: One aspect of the present disclosure is related to an eye tracking method) for augmented reality ([0003]: Nowadays, eye tracking methods are used in various applications. For example, in virtual reality (VR) or augmented reality (AR) application, eye tracking methods are used in the VR/AR system to trace a user's gazing direction in order to provide corresponding reaction and/or control in the VR/AR environment), by which a tracking application executed by at least one processor of a terminal (Fig.1: See processing circuit) performs object tracking for augmented reality, the method comprising: 
obtaining a 3D definition model (Fig.2: step S1 obtaining, by a processing circuit, an eye model) trained based on images capturing a target object from a first viewpoint ([0066]: Accordingly, various machine learning methods may be applied to minimize the cost of errors between the estimated pupil region Re and the pupil region of interest Ri to optimize the eye model. The optimized eye model, similarly to the original eye model, includes a matrix indicating relationship between the viewpoint of the eye and the gaze vector of the pupil region of interest.  A machine learning model is a trained model); 
performing object tracking of the target object based on the obtained 3D definition model (Fig.2: S6 tracing, by the processing circuit, a motion of the eye based on the viewpoint calculate using the eye model); 

    PNG
    media_image1.png
    689
    459
    media_image1.png
    Greyscale

obtaining a plurality of frame images from a plurality of viewpoints for the target object based on the object tracking; learning the target object from the plurality of viewpoints based on the plurality of frame images obtained; updating the 3D definition model based on the learning ([0066]: various machine learning methods may be applied to minimize the cost of errors between the estimated pupil region Re and the pupil region of interest Ri to optimize the eye model. The optimized eye model, similarly to the original eye model, includes a matrix indicating relationship between the viewpoint of the eye and the gaze vector of the pupil region of interest.  Lin does not explicitly recite obtaining a plurality of frame images from a plurality of viewpoints for the target object based on the object tracking; learning the target object from the plurality of viewpoints based on the plurality of frame images obtained; updating the 3D definition model based on the learning.  However a PHOSITA before the effective filing date of the claimed invention would have known that principal of machine learning is to obtain a plurality training data, process the training data through the machine learning model, obtain a result, obtain a comparison result by comparing the result with an expected result, adjust model’s parameter therefore minimize the comparison error); and 
performing AR object tracking for the target object based on the updated 3D definition model (Fig.8: S6 tracing, by the processing circuit, a motion of the eye based on the viewpoint calculated using the optimized eye model).

    PNG
    media_image2.png
    672
    447
    media_image2.png
    Greyscale


Regarding Claim 13, Claim 13 is similar to Claim 1 except in the format of system.  Lin discloses at least one memory and at least one processor as shown in Fig.1.  Therefore the same reason(s) for rejection is/are applied to Claim 1 is/are also applied to Claim 13.

Claims 2 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Lin (US 2018/0335840 A1) as applied to Claim 1 above, and further in view of Liu et al. (US 2022/0292715 A1) and Ye et al. (US 2020/0273190 A1).
Regarding Claim 2, Lin fails to disclose wherein the learning of the target object includes: extracting descriptors within the plurality of frame images obtained, determining a key frame image based on the extracted descriptors, and obtaining 3D depth data based on the determined key frame image.
However Liu discloses 
extracting descriptors within the plurality of frame images obtained ([0079]: a descriptor corresponding to a feature point (e.g., Fk) such as Oriented FAST and Rotated BRIEF (ORB) is extracted from an input image, and a feature point matching relationship between two frames before and after is obtained by comparing the extracted descriptor (e.g., a descriptor for feature point Fk) with a descriptor of an input image of the previous frame (e.g., a descriptor for feature point Fj)), 
determining a key frame image based on the extracted descriptors ([0085]: When determining a tracking state between the two frames before and after, the number of feature points tracked in the current frame and the number of newly extracted feature points (i.e., feature points newly extracted from the current frame when the number of matched feature points obtained on the current frame is less than the set threshold value) may be compared to each other, and when the number of feature points tracked in the current frame is less than a given value or the number of newly extracted feature points exceeds a given value, the current frame may be used as a key frame in the key frame set), and 
obtaining 3D depth data based on the determined key frame image ([0122]: In this case, a process of calculating the feature matching relationship includes the following operations. That is, feature points and descriptors on a similar key frame obtained by looking up a current frame may be extracted, and a three-dimensional (3D) point cloud corresponding to the current frame and the similar key frame may be obtained through a depth value corresponding to each key frame. Through the 3D point cloud, a corresponding device pose and a descriptor, a feature point matching relationship between two frames may be calculated.  Note Liu teaches depth data may be obtained through various input sensors, see [0052]: Specifically, the SLAM may capture inputs from various sensors (e.g., a LiDAR, a camera, an Inertial Measurement Unit (IMU), a Global Positioning System (GPS), and a depth sensor (Kinect)) in real time, thereby building a three-dimensional scene or map while estimating a pose in real time).
Lin and Liu are both related to the field of augmented reality application.  Therefore it would have been obvious to a PHOSITA before the effective filing date to incorporate the teaching of Liu into that of Lin and to include the limitation of extracting descriptors within the plurality of frame images obtained, determining a key frame image based on the extracted descriptors in order to a improve the efficiency of global and local feature joint learning on a data set with only image level annotations as suggested by Liu ([0066]).
Lin modified by Liu fails to disclose obtaining 3D depth data based on the determined key frame image.
However Ye disclose, before the effective filing date of the claimed invention, a PHOSITA had already known to obtain 3D depth data based on the determined key frame image ([0008]: Depth estimation from CNN based on monocular color image: the key frames generated by previous step are input into the well-trained CNN model to obtain corresponding depth maps).  Therefore it would have been obvious to a PHOSITA before the effective filing date to incorporate the teaching of Ye into that of Lin modified by Liu and to include the limitation of obtaining 3D depth data based on the determined key frame image in order to obtain high-quality dense depth maps as suggested by Ye ([0004]).

Regarding Claim 10, Ye further teaches or suggests obtaining the 3D depth data for each key frame image ([0008]: the key frames generated by previous step are input into the well-trained CNN model to obtain corresponding depth maps) and updating the 3D definition model based on the 3D depth data obtained for each key frame image ([0010]: Depth fusion and reconstruction: Depth sources are fused and the corresponding confidence map is computed for every key frame).  The same reason to combine as that of Claim 2 is applied.

Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Lin (US 2018/0335840 A1) in view of Liu et al. (US 2022/0292715 A1) and Ye et al. (US 2020/0273190 A1) as applied to Claim 2 above, and further in view of Nagy et al. (US 20180114372 A1).
Regarding Claim 3, Lin as modified teaches The key frame set may be built by employing the SLAM technique. Specifically, the key frame set may be built through the following method. First, feature points of an input image of an image acquisition device (a process is expressed by a current frame and a previous frame) may be collected or extracted. Next, the number of feature points tracked between a current frame image and a previous frame image may be updated, and key frames included in the key frame set may be updated (Liu [0076]).  Liu further teaches extracting a descriptor corresponding a feature point such as Oriented FAST and Rotated BRIEF (ORB) ([0079]).
Lin as modified fails to explicitly recite wherein the extracting of the descriptors within the plurality of frame images includes obtaining frame descriptor information for each of the plurality of frame images based on 6 degrees of freedom (DoF) parameters between 3D depth data of the 3D definition model and the plurality of frame images.
However Nagy, in the same field of endeavor, discloses a PHOSITA before the effective filing date of the claimed invention had already known that The tracking can comprise, for example, six degrees of freedom (6DoF) device tracking. This can be implemented, for example, using Simultaneous Localization and Mapping (SLAM). SLAM generally includes constructing and/or updating a map of an unknown environment while simultaneously keeping track of an agent's (e.g., the capture device's) location within the environment. One suitable approach uses ORB-SLAM with a monocular, stereo, and/or RGB-D camera of the capture device. It should be appreciated that this granularity of device tracking need not be employed in all embodiments, such as in embodiments of viewing and/or modifying environmental snapshots ([0065]).
Therefore it would have been obvious to a PHOSITA before the effective filing date to incorporate the teaching of Nagy into that of Lin as modified and to include the limitation of wherein the extracting of the descriptors within the plurality of frame images includes obtaining frame descriptor information for each of the plurality of frame images based on 6 degrees of freedom (DoF) parameters between 3D depth data of the 3D definition model and the plurality of frame images in order to use a known technology to extract a descriptor within a frame image as required by Lin modified by Liu.

Claims 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Lin (US 2018/0335840 A1) as applied to Claim 1 above, and further in view of Shreve et al. (US 2023/0403459 A1).
Regarding Claim 11, Lin fails to disclose providing an object additional shooting guide describing a procedure for capturing the occlusion area representing a target object area other than a sight area which is the target object area detected from the first viewpoint.
However Shreve discloses providing an object additional shooting guide describing a procedure for capturing the occlusion area representing a target object area other than a sight area which is the target object area detected from the first viewpoint ([0067]: Module 216 can use the statistical measures estimated by module 214 and guide the user to capture additional images to ensure obtaining all possible views of each object under all lighting conditions with and without occlusions and without excessive blur, in addition to other scene characteristic changes. Module 216 can provide this guidance as instructions (similar to instructions 176/198 in FIG. 1) to the user using AR features, such as arrows, waypoints, animations (e.g., to tilt the screen up/down, to move closer/further, etc.)).  
Shreve and Lin both are related to augmented reality application.  Therefore it would have been obvious to a PHOSITA before the effective filing date to incorporate the teaching of Shreve into that of Lin and to include the limitation of providing an object additional shooting guide describing a procedure for capturing the occlusion area representing a target object area other than a sight area which is the target object area detected from the first viewpoint in order to obtain or capture images with high quality that cover all the conditions expected in the final application as suggested by Shreve ([0006]).

Regarding Claim 12, Shreve further teaches or suggests wherein the providing of the object additional shooting guide includes providing the object additional shooting guide based on a predetermined virtual object ([0067]: Module 216 can provide this guidance as instructions (similar to instructions 176/198 in FIG. 1) to the user using AR features, such as arrows, waypoints, animations (e.g., to tilt the screen up/down, to move closer/further, etc.)).  The same reason to combine as that of Claim 11 is applied.

Allowable Subject Matter
Claims 4-8 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:
Prior art, either individually or in combination, fails to disclose or render obviousness the limitation of calculating the number of detected times that each the same descriptor is detected within the plurality of frame descriptor information and setting a same descriptor for which the calculated number of detected times is smaller than or equal to a predetermined criterion as an invalid descriptor as claimed in dependent Claim 4.  The closest prior art Liu et al. (US 2022/0292715 A1) discloses extracting a descriptor from a current frame.  But prior art fails to disclose above cited limitation. Claims 5-8 are rejected due to their dependency on Claim 4.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YINGCHUN HE whose telephone number is (571)270-7218. The examiner can normally be reached M-F 8:00-5:00 MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao M Wu can be reached at 571-272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/YINGCHUN HE/Primary Examiner, Art Unit 2613

Read full office action

Prosecution Timeline

Dec 14, 2023

Application Filed

Oct 30, 2025

Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/645,277

Patent 12602886

LOW LATENCY HAND-TRACKING IN AUGMENTED REALITY SYSTEMS

2y 5m to grant Granted Apr 14, 2026

18/029,570

Patent 12588711

METHOD AND APPARATUS FOR OUTPUTTING IMAGE FOR VIRTUAL REALITY OR AUGMENTED REALITY

2y 5m to grant Granted Mar 31, 2026

18/389,197

Patent 12586247

IMAGE DISTORTION CALIBRATION DEVICE, DISPLAY DEVICE AND DISTORTION CALIBRATION METHOD

2y 5m to grant Granted Mar 24, 2026

18/410,479

Patent 12586491

Display Device and Method for Driving the Same

2y 5m to grant Granted Mar 24, 2026

18/430,973

Patent 12579949

IMAGE PROCESSING APPARATUS

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

82%

Grant Probability

96%

With Interview (+14.4%)

2y 5m

Median Time to Grant

Low

PTA Risk

Based on 644 resolved cases by this examiner. Grant probability derived from career allow rate.