Office Action Analysis: 17644269 — METHOD AND SYSTEM FOR EXTRINSIC CAMERA CALIBRATION

Examiner Intelligence

GOEBEL, EMMA ROSE View full profile →
Grants 53% of resolved cases
Career Allow Rate
24 granted / 45 resolved
-8.7% vs TC avg
Strong +47% interview lift
Without
With
+47.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
40 currently pending
Career history
85
Total Applications
across all art units
Statute-Specific Performance

§101
18.2%
-21.8% vs TC avg
§103
60.1%
+20.1% vs TC avg
§102
11.8%
-28.2% vs TC avg
§112
8.4%
-31.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 45 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on January 9, 2026 has been entered.

Priority
Acknowledgement is made of Applicant’s claim of priority from Foreign Application No. CA3046609 filed June 14, 2019 and as a Continuation of PCT Application No. PCT/IB2020/052938, filed March 27, 2020.	

Status of Claims
Claims 1-3, 5-17, 19 and 21 are pending. Claims 4, 18 and 20 have been cancelled. Claim 21 is newly added.

Response to Arguments
Applicant's arguments filed January 9, 2026 have been fully considered but they are not persuasive.
Applicant argues that the prior art references do not teach rendering multiple synthetic views from a 2D calibration design. Examiner respectfully disagrees. As presented in the 35 USC 103 rejections below, Sun teaches generating synthetic orthographic views that are 2.5D (i.e., a 2D representation of the area with depth measurements from the camera and sensor) orthographic projections that represent the object from different view directions that are used to determine the pose of object relative to the camera (see Sun, Paras. [0068], [0075]-[0076] and [0093]). One having ordinary skill in the art would be motivated to combine Sun’s synthetic view generation with Mallet’s teaching of a 2D calibration design (see Mallet, Para. [0045]) to teach rendering multiple synthetic orthographic views of a 2D calibration design.

Applicant further argues that the prior art references do not teach cross-checking to improve correspondence reliability by feature correspondence between real images and synthetic views. Examiner respectfully disagrees. As described in the 35 USC 103 rejections below, Chang teaches comparing images in a reference and candidate set of image features and selecting a best match based on a determined metric (i.e., highest similarity) (see Chang, Col. 4, line 63 – Col. 5 line 14). While Chang may not explicitly state that the reference and candidate image features are digital camera image features and synthetic view features, one having ordinary skill in the art would have been motivated to combine Chang’s teachings with the digital camera image features of Mallet and the synthetic view features of Sun. Additionally, the Ramnath reference teaches a verification process involving a cross-check that verifies the potential match is a matching target (see Ramnath, Para. [0033]). Thus, the references teach feature correspondence between the real image features and the synthetic view features and perform a cross-check to verify the match. 

Applicant further argues that a person having ordinary skill in the art would not be motivated to combine the references and that the reconstruction of references would only be possible with knowledge of Applicant’s disclosure and therefore constitutes impermissible hindsight. Examiner respectfully disagrees. As shown in the 35 USC 103 rejections, the judgment of obviousness does not take into account knowledge gleaned only from Applicant’s application. Additionally, Applicant is reminded that any judgment of obviousness is in a sense necessarily a reconstruction based on hindsight reasoning (In re McLaughlin, 443 F.2d 1392, 1395, 170 USPQ 209, 212 (CCPA 1971)). Examiner upholds that the motivations presented in the rejections below provide sufficient motivation to combine each reference and that there is no requirement than an express, written motivation to combine must appear in prior art references before a finding of obviousness (see MPEP 2145(X)(A)).

Applicant further argues that the references do not teach the newly added limitations of claim 1 and 9. Specifically, that the Wang reference’s teaching of the extrinsic parameters (r, t) are in the same matrix and cannot be applied to the extrinsic parameters of the disclosed invention. Examiner respectfully disagrees. Applicant’s recitation of “establishing (i) a translation component that is represented as a 3-vector with separate x, y, and z values in 3D space and (ii) a rotation component that is represented as a vector of Euler angles” does not overcome the Wang reference because Wang teaches converting rotation matrices into Euler angles (see Wang, Para. [0079]) and a translation component that is 3 values with separate x, y, and z values in 3D space. While Wang’s equation shows the rotation and translation components in a matrix, nothing in Applicant’s claim specifies that the rotation and translation vectors must be separately identified. Thus, the Wang reference is still applied to teach the extrinsic parameters of the camera, and the 35 USC 103 rejection of the claims is upheld.

Applicant's arguments filed January 9, 2026, with respect to the 35 USC 103 rejection of claim 17 have been fully considered but are moot because of the new grounds of rejection presented in the sections below. Applicant argues that the Ren reference does not teach the newly added limitation of claim 17 and instead teaches away from it. However, the Ren reference is no long relied upon to teach this limitation. Instead, the Shen reference teaches an encoder that encodes feature vectors of the input image into a low dimensional space (see Shen, Para. [0073]). Thus, the 35 USC 103 rejection of the claims is upheld.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-3, 5-6, 9-13 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Mallet et al. (US 2015/0288951 A1) in view of Chang et al. (US 9412164 B2) further in view of Sun et al. (WO 2018/190805 A1), Ramnath et al. (US 2020/01432328 A1, filed November 7, 2018), and Wang et al. (US 2019/0012804 A1).

Regarding claim 1, Mallet teaches a method of determining extrinsic parameters of a camera, the method comprising:
	obtaining a two-dimensional (2D) digital representation of a calibration design (Mallet, Para. [0045], line 2, Mallet teaches a computer graphics model of calibration target 810. Para. [0046], the images may be captured on film, and then scanned into digital form. In digital form the pixel coordinates of an image point can be determined. Para. [0047], Each image captured for calibrating a camera may include a 2D array of pixels, and may be enumerated using pixel coordinates);
	obtaining a digital camera image of a physical representation of the calibration design (Mallet, Para. [0044], lines 11-12, Mallet teaches a camera captures images of the scene including images of a calibration target);
	identifying a set of features in the digital camera image (Mallet, Para. [0047], an identification procedure is performed on an image in order to identify parts of the image that correspond to particular fiducial markings),
	wherein each feature in the set of features is either (i) a portion of the digital camera image or (ii) a descriptor that is representative of information derived from the portion of the digital camera image (Mallet, Para. [0047], an identification procedure is performed on an image in order to identify parts of the image that correspond to particular fiducial markings. That is, the parts or segments of the image are identified for which a fiducial marking on the calibration target was the source in the physical scene).
Although Mallet teaches a computer graphics model of calibration target and a processor that can compare information from the captured images to the computer graphics model (Mallet, Para. [0045]), Mallet does not explicitly teach “identifying the set of features in each synthetic view of the multiple synthetic views, so as to identify multiple sets of features”, “finding correspondences by – comparing each feature in the set of features of the digital camera image with each feature in each set of the multiple sets of features of the multiple synthetic views” and “for each feature in the set of features of the digital camera image, identifying a best match by selecting, from among the multiple sets of features of the multiple synthetic views, a first feature that is determined to have a highest similarity and that is in a given one of the multiple synthetic views”. However, in an analogous field of endeavor, Chang teaches the image processing system matches features in the reference and candidate sets of the image features by determining at least one metric that characterizes the matching image features in the reference and candidate sets (e.g., weighted average or estimated probabilities) and then selects ones of the image features in the reference set from which the calibration-enabling data is derived based on the determined metric (i.e., highest similarity) (Chang, Col. 4 line 63-Col. 5 line 14).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet with the teachings of Chang by including generating a candidate set of image features for the synthetic image and determining a best match between the synthetic image and the digital camera image by selecting the image feature based on the highest similarity. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for calibrating an imaging system based on matched image features, as recognized by Chang.
Although Mallet in view of Chang teaches a computer graphics model of calibration target and a processor that can compare information from the captured images to the computer graphics model (Mallet, Para. [0045]), they do not explicitly teach “providing the 2D digital representation of the calibration design as input, to a three-dimensional (3D) rendering framework that produces, as output, multiple synthetic views that depict the calibration design under different camera projective transforms”. However, in an analogous field of endeavor, Sun teaches a 3D CAD model or 3D data is used to render or generate synthetic orthographic views from any potential viewpoint a user or operator may look at the object in a real scene. The strategy for creating synthetic views may be random or may be based on planned sampling the 3D space (Sun, Para. [0075]). The image processor is configured to determine a pose of the object relative to the camera and depth sensor based on poses of the object in the 3D orthographic projections. Where a best match is found, the pose of the orthographic projection or representation derived from the 3D data is determined as the pose of the object relative to the camera and depth sensor. By using image representations derived from the orthographic projections, the comparison of the orthographic projections may be more rapidly performed (Sun, Para. [0093]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet in view of Chang with the teachings of Sun by including using a 3D CAD model or 3D data to render multiple synthetic orthographic views from any potential viewpoint a user or operator may look at the object in a real scene of the digital representation of the calibration design and the view is used to determine the pose of the object relative to the camera (i.e., the calibration design under different camera projective transforms). One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for determining the extrinsic camera parameters based on the plurality of generated synthetic views, as recognized by Sun.
Although Mallet in view of Chang further in view of Sun teaches determining matching features from a reference and candidate set based on the highest similarity (Chang, Col. 4 line 63-Col. 5 line 14), they do not explicitly teach “comparing the first feature against each feature in the set of features in the digital camera image, so as to identify a second feature in the digital camera image that has a highest similarity as a cross-check feature” and “accepting the first feature as the best match in response to a determination that the second feature is the same as that feature for which the first feature was identified as the best match”. However, in an analogous field of endeavor, Ramnath teaches passing potential matches through a verifier process that compares the local-feature descriptors of the image against the local-feature descriptors of the one or more potential matches. If there were only one potential match, the verifier process may verify that the potential match is a matching target (Ramnath, Para. [0033]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet in view of Chang further in view of Sun with the teachings of Ramnath by including a verifier process that compares the potential best match feature to the to the features in the image and verify that the potential match is a matching target based on the comparison. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for verifying that a potential match is the best match, as recognized by Ramnath.
Although Mallet in view of Chang further in view of Sun and Ramnath teaches determining, among other parameters, the position, rotation, distortion and focal length of the camera with respect to the calibration target (Mallet, Para. [0045]), they do not explicitly teach that for the synthetic views, “wherein each of the synthetic views is associated with a set of translation and rotation coordinates that is representative of a corresponding camera projective transform that indicates placement of a corresponding one of multiple virtual cameras relative to the 2D digital representation of the calibration design” and “calculating, based on the virtual camera parameters of the features associated with the best matches, the extrinsic parameters of the camera by establishing (i) a translation component that is represented as a 3-vector with separate x, y, and z values in 3D space and (ii) a rotation component that is represented as a vector of Euler angles”. However, in an analogous field of endeavor, Wang teaches a virtual camera may be defined by virtual camera parameters which represent the configuration of the virtual camera required in order to have captured the second image. For example, each virtual camera has, among other virtual camera parameters, a position and orientation which can be determined (Wang, Para. [0065]). K, R, and t are the camera intrinsic (K) and extrinsic (R, t) parameters, respectively, of each virtual camera. Extrinsic (R, t) parameters of each virtual camera.                         
                            
                                    R
                                    |
                                    t
                                
                            =
                            
                                                        r
                                                    
                                                        11
                                                    
                                                        r
                                                    
                                                        12
                                                    
                                                        r
                                                    
                                                        13
                                                    
                                                        t
                                                    
                                                        1
                                                    
                                                        r
                                                    
                                                        21
                                                    
                                                        r
                                                    
                                                        22
                                                    
                                                        r
                                                    
                                                        23
                                                    
                                                        t
                                                    
                                                        2
                                                    
                                                        r
                                                    
                                                        31
                                                    
                                                        r
                                                    
                                                        32
                                                    
                                                        r
                                                    
                                                        33
                                                    
                                                        t
                                                    
                                                        3
                                                    
                     (i.e., translation component is a 3-vector with x, y, and z values in 3D space) (Wang, Para. [0136]). Wang further teaches positions and orientations (i.e., extrinsic parameters) of the plurality of multi-directional image capture apparatuses (i.e., camera) may be determined based on the positions and orientations of the virtual cameras (Wang, Para. [0101]). If there are twelve virtual cameras (six from each panoramic image of the stereo-pair of panoramic images) corresponding to the multi-directional image capture apparatus, then twelve rotation matrices are obtained for the orientation of the multi-directional image capture apparatus. Each of these rotation matrices may then be converted into corresponding Euler angles to obtain a set of Euler angles for the multi-directional image capture apparatus (i.e., a rotation component that is represented as a vector of Euler angles) (Wang, Para. [0079]). 
	Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet in view of Chang further in view of Sun and Ramnath with the teachings of Wang by including for each synthetic view of the virtual camera, intrinsic and extrinsic parameters included a translation component that is represented as a 3-vector with separate x, y, and z values in 3D space and a rotation component that is represented as a vector of Euler angles and determining the camera extrinsic parameters based on the extrinsic parameters of the virtual camera. One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for determining camera extrinsic parameters based on virtual cameras, as recognized by Wang. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Regarding claim 2, Mallet in view of Chang further in view of Sun, Ramnath and Wang teaches the method of claim 1, and further teaches wherein the calibration design is asymmetric in at least one dimension (Mallet, Figs. 2-5, calibration targets 200, 300, 400, and 500, Mallet teaches calibration targets that have unique and asymmetrical fiducial marker patterns).

Regarding claim 3, Mallet in view Chang further in view of Sun, Ramnath and Wang teaches the method of claim 2, and further teaches wherein the calibration design is a logo (Mallet, Para. [0025], Mallet teaches that calibrating a camera involves using a test object, or a calibration target, that can contain marks known as fiducial markings). As indicated by your specification (Para. [0013]: “the planar calibration pattern 30 may be a logo, background image or other design that may already normally appear in the camera’s 10 field of view”), the indication of the calibration image as a logo adds no special characteristics or unique improvements to the method, and therefore could suitably be any test object or calibration target, as taught by Mallet.

Regarding claim 5, Mallet in view Chang further in view of Sun, Ramnath and Wang teaches the method of claim 1, and further teaches wherein the multiple synthetic views are selected from a space of virtual camera parameters where the calibration image is within a field of view of the synthetic view (Wang, Para. [0065], a virtual camera may be defined by virtual camera parameters which represent the configuration of the virtual camera required in order to have captured the second image).
The proposed combination as well as the motivation for combining the Mallet, Chang, Sun, Ramnath, and Wang references presented in the rejection of Claim 1, apply to Claim 5 and are incorporated herein by reference.  Thus, the method recited in Claim 5 is met by Mallet in view of Chang further in view of Sun, Ramnath, and Wang.

Regarding claim 6, Mallet in view Chang further in view of Sun, Ramnath and Wang teaches the method of claim 1, and further teaches wherein identifying a set of features from each of the plurality of synthetic views and identifying the set of features in the digital camera image is performed using a feature detection module (Para. [0045], lines 7-10, Mallet teaches that without receiving input from a user, processor 804 can compare information from the captured images to the computer graphics model. The processor taught by Mallet can be a feature detection module).

Regarding claim 9, Mallet teaches a camera calibration module for determining the translation and rotation of a camera using a physical, planar calibration pattern, the camera calibration module comprising:
a feature detector for extracting a set of features from an image captured from the camera (Mallet, Para. [0047], an identification procedure is performed on an image in order to identify parts of the image that correspond to particular fiducial markings),
wherein each feature in the set of features is either (i) a predetermined portion of the image or one of the plurality of synthetic views or (ii) a descriptor that is representative of information derived from the predetermined portion (Mallet, Para. [0047], an identification procedure is performed on an image in order to identify parts of the image that correspond to particular fiducial markings. That is, the parts or segments of the image are identified for which a fiducial marking on the calibration target was the source in the physical scene);
Although Mallet teaches a computer graphics model of calibration target and a processor that can compare information from the captured images to the computer graphics model (Mallet, Para. [0045]), Mallet does not explicitly teach extracting a set of features from “each of the plurality of synthetic views”, “a feature matching module for finding correspondences by comparing each feature in the set of features of the digital camera image with each feature in each set of features of each synthetic view of the plurality of synthetic views” and “for each feature in the set of features of the digital camera image, identifying a best match by selecting, from among the sets of features of the plurality of synthetic views, a first feature that is determined to have a highest similarity and that is in a given one of the plurality of synthetic views”. However, in an analogous field of endeavor, Chang teaches the image processing system matches features in the reference and candidate sets of the image features by determining at least one metric that characterizes the matching image features in the reference and candidate sets (e.g., weighted average or estimated probabilities) and then selects ones of the image features in the reference set from which the calibration-enabling data is derived based on the determined metric (i.e., highest similarity) (Chang, Col. 4 line 63-Col. 5 line 14).
The proposed combination as well as the motivation for combining the Mallet and Chang references presented in the rejection of Claim 1, apply to Claim 9 and are incorporated herein by reference. 
Although Mallet in view of Chang teaches a computer graphics model of calibration target and a processor that can compare information from the captured images to the computer graphics model (Mallet, Para. [0045]), they do not explicitly teach “a synthetic pattern generator for generating a plurality of synthetic views of a digital calibration image corresponding to the physical calibration pattern”. However, in an analogous field of endeavor, Sun teaches a 3D CAD model or 3D data is used to render or generate synthetic orthographic views from any potential viewpoint a user or operator may look at the object in a real scene. The strategy for creating synthetic views may be random or may be based on planned sampling the 3D space (Sun, Para. [0075]).
The proposed combination as well as the motivation for combining the Mallet, Chang and Sun references presented in the rejection of Claim 1, apply to Claim 9 and are incorporated herein by reference. 
Although Mallet in view of Chang further in view of Sun teaches determining matching features from a reference and candidate set based on the highest similarity (Chang, Col. 4 line 63-Col. 5 line 14), they do not explicitly teach “comparing the first feature against each feature in the set of features in the digital camera image, so as to identify a second feature in the digital camera image that has a highest similarity as a cross-check feature” and “accepting the first feature as the best match in response to a determination that the second feature is the same as that feature for which the first feature was identified as the best match”. However, in an analogous field of endeavor, Ramnath teaches passing potential matches through a verifier process that compares the local-feature descriptors of the image against the local-feature descriptors of the one or more potential matches. If there were only one potential match, the verifier process may verify that the potential match is a matching target (Ramnath, Para. [0033]).
The proposed combination as well as the motivation for combining the Mallet, Chang, Sun, and Ramnath references presented in the rejection of Claim 1, apply to Claim 9 and are incorporated herein by reference. 
Although Mallet in view of Chang further in view of Sun and Ramnath teaches determining, among other parameters, the position, rotation, distortion and focal length of the camera with respect to the calibration target (Mallet, Para. [0045]), they do not explicitly teach “a calibration solver for calculating the translation and rotation of the camera using virtual camera parameters of the features associated with the best matches” and “wherein the translation of the camera is represented as a 3-vector with separate x, y, and z values in 3D space, and wherein the rotation of the camera that is represented as (i) a vector of Euler angles, (ii) a 3x3 rotation matrix, or (iii) an angle-axis vector”. However, in an analogous field of endeavor, Wang teaches a virtual camera may be defined by virtual camera parameters which represent the configuration of the virtual camera required in order to have captured the second image. For example, each virtual camera has, among other virtual camera parameters, a position and orientation which can be determined (Wang, Para. [0065]). K, R, and t are the camera intrinsic (K) and extrinsic (R, t) parameters, respectively, of each virtual camera. Extrinsic (R, t) parameters of each virtual camera.                         
                            
                                    R
                                    |
                                    t
                                
                            =
                            
                                                        r
                                                    
                                                        11
                                                    
                                                        r
                                                    
                                                        12
                                                    
                                                        r
                                                    
                                                        13
                                                    
                                                        t
                                                    
                                                        1
                                                    
                                                        r
                                                    
                                                        21
                                                    
                                                        r
                                                    
                                                        22
                                                    
                                                        r
                                                    
                                                        23
                                                    
                                                        t
                                                    
                                                        2
                                                    
                                                        r
                                                    
                                                        31
                                                    
                                                        r
                                                    
                                                        32
                                                    
                                                        r
                                                    
                                                        33
                                                    
                                                        t
                                                    
                                                        3
                                                    
                     (i.e., translation component is a 3-vector with x, y, and z values in 3D space, rotation of the camera represented by a 3x3 rotation matrix) (Wang, Para. [0136]). Wang further teaches positions and orientations (i.e., extrinsic parameters) of the plurality of multi-directional image capture apparatuses (i.e., camera) may be determined based on the positions and orientations of the virtual cameras (Wang, Para. [0101]).
The proposed combination as well as the motivation for combining the Mallet, Chang, Sun, Ramnath and Wang references presented in the rejection of Claim 1, apply to Claim 9 and are incorporated herein by reference. Thus, the camera calibration module recited in claim 9 is met by Mallet in view of Chang further in view of Sun, Ramnath, and Wang.

Regarding claim 10,  Mallet in view of Chang further in view of Sun, Ramnath, and Wang teaches the camera calibration module of claim 9, and further teaches wherein the digital calibration image is asymmetric in at least one dimension (Mallet, Figs. 2-5, calibration targets 200, 300, 400, and 500, Mallet teaches calibration targets that have unique and asymmetrical fiducial marker patterns).

Regarding claim 11, Mallet in view of Chang further in view of Sun, Ramnath, and Wang teaches the camera calibration module of claim 10, and further teaches wherein each synthetic view of the plurality of synthetic views is associated with a set of translation and rotation coordinates that indicates placement of a corresponding one of a plurality of virtual cameras relative to the digital calibration image (Wang, Para. [0065], a virtual camera may be defined by virtual camera parameters which represent the configuration of the virtual camera required in order to have captured the second image. For example, each virtual camera has, among other virtual camera parameters, a position and orientation which can be determined. Para. [0136], K, R, and t are the camera intrinsic (K) and extrinsic (R, t) parameters, respectively, of each virtual camera).
The proposed combination as well as the motivation for combining the Mallet, Chang, Sun, Ramnath and Wang references presented in the rejection of Claim 1, apply to Claim 11 and are incorporated herein by reference. Thus, the camera calibration module recited in claim 11 is met by Mallet in view of Chang further in view of Sun, Ramnath, and Wang.

Regarding claim 12, Mallet in view of Chang further in view of Sun, Ramnath, and Wang teaches the camera calibration module of claim 9, and further teaches wherein the extrinsic parameters and the virtual camera parameters comprise translation and rotation coordinates (Wang, Para. [0065], a virtual camera may be defined by virtual camera parameters which represent the configuration of the virtual camera required in order to have captured the second image. For example, each virtual camera has, among other virtual camera parameters, a position and orientation which can be determined. Para. [0136], K, R, and t are the camera intrinsic (K) and extrinsic (R, t) parameters, respectively, of each virtual camera).
The proposed combination as well as the motivation for combining the Mallet, Chang, Sun, Ramnath and Wang references presented in the rejection of Claim 1, apply to Claim 12 and are incorporated herein by reference. Thus, the camera calibration module recited in claim 12 is met by Mallet in view of Chang further in view of Sun, Ramnath, and Wang.

Regarding claim 13, Mallet in view of Chang further in view of Sun, Ramnath, and Wang teaches the camera calibration module of claim 9, and further teaches wherein the plurality of synthetic views are selected from a space of virtual camera parameters where the calibration image is within a field of view of the synthetic view (Wang, Para. [0065], a virtual camera may be defined by virtual camera parameters which represent the configuration of the virtual camera required in order to have captured the second image).
The proposed combination as well as the motivation for combining the Mallet, Chang, Sun, Ramnath and Wang references presented in the rejection of Claim 1, apply to Claim 13 and are incorporated herein by reference. Thus, the camera calibration module recited in claim 13 is met by Mallet in view of Chang further in view of Sun, Ramnath, and Wang.

Regarding claim 16, Mallet in view of Chang further in view of Sun, Ramnath, and Wang teaches a camera calibration system comprising: 
the camera calibration module of claim 9 (as described above);
the camera (Para. [0044], line 4, Mallet teaches a physical camera 802); and
the physical planar calibration pattern (Para. [0044], line 10, Mallet teaches a calibration target 810); 
wherein output from the camera is embedded with the translation and rotation of the camera (Para. [0010], lines 8-9, Mallet teaches using the processor to determine the position, rotation, and focal length for the physical camera).

Claims 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Mallet (US 2015/0288951 A1) in view of Chang (US 9412164 B2) further in view of Sun (WO 2018/190805 A1), Ramnath (US 2020/01432328 A1, filed November 7, 2018) and Wang et al. (US 2019/0012804 A1), as applied to claims 1-6, 9-13, and 16 above, and further in view of Gossow (US 2017/0098395 A1).

Regarding claim 7, Mallet in view of Chang further in view of Sun, Ramnath and Wang teaches the method of claim 1, as described above.
Although Mallet in view of Chang further in view of Sun, Ramnath and Wang teaches calculating the extrinsic parameters of the camera using the virtual camera parameters of the features associated with the best matches (Mallet, Para. [0045]), they do not explicitly teach “wherein finding correspondences comprising computing the elementwise difference between each feature and minimizing this difference for both the synthetic view to captured image and captured image to synthetic view”. However, in an analogous field of endeavor, Gossow teaches identifying best matches by computing the elementwise difference between each feature (Para. [0039] lines 6-7, Gossow teaches a difference between intensities of the pixels in the synthetic images and intensities of pixels in the actual images) and minimizing this difference for both the synthetic view to captured image and captured image to synthetic view (Para. [0039] lines 5-7, Gossow teaches a calibration device that determines the value of the parameter vector that minimizes a difference between intensities of the pixels in the synthetic images and intensities of pixels in the actual images).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date to modify the method of Mallet in view of Chang further in view of Sun, Ramnath and Wang with the teachings of Gossow by including identifying best matches by computing the elementwise difference between each feature and minimizing the difference for both the synthetic view to captured image and captured image to synthetic view. One having ordinary skill in the art would be motivated to combine the references since doing so would improve the speed and accuracy of camera calibration and extrinsic parameter determination, as recognized by Gossow. Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

Claim 14 recites a camera calibration module with elements corresponding to the steps recited in Claim 7. Therefore, the recited elements of this claim are mapped to the proposed combination in the same manner as the corresponding steps in its corresponding method claim.  Additionally, the rationale and motivation to combine the Mallet, Chang, Sun, Ramnath, Wang and Gossow references, presented in rejection of Claim 7, apply to this claim.

Claims 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Mallet (US 2015/0288951 A1) in view of Chang (US 9412164 B2) further in view of Sun (WO 2018/190805 A1), Ramnath (US 2020/01432328 A1, filed November 7, 2018) and Wang et al. (US 2019/0012804 A1), as applied to claims 1-6, 9-13, 16, and 20 above, and further in view of  Hong (US 8861864 B2).

Regarding claim 8, Mallet in view of Chang further in view of Sun, Ramnath and Wang teaches the method of claim 1, as described above.
Although Mallet in view of Chang further in view of Sun, Ramnath and Wang teaches calculating the extrinsic parameters of the camera using the virtual camera parameters of the features associated with the best matches (Mallet, Para. [0045]), they do not explicitly teach “identifying a region of interest of the digital camera image wherein the identifying the set of features in the digital camera image is performed only on the region of interest”. However, in an analogous field of endeavor, Hong teaches identifying a region of interest of the digital camera image (Hong, Col. 1, line 59, Hong teaches a user selecting a desired region of interest), and wherein identifying the set of features in the digital camera image is performed only on the region of interest (Hong, Col. 1, lines 59-61, Hong teaches that after a user selects a desired region of interest, a feature detector, such as SIFT, is applied to the region of interest).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet in view of Chang further in view of Sun, Ramnath, and Wang with the teachings of Hong by including identifying a region of interest of the digital camera image and perform feature identification on this region. One having ordinary skill in the art would be motivated to combine the references since doing so would improve the speed and accuracy of camera calibration and extrinsic parameter determination. Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

Claim 15 recites a camera calibration module with elements corresponding to the steps recited in Claim 8. Therefore, the recited elements of this claim are mapped to the proposed combination in the same manner as the corresponding steps in its corresponding method claim.  Additionally, the rationale and motivation to combine the Mallet, Chang, Sun, Ramnath, Wang and Hong references, presented in rejection of Claim 8, apply to this claim.

Claims 17, 19 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Mallet et al. (US 2015/0288951 A1) in view of Chang et al. (US 9412164 B2) further in view of Sun et al. (WO 2018/190805 A1), Ramnath et al. (US 2020/01432328 A1, filed November 7, 2018), Wang et al. (US 2019/0012804 A1) and Shen et al. (US 2018/0260668 A1).

Regarding claim 17, Mallet teaches a method comprising:
obtaining a graphic pattern in digital form (Mallet, Para. [0045], a computer graphics model of calibration target 810 including the positions and unique patterns of the fiducial markings on the target);
obtaining an image of a physical representation of the graphic pattern that is generated by a camera (Para. [0044], lines 11-12, Mallet teaches a camera captures images of the scene including images of a calibration target); 
identifying the set of features in the image (Para. [0045], lines 8-9, Mallet teaches a processor that can compare information from the captured images).
Although Mallet teaches a computer graphics model of calibration target and a processor that can compare information from the captured images to the computer graphics model (Mallet, Para. [0045]), Mallet does not explicitly “identifying a set of features in each synthetic view of the multiple synthetic views, so as to identify multiple sets of features” and “for each feature in the set of features of the digital camera image, identifying a best match by selecting, from among the multiple sets of features identified in the multiple synthetic views, a given first feature that is determined to be most similar to that feature and that is in a given one of the multiple synthetic views”. However, in an analogous field of endeavor, Chang teaches the image processing system determines a reference set of image features from the electronic data file which specifies the reference image in a reference coordinate space (Chang, Col. 3, lines 38-44) and that the image processing system matches features in the reference and candidate sets of the image features by determining at least one metric that characterizes the matching image features in the reference and candidate sets (e.g., weighted average or estimated probabilities) and then selects ones of the image features in the reference set from which the calibration-enabling data is derived based on the determined metric (i.e., highest similarity) (Chang, Col. 4 line 63-Col. 5 line 14).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet with the teachings of Chang by including determining  set of features in the synthetic view for each synthetic view and determining a best match between the synthetic image and the digital camera image by selecting the image feature based on the highest similarity. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for calibrating an imaging system based on matched image features, as recognized by Chang.
Although Mallet in view of Chang teaches a computer graphics model of calibration target and a processor that can compare information from the captured images to the computer graphics model (Mallet, Para. [0045]), they do not explicitly teach “providing the graphic pattern, as input, to a three-dimensional (3D) rendering framework that renders the graphic pattern into multiple synthetic views, wherein each synthetic view is representative of an image that depicts the graphic pattern under a projective transform that is calculable from corresponding virtual camera parameters”. However, in an analogous field of endeavor, Sun teaches a 3D CAD model or 3D data is used to render or generate synthetic orthographic views from any potential viewpoint a user or operator may look at the object in a real scene. The strategy for creating synthetic views may be random or may be based on planned sampling the 3D space (Sun, Para. [0075]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet in view of Chang with the teachings of Sun by including using a 3D CAD model or 3D data to render multiple synthetic orthographic views from any potential viewpoint a user or operator may look at the object in a real scene. One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for determining the extrinsic camera parameters based on the plurality of generated synthetic views, as recognized by Sun. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.
Although Mallet in view of Chang further in view of Sun teaches determining matching features from a reference and candidate set based on the highest similarity (Chang, Col. 4 line 63-Col. 5 line 14), they do not explicitly teach “comparing the first feature against each feature in the set of features in the digital camera image, so as to identify a second feature in the digital camera image that is determined to be the most similar to the first feature” and “accepting the first feature as the best match in response to a determination that the second feature is the same as that feature for which the first feature was identified as the best match”. However, in an analogous field of endeavor, Ramnath teaches passing potential matches through a verifier process that compares the local-feature descriptors of the image against the local-feature descriptors of the one or more potential matches. If there were only one potential match, the verifier process may verify that the potential match is a matching target (Ramnath, Para. [0033]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet in view of Chang further in view of Sun with the teachings of Ramnath by including a verifier process that compares the potential best match feature to the to the features in the image and verify that the potential match is a matching target based on the comparison. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for verifying that a potential match is the best match, as recognized by Ramnath.
Although Mallet in view of Chang further in view of Sun and Ramnath teaches determining, among other parameters, the position, rotation, distortion and focal length of the camera with respect to the calibration target (Mallet, Para. [0045]), they do not explicitly teach “a corresponding set of translation and rotation coordinates, which is indicative of placement of a corresponding virtual camera with respect to the graphic pattern” and “calculating a translation and a rotation component of the camera based on the virtual camera parameters of the features identified in the image”. However, in an analogous field of endeavor, Wang teaches a virtual camera may be defined by virtual camera parameters which represent the configuration of the virtual camera required in order to have captured the second image. For example, each virtual camera has, among other virtual camera parameters, a position and orientation which can be determined (Wang, Para. [0065]). K, R, and t are the camera intrinsic (K) and extrinsic (R, t) parameters, respectively, of each virtual camera (Wang, Para. [0136]). Wang further teaches positions and orientations (i.e., extrinsic parameters) of the plurality of multi-directional image capture apparatuses (i.e., camera) may be determined based on the positions and orientations of the virtual cameras (Wang, Para. [0101]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet in view of Chang further in view of Sun and Ramnath with the teachings of Wang by including for each synthetic view of the virtual camera, intrinsic and extrinsic parameters including translation and rotation coordinates and determining the camera extrinsic parameters based on the extrinsic parameters of the virtual camera. One having ordinary skill in the art would have been motivated to combine these references, because doing so would allow for determining camera extrinsic parameters based on virtual cameras, as recognized by Wang.
Although Mallet in view of Chang further in view of Sun, Ramnath and Wang teaches a patch associated with a point of interest may be used to generate a local-feature descriptor (e.g., using a Scale Invariant Feature Transform (SIFT), using Speed Up Robust Feature (SURF)) that may function as a representation of the patch (Ramnath, Para. [0028]), they do not explicitly teach “wherein each feature in the set of features is representative of a patch of a corresponding one of the multiple synthetic views that is accompanied by a descriptor, the descriptor being representative of an encoding of information related to the patch in a lower-level dimensional space”. However, in an analogous field of endeavor, Shen teaches an encoder that encodes the content, or feature vectors, of the input image into a low-dimensional space (Shen, Para. [0073]).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the method of Mallet in view of Chang further in view of Sun, Ramnath and Wang with the teachings of Shen by including encoding the local-feature descriptor representative of a patch as taught by Ramnath using the encoder of Shen that encodes the feature vector into a low-dimensional space. One having ordinary skill in the art would have been motivated to combine these references because doing so would allow for encoding feature representations that capture image contexts, as recognized by Shen. Thus, the claimed invention would have been obvious to one having ordinary skill in the art before the effective filing date.

Regarding claim 19, Mallet in view of Chang further in view of Sun, Ramnath, Wang and Shen teaches the method of claim 17, and further teaches wherein the translation component is represented as a vector with x-, y-, and z-values, and wherein the rotation component is represented as a vector of Euler angles, an angle-axis vector, or a 3x3 matrix (Wang, Para. [0136], extrinsic (R, t) parameters of each virtual camera.                 
                    
                            R
                            |
                            t
                        
                    =
                    
                                                r
                                            
                                                11
                                            
                                                r
                                            
                                                12
                                            
                                                r
                                            
                                                13
                                            
                                                t
                                            
                                                1
                                            
                                                r
                                            
                                                21
                                            
                                                r
                                            
                                                22
                                            
                                                r
                                            
                                                23
                                            
                                                t
                                            
                                                2
                                            
                                                r
                                            
                                                31
                                            
                                                r
                                            
                                                32
                                            
                                                r
                                            
                                                33
                                            
                                                t
                                            
                                                3
                                            
            ).
The proposed combination as well as the motivation for combining the Mallet, Chang, Sun, Ramnath, Wang and Shen references presented in the rejection of Claim 17, apply to Claim 19 and are incorporated herein by reference. Thus, the method recited in claim 19 is met by Mallet in view of Chang further in view of Sun, Ramnath, Wang and Shen.

Regarding claim 21, Mallet in view of Chang further in view of Sun, Ramnath, Wang and Shen teaches the method of claim 17, wherein to identify the set of features in each synthetic view of the multiple synthetic views, a Speeded Up Robust Features (SURF) algorithm or a Maximally Stable Extremal Regions (MSER) algorithm is executed that identifies the set of features in a fashion invariant to scaling and rotation (Ramnath, Para. [0028], a patch associated with a point of interest may be used to generate a local-feature descriptor (e.g., using a Scale Invariant Feature Transform (SIFT), using Speed Up Robust Feature (SURF)) that may function as a representation of the patch. The local-feature descriptor may include location information (e.g., an (x, y) coordinate) for the point of interest within the image).
The proposed combination as well as the motivation for combining the Mallet, Chang, Sun, Ramnath, Wang and Shen references presented in the rejection of Claim 17, apply to Claim 21 and are incorporated herein by reference. Thus, the method recited in claim 21 is met by Mallet in view of Chang further in view of Sun, Ramnath, Wang and Shen.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Emma Rose Goebel whose telephone number is (703)756-5582. The examiner can normally be reached Monday - Friday 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached at (571) 272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Emma Rose Goebel/Examiner, Art Unit 2662                                                                                                                                                                                                        
/AMANDEEP SAINI/Supervisory Patent Examiner, Art Unit 2662
Read full office action
Prosecution Timeline

Dec 14, 2021
Application Filed
Jul 24, 2024
Non-Final Rejection — §103
Jan 09, 2025
Response Filed
Jan 29, 2025
Final Rejection — §103
Apr 18, 2025
Request for Continued Examination
Apr 21, 2025
Response after Non-Final Action
May 05, 2025
Non-Final Rejection — §103
Sep 09, 2025
Response Filed
Oct 07, 2025
Final Rejection — §103
Jan 09, 2026
Request for Continued Examination
Jan 23, 2026
Response after Non-Final Action
Mar 10, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/146,581
Patent 12597236
FINE-TUNING JOINT TEXT-IMAGE ENCODERS USING REPROGRAMMING
2y 5m to grant Granted Apr 07, 2026
18/155,081
Patent 12597129
METHOD FOR ANALYZING IMMUNOHISTOCHEMISTRY IMAGES
2y 5m to grant Granted Apr 07, 2026
18/462,431
Patent 12597093
UNDERWATER IMAGE ENHANCEMENT METHOD AND IMAGE PROCESSING SYSTEM USING THE SAME
2y 5m to grant Granted Apr 07, 2026
18/568,996
Patent 12597124
DEBRIS DETERMINATION METHOD
2y 5m to grant Granted Apr 07, 2026
17/822,688
Patent 12588885
FAT MASS DERIVATION DEVICE, FAT MASS DERIVATION METHOD, AND FAT MASS DERIVATION PROGRAM
2y 5m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds
Prosecution Projections

5-6
Expected OA Rounds
53%
Grant Probability
99%
With Interview (+47.0%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 45 resolved cases by this examiner. Grant probability derived from career allow rate.
METHOD AND SYSTEM FOR EXTRINSIC CAMERA CALIBRATION

This examiner grants 53% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

METHOD AND SYSTEM FOR EXTRINSIC CAMERA CALIBRATION

This examiner grants 53% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email