Last updated: May 29, 2026
Application No. 18/493,591
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

Final Rejection §103§112
Filed
Oct 24, 2023
Priority
Apr 27, 2021 — JP 2021-075040 +1 more
Examiner
SHENG, XIN
Art Unit
2619
Tech Center
2600 — Communications
Assignee
Canon Kabushiki Kaisha
OA Round
2 (Final)
Interview Optional

— +17.2% interview lift. Examiner has a relatively high allowance rate (72%); +17.2% interview lift. A written response may suffice.
Based on 404 resolved cases, 2023–2026
Examiner Intelligence

SHENG, XIN View full profile →
Grants 72% — above average
Career Allowance Rate
293 granted / 404 resolved
+10.5% vs TC avg
Strong +17% interview lift
Without
With
+17.2%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
14 currently pending
Career history
421
Total Applications
across all art units
Statute-Specific Performance

§101
1.6%
-38.4% vs TC avg
§103
94.5%
+54.5% vs TC avg
§102
1.0%
-39.0% vs TC avg
§112
0.3%
-39.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 404 resolved cases
Office Action

§103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s amendments and remarks submitted 12/31/2025 have been entered and considered, Claims 1-7, 9, 11-14, 16-17 are cancelled. Claims 8, 10, 15 are amended. Claims 18-25 are new. This action is made final.

Response to Arguments
Applicant’s arguments filed on 12/31/2025 have been fully considered but are not persuasive.
Applicant argues “However, a review of OTA and NAKAMURA indicate OTA and NAKAMURA do not teach or suggest the aforementioned features recited in the presently amended claims. In particular, OTA and NAKAMURA do not teach or suggest at least “acquire first information indicating a position of three-dimensional shape data of a subject,… and generate a virtual viewpoint image based on the specified captured image and the three-dimensional shape data of the subject” as presently claimed. Thus, since the combination of OTA, NAKAMURA, and TATENO, whether considered individually or in proper combination, do not teach or suggest at least the aforementioned features as presently recited in independent Claims 18, 24, and 25, Applicant combination of OTA, NAKAMURA, and TATENO, whether considered individually or in proper combination, cannot render Claims 18, 24, and 25 prima facie obvious”. 
However, applicant should submit an argument under the heading “Remarks” pointing out disagreements with the examiner’s contentions.  Applicant must also discuss the references applied against the claims, explaining how the claims avoid the references or distinguish from them. Simply by stating the prior art doesn’t teach the claim is not persuasive.
Ota, Tateno and Nakamura are analogous art because they all teach generating virtual image/video based on input position and orientation data. Ota further teaches user input viewpoint position and orientation data of virtual camera. Tateno further teaches adding texture and color information in generating the virtual viewpoint image/video. Nakamura further teaches input position and orientation data for 3D object of interest. Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention, to modify the method of generating virtual image/video based on input position and orientation data (taught in Ota and Tateno), to further consider input of position and orientation of the object of interest (taught in Nakamura), so as to generate images of moving objects (sports) from an arbitrary spatial position for broadcasting or motion analysis (Nakamura, [0001]).
See further detail rejection below.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.


Claims 8, 15 are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends. Claims 8, 15 are dependent on Claim 18 which is a later-numbered claim. A dependent claim must refer only to a claim previously set forth in the application.  Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 8, 10, 15, 18-25 are rejected under 35 U.S.C. 103 as being unpatentable over US Ota (20190278803) in view of Tateno (US2022172447) further in view of Nakamura et al (JPH0779382).

Regarding Claim 18. Ota teaches An information processing system comprising:
one or more memories storing instructions; and
one or more processors executing the instructions to (Ota, abstract, the invention describes an image search system which accumulates virtual viewpoint video image data generated based on image data obtained by capturing an object from a plurality of directions by a plurality of cameras and a virtual viewpoint parameter used for generation of the virtual viewpoint video image data in association with each other. Then, the image search system extracts, in a case where a search condition is input via an input unit, virtual viewpoint video image data associated with a virtual viewpoint parameter corresponding to the search condition from the accumulated virtual viewpoint video image data. Further, the image search system presents information of the extracted virtual viewpoint video image data as results of the search.
[0018] FIG. 1 is a block diagram showing an example of a configuration of an image search system in a first embodiment. The image search system in the first embodiment includes a virtual viewpoint video image search apparatus (hereinafter, called an image search apparatus, or simply a search apparatus) 10 that searches for a virtual viewpoint video image as shown in FIG. 1.
[0019] The search apparatus 10 is a computer, for example, such as a PC (Personal Computer), a WS (Work Station), and various servers. The computer may be a tablet PC or a smartphone.
[0020] The accumulation unit 20 accumulates one piece or a plurality of pieces of virtual viewpoint video image data. The accumulation unit 20 is, for example, a storage device, such as a hard disk drove (HDD), a solid state drive (SSD), and a flash memory.):

Ota fails to explicitly teach, however, Tateno teaches acquire first information indicating a position of three-dimensional shape data of a subject, the three-dimensional shape data being generated based on a plurality of captured images obtained by a plurality of image capturing devices (Tateno, abstract, the invention describes to an image processing device, an image processing method, and a program that enable easy editing of a free viewpoint image. The present technology displays: a 3D stroboscopic image obtained by imaging, with a virtual camera, a stroboscopic model in which 3D models of an object at a plurality of times generated from a plurality of viewpoint images imaged from a plurality of viewpoints are arranged in a three-dimensional space; and an editing parameter to be edited in editing of a free viewpoint image obtained by imaging, with the virtual camera, free viewpoint data generated from the plurality of viewpoint images, the editing parameter being linked with the 3D stroboscopic image. The present technology can be applied to, for example, a case of editing a free viewpoint image.
[0044] Moreover, the free viewpoint image can be displayed on a head-up display (HUD) using a transparent display through which the further side can be seen, such as augmented reality (AR) glasses. In this case, in a three-dimensional space where the user actually exists, an object such as a person or a material body imaged in another three-dimensional space can be superimposed and displayed.
[0056] In a case where the 3D data including the 3D shape model and the color information is adopted as the free viewpoint data, the free viewpoint data generation unit 31 performs modeling by a visual hull or the like using the viewpoint images of the plurality of viewpoints from the imaging device 21, generates a 3D shape model and the like of the object reflected in the viewpoint images, and sets the 3D shape model and the like as the free viewpoint data together with the viewpoint images of the plurality of viewpoints serving as a texture.).
Ota and Tateno are analogous art because they both teach generating virtual image/video based on input position and orientation data. Tateno further teaches adding texture and color information in generating the virtual viewpoint image/video. Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention, to modify the method of generating virtual image/video based on input position and orientation data (taught in Ota), to further adding material data to object in the virtual viewpoint image/video (taught in Tateno), so as to provide user with rich and realistic virtual viewpoint image/video.

The combination of Ota and Tateno fails to explicitly teach, however, Nakamura teaches change the position of the three-dimensional shape data of the subject indicated by the first information (Nakamura, abstract, the invention describes method to edit a motion of plural operating moving objects on different time spaces to a picture projected to an optional time space by providing a picture conversion means converting a time space map picture of the moving objects into a picture picked up from a different time space. The processing unit is provided with a camera head 1 picking up an object being a moving object to provide an output of a picture picked up together with 3-dimension information, a data recording section 2 recording the picture with the 3-dimension information and time information, and a moving picture edit section 3 executing time space mapping to the obtained picture to convert the picture, and also with a stop watch 5 providing an output of time information being a reference for the management of the time information and a time base unit 4 distributing the time information outputted from the stop watch 5 to the camera head 1 and the data recording section 2. Furthermore, as the configuration to extract the 3-dimension information of the mobile object, the 3-dimension space coordinate of the mobile object is calculated by an arithmetic operation unit provided separately.
[0042] As described above, in the present invention, the object of editing is a spatiotemporal map. For example, in the case of swimming, consider the case where swimmers in lane 1 and lane 6 are photographed by separate camera heads (cam1, cam2) as shown in FIG. In this case, the spatiotemporal map shows each image in a separate location, as shown in Figure 3. If you shoot this with the virtual camera mentioned above, both images will be too small to fit on the screen, so you will need to edit them. During editing, the spatiotemporal map itself can be cut and pasted as shown in FIG 6. In this example, if one entire lane of the pool were cut out and moved to another location as shown in Figure 6, it would be as if the two swimmers were swimming in adjacent lanes. Since the spatiotemporal map corresponds to actual spatial positions, performing this operation is equivalent to swapping the locations of actual pool lanes in the data. This editing operation is a conditional image movement on the spatiotemporal map, in which the figure is moved within the range of coordinates corresponding to the course to be moved, and is not moved within other ranges. This is also expressed as a matrix like Equation 3. However, as mentioned above, this movement is conditional, so this matrix E differs depending on the position (x, z) on the map.
Therefore, the position and orientation of the object of interest (the swimmers) is entered before new virtual image is generated. The arrangement of the object of interest is modified from its original image.);
Ota, Tateno and Nakamura are analogous art because they all teach generating virtual image/video based on input position and orientation data. Ota further teaches user input viewpoint position and orientation data of virtual camera. Nakamura further teaches input position and orientation data for object of interest. Therefore, it would have been obvious to a person with ordinary skill in the art before the effective filing date of the claimed invention, to modify the method of generating virtual image/video based on input position and orientation data (taught in Ota and Tateno), to further consider input of position and orientation of the object of interest (taught in Nakamura), so as to generate images of moving objects (sports) from an arbitrary spatial position for broadcasting or motion analysis (Nakamura, [0001]).

The combination of Ota, Tateno and Nakamura further teaches acquire second information indicating a parameter that represents a position of a virtual viewpoint and a line-of-sight direction from the virtual viewpoint (Ota, [0006] The image search system according to the present invention includes: an accumulation unit configured to accumulate virtual viewpoint video image data generated based on image data obtained by capturing an object from a plurality of directions by a plurality of cameras and a virtual viewpoint parameter used for generation of the virtual viewpoint video image data in association with each other;
[0022] FIG. 2 is a diagram showing an example of the data format of virtual viewpoint video image data accumulated in the accumulation unit 20 in the first embodiment. As shown in FIG. 2, virtual viewpoint video image data 202 in the present embodiment is stored in the accumulation unit 20 in a state where metadata 201 is given. The metadata 201 includes a virtual viewpoint parameter 211 and video image attached information 212.
[0023] The virtual viewpoint parameter 211 includes an orientation parameter 221 and a position parameter 222…. The video image-attached information 212 is metadata that is generally given to, for example, video image data, such as image capturing time information (hereinafter, described simply as time information), a resolution of a video image, and a color space.
[0024] Here, the virtual viewpoint parameter 211 is explained. The orientation parameter 221 of the virtual viewpoint parameter 211 is a parameter indicating the orientation of a virtual camera.
[0025-0029], … Next, a vector indicating the direction of the rotation axis is described as v=(xv, yv, zv) and a desired rotation angle is described as θ ….As a result of this, a point that is the point P rotated by θ about the axis in the direction of the vector vis obtained as coordinates (x, y, z).
[0030] The position parameter 222 is a parameter indicating the position of a virtual camera. It is assumed that the three-dimensional coordinates are three-dimensional coordinates (x, y, z) with the origin on the world coordinates being taken to be (0, 0, 0).);
change the position of the virtual viewpoint indicated by the second information based on an amount of change resulting from the change of the position of the three-dimensional shape data (Nakamura, [0047] When swimming, the size of the pool is limited, so if you want to swim long distances you will have to make repeated turns. At this point, you may want to compare the time before and after the turn, for example. In this case, edit as follows: When taking a photo, the direction of travel is reversed on the spacetime map before and after the turn, as shown in Figure 12. In the editing operation, the left and right of the space-time map after the turn are swapped and aligned with the map before the turn in order to reverse the direction of travel after the turn, as shown in FIG. At this time, care must be taken to align the position and time as described above. By this operation, it is possible to obtain an image as shown in FIG. 14, in which the athletes before and after the turn appear to be racing on adjacent courses at the same time. The positional relationship of the images to be compared by any of the above editing operations is displayed at a position based on the coordinates of the actual three-dimensional space, so it is also possible to measure the position and distance using the positional relationship on the images.
[0053] (Third embodiment) In the first embodiment, the virtual camera is operated by an automatic tracking method, but this operation may also be performed by a human being. For this operation, it is desirable to use a data input device that is similar to the operation of an actual camera, rather than a mouse or the like. For example, there is a camera parameter input device as shown in FIG.
[0055] In addition, instead of tracking the virtual camera to the target object, it can also perform a predetermined action. In the case of swimming, for example, by controlling the virtual camera so that the center of the screen is at the pace for the Japanese record, it is possible to see at a glance whether the athlete is likely to set a record based on the position on the screen where they are swimming, and what their pace distribution is.
[0056] Furthermore, the position of the virtual camera can be freely moved, so it can be controlled so that it is always positioned on the ceiling directly above the leading player, for example. In addition, the virtual camera can be given similar movements as when it is placed on a cart on rails and moved around while filming, as is done in television and movie shooting.
Therefore, the athlete (3D object)’s position is changing as athlete is moving along the lane. The virtual camera’s virtual viewpoint is moving according to the athlete’s position changes, so as to keep two different athletes’ moving image adjacent to each other for comparison.);
specify a captured image based on the position of the virtual viewpoint after the change and the line-of-sight direction from the virtual viewpoint (Nakamura, [0039] Next, the image output from the video editing unit 3 will be described. Unlike conventional image editing, the present invention uses a spatiotemporal map as the target of editing, rather than an input image. The output image is the image obtained when the spatiotemporal map is photographed using a virtual camera. This virtual camera is basically unrelated to the camera head 1 used for input. Therefore, it is possible to take a photograph from any position with any angle of view. This can be captured by the concept of a new camera filming a pitch-black pool with a spotlight following the swimmer. As mentioned above, the spatiotemporal map corresponds to a position in real three-dimensional space. Therefore, the virtual camera can be operated in the same way as a camera head for image input. In other words, the camera position can be determined in three-dimensional space, and then the camera direction and angle of view can be determined from there. As mentioned above, the space-time map has a position in actual three-dimensional space, so if the position and direction of the virtual camera are known, the position and range to be photographed can also be determined. This is the exact opposite operation of mapping the image from the camera head. FIG. 5 explains this principle. In this example, since a constraint plane is used and the spatiotemporal map is a plane, the image transformation is expressed by the matrix shown in the following equation 2, just like in the case of mapping.
[0042] As described above, in the present invention, the object of editing is a spatiotemporal map. For example, in the case of swimming, consider the case where swimmers in lane 1 and lane 6 are photographed by separate camera heads (cam1, cam2) as shown in FIG. In this case, the spatiotemporal map shows each image in a separate location, as shown in Figure 3. If you shoot this with the virtual camera mentioned above, both images will be too small to fit on the screen, so you will need to edit them. During editing, the spatiotemporal map itself can be cut and pasted as shown in FIG 6. In this example, if one entire lane of the pool were cut out and moved to another location as shown in Figure 6, it would be as if the two swimmers were swimming in adjacent lanes. Since the spatiotemporal map corresponds to actual spatial positions, performing this operation is equivalent to swapping the locations of actual pool lanes in the data. This editing operation is a conditional image movement on the spatiotemporal map, in which the figure is moved within the range of coordinates corresponding to the course to be moved, and is not moved within other ranges. This is also expressed as a matrix like Equation 3. However, as mentioned above, this movement is conditional, so this matrix E differs depending on the position (x, z) on the map.
[0043] After this editing, when a picture is taken with the virtual camera, the image will look like that shown in FIG. 7, and the player images will not be too small.
This operation makes it appear as if swimmers who were originally swimming in separate lanes are now swimming in adjacent lanes. 
Therefore, the position and orientation of the object of interest (the swimmers) is entered before new virtual image is generated. The arrangement of the object of interest is modified from its original image.); and
generate a virtual viewpoint image based on the specified captured image and the three-dimensional shape data of the subject (Nakamura, [0043] After this editing, when a picture is taken with the virtual camera, the image will look like that shown in FIG. 7, and the player images will not be too small. This operation makes it appear as if swimmers who were originally swimming in separate lanes are now swimming in adjacent lanes.).

Regarding Claim 19. The combination of Ota, Tateno and Nakamura further teaches The information processing system according to Claim 18, wherein a color of the subject included in the virtual viewpoint image is determined based on the specified captured image (Tateno, abstract, the invention describes to an image processing device, an image processing method, and a program that enable easy editing of a free viewpoint image. The present technology displays: a 3D stroboscopic image obtained by imaging, with a virtual camera, a stroboscopic model in which 3D models of an object at a plurality of times generated from a plurality of viewpoint images imaged from a plurality of viewpoints are arranged in a three-dimensional space; and an editing parameter to be edited in editing of a free viewpoint image obtained by imaging, with the virtual camera, free viewpoint data generated from the plurality of viewpoint images, the editing parameter being linked with the 3D stroboscopic image. The present technology can be applied to, for example, a case of editing a free viewpoint image.
[0044] Moreover, the free viewpoint image can be displayed on a head-up display (HUD) using a transparent display through which the further side can be seen, such as augmented reality (AR) glasses. In this case, in a three-dimensional space where the user actually exists, an object such as a person or a material body imaged in another three-dimensional space can be superimposed and displayed.
[0056] In a case where the 3D data including the 3D shape model and the color information is adopted as the free viewpoint data, the free viewpoint data generation unit 31 performs modeling by a visual hull or the like using the viewpoint images of the plurality of viewpoints from the imaging device 21, generates a 3D shape model and the like of the object reflected in the viewpoint images, and sets the 3D shape model and the like as the free viewpoint data together with the viewpoint images of the plurality of viewpoints serving as a texture.).
The reasoning for combination of Ota, Nakamura and Tateno is the same as described in Claim 18.

Regarding Claim 20. The combination of Ota, Nakamura and Tateno further teaches The information processing system according to Claim 18, wherein the specified captured image is included in the plurality of captured images (Ota, [0006] The image search system according to the present invention includes: an accumulation unit configured to accumulate virtual viewpoint video image data generated based on image data obtained by capturing an object from a plurality of directions by a plurality of cameras and a virtual viewpoint parameter used for generation of the virtual viewpoint video image data in association with each other;
Nakamura, [0013] The image processing device may also include a means for converting a plurality of spatiotemporal map images of the same or different moving objects captured at different times into images that have been captured by arranging them on the spatiotemporal map so that they have the same time axis but are at different spatial positions.).
The reasoning for combination of Ota, Nakamura and Tateno is the same as described in Claim 18.

Regarding Claim 21. The combination of Ota, Nakamura and Tateno further teaches The information processing system according to Claim 18, wherein the amount of change resulting from the change of the position of the three-dimensional shape data is the same as an amount of change resulting from the change of the position of the virtual viewpoint (Nakamura, [0051] This gives us the input image that corresponds to the point in the output image. Input 1 (31) and input 2 (32) may be the outputs of the data recording unit 2 or the camera head 1 described above. Of the input data, the three-dimensional data and time data are used to obtain the matrix A1-1 when projecting onto the spatiotemporal map. Since the inputs are different parameters, a matrix A1-1 is required for each. On the other hand, the input image data is written into frame memories (FM) 35 and 36, respectively. The address at this time corresponds to (u, v). Each of these frame memories has two systems, and writing and reading are performed alternately. The virtual camera controller 39 generates the three-dimensional parameters of the virtual camera. Automatic tracking, for example, is calculated based on the input three-dimensional data. From this data, the matrix coefficient calculation circuit 40 determines the matrix B-1. A read address counter 44 generates a read address (u', v') corresponding to the raster scan of the output. Based on this, the matrix calculation circuit 45 determines the address (x, z) on the spatiotemporal map. Based on this, the editing controller 41 creates editing matrices E1-1 and E2-1. Furthermore, from the read addresses and their matrices, the write addresses (u, v) corresponding to the read addresses (u', v') are calculated using Equation 5. As mentioned above, there are parts on the space-time map where there is an image and parts where there is no image, and this can be determined by checking the value of the calculated write address. In other words, when a read address is specified, if the corresponding write address has a value that actually exists, then that is a portion where an image exists. This comparison is performed by an address manager circuit 48, which is made up of a comparator and a selector, for each of the inputs, and calculates which address corresponds to which input. The frame memories 35 and 36 are read out at the calculated read address and selected by the data selector circuit 49, thereby obtaining an image of the virtual camera. In this example, there are two inputs, but the calculation can be performed in the same way for any number of inputs.

    PNG
    media_image1.png
    82
    580
    media_image1.png
    Greyscale

Therefore, the virtual viewpoint image is calculated based on object moving data. It is obvious to a person with ordinary skill in the art that, the virtual viewpoint is moved the same amount and same direction as the object movement, when the updated virtual viewpoint image is to lock in (catch up) the moving virtual object. This is common in sports TV when the camera is tracking on one specific player running on the field.).
The reasoning for combination of Ota, Nakamura and Tateno is the same as described in Claim 18.

Regarding Claim 22. The combination of Ota, Nakamura and Tateno further teaches The information processing system according to Claim 18, wherein the virtual viewpoint image is generated based on the changed position of the virtual viewpoint (Nakamura, [0047] When swimming, the size of the pool is limited, so if you want to swim long distances you will have to make repeated turns. At this point, you may want to compare the time before and after the turn, for example. In this case, edit as follows: When taking a photo, the direction of travel is reversed on the spacetime map before and after the turn, as shown in Figure 12. In the editing operation, the left and right of the space-time map after the turn are swapped and aligned with the map before the turn in order to reverse the direction of travel after the turn, as shown in FIG. At this time, care must be taken to align the position and time as described above. By this operation, it is possible to obtain an image as shown in FIG. 14, in which the athletes before and after the turn appear to be racing on adjacent courses at the same time. The positional relationship of the images to be compared by any of the above editing operations is displayed at a position based on the coordinates of the actual three-dimensional space, so it is also possible to measure the position and distance using the positional relationship on the images.). 
The reasoning for combination of Ota, Nakamura and Tateno is the same as described in Claim 18.

Regarding Claim 23. The combination of Ota, Nakamura and Tateno further teaches The information processing system according to Claim 18, wherein the second information is acquired based on a user operation (Nakamura, [0042] As described above, in the present invention, the object of editing is a spatiotemporal map. For example, in the case of swimming, consider the case where swimmers in lane 1 and lane 6 are photographed by separate camera heads (cam1, cam2) as shown in FIG. In this case, the spatiotemporal map shows each image in a separate location, as shown in Figure 3. If you shoot this with the virtual camera mentioned above, both images will be too small to fit on the screen, so you will need to edit them. During editing, the spatiotemporal map itself can be cut and pasted as shown in FIG.
Cut and paste are well known user operation of editing.).
The reasoning for combination of Ota, Nakamura and Tateno is the same as described in Claim 18.

Claim 24 is similar in scope as Claim 18, and thus is rejected under same rationale.
Claim 25 is similar in scope as Claim 18, and thus is rejected under same rationale.

Regarding Claim 8. The combination of Ota, Nakamura and Tateno further teaches The information processing system according to Claim 18, wherein the position of the subject in the virtual viewpoint video are a position of the three-dimensional shape data in a coordinate system for generating the virtual viewpoint video, and
the coordinate system is different from a coordinate system corresponding to an image-capturing space whose images are captured by the plurality of image capturing devices (Nakamura, [0039] Next, the image output from the video editing unit 3 will be described. Unlike conventional image editing, the present invention uses a spatiotemporal map as the target of editing, rather than an input image. The output image is the image obtained when the spatiotemporal map is photographed using a virtual camera. This virtual camera is basically unrelated to the camera head 1 used for input. Therefore, it is possible to take a photograph from any position with any angle of view…. Therefore, the virtual camera can be operated in the same way as a camera head for image input. In other words, the camera position can be determined in three-dimensional space, and then the camera direction and angle of view can be determined from there. … FIG. 5 explains this principle. In this example, since a constraint plane is used and the spatiotemporal map is a plane, the image transformation is expressed by the matrix shown in the following equation 2, just like in the case of mapping.
[0045] When editing using footage from multiple camera heads, first determine the relationship between the clock and field using one of the cameras as the reference timing. At this time, the output image is also obtained. The image from the camera head 1 contains time data recorded by the stopwatch 5 as well as three-dimensional coordinate data, so the time of each field can be determined.) .
The reasoning for combination of Ota, Nakamura and Tateno is the same as described in Claim 18.

Regarding Claim 10. The combination of Ota, Nakamura and Tateno further teaches The information processing system according to Claim 8, wherein the coordinate system for generating the virtual viewpoint video is a coordinate system corresponding to a background in the virtual viewpoint video (Nakamura, [0039] Next, the image output from the video editing unit 3 will be described. Unlike conventional image editing, the present invention uses a spatiotemporal map as the target of editing, rather than an input image. The output image is the image obtained when the spatiotemporal map is photographed using a virtual camera. This virtual camera is basically unrelated to the camera head 1 used for input. Therefore, it is possible to take a photograph from any position with any angle of view…. Therefore, the virtual camera can be operated in the same way as a camera head for image input. In other words, the camera position can be determined in three-dimensional space, and then the camera direction and angle of view can be determined from there. … FIG. 5 explains this principle. In this example, since a constraint plane is used and the spatiotemporal map is a plane, the image transformation is expressed by the matrix shown in the following equation 2, just like in the case of mapping.
Therefore, the constraint plane is the background, which is the XZ coordinate plane.).
The reasoning for combination of Ota, Nakamura and Tateno is the same as described in Claim 18.

Regarding Claim 15. The combination of Ota, Nakamura and Tateno further teaches The information processing system according to Claim 18, wherein the changing the position of the three-dimensional shape data of the subject is performed by a user operation (Nakamura, [0042] As described above, in the present invention, the object of editing is a spatiotemporal map. For example, in the case of swimming, consider the case where swimmers in lane 1 and lane 6 are photographed by separate camera heads (cam1, cam2) as shown in FIG. In this case, the spatiotemporal map shows each image in a separate location, as shown in Figure 3. If you shoot this with the virtual camera mentioned above, both images will be too small to fit on the screen, so you will need to edit them. During editing, the spatiotemporal map itself can be cut and pasted as shown in FIG.
Cut and paste are well known user operation of editing.).
The reasoning for combination of Ota, Nakamura and Tateno is the same as described in Claim 18.


Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XIN SHENG whose telephone number is (571)272-5734. The examiner can normally be reached M-F 9:30AM-3:30PM 6:00PM-8:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jason Chan can be reached at 5712723022. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Xin Sheng/           Primary Examiner, Art Unit 2619
Read full office action
Prosecution Timeline

Oct 24, 2023
Application Filed
Oct 01, 2025
Non-Final Rejection mailed — §103, §112
Dec 31, 2025
Response Filed
Apr 01, 2026
Final Rejection mailed — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/173,623
Patent 12626326
IMAGE STITCHING WITH AN ADAPTIVE THREE-DIMENSIONAL BOWL MODEL OF THE SURROUNDING ENVIRONMENT FOR SURROUND VIEW VISUALIZATION
3y 2m to grant Granted May 12, 2026
18/367,119
Patent 12620165
SYSTEMS, METHODS, AND COMPUTER PROGRAM PRODUCTS FOR POPULATING ENVIRONMENT MODELS
2y 7m to grant Granted May 05, 2026
18/367,115
Patent 12614341
SYSTEMS, METHODS, AND COMPUTER PROGRAM PRODUCTS FOR POPULATING ENVIRONMENT MODELS
2y 7m to grant Granted Apr 28, 2026
18/490,458
Patent 12614337
SYSTEM AND METHODS FOR CUSTOMIZING 3D MODELS
2y 6m to grant Granted Apr 28, 2026
18/796,576
Patent 12614366
AUTOMATIC POINT CLOUD BUILDING ENVELOPE SEGMENTATION (AUTO-CuBES) USING MACHINE LEARNING
1y 8m to grant Granted Apr 28, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
72%
Grant Probability
90%
With Interview (+17.2%)
2y 4m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 404 resolved cases by this examiner. Grant probability derived from career allowance rate.