Last updated: April 19, 2026
Application No. 18/594,062
INFORMATION PROCESSING APPARATUS AND METHOD, AND STORAGE MEDIUM

Non-Final OA §103§112
Filed
Mar 04, 2024
Examiner
HOANG, PETER
Art Unit
2616
Tech Center
2600 — Communications
Assignee
Canon Kabushiki Kaisha
OA Round
1 (Non-Final)
Interview Optional

— +11.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 539 resolved cases, 2023–2026
Examiner Intelligence

HOANG, PETER View full profile →
Grants 81% — above average
Career Allow Rate
435 granted / 539 resolved
+18.7% vs TC avg
Moderate +12% lift
Without
With
+11.7%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
12 currently pending
Career history
551
Total Applications
across all art units
Statute-Specific Performance

§101
11.8%
-28.2% vs TC avg
§103
54.8%
+14.8% vs TC avg
§102
6.7%
-33.3% vs TC avg
§112
13.6%
-26.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 539 resolved cases
Office Action

§103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
35 USC § 112
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.


The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
A first generating unit configured to generate, an obtaining unit configured to obtain, a second generating unit configured to generate, a combining unit configured to distinguishably combine, a designating unit configured to designate.
Since the claim limitation(s) invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, Claims 1-20 has/have been interpreted to cover the corresponding structure described in the specification that achieves the claimed function, and equivalents thereof.  
A review of the specification shows that the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitation: 
the first generating unit, obtaining unit, second generation unit, combining unit, and designating unit are all being interpreted to cover various functional processing units realized in a CPU 711, see [0047-0048], in reference to Figs. 2, 7-8, 10.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 18 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 18 states, “…wherein a difference between a virtual viewpoint image including all of the plurality of objects and a virtual viewpoint image of the object of the generation target, and the virtual viewpoint image of the object of the generation target displayed on the display screen in an identifiable manner…” but the claim appears to be an incomplete sentence, thus  not clear and indefinite.  Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 12-18, 20 21-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yao et al. (“Accurate silhouette extraction of multiple moving objects for free viewpoint sports video synthesis”) in view of Sabirin et al. (“Toward Real-Time Delivery of Immersive Sports Content).
	Re claim 1, Yao teaches an information processing apparatus comprising:
a first generating unit configured to generate, from a plurality of captured images obtained by a plurality of cameras, a plurality of silhouette images representing respective regions of a plurality of objects appearing in the plurality of captured images (see section IV: system is a static dual camera system, wherein two cameras are sparsely fixed in two different positions in the baseball court respectively, as shown in Fig. 6; and Fig. 1 section III C. After the false silhouette candidates are rejected out, the histogram-based thresholding method is adopted to refine each silhouette candidate in a set.
an obtaining unit configured to obtain associating information representing a silhouette image associated with each object by associating the plurality of silhouette images with each object of the plurality of objects (see Fig. 2, obj 1-5 as a silhouette image of each object).
Yao does not explicitly teach a second generating unit configured to generate, based on the associating information corresponding to an object designated from the plurality of objects and a designated virtual viewpoint, a virtual viewpoint image of the designated object.
However, Sabirin teaches a second generating unit configured to generate, based on the associating information corresponding to an object designated from the plurality of objects and a designated virtual viewpoint, a virtual viewpoint image of the designated object (see p. 63, Content Creation: To generate high-quality free viewpoint video of sports events held in a spacious area with sparsely arranged cameras that not only switches between cameras but also reproduces arbitrary viewpoints where cameras cannot be mounted… As Figure 1 shows, videos of a sports event are first acquired from several cameras located around the venue. This data then goes through a series of semi-automated image-processing steps: camera calibration, object extraction, object tracking, and object separation. The term “semi-automatic” indicates possible user intervention at each stage, especially object separation, 
to fine-tune or correct the results of the automated methods. Finally, a free viewpoint video is generated by synthesizing the objects extracted from the input videos and embedding them in a virtual sports venue) and (see Fig. 1, 3, and 5, wherein an object, such as soccer player, is designated from the plurality of objects and a designated virtual viewpoint, a virtual viewpoint image of the athlete is generate based on the tracking of a player object).
	Yao and Sabirin teach claim 1.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Yao’s systems of generating silhouette images associated with each object to explicitly including generated a virtual viewpoint image based on a designated object, as taught by Sabirin, as the references are in the analogous art of silhouette image processing.  An advantage of the modification is that it achieves the result of allowing for generation of a virtual viewpoint based on designated object, such as players in a sports game.
Re claim 12, Yao and Sabirin teach claim 1.  Furthermore, Sabirin teaches further comprising a combining unit configured to distinguishably combine a virtual viewpoint image of the designated object and differences between all virtual viewpoint images of the plurality of objects obtained using the plurality of silhouette images and the virtual viewpoint image of the designated object (see p. 65: Our multi-camera object tracking method combines temporal information between consecutive frames with spatial information from multiple cameras (see Figure 2). It represents the 3D world coordinate of an object as a 2D xy-coordinate in the ground plane, as viewed from above, and calculates this by estimating a homographic matrix between each camera image and the ground plane. Using a particle filter set for each object in every camera image with the same object’s identifier among multiple cameras, our method estimates the object’s 2D xy-coordinate on the ground plane in every camera image. As the process continues frame by frame, the system checks each object for occlusion and modifies the regions of occluding and occluded objects by projecting the 2D xy-coordinates of those objects estimated using another camera in which occlusion was not observed) and (see Fig. 5, capturing environments with camera selection views, and object extraction/tracking/separation of designated objects).  For motivation, see claim 1.
Re claim 13, Yao and Sabirin teaches claim 1.  Furthermore, Sabirin teaches wherein the first generating unit separates a region for each object in each of the plurality of captured images, and generates the silhouette image based on the separated region of the object (see Fig. 1, 3-4-5, wherein for each object in an image of a plurality of captured images, objects are extracted and silhouette images are generated based on the separated regions of the objects).  For motivation, see claim 1.
Re claim 14, Yao and Sabirin teach claim 13.  Furthermore, Sabirin teaches wherein wherein the first generating unit separates a region for each object in each of the plurality of captured images by instance segmentation (see Fig. 1, 3-4-5, wherein for each object in an image of a plurality of captured images, objects are extracted and silhouette images are generated based on the separated regions of the objects, wherein the objects are segmented from each other and tracked).  For motivation, see claim 1.
Re claim 15, Yao in view of Sabirin teaches claim 1.  Furthermore, Sabirin teaches further comprising a designating unit configured to designate an object as a generation target of the virtual viewpoint image from among the plurality of objects in accordance with a user operation (see p. 69, We also developed a simple viewer for handheld devices to evaluate free viewpoint video generated by our application. It operates seamlessly in handheld devices such as smartphones and tablets, enabling users to freely rotate and zoom in/out virtual viewpoints by swiping and pinching in/out the screen) (see Fig. 1, wherein the free viewpoint video shows a user swiping on a screen to change to a designated object in a particular viewpoint, and (see Fig. 5, free viewpoint video authoring application wherein users can manually supervise and adjust automatic approaches, including camera selection, object extraction/tracking/separation).  For motivation, see claim 1.
Re claim 16, Yao and Sabirin teach claim 15.  Furthermore, Sabirin teaches wherein the designating unit determines an object of the generation target based on a position designated by the user operation on a display screen that displays one captured image of the plurality of captured images (see p. 69, We also developed a simple viewer for handheld devices to evaluate free viewpoint video generated by our application. It operates seamlessly in handheld devices such as smartphones and tablets, enabling users to freely rotate and zoom in/out virtual viewpoints by swiping and pinching in/out the screen) (see Fig. 1, wherein the free viewpoint video shows a user swiping on a screen to change to a designated object in a particular viewpoint, and (see Fig. 5, free viewpoint video authoring application wherein users can manually supervise and adjust automatic approaches, including camera selection of a particular viewpoint of a plurality of viewpoints and object detection, object extraction/tracking/separation).  For motivation, see claim 1.
Re claim 17, Yao and Sabirin teach claim 15.  Furthermore, Sabirin teaches wherein the designating unit determines an object of the generation target based on a position designated by the user operation on a display screen that displays a virtual viewpoint image generated by the second generating unit (see p. 69, We also developed a simple viewer for handheld devices to evaluate free viewpoint video generated by our application. It operates seamlessly in handheld devices such as smartphones and tablets, enabling users to freely rotate and zoom in/out virtual viewpoints by swiping and pinching in/out the screen) (see Fig. 1, wherein the free viewpoint video shows a user swiping on a screen to change to a designated object in a particular viewpoint, and (see Fig. 5, free viewpoint video authoring application wherein users can manually supervise and adjust automatic approaches, including camera selection of a particular viewpoint of a plurality of viewpoints and object detection, object extraction/tracking/separation).  For motivation, see claim 1.
Re claim 18, Yao and Sabirin teaches claim 17.  Furthermore, Sabirin teaches wherein a difference between a virtual viewpoint image including all of the plurality of objects and a virtual viewpoint image of the object of the generation target, and the virtual viewpoint image of the object of the generation target displayed on the display screen in an identifiable manner unit (see p. 69, We also developed a simple viewer for handheld devices to evaluate free viewpoint video generated by our application. It operates seamlessly in handheld devices such as smartphones and tablets, enabling users to freely rotate and zoom in/out virtual viewpoints by swiping and pinching in/out the screen), (see Fig. 1, wherein the free viewpoint video shows a user swiping on a screen to change to a designated object in a particular viewpoint, and wherein automatic object extraction shows a plurality of objects and the automatic object tracking and semi-automatic object separation silhouette images are of designated objects), and (see Fig. 5, free viewpoint video authoring application wherein users can manually supervise and adjust automatic approaches, including camera selection of a particular viewpoint of a plurality of viewpoints and object detection, object extraction/tracking/separation).  For motivation, see claim 1.
Re claim 20, Yao and Sabirin teaches claim 15.  Furthermore, Sabirin teaches wherein the designating unit maintains a designated state of the object by tracking the object designated by the user operation (see p. 65, in view of Fig. 3, wherein objects are tracking using at least a first and second camera, and thus the designation state of the object is tracked).  For motivation, see claim 1.
	Claims 21-22 claim limitations in scope to claim 1 and is rejected for at least the reasons above.
Claim(s) 2-11, 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yao et al. (“Accurate silhouette extraction of multiple moving objects for free viewpoint sports video synthesis”) in view of Sabirin et al. (“Toward Real-Time Delivery of Immersive Sports Content) and Chen et al. (“A Fast Free-viewpoint Video Synthesis Algorithm for Sports Scenes”).

	Re claim 2, Yao and Sabirin teach claim 1.  Furthermore, Yao and Sabirin is not relied upon to explicitly teach wherein the obtaining unit obtains 3d shape using the plurality of silhouette images and performs association for each object of the plurality of silhouette images is projected onto the 3d shape.
	However, Chen teaches wherein the obtaining unit obtains 3d shape using the plurality of silhouette images and performs association for each object of the plurality of silhouette images is projected onto the 3d shape (see p. 3211, C. Surface polygonization, in reference to Fig. 4-5, wherein silhouette images are projected into a 3d shape cell).
	Yao, Sabirin, and Chen teaches claim 2.  It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Yao and Sabirin’s systems of generating silhouette images associated with each object to explicitly obtaining a 3d shape using the plurality of silhouette images, as taught by Chen, as the references are in the analogous art of silhouette image processing.  An advantage of the modification is that it achieves the result of allowing for generation of a 3d shapes by performing association of objects using the projection of silhouette images onto a 3d shape.
	Re claim 3, Yao, Sabirin, and Chen teach claim 2.  Furthermore, Chen teaches wherein the obtaining unit associates a silhouette image with an element present in the projection area of the silhouette image among a plurality of elements constituting the three-dimensional shape, and performs the association of the silhouette images based on a result of the association of the silhouette image in each of the plurality of elements (see p. 3211, in reference to Fig. 1, Fig. 4-5, wherein a plurality of elements of 3d ROI elements are associated with particular silhouette images).  For motivation, see claim 2.
	Re claim 4, Yao, Sabirin, and Chen teaches claim 3.  Furthermore, Chen teaches wherein
the obtaining unit: determines, for the plurality of elements, visibility from the plurality of cameras, and associates a silhouette image obtained from a captured image obtained by a camera in which the element is determined to be visible with the element (see abstract, In this paper, we report on a parallel free-viewpoint video synthesis algorithm that can efficiently reconstruct a high-quality 3D scene representation of sports scenes. The proposed method focuses on a scene that is captured by multiple synchronized cameras featuring wide-baselines. The following strategies are introduced to accelerate the production of a free-viewpoint video taking the improvement of visual quality into account: (1) a sparse point cloud is reconstructed using a volumetric visual hull approach, and an exact 3D ROI is found for each object using an efficient connected components labeling algorithm. Next, the reconstruction of a dense point cloud is accelerated by implementing visual hull only in the ROIs; (2) an accurate polyhedral surface mesh is built by estimating the exact intersections between grid cells and the visual hull; (3) the appearance of the reconstructed presentation is reproduced in a view-dependent manner that respectively renders the non-occluded and occluded region with the nearest camera and its neighboring cameras. The production for volleyball and judo sequences demonstrates the effectiveness of our method in terms of both execution time and visual quality) and (See p. 3211-3213, wherein visibility is detected, including determination of occlusion, and view-dependent renderings are performed with non-occluded parts rendered with occluded regions from neighboring cameras).  For motivation, see claim 2.
	Re claim 5, Yao, Sabirin, and Chen teach claim 3.  Furthermore, Chen teaches the obtaining unit supplements the association between the element and the silhouette image based on association of a silhouette image in an element adjacent to the element (see abstract, In this paper, we report on a parallel free-viewpoint video synthesis algorithm that can efficiently reconstruct a high-quality 3D scene representation of sports scenes. The proposed method focuses on a scene that is captured by multiple synchronized cameras featuring wide-baselines. The following strategies are introduced to accelerate the production of a free-viewpoint video taking the improvement of visual quality into account: (1) a sparse point cloud is reconstructed using a volumetric visual hull approach, and an exact 3D ROI is found for each object using an efficient connected components labeling algorithm. Next, the reconstruction of a dense point cloud is accelerated by implementing visual hull only in the ROIs; (2) an accurate polyhedral surface mesh is built by estimating the exact intersections between grid cells and the visual hull; (3) the appearance of the reconstructed presentation is reproduced in a view-dependent manner that respectively renders the non-occluded and occluded region with the nearest camera and its neighboring cameras. The production for volleyball and judo sequences demonstrates the effectiveness of our method in terms of both execution time and visual quality) and (See p. 3211-3213, wherein visibility is detected, including determination of occlusion, and view-dependent renderings are performed with non-occluded parts rendered with occluded regions from neighboring cameras).  For motivation, see claim 2.

	Re claim 6, Yao and Sabirin teaches claim 1.  Yao and Sabirin do not explicitly teach wherein the second generating unit obtains a three-dimensional shape of the designated object using a silhouette image associated by the associating information corresponding to the designated object, and generates the virtual viewpoint image of the designated object based on the obtained three-dimensional shape.
	However, Chen teaches wherein the second generating unit obtains a three-dimensional shape of the designated object using a silhouette image associated by the associating information corresponding to the designated object, and generates the virtual viewpoint image of the designated object based on the obtained three-dimensional shape (see p. 3211, C. Surface polygonization, in reference to Fig. 4-5, wherein silhouette images are projected into a 3d shape cell) and (See Fig. 1-2, wherein silhouette image associates with designated objects and the fast free viewpoints are generated for 3d shapes based on designating objects of the silhouette images).
	Yao, Sabirin, and Chen teach claim 6.  For motivation, see claim 2.
	Re claim 7, Yao, Sabirin, and Chen teaches claim 6.  Furthermore, Sabirin teaches wherein the second generating unit, performs, when the designated object is two or more objects, logical sum synthesis of silhouette images corresponding to the two or more objects for each captured image, and generates the virtual viewpoint image using the silhouette image obtained by the logical sum synthesis (see p. 64- 65, object extraction and tracking, in reference to Fig. 3, wherein logical sum synthesis of silhouette images corresponding to two or more objects are designated in a silhouette image such as label=5 and label=6 , and virtual viewpoint images using the silhouette such as tr= 1, tr=2).  For motivation, see claim 1. 
	Re claim 8, Yao, Sabirin, and Chen teaches claim 6.  Furthermore, Chen teaches wherein
the second generating unit combines a silhouette image of the designated object and a silhouette image of an object in an occlusion relationship with the designated object in a distinguishable manner, and generates the virtual viewpoint image using the combined silhouette image (see abstract, In this paper, we report on a parallel free-viewpoint video synthesis algorithm that can efficiently reconstruct a high-quality 3D scene representation of sports scenes. The proposed method focuses on a scene that is captured by multiple synchronized cameras featuring wide-baselines. The following strategies are introduced to accelerate the production of a free-viewpoint video taking the improvement of visual quality into account: (1) a sparse point cloud is reconstructed using a volumetric visual hull approach, and an exact 3D ROI is found for each object using an efficient connected components labeling algorithm. Next, the reconstruction of a dense point cloud is accelerated by implementing visual hull only in the ROIs; (2) an accurate polyhedral surface mesh is built by estimating the exact intersections between grid cells and the visual hull; (3) the appearance of the reconstructed presentation is reproduced in a view-dependent manner that respectively renders the non-occluded and occluded region with the nearest camera and its neighboring cameras. The production for volleyball and judo sequences demonstrates the effectiveness of our method in terms of both execution time and visual quality) and (See p. 3211-3213, wherein visibility is detected, including determination of occlusion, and view-dependent renderings are performed with non-occluded parts rendered with occluded regions from neighboring cameras).  For motivation, see claim 2.
	Re claim 9, Yao, Sabirin, and Chen teaches claim 8.  Furthermore, Chen teaches wherein the virtual viewpoint image generated from the silhouette image of the object in the occlusion relationship is generated with a brightness different from the virtual viewpoint image generated from the silhouette image of the designated object (see p. 3212, wherein occlude parts are shown in grey in occlusion map of a specific camera in Fig. 1(d)).  For motivation, see claim 2.
	Re claim 10, Yao, Sabirin, and Chen teach claim 6.  Furthermore, Chen teaches the second generating unit generates, when the designated object is two or more objects,
virtual viewpoint images of the two or more objects using silhouette images of the two or more objects, respectively; and combines the virtual viewpoint images of each of the two or more objects based on a distance between the designated virtual viewpoint and each of the two or more objects (see Fig. 2, silhouette extraction wherein selected region shows enlarged views with two or more objects, using distance map), (see p. 3211-3212,  visibility detection and View-dependent rendering calculating distances from virtual viewpoint to each camera), and (see abstract, reconstructed presentation in a view-dependent manner including nearest camera and its neighbors).  For motivation, see claim 2.
	Re claim 11, Yao, Sabirin, and Chen teaches claim 3.  Furthermore, Chen teaches wherein the second generating unit generates the virtual viewpoint image of the designated object based on a three-dimensional shape obtained by deleting an element not associated with the silhouette image associated with the designated object by the associating information from a plurality of elements constituting the three-dimensional shape generated based on the plurality of silhouette images (see p. 3210-3211, removing noise, and remove point set if the number of voxels of the point set is less than a specified voxel number).  For motivation, see claim 2.
	Re claim 19, Yao and Sabirin teaches claim 15.  Yao and Sabirin do not explicitly teach wherein the designating unit determines, as an object of the generation target, an object present at a predetermined time in a specific region designated by the user operation in photographing regions of the plurality of cameras.
	However, Chen teaches wherein the designating unit determines, as an object of the generation target, an object present at a predetermined time in a specific region designated by the user operation in photographing regions of the plurality of cameras (see p. 3209, Introduction, Free-viewpoint video (FVV) is a well-known technique that provides an immersive user experience when viewing visual media. Compared with traditional fixed-viewpoint video, it allows users to select a viewpoint interactively and is capable of rendering a new view from a novel viewpoint) and (see Fig. 10, wherein Synthesized FVV of a judo sequence is captured at different predetermined times in a specific area designated by the user along with objects/regions of interest).  Yao, Sabirin, and Chen teach claim 19.  For motivation, see claim 2.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Peter Hoang whose telephone number is (571)270-1346. The examiner can normally be reached Monday-Friday 8:00 am - 5:00 pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hajnik F. Daniel can be reached at (571) 272-7642. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PETER HOANG/Primary Examiner, Art Unit 2616
Read full office action
Prosecution Timeline

Mar 04, 2024
Application Filed
Feb 20, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/574,111
Patent 12597192
INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD
2y 5m to grant Granted Apr 07, 2026
18/622,684
Patent 12582906
SYSTEM FOR GENERATING ANIMATION WITHIN A VIRTUAL ENVIRONMENT
2y 5m to grant Granted Mar 24, 2026
18/256,256
Patent 12561902
INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD
2y 5m to grant Granted Feb 24, 2026
19/256,567
Patent 12555318
Systems and Methods for Adaptive Streaming of Point Clouds
2y 5m to grant Granted Feb 17, 2026
18/383,610
Patent 12530841
INTELLIGENT METHOD TO DYNAMICALLY PRIORITIZE AND ORCHESTRATE SPATIAL COMPUTING DATA FEEDS LEVERAGING QUANTUM GENERATIVE ARTIFICIAL INTELLIGENCE
2y 5m to grant Granted Jan 20, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
81%
Grant Probability
92%
With Interview (+11.7%)
2y 7m
Median Time to Grant
Low
PTA Risk
Based on 539 resolved cases by this examiner. Grant probability derived from career allow rate.