Last updated: May 29, 2026
Application No. 18/617,584
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Non-Final OA §102§103
Filed
Mar 26, 2024
Priority
Sep 28, 2021 — JP 2021-158214 +1 more
Examiner
ANSARI, TAHMINA N
Art Unit
2674
Tech Center
2600 — Communications
Assignee
Fujifilm Corporation
OA Round
1 (Non-Final)
Interview Optional

— +18.1% interview lift. Examiner has a relatively high allowance rate (86%); +18.1% interview lift. A written response may suffice.
Based on 881 resolved cases, 2023–2026
Examiner Intelligence

ANSARI, TAHMINA N View full profile →
Grants 86% — above average
Career Allowance Rate
753 granted / 881 resolved
+23.5% vs TC avg
Strong +18% interview lift
Without
With
+18.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
22 currently pending
Career history
906
Total Applications
across all art units
Statute-Specific Performance

§101
2.5%
-37.5% vs TC avg
§103
77.5%
+37.5% vs TC avg
§102
10.8%
-29.2% vs TC avg
§112
3.6%
-36.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 881 resolved cases
Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-24 are pending in this application.
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Specification
The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.


Claims 1-5, 7-14, 16, 18, and 20-24 are rejected under 35 U.S.C. 102(a)(1)/(a)(2) as being anticipated by Yang Liu et al. (US PGPub US US20210377505A1, with priority to May 26, 2021, hereby referred to as “Liu”).

Consider Claims 1 and 21. 
Liu teaches: 
1. An information processing apparatus comprising: a processor, wherein the processor acquires an image, / 21. An information processing method executed by an information processing apparatus, the method comprising: acquiring an image; (Liu: abstract, A system for generating three-dimensional (3D) images from captured images of a target when executing digital magnification. A controller executes a digital magnification on the first image of the target captured by the first image sensor and on the second image captured by the second image sensor of the target. The controller crops the first image and the second image to overlap a first portion of the target captured by the first image sensor with a second portion of the target captured by the second image sensor. The controller adjusts the cropping of the first image and the second image to provide binocular overlap of the first portion of the target with the second portion of target. The displayed cropped first image and the cropped second image display the 3D image at the digital magnification to the user. [0022] FIG. 1A illustrates a schematic view of binocular overlap of human eyes configuration 100 where the region seen by both eyes is the overlapped region included in the scene seen by both eyes. [0024] FIG. 1B illustrates a block diagram of a two imaging sensor configuration 150 where two image sensors with two lenses are used in a side-by-side configuration. The two imaging sensor configuration 150 includes a right image sensor 130 a, a left image sensor 130 b, a right lens 140 a, and a left lens 140 b. FIG. 1C illustrates a block diagram a binocular overlap of two imaging sensor configuration 175 with the regions seen by both imaging sensors is the overlapped region.)
1. executes processing of region extraction to extract a region of a detection target from the image, / 21. executing processing of region extraction to extract a region of a detection target from the image; (Liu: [0024] FIG. 1B illustrates a block diagram of a two imaging sensor configuration 150 where two image sensors with two lenses are used in a side-by-side configuration. The two imaging sensor configuration 150 includes a right image sensor 130 a, a left image sensor 130 b, a right lens 140 a, and a left lens 140 b. FIG. 1C illustrates a block diagram a binocular overlap of two imaging sensor configuration 175 with the regions seen by both imaging sensors is the overlapped region. The binocular overlap of two imaging sensor configuration 175 includes a captured region by right image sensor 150 a, a captured region by left image sensor 150 b, and a binocular overlap region 150 c. FIG. 1C depicts the binocular overlap region 150 c that is generated when a right image sensor 130 a and a left image sensor 130 b are used in a side-by-side configuration as depicted in FIG. 1B. [0025] FIG. 2 depicts a schematic view of a conventional digital zoom configuration 200 where the original image is cropped and resized (from left to right). The cropped and resized images are displayed to the user after conventional digital zooming. Conventionally, digital zoom has been commonly used to zoom the image. The principle of conventional digital zoom is illustrated in FIG. 2. Although conventional digital zoom can magnify the images without the need of zoom lenses, it is not suitable for 3D magnification. [0094])
1. generates position information of the region from region information of the extracted region, / 21. generating position information of the region from region information of the extracted region; (Liu: [0027] The digital magnification of a 3D image system 300 may generate 3D images from captured images of a target when executing digital magnification on the captured images to maintain the 3D images generated of the target after digital magnification. A first image sensor (such as right image sensor 330 a) may capture a first image at an original size of the target. A second image sensor (such as left image sensor 330 b) may be positioned on a common x-axis with the first image sensor 330 a to capture a second image at the original size of the target. It should be appreciated that the first image sensor 330 a and the second image sensor 330 b may be positioned with either a converging angle or a diverging angle. [0028] A controller 310 may execute a digital magnification on the first image captured by the first image sensor 330 a at the original size of the target and on the second image captured by the second image sensor 330 b at the original size of the target. The controller 310 may crop the first image captured by the first image sensor 330 a and the second image captured by the second image sensor 330 b to overlap a first portion of the target captured by the first image sensor 330 a with a second portion of the target captured by the second image sensor 330 b. The first portion of the target captured by the first image sensor 330 a overlaps with the second portion of the target captured by the second image sensor 330 b. In one aspect, the first image sensor 330 a is further coupled with a first autofocus lens and the second image sensor 330 b is further coupled with a second autofocus lens. The autofocus lenses may enable autofocus.)
1. and displays a result of the processing of the region extraction by switching between a first display aspect in which information for informing of a position of the region is displayed on a display screen based on the position information in a visually appealing manner / 21. and displaying a result of the processing of the region extraction by switching between a first display aspect in which information for informing of a position of the region is displayed on a display screen based on the position information in a visually appealing manner (Liu: [0029] The controller 310 may adjust the cropping of the first image and the second image to provide binocular overlap of the first portion of the target with the second portion of the target. The binocular overlap of the first image and the second image is an overlap threshold that when satisfied results in a 3D image of the target displayed to a user after the digital magnification is executed. The controller may instruct a display (such as near-eye 3D display 320) to display the cropped first image and the cropped second image that includes the binocular overlap to the user. The displayed cropped first image and the cropped second image display the 3D image at the digital magnification to the user. [0030])
1. and a second display aspect in which the region information is displayed in an aspect different from the first display aspect according to at least one of a size of the extracted region or a display size of the region displayed on the display screen. / 21. and a second display aspect in which the region information is displayed in an aspect different from the first display aspect according to at least one of a size of the extracted region or a display size of the region displayed on the display screen. (Liu: [0030] The controller 310 may resize the cropped first image to the original size of the first image captured by the first image sensor 330 a and the cropped second image to the original size of the second image captured by the second image sensor 330 b. The cropped first image as resized and the cropped second image resized includes the binocular overlap of the first image and the second image. The controller 310 may instruct the near- eye 3D display 320 to display the resized and cropped first image and the resized and cropped second image that includes the binocular overlap to the user. The displayed resized and cropped first image and the resized and cropped second image display the 3D image at the digital magnification to the user. It should be appreciated that in one embodiment the controller 310 may crop the first image captured by the first image sensor 330 a, to generate both left cropped image and right cropped image. In this embodiment, the second image captured by the second image sensor 330 b is not used. [0095] In another example, the controller 310 may conduct using calibration and disparity map to find the working distance of desired object. The controller 310 may use previously calibrated frames to extract a partial or full disparity or depth map. Then controller 310 may use a region of interest or a point in a specific part of the image to assess the distance to the desired object or plane of operation (working distance), and use the distance to determine proper value for autofocus from either a distance dependent equation or a pre-determined look-up-table (LUT).)

Consider Claim 23. 
Liu teaches: 
23. An information processing method executed by an information processing apparatus, the method comprising:  (Liu: abstract, A system for generating three-dimensional (3D) images from captured images of a target when executing digital magnification. A controller executes a digital magnification on the first image of the target captured by the first image sensor and on the second image captured by the second image sensor of the target. The controller crops the first image and the second image to overlap a first portion of the target captured by the first image sensor with a second portion of the target captured by the second image sensor. The controller adjusts the cropping of the first image and the second image to provide binocular overlap of the first portion of the target with the second portion of target. The displayed cropped first image and the cropped second image display the 3D image at the digital magnification to the user. [0022] FIG. 1A illustrates a schematic view of binocular overlap of human eyes configuration 100 where the region seen by both eyes is the overlapped region included in the scene seen by both eyes. [0024] FIG. 1B illustrates a block diagram of a two imaging sensor configuration 150 where two image sensors with two lenses are used in a side-by-side configuration. The two imaging sensor configuration 150 includes a right image sensor 130 a, a left image sensor 130 b, a right lens 140 a, and a left lens 140 b. FIG. 1C illustrates a block diagram a binocular overlap of two imaging sensor configuration 175 with the regions seen by both imaging sensors is the overlapped region.)
23. acquiring region information of a region of a detection target in an image and position information of the region;  (Liu: [0024] FIG. 1B illustrates a block diagram of a two imaging sensor configuration 150 where two image sensors with two lenses are used in a side-by-side configuration. The two imaging sensor configuration 150 includes a right image sensor 130 a, a left image sensor 130 b, a right lens 140 a, and a left lens 140 b. FIG. 1C illustrates a block diagram a binocular overlap of two imaging sensor configuration 175 with the regions seen by both imaging sensors is the overlapped region. The binocular overlap of two imaging sensor configuration 175 includes a captured region by right image sensor 150 a, a captured region by left image sensor 150 b, and a binocular overlap region 150 c. FIG. 1C depicts the binocular overlap region 150 c that is generated when a right image sensor 130 a and a left image sensor 130 b are used in a side-by-side configuration as depicted in FIG. 1B. [0025] FIG. 2 depicts a schematic view of a conventional digital zoom configuration 200 where the original image is cropped and resized (from left to right). The cropped and resized images are displayed to the user after conventional digital zooming. Conventionally, digital zoom has been commonly used to zoom the image. The principle of conventional digital zoom is illustrated in FIG. 2. Although conventional digital zoom can magnify the images without the need of zoom lenses, it is not suitable for 3D magnification. [0094])
23. and switching between a first display aspect in which information for informing of a position of the region is displayed on a display screen based on the position information in a visually appealing manner (Liu: [0027] The digital magnification of a 3D image system 300 may generate 3D images from captured images of a target when executing digital magnification on the captured images to maintain the 3D images generated of the target after digital magnification. A first image sensor (such as right image sensor 330 a) may capture a first image at an original size of the target. A second image sensor (such as left image sensor 330 b) may be positioned on a common x-axis with the first image sensor 330 a to capture a second image at the original size of the target. It should be appreciated that the first image sensor 330 a and the second image sensor 330 b may be positioned with either a converging angle or a diverging angle. [0028] A controller 310 may execute a digital magnification on the first image captured by the first image sensor 330 a at the original size of the target and on the second image captured by the second image sensor 330 b at the original size of the target. The controller 310 may crop the first image captured by the first image sensor 330 a and the second image captured by the second image sensor 330 b to overlap a first portion of the target captured by the first image sensor 330 a with a second portion of the target captured by the second image sensor 330 b. The first portion of the target captured by the first image sensor 330 a overlaps with the second portion of the target captured by the second image sensor 330 b. In one aspect, the first image sensor 330 a is further coupled with a first autofocus lens and the second image sensor 330 b is further coupled with a second autofocus lens. The autofocus lenses may enable autofocus.)
23. and a second display aspect in which the region information is displayed in an aspect different from the first display aspect according to at least one of a size of the region or a display size of the region displayed on the display screen. (Liu: [0029] The controller 310 may adjust the cropping of the first image and the second image to provide binocular overlap of the first portion of the target with the second portion of the target. The binocular overlap of the first image and the second image is an overlap threshold that when satisfied results in a 3D image of the target displayed to a user after the digital magnification is executed. The controller may instruct a display (such as near-eye 3D display 320) to display the cropped first image and the cropped second image that includes the binocular overlap to the user. The displayed cropped first image and the cropped second image display the 3D image at the digital magnification to the user. [0030] The controller 310 may resize the cropped first image to the original size of the first image captured by the first image sensor 330 a and the cropped second image to the original size of the second image captured by the second image sensor 330 b. The cropped first image as resized and the cropped second image resized includes the binocular overlap of the first image and the second image. The controller 310 may instruct the near- eye 3D display 320 to display the resized and cropped first image and the resized and cropped second image that includes the binocular overlap to the user. The displayed resized and cropped first image and the resized and cropped second image display the 3D image at the digital magnification to the user. It should be appreciated that in one embodiment the controller 310 may crop the first image captured by the first image sensor 330 a, to generate both left cropped image and right cropped image. In this embodiment, the second image captured by the second image sensor 330 b is not used. [0095] In another example, the controller 310 may conduct using calibration and disparity map to find the working distance of desired object. The controller 310 may use previously calibrated frames to extract a partial or full disparity or depth map. Then controller 310 may use a region of interest or a point in a specific part of the image to assess the distance to the desired object or plane of operation (working distance), and use the distance to determine proper value for autofocus from either a distance dependent equation or a pre-determined look-up-table (LUT).)

Consider Claim 2. 
Liu teaches: 2. The information processing apparatus according to claim 1, wherein the processor uses a segmentation model that performs image segmentation to execute the processing of the region extraction.(Liu: [0052] In yet another example, neural networks, convolutional neural networks, or deep learning are used for object recognition, image classification, object localization, image segmentation, image registration, or a combination thereof. Neural network based systems are advantageous in many cases for image segmentation, recognition and registration tasks. [0053] In one example, U-Net is used, which has a contraction path and expansion path. The contraction path has consecutive convolutional layers and max-pooling layer. The expansion path performs up-conversion and may have convolutional layers. The convolutional layer(s) prior to the output maps the feature vector to the required number of target classes in the final segmentation output. In one example, V-net is implemented for image segmentation to isolate the organ or tissue of interest (e.g. vertebral bodies). In one example, Autoencoder based Deep Learning Architecture is used for image segmentation to isolate the organ or tissue of interest. In one example, backpropagation is used for training the neural networks.)

Consider Claim 3. 
Liu teaches:3. The information processing apparatus according to claim 2, wherein the segmentation model is a learning model trained using machine learning to extract the region of the detection target from an input image. (Liu: [0054] In yet another example, deep residual learning is performed for image recognition or image segmentation, or image registration. A residual learning framework is utilized to ease the training of networks. A plurality of layers is implemented as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. One example of network that performs deep residual learning is deep Residual Network or ResNet. [0055] In another embodiment, a Generative Adversarial Network (GAN) is used for image recognition or image segmentation, or image registration. In one example, the GAN performs image segmentation to isolate the organ or tissue of interest. In the GAN, a generator is implemented through neural network to models a transform function which takes in a random variable as input and follows the targeted distribution when trained. A discriminator is implemented through another neural network simultaneously to distinguish between generated data and true data. In one example, the first network tries to maximize the final classification error between generated data and true data while the second network attempts to minimize the same error. Both networks may improve after iterations of the training process. [0056] In yet another example, ensemble methods are used, wherein multiple learning algorithms are used to obtain better predictive performance.)

Consider Claim 4. 
Liu teaches:4. The information processing apparatus according to claim 1, wherein the position information includes information indicating a position of a centroid or a center of a circumscribed rectangle of the extracted region. (Liu: [0049]-[0050] In another example, machine learning algorithms are used for determining a center of cropping for the left image, or a center of cropping for the right image, or both centers, during the digital magnification process. In one aspect, object recognition and localization based on machine learning (e.g. recognize surgical field, or recognize surgical instrument, or recognize tissues, etc.) may determine at least one center of the cropping. For example, the surgical bed is recognized and localized based on the left image, and a location within the surgical bed (e.g. centroid) is assigned to be the center of cropping for the left image, and the center of cropping for the right image is calculated based on the center of cropping for the left image and the desirable binocular overlap to be maintained. [0051] In one aspect, supervised learning can be implemented. In another aspect, unsupervised learning can be implemented. In yet another aspect, reinforcement learning can be implemented. It should be appreciated that feature learning, sparse dictionary learning, anomaly detection, association rules may also be implemented. Various models may be implemented for machine learning. In one aspect, artificial neural networks are used. In another aspect, decision trees are used. In yet another aspect, support vector machines are used. In yet another aspect, Bayesian networks are used. In yet another aspect, genetic algorithms are used. [0052] In yet another example, neural networks, convolutional neural networks, or deep learning are used for object recognition, image classification, object localization, image segmentation, image registration, or a combination thereof. Neural network based systems are advantageous in many cases for image segmentation, recognition and registration tasks.)

Consider Claim 5. 
Liu teaches:5. The information processing apparatus according to claim 1, wherein the first display aspect includes displaying a rectangular frame or a circular frame as the information for informing of the position of the extracted region. (Liu: [0050] In another example, machine learning algorithms are used for determining a center of cropping for the left image, or a center of cropping for the right image, or both centers, during the digital magnification process. In one aspect, object recognition and localization based on machine learning (e.g. recognize surgical field, or recognize surgical instrument, or recognize tissues, etc.) may determine at least one center of the cropping. For example, the surgical bed is recognized and localized based on the left image, and a location within the surgical bed (e.g. centroid) is assigned to be the center of cropping for the left image, and the center of cropping for the right image is calculated based on the center of cropping for the left image and the desirable binocular overlap to be maintained. [0051]-[0052], [0092] The system 300 may need stereoscopic calibration to enable accurate 3D digital magnification. In one example, after mechanical fixture to achieve vertical calibration, a single calibration (through repeated capture of similar calibration pattern such as fiducials or chessboard) on left and right sensors, based on that an initial homography transformation and cropping is applied to the pair of images to achieve a high accuracy alignment between the two in executed. This is similar to finding the epipolar geometry between two sensors and bringing the two frames into a single plane through calibration to have: (1) Identical scales of the captured geometry, through virtual identical focal length, (2) Identical peripheral alignment of captured scene, through undistortion, and (3) Identical vertical alignment of captured frames, through homography (projective) transformation. The new calibrated frames (rectified frames) may be used for subsequent digital 3D magnification and visualization processes, as previously described.)

Consider Claim 7. 
Liu teaches:7. The information processing apparatus according to claim 1, wherein the processor performs display in the first display aspect in a case in which the size of the extracted region is smaller than a first reference size, and performs display in the second display aspect in a case in which the size of the extracted region is larger than the first reference size. (Liu: [0050] In another example, machine learning algorithms are used for determining a center of cropping for the left image, or a center of cropping for the right image, or both centers, during the digital magnification process. In one aspect, object recognition and localization based on machine learning (e.g. recognize surgical field, or recognize surgical instrument, or recognize tissues, etc.) may determine at least one center of the cropping. For example, the surgical bed is recognized and localized based on the left image, and a location within the surgical bed (e.g. centroid) is assigned to be the center of cropping for the left image, and the center of cropping for the right image is calculated based on the center of cropping for the left image and the desirable binocular overlap to be maintained. [0051]-[0052], [0092] The system 300 may need stereoscopic calibration to enable accurate 3D digital magnification. In one example, after mechanical fixture to achieve vertical calibration, a single calibration (through repeated capture of similar calibration pattern such as fiducials or chessboard) on left and right sensors, based on that an initial homography transformation and cropping is applied to the pair of images to achieve a high accuracy alignment between the two in executed. This is similar to finding the epipolar geometry between two sensors and bringing the two frames into a single plane through calibration to have: (1) Identical scales of the captured geometry, through virtual identical focal length, (2) Identical peripheral alignment of captured scene, through undistortion, and (3) Identical vertical alignment of captured frames, through homography (projective) transformation. The new calibrated frames (rectified frames) may be used for subsequent digital 3D magnification and visualization processes, as previously described.)

Consider Claim 8. 
Liu teaches:8. The information processing apparatus according to claim 7, wherein the first display aspect is an aspect which causes the information for informing of the position of the region to be displayed with a size set in advance. (Liu: [0043] FIG. 6 depicts a schematic view of a digitally magnified stereo images with preservation of vertical alignment configuration 600 whereas the digital magnification is applied, binocular overlap between the cropped left images and cropped right images gradually decreases. The digitally magnified stereo images with preservation of vertical alignment configuration 600 includes digitally magnified right images 610 a are vertically aligned with the digitally magnified left images 610 b. For example, at a 2.3× magnification, the binocular overlap decreases from 75% to 50% resulting in a decrease in 3D visualization. At a 5.3× magnification, the binocular overlap decreases from 75% to 0%. The vertical alignment preservation without the preservation of binocular overlap may result in the gradual decrease in binocular overlap with each digital magnification. [0044] After executing a first digital magnification at a first digital magnification level on the first image captured by the first image sensor 330 b and on the second image captured by the second image sensor 330 a, the controller 310 may maintain the binocular overlap generated by adjusting the cropping of the first image and the second image to satisfy the overlap threshold. In one aspect, during the digital magnification process a fixed binocular overlap number is maintained, such as 80%, 90% or 100%. In another aspect, during the digital magnification process a range of binocular overlap number is maintained, such as 60%-90%. [0045])

Consider Claim 9. 
Liu teaches:9. The information processing apparatus according to claim 7, wherein the first display aspect is an aspect which causes the information for informing of the position of the region to be displayed with a fixed size set in advance. (Liu: [0043]-[0045], digital magnification level. [0046] After executing each previous digital magnification at each previous digital magnification level on the first image and the second image, the controller 310 may maintain the binocular overlap and the vertical alignment determined when executing the first digital magnification at the first digital magnification level on the first image and the second image. The controller 310 may continue to maintain the binocular overlap and the vertical alignment determined from the adjusting of the cropping of the first image and the second image to satisfy the overlap threshold after executing the first digital magnification at the first digital magnification level on the first image and the second image for each subsequent digital magnification level. Each subsequent digital magnification level is increased from each previous digital magnification level. For example, the overlap threshold may be satisfied when the binocular overlap includes 75% overlap of the first image and the second image is maintained for each subsequent digital magnification at each subsequent digital magnification level. In one embodiment, each subsequent digital magnification from the previous magnification level (e.g. increase from 1× to 2×, and increase 2× to 4×) may be a recursive function. [0047] The controller 310 may execute first digital magnification at the first digital magnification level on a non-concentric portion of the first image and a non-concentric portion of the second image. The non-concentric portion of the first image and the second image is a portion of the first image and the second image that differs from a center of the first image and the second image. The controller 310 may adjust the cropping of the first image and the second image to provide binocular overlap of the non-concentric portion of the first image and the non-concentric portion of the second image. The binocular overlap of the non-concentric portion of the first image and the non-concentric portion of the second image satisfies the overlap threshold either specified as a fixed number or a range. The controller 310 may continue to crop a non-concentric portion of the first image and a non-concentric portion of the second image for each subsequent digital magnification at each subsequent digital magnification level. The binocular overlap of the non-concentric portion of the first image and the non-concentric portion of the second image is maintained from the first digital magnification at the first digital magnification level. [0048] The non-concentric portion of the first image and the non-concentric portion of the second image may be resized to display to the user. In one aspect, at each magnification level, a first center of cropping of the non-concentric portion of the first image and a second center of cropping of the non-concentric portion of the second image are determined by the system 300. In one embodiment, the first center of cropping is fixed at the particular part of the first image, and second center of cropping at each magnification level is determined based on the location of the corresponding first center of cropping and the targeted binocular overlap. It should be appreciated that in some embodiment and at one or more magnification level, the digital magnification on either left image or right image may be concentric. For example, digital magnification on the left image is concentric but the digital magnification on the right image is non-concentric to maintain the binocular overlap.)

Consider Claim 10. 
Liu teaches:10. The information processing apparatus according to claim 1, wherein the processor performs display in the first display aspect in a case in which the display size of the region displayed on the display screen is smaller than a second reference size, and performs display in the second display aspect in a case in which the display size is larger than the second reference size. (Liu: [0030] The controller 310 may resize the cropped first image to the original size of the first image captured by the first image sensor 330 a and the cropped second image to the original size of the second image captured by the second image sensor 330 b. The cropped first image as resized and the cropped second image resized includes the binocular overlap of the first image and the second image. The controller 310 may instruct the near- eye 3D display 320 to display the resized and cropped first image and the resized and cropped second image that includes the binocular overlap to the user. The displayed resized and cropped first image and the resized and cropped second image display the 3D image at the digital magnification to the user. It should be appreciated that in one embodiment the controller 310 may crop the first image captured by the first image sensor 330 a, to generate both left cropped image and right cropped image. In this embodiment, the second image captured by the second image sensor 330 b is not used. [0094] Autofocus can be achieved through mechanical structure such as motors/actuators or through liquid lenses. In one example, the controller 310 may conduct brightness assessment to find a high contrast image, high frequency values, etc. through a method of Sobel filter or similar that extracts edges and high frequency features of the left and/or right images. The autofocus lens may test a large range of focus (course focus) to find a course focus, and subsequently conduct a smaller range of focus (fine focus) based in the neighborhood near the course focus. In one example, the right lens 340 a and left lens 340 b may be assigned to 2 ends of the focus range and progress towards the middle. Once the an optical focus value is found, both lenses will assigned the same value or similar value, to avoid 2 lenses focusing on different image planes. [0095] In another example, the controller 310 may conduct using calibration and disparity map to find the working distance of desired object. The controller 310 may use previously calibrated frames to extract a partial or full disparity or depth map. Then controller 310 may use a region of interest or a point in a specific part of the image to assess the distance to the desired object or plane of operation (working distance), and use the distance to determine proper value for autofocus from either a distance dependent equation or a pre-determined look-up-table (LUT).)

Consider Claim 11. 
Liu teaches:11. The information processing apparatus according to claim 10, wherein the processor receives an instruction to perform enlargement display and reduction display, changes a display magnification of the display screen in accordance with the received instruction, and switches between the first display aspect and the second display aspect in accordance with the display magnification. (Liu: [0030] The controller 310 may resize the cropped first image to the original size of the first image captured by the first image sensor 330 a and the cropped second image to the original size of the second image captured by the second image sensor 330 b. The cropped first image as resized and the cropped second image resized includes the binocular overlap of the first image and the second image. The controller 310 may instruct the near- eye 3D display 320 to display the resized and cropped first image and the resized and cropped second image that includes the binocular overlap to the user. The displayed resized and cropped first image and the resized and cropped second image display the 3D image at the digital magnification to the user. It should be appreciated that in one embodiment the controller 310 may crop the first image captured by the first image sensor 330 a, to generate both left cropped image and right cropped image. In this embodiment, the second image captured by the second image sensor 330 b is not used. [0043] FIG. 6 depicts a schematic view of a digitally magnified stereo images with preservation of vertical alignment configuration 600 whereas the digital magnification is applied, binocular overlap between the cropped left images and cropped right images gradually decreases. The digitally magnified stereo images with preservation of vertical alignment configuration 600 includes digitally magnified right images 610 a are vertically aligned with the digitally magnified left images 610 b. For example, at a 2.3× magnification, the binocular overlap decreases from 75% to 50% resulting in a decrease in 3D visualization. At a 5.3× magnification, the binocular overlap decreases from 75% to 0%. The vertical alignment preservation without the preservation of binocular overlap may result in the gradual decrease in binocular overlap with each digital magnification. [0044] After executing a first digital magnification at a first digital magnification level on the first image captured by the first image sensor 330 b and on the second image captured by the second image sensor 330 a, the controller 310 may maintain the binocular overlap generated by adjusting the cropping of the first image and the second image to satisfy the overlap threshold. In one aspect, during the digital magnification process a fixed binocular overlap number is maintained, such as 80%, 90% or 100%. In another aspect, during the digital magnification process a range of binocular overlap number is maintained, such as 60%-90%. [0045], [0095] In another example, the controller 310 may conduct using calibration and disparity map to find the working distance of desired object. The controller 310 may use previously calibrated frames to extract a partial or full disparity or depth map. Then controller 310 may use a region of interest or a point in a specific part of the image to assess the distance to the desired object or plane of operation (working distance), and use the distance to determine proper value for autofocus from either a distance dependent equation or a pre-determined look-up-table (LUT).)

Consider Claim 12. 
Liu teaches:12. The information processing apparatus according to claim 11, further comprising: an input device that receives an input of the instruction to perform the enlargement display and the reduction display. (Liu: [0030] The controller 310 may resize the cropped first image to the original size of the first image captured by the first image sensor 330 a and the cropped second image to the original size of the second image captured by the second image sensor 330 b. The cropped first image as resized and the cropped second image resized includes the binocular overlap of the first image and the second image. The controller 310 may instruct the near- eye 3D display 320 to display the resized and cropped first image and the resized and cropped second image that includes the binocular overlap to the user. The displayed resized and cropped first image and the resized and cropped second image display the 3D image at the digital magnification to the user. It should be appreciated that in one embodiment the controller 310 may crop the first image captured by the first image sensor 330 a, to generate both left cropped image and right cropped image. In this embodiment, the second image captured by the second image sensor 330 b is not used. [0043] FIG. 6 depicts a schematic view of a digitally magnified stereo images with preservation of vertical alignment configuration 600 whereas the digital magnification is applied, binocular overlap between the cropped left images and cropped right images gradually decreases. The digitally magnified stereo images with preservation of vertical alignment configuration 600 includes digitally magnified right images 610 a are vertically aligned with the digitally magnified left images 610 b. For example, at a 2.3× magnification, the binocular overlap decreases from 75% to 50% resulting in a decrease in 3D visualization. At a 5.3× magnification, the binocular overlap decreases from 75% to 0%. The vertical alignment preservation without the preservation of binocular overlap may result in the gradual decrease in binocular overlap with each digital magnification. [0044] After executing a first digital magnification at a first digital magnification level on the first image captured by the first image sensor 330 b and on the second image captured by the second image sensor 330 a, the controller 310 may maintain the binocular overlap generated by adjusting the cropping of the first image and the second image to satisfy the overlap threshold. In one aspect, during the digital magnification process a fixed binocular overlap number is maintained, such as 80%, 90% or 100%. In another aspect, during the digital magnification process a range of binocular overlap number is maintained, such as 60%-90%. [0045], [0095] In another example, the controller 310 may conduct using calibration and disparity map to find the working distance of desired object. The controller 310 may use previously calibrated frames to extract a partial or full disparity or depth map. Then controller 310 may use a region of interest or a point in a specific part of the image to assess the distance to the desired object or plane of operation (working distance), and use the distance to determine proper value for autofocus from either a distance dependent equation or a pre-determined look-up-table (LUT).)

Consider Claim 13. 
Liu teaches:13. The information processing apparatus according to claim 1, further comprising: a display device that displays the result of the processing.(Liu: [0028]-[0030], [0031] In one aspect, the display 320 is a near-eye display. In one embodiment, the display 320 is a 2D display. In another embodiment, the display 320 is a 3D display. It should be further appreciated that the near-eye display 320 may comprise LCD (liquid crystal) microdisplays, LED (light emitting diode) microdisplays, organic LED (OLED) microdisplays, liquid crystal on silicon (LCOS) microdisplays, retinal scanning displays, virtual retinal displays, optical see-through displays, video see-through displays, convertible video-optical see-through displays, wearable projection displays, projection display, and the like. It should be the appreciated that the display 320 may be stereoscopic to enable displaying of 3D content. In another embodiment, the display 320 is a projection display. It should be appreciated that the display 320 may be a monitor placed near the user. [0032] It should be further appreciated that the display 320 may be a 3D monitor placed near the user and the user will wear a polarizing glass or active shutter glasses. It should be further appreciated that the display 320 may be a half transparent mirror placed near the user to reflect the image projected by a projector. It should be further be appreciated that the said projector may be 2D or 3D. It should be further appreciated that the said projector may be used with the user wearing a polarizing glass or active shutter glasses. In one embodiment, the display 320 is a flat panel 2D monitor or TV. In another embodiment, the display 320 is a flat panel 3D monitor or 3D TV. The 3D monitor/TV may need to work with passive polarizers or active shutter glasses. In one aspect, the 3D monitor/TV is glass-free. It should be appreciated that the display 320 can be a touchscreen, or a projector. In one example, the display 320 comprises a half transparent mirror that can reflect projection of images to the eyes of the user. The images being projected may be 3D, and the user may wear 3D glasses (e.g. polarizer; active shutter 3D glasses) to visualize the 3D image data reflected by the half transparent mirror. The half transparent mirror may be placed on top of the surgical field to allow the user to see through the half transparent mirror to visualize the surgical field.)

Consider Claim 14. 
Liu teaches:14. The information processing apparatus according to claim 1, wherein the image is an X-ray transmission image. (Liu: [0088] In yet another embodiment, the digital magnification wearable device configuration 800 further includes additional input devices, such as a foot pedal, a wired or a wireless remote control, one or more button, a touch screen, microphone with voice control, gesture control device such as Microsoft Kinect, etc. It should be appreciated that the controller can be useable or disposable. It should be appreciated that a sterile sheet or wrap may be placed around the input device. In yet another embodiment, the digital magnification wearable device configuration 800 may display medical images such as MRI (magnetic resonance image) image data, computed tomography (CT) image data, positron emission tomography (PET) image data, single-photon emission computed tomography (SPECT), PET/CT, SPECT/CT, PET/MRI, gamma scintigraphy, X-ray radiography, ultrasound, and the like. In yet another embodiment, the digital magnification wearable device configuration 800 may include digital storage hardware, to enable recording the magnification data, and/or the original image data from image sensors, and/or audio data, and/or other sensor data.)

Consider Claim 16. 
Liu teaches:16. The information processing apparatus according to claim 1, wherein the detection target is a defect. (Liu: [0023] The present invention describes the apparatus, systems, and methods for constructing augmented reality devices for medical and dental magnification. One of the key concepts in 3D imaging and visualization is binocular overlap 120 c. Binocular overlap 120 c describes the overlap between the image as seen by the left eye 120 b, versus the image as seen by the right eye 120 a. For human being, a binocular overlap 120 c is approximately 70%. [0024] FIG. 1B , FIG. 1C  [0053] [0053] In one example, U-Net is used, which has a contraction path and expansion path. The contraction path has consecutive convolutional layers and max-pooling layer. The expansion path performs up-conversion and may have convolutional layers. The convolutional layer(s) prior to the output maps the feature vector to the required number of target classes in the final segmentation output. In one example, V-net is implemented for image segmentation to isolate the organ or tissue of interest (e.g. vertebral bodies). In one example, Autoencoder based Deep Learning Architecture is used for image segmentation to isolate the organ or tissue of interest. In one example, backpropagation is used for training the neural networks. [0055] In another embodiment, a Generative Adversarial Network (GAN) is used for image recognition or image segmentation, or image registration. In one example, the GAN performs image segmentation to isolate the organ or tissue of interest. In the GAN, a generator is implemented through neural network to models a transform function which takes in a random variable as input and follows the targeted distribution when trained. A discriminator is implemented through another neural network simultaneously to distinguish between generated data and true data. In one example, the first network tries to maximize the final classification error between generated data and true data while the second network attempts to minimize the same error. Both networks may improve after iterations of the training process.)

Consider Claim 18. 
Liu teaches:18. The information processing apparatus according to claim 1, wherein the second display aspect includes causing a contour line of the extracted region to be displayed. (Liu: [0064] In another embodiment, the digital magnification method further comprises of an additional condition to satisfy: the left cropped image shares the same geometrical center as that of the left original image. The right cropped image may be calculated by the controller 310 and generated accordingly by the controller 310 based on the cropping of the left cropped image, while preserving the binocular overlap and binocular vertical alignment. The benefit of this implementation is: the digital magnification process may be coaxial along the center of the left image (the optical axis), and the progression of digital magnification may align with the line of sight of the user's left eye. Alternatively, the cropped right image may share the same center as the right original image. The left cropped image may be calculated by the controller 310 and generated accordingly by the controller 310 based on the position and cropping of the right cropped image, while preserving the binocular overlap and binocular vertical alignment.
[0065] In another embodiment, the acceptable binocular overlap of cropped images may be specified as a range, rather than a specific number. For instance, the binocular overlap of cropped left and right images may be specified to be within a range between 60% to 90%. Any number between 60% and 90% may be considered satisfactory for digital magnification. With an acceptable range of binocular overlap as a guideline for cropping left and right images, the left image sensor 330 b with the left lens 340 b that are worn by the user controller 310 may capture a left image. The right image sensor 330 a and the right lens 340 a that are worn by the user may capture a right image. The left image and the right image may be provided to the controller 310.)

Consider Claim 20. 
Liu teaches:20. The information processing apparatus according to claim 1, wherein, in a case where the size of the extracted region is smaller than a first reference size, the processor displays the region information, and displays the information for informing of the position of the region with the first display aspect. (Liu: [0030] The controller 310 may resize the cropped first image to the original size of the first image captured by the first image sensor 330 a and the cropped second image to the original size of the second image captured by the second image sensor 330 b. The cropped first image as resized and the cropped second image resized includes the binocular overlap of the first image and the second image. The controller 310 may instruct the near- eye 3D display 320 to display the resized and cropped first image and the resized and cropped second image that includes the binocular overlap to the user. The displayed resized and cropped first image and the resized and cropped second image display the 3D image at the digital magnification to the user. It should be appreciated that in one embodiment the controller 310 may crop the first image captured by the first image sensor 330 a, to generate both left cropped image and right cropped image. In this embodiment, the second image captured by the second image sensor 330 b is not used. [0043] FIG. 6 depicts a schematic view of a digitally magnified stereo images with preservation of vertical alignment configuration 600 whereas the digital magnification is applied, binocular overlap between the cropped left images and cropped right images gradually decreases. The digitally magnified stereo images with preservation of vertical alignment configuration 600 includes digitally magnified right images 610 a are vertically aligned with the digitally magnified left images 610 b. For example, at a 2.3× magnification, the binocular overlap decreases from 75% to 50% resulting in a decrease in 3D visualization. At a 5.3× magnification, the binocular overlap decreases from 75% to 0%. The vertical alignment preservation without the preservation of binocular overlap may result in the gradual decrease in binocular overlap with each digital magnification. [0044] After executing a first digital magnification at a first digital magnification level on the first image captured by the first image sensor 330 b and on the second image captured by the second image sensor 330 a, the controller 310 may maintain the binocular overlap generated by adjusting the cropping of the first image and the second image to satisfy the overlap threshold. In one aspect, during the digital magnification process a fixed binocular overlap number is maintained, such as 80%, 90% or 100%. In another aspect, during the digital magnification process a range of binocular overlap number is maintained, such as 60%-90%. [0045], [0095] In another example, the controller 310 may conduct using calibration and disparity map to find the working distance of desired object. The controller 310 may use previously calibrated frames to extract a partial or full disparity or depth map. Then controller 310 may use a region of interest or a point in a specific part of the image to assess the distance to the desired object or plane of operation (working distance), and use the distance to determine proper value for autofocus from either a distance dependent equation or a pre-determined look-up-table (LUT).)

Consider Claim 22. 
Liu teaches:22. A non-transitory, computer-readable tangible recording medium on which a program for causing, when read by a computer, the computer to execute the information processing method according to claim 21 is recorded. (Liu: [0019], [0097] The controller 310 comprises the hardware and software necessary to implement the aforementioned methods. In one embodiment, the controller 310 involves a computer-readable medium comprising processor-executable instructions configured to implement one or more of the techniques presented herein. An example embodiment of a computer-readable medium or a computer-readable device comprises a computer-readable medium, such as a SSD, CD-R, DVD-R, flash drive, a platter of a hard disk drive, etc., on which is encoded computer-readable data. This computer-readable data, such as binary data comprising at least one of a zero or a one, in turn comprises a set of computer instructions configured to operate according to one or more of the principles set forth herein. In some embodiments, the set of computer instructions are configured to perform a method, such as at least some of the exemplary methods described herein, for example. In some embodiments, the set of computer instructions are configured to implement a system, such as at least some of the exemplary systems described herein, for example. Many such computer-readable media are devised by those of ordinary skill in the art that are configured to operate in accordance with the techniques presented herein. [0101])

Consider Claim 24. 
Liu teaches:24. A non-transitory, computer-readable tangible recording medium on which a program for causing, when read by a computer, the computer to execute the information processing method according to claim 23 is recorded. (Liu: [0019], [0097] The controller 310 comprises the hardware and software necessary to implement the aforementioned methods. In one embodiment, the controller 310 involves a computer-readable medium comprising processor-executable instructions configured to implement one or more of the techniques presented herein. An example embodiment of a computer-readable medium or a computer-readable device comprises a computer-readable medium, such as a SSD, CD-R, DVD-R, flash drive, a platter of a hard disk drive, etc., on which is encoded computer-readable data. This computer-readable data, such as binary data comprising at least one of a zero or a one, in turn comprises a set of computer instructions configured to operate according to one or more of the principles set forth herein. In some embodiments, the set of computer instructions are configured to perform a method, such as at least some of the exemplary methods described herein, for example. In some embodiments, the set of computer instructions are configured to implement a system, such as at least some of the exemplary systems described herein, for example. Many such computer-readable media are devised by those of ordinary skill in the art that are configured to operate in accordance with the techniques presented herein. [0101])

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

Claims 6 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (US PGPub US US20210377505A1, with priority to May 26, 2021, hereby referred to as “Liu”), in view of Sevastopolskiy et al. (US PGPub US20220157014A1, with foreign priority to November 19, 2020, hereby referred to as “Sevastopolskiy”. 

Consider Claims 6 and 19.
Liu does teach: The information processing apparatus of Claim 1. 
Liu does not teach: the features claimed in claim 6 or claim 19, as presented below: 
- 6. wherein the second display aspect includes segmentation mask display in which the extracted region is filled and displayed.
- 19. wherein the first display aspect includes causing the information for informing of the position of the region to be displayed in a blinking manner.
Sevastopolskiy teaches: 
1. An information processing apparatus comprising: a processor, wherein the processor acquires an image, / 21. An information processing method executed by an information processing apparatus, the method comprising: acquiring an image; (Sevastopolskiy: The disclosure provides a method for generating relightable 3D portrait using a deep neural network and a computing device implementing the method. A possibility of obtaining, in real time and on computing devices having limited processing resources, realistically relighted 3D portraits having quality higher or at least comparable to quality achieved by prior art solutions, but without utilizing complex and costly equipment is provided. A method for rendering a relighted 3D portrait of a person, the method including: receiving an input defining a camera viewpoint and lighting conditions, rasterizing latent descriptors of a 3D point cloud at different resolutions based on the camera viewpoint to obtain rasterized images, wherein the 3D point cloud is generated based on a sequence of images captured by a camera with a blinking flash while moving the camera at least partly around an upper body, the sequence of images comprising a set of flash images and a set of no-flash images, processing the rasterized images with a deep neural network to predict albedo, normals, environmental shadow maps, and segmentation mask for the received camera viewpoint, and fusing the predicted albedo, normals, environmental shadow maps, and segmentation mask into the relighted 3D portrait based on the lighting conditions.)
1. executes processing of region extraction to extract a region of a detection target from the image, / 21. executing processing of region extraction to extract a region of a detection target from the image; (Sevastopolskiy: [0029] FIG. 2 is a flowchart illustrating an example method for rendering a relighted 3D portrait of a person according to various embodiments. A sequence of images featuring an upper body of the person is captured at S200 by a camera with blinking flash. During the capturing (e.g. photographing or video capturing) the camera may be moved by the person or a third person at least partly around the upper body of the person. The resulting sequence of images comprises a set of flash images and a set of no-flash images. [0030] A 3D point cloud is generated at S205 based on the captured sequence of images. The 3D point cloud may be generated using, for example, Structure-from-Motion (SfM) technique or any other known techniques allowing to reconstruct 3D structure of a scene or object based on the sequence of 2D images of that scene or object. During the training stage that will be described in greater detail below, the method may include augmenting each point in the 3D point cloud 10 with latent descriptor being a multi-dimensional latent vector characterizing properties of the point.)
1. generates position information of the region from region information of the extracted region, / 21. generating position information of the region from region information of the extracted region; (Sevastopolskiy: [0025] The camera viewpoint may be specified in an input directly, for example, and without limitation, in the form of particular coordinates in AR/VR environment, camera direction, focal length and other camera intrinsic parameters, or indirectly, for example, and without limitation, a touch input to a corresponding viewpoint in AR/VR environment, an input corresponding to a viewpoint based on a position and an orientation of a computing device (e.g. a smartphone, AR/VR smart glasses) used for manipulating in AR/VR environment or for displaying certain information therein. [0031] One or more inputs defining a camera viewpoint and/or lighting conditions are received at S210. Latent descriptors of the 3D point cloud, which are previously obtained during the training stage, are rasterized at S215 at different resolutions according to the camera viewpoint to obtain rasterized images. Neural rendering is performed by processing at S220 the rasterized images with a deep neural network to predict albedo, normals, environmental shadow maps, and segmentation mask for the received (e.g., new) camera viewpoint. Before the inference stage the deep neural network is trained as will be discussed in greater detail below with reference to FIG. 4. The relighted 3D portrait is rendered by fusing at S225 the predicted albedo, normals, environmental shadow maps, and segmentation mask into the relighted 3D portrait according to the received lighting conditions. [0032] FIG. 3 is a flowchart illustrating example details of operation S205 of generating the 3D point cloud based on the sequence of images according to various embodiments. Generating the 3D point cloud at S205 may further include estimating S205.1 camera viewpoints with which the sequence of images is captured. The camera viewpoints estimated at step S205.1 may be used for generating the 3D point cloud as well as for training the deep neural network, latent descriptors for the 3D point cloud, and auxiliary parameters during the training stage. The camera viewpoint and the lighting conditions received at S210 (e.g., new or arbitrary camera viewpoint and lighting conditions requested by a user) via the input differ from camera viewpoints estimated at S205.1 and lighting conditions with which the sequence of images is captured (e.g., current camera viewpoints and lighting conditions during capturing). Estimating camera viewpoints at S205.1 may be performed using, for example, SfM. Points of the originally generated 3D point cloud at least partly corresponding to an upper body of the person.)
1. and displays a result of the processing of the region extraction by switching between a first display aspect in which information for informing of a position of the region is displayed on a display screen based on the position information in a visually appealing manner / 21. and displaying a result of the processing of the region extraction by switching between a first display aspect in which information for informing of a position of the region is displayed on a display screen based on the position information in a visually appealing manner(Sevastopolskiy: [0018] FIG. 1 is a diagram illustrating an example of an overall processing pipeline according to various embodiments. The illustrated processing pipeline may be implemented fully or partly on currently available computing devices such as, for example, and without limitation, a smartphone, a tablet, VR (Virtual Reality) display device, AR (Augmented Reality) display device, 3D display device, etc. VR, AR, and 3D display devices may, for example, be in the form of smart glasses. When implemented partly, the computationally demanding operations (e.g., operations of training stage) may be performed in the cloud, for example by a server. [0019] For generating a relighted 3D portrait of a person the disclosed system may utilize a sequence of images featuring a person. Such sequence of images may be captured by conventional cameras of currently available handheld devices (e.g. by a smartphone camera). The sequence of images may be provided to the pipeline from user gallery (when access to the gallery is permitted by the user) or downloaded from a web-resource. However, it will be understood that the source of the sequence of images is not limited. [0058] Given the output of the deep neural network, the final rendered image (e.g., the relighted 3D portrait) is defined by fusing the albedo, normals, and environmental shadow maps, and segmentation mask as prescribed by the lighting model (4) (e.g., according to the lighting conditions): Equation 5, [0059]-[0060])
6. The information processing apparatus according to claim 1, wherein the second display aspect includes segmentation mask display in which the extracted region is filled and displayed. (Sevastopolskiy: [0053]-[0061], [0053] The overall processing pipeline according to various embodiments is outlined in FIG. 1. The last layer of the deep neural network may output an eight-channel tensor with several groups: [0058] Given the output of the deep neural network, the final rendered image (e.g., the relighted 3D portrait) is defined by fusing the albedo, normals, and environmental shadow maps, and segmentation mask as prescribed by the lighting model (4) (e.g., according to the lighting conditions): Equation 5, [0059]-[0060])
19. The information processing apparatus according to claim 1, wherein the first display aspect includes causing the information for informing of the position of the region to be displayed in a blinking manner. (Sevastopolskiy: [0020]-[0021] [0029] FIG. 2 is a flowchart illustrating an example method for rendering a relighted 3D portrait of a person according to various embodiments. A sequence of images featuring an upper body of the person is captured at S200 by a camera with blinking flash. During the capturing (e.g. photographing or video capturing) the camera may be moved by the person or a third person at least partly around the upper body of the person. The resulting sequence of images comprises a set of flash images and a set of no-flash images. [0076] FIG. 5 is a block diagram illustrating an example configuration of a computing device 50 configured to render 3D portrait of a person according to various embodiments. The computing device 50 comprises processor (e.g., including processing circuitry) 50.1, camera 50.2 (camera 50.2 is optional; thus in FIG. 5 it is outlined with a dotted line), and memory 50.3 interconnected with each other. Illustration of interconnections between processor 50.1, camera 50.2, and memory 50.3 should not be construed as the limitation, because it is clear that processor 50.1, camera 50.2, and memory 50.3 may be interconnected differently. Camera 50.2 is configured to capture a sequence of images of a person using blinking flash. Memory 50.3 is configured to store processor-executable instructions instructing the computing device 50 to perform any step or substep of the disclosed method, as well as weights of the deep neural network, latent descriptors, and auxiliary parameters obtained during the training stage. The processor 50.1, upon execution of the processor-executable instructions, is configured to cause the computing device 50 to carry out the disclosed method for rendering a relighted 3D portrait of a person.)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify Liu’s machine learning method and system for digital magnification of 3D images with the improved relighted rendering of Sevastopolskiy. The determination of obviousness is predicated upon the following findings: they are both directed towards the field of improved machine learning algorithms for the rendering and display of 3D images isolating regions of interest using enhanced segmentation masking operations, and One skilled in the art would have been motivated to modify Liu in this manner in order to improve the overall quality of rendered and displayed data to incorporate in realistic relightable 3D models. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in the manner explained above using known engineering design, interface and/or programming techniques, without changing a “fundamental” operating principle of Liu, while the teaching of Sevastopolskiy continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of using enhanced segmentation masking operations and lighting conditions in order to improve the overall  machine learning algorithms for the rendering and display of 3D images isolating regions of interest. It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question. 

Claims 15-17 are further rejected under 35 U.S.C. 103 as being unpatentable over Liu et al. (US PGPub US US20210377505A1, with priority to May 26, 2021, hereby referred to as “Liu”), in view of Kaneko et al. (US PGPub US 20210072165, filed November 17, 2020), hereby referred to as “Kaneko”. 

Consider Claims 15-17.
Liu does teach: The information processing apparatus of Claim 1 and Claim 16. 
Liu does not teach: the features claimed in claims 15 or 17:
15. The information processing apparatus according to claim 1, wherein the image is an X-ray transmission image of a cast metal component, a forged metal component, or a welded metal component.
17. The information processing apparatus according to claim 16, wherein the defect includes at least one of an air bubble, a porosity, foreign material less dense (FMLD), or foreign material more dense (FMMD).
Kaneko teaches: 
1. An information processing apparatus comprising: a processor, wherein the processor acquires an image, (Kaneko: [0036]-[0052], Figures 1 and 2, [0036] FIG. 1 is a block diagram illustrating a defect inspection device according to an embodiment of the present invention. [0037] A defect inspection device 10 according to this embodiment is a device with which a user (image interpreter) performs nondestructive inspection of an industrial product such as a casting by using a radiographic image of the industrial product. The inspection-target industrial product is hereinafter referred to as an object OBJ. [0046] Next, the defect detection and display function of the defect inspection device will be described. As illustrated in FIG. 1, the control unit 12 includes an image acquisition unit 12A, a defect detection unit 12C, a defect information acquisition unit 12B, a defect selection unit 12D, and a display control unit 12E. [0047] First, for defect detection, the image acquisition unit 12A acquires a radiographic image (for example, an X-ray transmission image) of the object OBJ from the imaging system 100 or the like.)
1. executes processing of region extraction to extract a region of a detection target from the image, generates position information of the region from region information of the extracted region, (Kaneko: [0045] The defect inspection device 10 is capable of accepting an input of a radiographic image from the imaging system 100 via the communication I/F 20. The method for inputting a radiographic image to the defect inspection device 10 is not limited to communication via a network. For example, a USB (Universal Serial Bus) cable, Bluetooth (registered trademark), infrared communication, or the like may be used. Alternatively, a recording medium (for example, a memory card) removably attachable to and readable by the defect inspection device 10 may store a radiographic image, and an input of the radiographic image may be accepted via the recording medium. [0048] The defect detection unit 12C analyzes the radiographic image of the object OBJ, checks the radiographic image against design data (for example, CAD (Computer-Aided Design) data) of the object OBJ, and detects defects included in the object OBJ. [0049] Defects occurring in industrial products such as castings can be classified according to the shape and cause. Examples of the type of defect occurring in industrial products such as castings include stains, cracks, chipping, defects caused by contamination with foreign substances and dissimilar kinds of metals, and bubble-like defects caused by contamination of a mold with air during casting.)
1. and displays a result of the processing of the region extraction by switching between a first display aspect in which information for informing of a position of the region is displayed on a display screen based on the position information in a visually appealing manner. (Kaneko: [0046] Next, the defect detection and display function of the defect inspection device will be described. As illustrated in FIG. 1, the control unit 12 includes an image acquisition unit 12A, a defect detection unit 12C, a defect information acquisition unit 12B, a defect selection unit 12D, and a display  control unit 12E. [0050] The defect detection unit 12C identifies the type of defect on the basis of the size and shape of defects detected by image analysis, and the luminance differences between the defect and neighboring pixels, which are caused by the transmittance and scattering of radiation through the object OBJ. Then, the defect detection unit 12C assigns an identifier for identifying defects and generates defect information DAT1 in association with information on the type of defect. The defect detection unit 12C generates the defect information DAT1 for each defect and stores the defect information DAT1 in the storage unit 18 in association with the radiographic image.[0051], [0060]-[0061], [0059] Then, to display defects, the image acquisition unit 12A acquires a radiographic image of the object OBJ from the storage unit 18, and the defect information acquisition unit 12B acquires the defect information DAT1 associated with the radiographic image of the object OBJ.)
15. The information processing apparatus according to claim 1, wherein the image is an X-ray transmission image of a cast metal component, a forged metal component, or a welded metal component.(Kaneko: [0045] The defect inspection device 10 is capable of accepting an input of a radiographic image from the imaging system 100 via the communication I/F 20. The method for inputting a radiographic image to the defect inspection device 10 is not limited to communication via a network. For example, a USB (Universal Serial Bus) cable, Bluetooth (registered trademark), infrared communication, or the like may be used. Alternatively, a recording medium (for example, a memory card) removably attachable to and readable by the defect inspection device 10 may store a radiographic image, and an input of the radiographic image may be accepted via the recording medium. [0046]-[0047] First, for defect detection, the image acquisition unit 12A acquires a radiographic image (for example, an X-ray transmission image) of the object OBJ from the imaging system 100 or the like.)
16. The information processing apparatus according to claim 1, wherein the detection target is a defect. (Kaneko: [0046]-[0047] First, for defect detection, the image acquisition unit 12A acquires a radiographic image (for example, an X-ray transmission image) of the object OBJ from the imaging system 100 or the like.)
17. The information processing apparatus according to claim 16, wherein the defect includes at least one of an air bubble, a porosity, foreign material less dense (FMLD), or foreign material more dense (FMMD). (Kaneko: [0049] Defects occurring in industrial products such as castings can be classified according to the shape and cause. Examples of the type of defect occurring in industrial products such as castings include stains, cracks, chipping, defects caused by contamination with foreign substances and dissimilar kinds of metals, and bubble-like defects caused by contamination of a mold with air during casting. [0050] The defect detection unit 12C identifies the type of defect on the basis of the size and shape of defects detected by image analysis, and the luminance differences between the defect and neighboring pixels, which are caused by the transmittance and scattering of radiation through the object OBJ. Then, the defect detection unit 12C assigns an identifier for identifying defects and generates defect information DAT1 in association with information on the type of defect. The defect detection unit 12C generates the defect information DAT1 for each defect and stores the defect information DAT1 in the storage unit 18 in association with the radiographic image. [0059] Then, to display defects, the image acquisition unit 12A acquires a radiographic image of the object OBJ from the storage unit 18, and the defect information acquisition unit 12B acquires the defect information DAT1 associated with the radiographic image of the object OBJ. [0060] The defect selection unit 12D selects defects in accordance with an instruction input from the input unit 14. The defect selection unit 12D accepts, for example, an input of selection criteria such as the defect type or size, or the thickness of the object OBJ at the position of defects, and the density of neighboring defects. Then, the defect selection unit 12D selects defects that match the selection criteria on the basis of the defect information DAT1 (see FIG. 4 and FIG. 5).)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to modify Liu’s machine learning method and system for digital magnification of 3D images with the improved relighted defect determination and display system of Kaneko. The determination of obviousness is predicated upon the following findings: they are both directed towards the field of using radiation imaging for defect detection and display isolating regions of interest, and One skilled in the art would have been motivated to modify Liu in this manner in order to improve the overall quality of rendered and displayed data to incorporate in defect detection across radiographic image data of industrial products. Furthermore, the prior art collectively includes each element claimed (though not all in the same reference), and one of ordinary skill in the art could have combined the elements in the manner explained above using known engineering design, interface and/or programming techniques, without changing a “fundamental” operating principle of Liu, while the teaching of Keneko continues to perform the same function as originally taught prior to being combined, in order to produce the repeatable and predictable result of using enhanced machine learning models for region segmentation in radiographic images and applying it to the field of Kaneko and ensuring the applicability to industrial products and defect detection as well. It is for at least the aforementioned reasons that the examiner has reached a conclusion of obviousness with respect to the claim in question.

Conclusion
The prior art made of record in form PTO-892 and not relied upon is considered pertinent to applicant's disclosure. 


    PNG
    media_image1.png
    396
    922
    media_image1.png
    Greyscale

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TAHMINA ANSARI whose telephone number is 571-270-3379.  The examiner can normally be reached on IFP Flex - Monday through Friday 9 to 5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, O’NEAL MISTRY can be reached on 313-446-4912.  The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300 for regular communications and 571-273-8300 for After Final communications. TC 2600’s customer service number is 571-272-2600.
Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to the receptionist whose telephone number is 571-272-2600.




2674
/Tahmina Ansari/

May 8, 2026
/TAHMINA N ANSARI/Primary Examiner, Art Unit 2674
Read full office action
Prosecution Timeline

Mar 26, 2024
Application Filed
May 19, 2026
Non-Final Rejection mailed — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/936,626
Patent 12639808
SYSTEMS AND METHODS FOR IMAGE PROCESSING TO DETERMINE CASE OPTIMIZATION
3y 8m to grant Granted May 26, 2026
18/127,657
Patent 12639928
SELF-SUPERVISED LEARNING METHOD AND APPARATUS FOR IMAGE FEATURES, DEVICE, AND STORAGE MEDIUM
3y 1m to grant Granted May 26, 2026
17/795,523
Patent 12614246
SENSOR PRIORITIZATION FOR COMPOSITE IMAGE CAPTURE
3y 9m to grant Granted Apr 28, 2026
17/989,722
Patent 12614297
LIGHT FIELD ENCODED IMAGING METHOD AND APPARATUS FOR SCATTERING SCENE
3y 5m to grant Granted Apr 28, 2026
18/068,590
Patent 12586249
PROCESSING APPARATUS, PROCESSING METHOD, AND STORAGE MEDIUM FOR CALIBRATING AN IMAGE CAPTURE APPARATUS
3y 3m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
86%
Grant Probability
99%
With Interview (+18.1%)
2y 6m (~4m remaining)
Median Time to Grant
Low
PTA Risk
Based on 881 resolved cases by this examiner. Grant probability derived from career allowance rate.