DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/28/2024 and 09/17/2024 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement has been considered by the examiner.
Specification
1 35 U.S.C. 112(a) or pre-AIA 35 U.S.C. 112, requires the specification to be written in “full, clear, concise, and exact terms.” The specification is replete with terms which are not clear, concise and exact. The specification should be revised carefully in order to comply with 35 U.S.C. 112(a) or pre-AIA 35 U.S.C. 112. Examples of some unclear, inexact or verbose terms used in the specification are: “background pose”.
Claim Objections
2 Claim 20 is objected to because of the following informalities: The word “comprises” is put in the claim twice in a row. Appropriate correction is required.
Claim Rejections - 35 USC § 103
6 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
7 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
8 Claim(s) 1-6, 8-9, 11-17, 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yuan et al. (US 20200213532 A1) in view of Pang et al. (US 20180020204 A1).
9 Regarding claim 1, Yuan teaches an image display method, comprising:
acquiring a converted image corresponding to each video frame in a target video ([0011] reciting “In some embodiments, the method further comprises: acquiring a virtual component for the target object; and determining a position of the virtual component in the image; and synthesizing the video frame with the image based on the position of the foreground portion of the video frame in the image.”), wherein the converted image is an image obtained after converting a pixel point located in an in a foreground image into an , the foreground image is an image comprising a foreground object and extracted from the video frame ([Abstract] reciting “…configuring alpha channel values for pixels corresponding to the foreground portion and pixels corresponding to the background portion of the video frame…”), and the target video comprises a free perspective video or a light field video ([0005] reciting “…determining a position of the foreground portion of the video frame in the image…”);
acquiring a background pose of a background capturing device at a target moment, and determining a perspective image corresponding to the background pose from at least one converted image corresponding to the target moment ([0029] reciting “After a background portion of the video file is processed to make the background portion of the video frame transparent, the video file can be synthesized onto a video image captured by the user…”; [0064] reciting “After the alpha channel value is configured, the background portion of the video frame becomes transparent, while the foreground portion remains unchanged. The video frame obtained in this way can be deemed as a transparent texture carrying an alpha channel.”);
; and
combining a background image captured by the background capturing device at the target moment with the target image to obtain an augmented reality image, and displaying the augmented reality image (See Fig. 2).
PNG
media_image1.png
671
441
media_image1.png
Greyscale
10 Yuan does not explicitly teach … wherein the converted image is an image obtained after converting a pixel point located in an image coordinate system in a foreground image into an augmented reality coordinate system … converting a pixel point in the perspective image into a background capturing coordinate system where the background capturing device is located according to the background pose to obtain a target image…
11 Pang teaches … wherein the converted image is an image obtained after converting a pixel point located in an image coordinate system in a foreground image into an augmented reality coordinate system ([0346] reciting “Numerous data representations are possible for video data for fully immersive virtual reality and/or augmented reality (hereafter “immersive video”).”; [0242] reciting “In at least one embodiment, all pixels in all the light-field cameras in the tiled array may be mapped to light-field volume coordinates. This mapping may facilitate the generation of images for different viewpoints within the light-field volume.”; [0415] reciting “Image-based and/or video-based lossless compression techniques may advantageously be applied to the depth map data. Inter-vantage prediction techniques are applicable to depth map compression as well. Depth values may need to be geometrically re-calculated to another vantage with respect to an origin reference. In a manner similar to that of color, the (x,y) coordinate can be geometrically re-projected to another vantage.”) … converting a pixel point in the perspective image into a background capturing coordinate system where the background capturing device is located according to the background pose to obtain a target image ([0250] reciting “The rays that are successfully traced from the pixel and out through the objective lens may be aggregated in some manner…The camera-centric world coordinates may then be transformed based on the camera's location within the tiled array, into world coordinates that are consistent to all cameras in the array. Finally, each transformed ray in the consistent world coordinate space may be traced and intersections calculated for the inner and outer spheres that define the light-field volume coordinates.”)…
12 It would have been obvious to one with ordinary skill before the effective filing date of the claimed invention, to have modified the method (taught by Yuan) to incorporate the teachings of Pang to provide a method to incorporate types of coordinate systems used in a grid for the conversion of images to a target image taught by Yuan. Doing so would provide smooth transitions to view-dependent lighting and rendering as stated by Pang ([0013] recited).
13 Regarding claim 2, Yuan in view of Pang teaches the method according to claim 1 (see claim 1 rejection above), further comprising:
extracting, for each video frame, the foreground image from the video frame (Yuan; [0040] reciting “As an example, the comparison process may include extracting image features of the merchant-related image and image features of the captured image”);
acquiring a calibration result of a foreground capturing device used to capture the video frame (Yuan; [0050] reciting “A motion state of the virtual component in a dynamic form may be changed by the device according to a set motion rule or adjusted according to changes of some targets recognized from the captured image, or the like”) (Pang; [0252] reciting “FIG. 26 shows a diagram 2600 with a set of two charts that may be used to calibrate the mapping function. More specifically, the diagram 2600 includes a cylindrical inner calibration chart, or chart 2610, and a cylindrical outer calibration chart, or chart 2620…Each of the chart 2610 and the chart 2620 contains a pattern so that locations on images may be precisely calculated.”);
converting the pixel point located in the image coordinate system in the foreground image into a foreground capturing coordinate system where the foreground capturing device is located according to the calibration result to obtain a calibration image (Yuan; [0046] reciting “For a video file that carries an alpha channel, alpha channel values of the background portion can be changed after the foreground portion is distinguished from the background portion, to make the background portion transparent.”) (Pang; See claim 1 rejection regarding “image coordinate system”); and
converting a pixel point in the calibration image into the augmented reality coordinate system to obtain the converted image (Yuan; [0050] reciting “A motion state of the virtual component in a dynamic form may be changed by the device according to a set motion rule or adjusted according to changes of some targets recognized from the captured image, or the like. The construction of the virtual component can be flexibly set according to desires, which is not limited by the embodiments. Where the virtual component is involved, the method provided by the embodiments of the specification may further comprise: acquiring a virtual component for the target object. Synthesizing, after determining the position of the foreground portion in a currently captured image, the processed video frame with the currently captured image comprises: synthesizing, after respectively determining the positions of the foreground portion and of the virtual component in the currently captured image”) (Pang; [0250] reciting “The rays that are successfully traced from the pixel and out through the objective lens may be aggregated in some manner…The camera-centric world coordinates may then be transformed based on the camera's location within the tiled array, into world coordinates that are consistent to all cameras in the array. Finally, each transformed ray in the consistent world coordinate space may be traced and intersections calculated for the inner and outer spheres that define the light-field volume coordinates.”).
14 Regarding claim 3, Yuan in view of Pang teaches the method according to claim 2, wherein the converting a pixel point in the calibration image into the augmented reality coordinate system to obtain the converted image comprises (see claims 1-2 rejections above):
acquiring a fixed-axis coordinate system, wherein the fixed-axis coordinate system is a coordinate system determined according to a foreground pose of at least one foreground capturing device or the video frame captured (Pang; [0319] reciting “For example, a foreground object may block the view of the background from the perspective of the physical cameras in the capture system. However, the synthetic ray may intersect with the background object in the occluded region. As no color information is available at that location on the background object, the value for the color of the synthetic ray may be guessed or estimated using an infill algorithm. Any suitable pixel infill algorithms may be used.”; [0325] reciting “In another embodiment, “hole” data may be filled in using pixel infill algorithms. In a specific example, to fill hole data for the index with coordinates (rho1, theta1, rho2, theta2), a two-dimensional slice of data may be generated by keeping rho1 and theta1 fixed.”);
converting the pixel point in the calibration image into the fixed-axis coordinate system to obtain a fixed-axis image ([0229] reciting “At each time of capture, the sensor array 2000 may capture images from a subset of the arrays of objective lenses 2010. The arrays of objective lenses 2010 may maintain a fixed position while the array of light-field sensors 2020 may rotate.”; [0325] reciting “In another embodiment, “hole” data may be filled in using pixel infill algorithms. In a specific example, to fill hole data for the index with coordinates (rho1, theta1, rho2, theta2), a two-dimensional slice of data may be generated by keeping rho1 and theta1 fixed.”); and
converting a pixel point in the fixed-axis image into the augmented reality coordinate system to obtain the converted image (Pang; [0229] reciting “At each time of capture, the sensor array 2000 may capture images from a subset of the arrays of objective lenses 2010. The arrays of objective lenses 2010 may maintain a fixed position while the array of light-field sensors 2020 may rotate.”) (Yuan; [0050] reciting “A motion state of the virtual component in a dynamic form may be changed by the device according to a set motion rule or adjusted according to changes of some targets recognized from the captured image, or the like. The construction of the virtual component can be flexibly set according to desires, which is not limited by the embodiments. Where the virtual component is involved, the method provided by the embodiments of the specification may further comprise: acquiring a virtual component for the target object. Synthesizing, after determining the position of the foreground portion in a currently captured image, the processed video frame with the currently captured image comprises: synthesizing, after respectively determining the positions of the foreground portion and of the virtual component in the currently captured image”).
15 Regarding claim 4, Yuan in view of Pang teaches the method according to claim 3, wherein the converting the pixel point in the calibration image into the fixed-axis coordinate system to obtain a fixed-axis image comprises (see claims 1-3 rejection above):
acquiring a first homography matrix from the foreground capturing coordinate system to the fixed-axis coordinate system, and converting the pixel point in the calibration image into the fixed- axis coordinate system based on the first homography matrix to obtain the fixed-axis image (Pang; [0347] reciting “Various vantage arrangements may be used, such as a rectangular grid, a polar (spherical) matrix, a cylindrical matrix, and/or an irregular matrix. Each vantage may contain a projected view, such as an omnidirectional view projected onto the interior of a sphere, of the scene at a given coordinate in the sampling grid. This projected view may be encoded into video data for that particular vantage.”; [0202] reciting “Referring to FIGS. 16A through 16C, a sensor array 1600 may be a sparsely populated ring of plenoptic light-field cameras 1610. Each successive frame may capture a different set of angles than the previous frame.”; [0209] reciting “At each time of capture, the sensor array 1700 may capture images from a subset of the objective lenses 1710. The objective lenses 1710 may maintain a fixed position while the array of light-field sensors 1720 may rotate.”).
16 Regarding claim 5, Yuan in view of Pang teaches the method according to claim 3, wherein the converting a pixel point in the fixed-axis image into the augmented reality coordinate system to obtain the converted image comprises:
acquiring a second homography matrix from the fixed-axis coordinate system to the augmented reality coordinate system, and converting the pixel point in the fixed-axis image into the augmented reality coordinate system based on the second homography matrix to obtain the converted image (Pang; [0347] reciting “Various vantage arrangements may be used, such as a rectangular grid, a polar (spherical) matrix, a cylindrical matrix, and/or an irregular matrix. Each vantage may contain a projected view, such as an omnidirectional view projected onto the interior of a sphere, of the scene at a given coordinate in the sampling grid. This projected view may be encoded into video data for that particular vantage.”; See claims 2-3 rejections for obtaining the converted image).
17 Regarding claim 6, Yuan in view of Pang teaches the method according to claim 1, wherein the combining a background image captured by the background capturing device at the target moment with the target image to obtain an augmented reality image, and displaying the augmented reality image comprises (see claim 1 rejection above):
acquiring the background image captured by the background capturing device at the target moment (Yuan; [0029] reciting “A target object in which a user is possibly interested can be recognized from a captured image when the user takes images of a real environment, and a video file associated with the target object is further searched for. After a background portion of the video file is processed to make the background portion of the video frame transparent, the video file can be synthesized onto a video image captured by the user”);
fusing the target image and the background image based on transparency information of a pixel point in the target image to obtain the augmented reality image, and displaying the augmented reality image (Yuan; [Abstract] reciting “…configuring alpha channel values for pixels corresponding to the foreground portion and pixels corresponding to the background portion of the video frame, to make the background portion of the video frame transparent…”; See Fig. 2 from claim 1 above).
18 Regarding claim 8, Yuan in view of Pang teaches the method according to claim 1, wherein the determining a perspective image corresponding to the background pose from at least one converted image corresponding to the target moment comprises (see claim 1 rejection above):
taking a video frame corresponding to an augmented reality image displayed at a previous moment of the target moment as a previous frame, and determining at least one next frame of the previous frame from at least one video frame (Pang; [0387] reciting “In the encoding schemes 4000, 4100, 4200, 4300, and 4400, the I-frames are keyframes that can be independently determined. The P-frames are predicted frames with a single dependency on a previous frame, and the B-frames are predicted frames with more than one dependency on other frames, which may include future and/or past frames.”);
taking at least one converted image respectively corresponding to at least one next frame as the at least one converted image corresponding to the target moment, respectively acquiring at least one capturing perspective of the at least one converted image corresponding to the target moment (Yuan; [0031] reciting “In step 101, the method may include acquiring an image captured by a device, and recognizing a target object in the captured image.”) (Pang; [0387] reciting “In the encoding schemes 4000, 4100, 4200, 4300, and 4400, the I-frames are keyframes that can be independently determined. The P-frames are predicted frames with a single dependency on a previous frame, and the B-frames are predicted frames with more than one dependency on other frames, which may include future and/or past frames.”);
determining a background perspective corresponding to the background pose from the at least one capturing perspective and taking the converted image having the background perspective from the at least one converted image corresponding to the target moment as the perspective image (Yuan; [0029] reciting “After a background portion of the video file is processed to make the background portion of the video frame transparent, the video file can be synthesized onto a video image captured by the user, so that the captured image viewed by the user further includes a non-transparent foreground portion of the video file, thereby achieving a better augmented reality effect and a better visual effect.”).
19 Regarding claim 9, Yuan in view of Pang teaches the method according to claim 1, wherein combining a background image captured by the background capturing device at the target moment with the target image to obtain an augmented reality image, and displaying the augmented reality image comprises (see claim 1 rejection above):
acquiring the background image captured by the background capturing device at the target moment (Yuan; [0031] reciting “In step 101, the method may include acquiring an image captured by a device, and recognizing a target object in the captured image.”); identifying a background plane in the background image, and obtaining a plane position of the background plane in the background image; combining the background image with the target image based on the plane position so that the foreground object in the augmented reality image lies on the background plane (Yuan; [0049] reciting “The obtained transparent-background video frame can be synthesized with the captured image to obtain a synthesized video frame. The synthesized video frame can be rendered on the display screen of the device; and accordingly, the user can view the captured image with a video rendered. In some embodiments, before synthesis, the transparent-background video frame can be subject to other related processing. In some examples, the foreground portion may be rotated, scaled, or stretched as desired. In other examples, three-dimensional processing may also be performed. For example, a three-dimensional space plane may be determined according to a three-dimensional effect desired to be displayed, and the processed video frame is rendered onto the three-dimensional space plane to enable the foreground portion to have a three-dimensional effect, such as a shadow effect or a depth-of-field effect.”);
displaying the augmented reality image (Yuan; See Fig. 2 from claim 1 above).
20 Claims 11 and 12 has similar limitations as of claim 1, therefore they are rejected under the same rationale as claim 1.
21 Claims 13 and 21 has similar limitations as of claim 2, therefore they are rejected under the same rationale as claim 2.
22 Claim 14 has similar limitations as of claim 3, therefore it is rejected under the same rationale as claim 3.
23 Claim 15 has similar limitations as of claim 4, therefore it is rejected under the same rationale as claim 4.
24 Claim 16 has similar limitations as of claim 5, therefore it is rejected under the same rationale as claim 5.
25 Claim 17 has similar limitations as of claim 6, therefore it is rejected under the same rationale as claim 6.
26 Claim 19 has similar limitations as of claim 8, therefore it is rejected under the same rationale as claim 8.
27 Claim 20 has similar limitations as of claim 9, therefore it is rejected under the same rationale as claim 9.
28 Claim(s) 7 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Yuan et al. (US 20200213532 A1) in view of Pang et al. (US 20180020204 A1) as of claims 1 and 6, further in view of Zhang et al. (US 20200213533 A1).
29 Regarding claim 7, Yuan in view of Pang teaches the method according to claim 6, before the fusing the target image and the background image based on transparency information of a pixel point in the target image, further comprising (see claims 1 and 6 rejections above): but does not explicitly teach acquiring a color temperature of the background image; adjusting an image parameter of the target image based on the color temperature and updating the target image according to an adjustment result, wherein the image parameter comprises at least one of white balance or brightness.
30 Zhang teaches acquiring a color temperature of the background image ([Claim 2] reciting “…wherein the preset parameter comprises a color temperature, acquiring the background image and the portrait region image of the current user comprises: detecting a color temperature of a scene where the current user is located…or adjusting the color temperature of the background image to be merged using the color temperature of the scene, such that the color temperature of the scene matches the color temperature of the background image…”);
adjusting an image parameter of the target image based on the color temperature and updating the target image according to an adjustment result, wherein the image parameter comprises at least one of white balance or brightness ([Claim 2] reciting “…turning on a virtual light source matching the color temperature of the scene to adjust the color temperature of the background image to be merged…”; [0054] reciting “After the color temperature of the scene is obtained, in order to well merge the portrait and the background image to be merged, the processor 20 may be configured to adjust the color temperature of the scene and/or the background image to be merged based on the color temperature of the scene, such that the color temperature of the scene matches the color temperature of the background image and differences of the color temperatures in the merged image are almost invisible to human eyes.”).
31 It would have been obvious to one with ordinary skill before the effective filing date of the claimed invention, to have modified the method (taught by Yuan in view of Pang) to incorporate the teachings of Zhang to provide a method to provide a color temperature with the image parameters of the target images and/or background images provided from Yuan in view of Pang. Doing so would allow the merging effect to be good and the user experience is improved as stated by Zhang ([0054] recited).
32 Claim 18 has similar limitations as of claim 7, therefore it is rejected under the same rationale as claim 7.
Conclusion
33 Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNNY TRAN LE whose telephone number is (571)272-5680. The examiner can normally be reached Mon-Thu: 7:30am-5pm; First Fridays Off; Second Fridays: 7:30am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached at (571) 272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JOHNNY T LE/Examiner, Art Unit 2614
/KENT W CHANG/Supervisory Patent Examiner, Art Unit 2614