DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Status
Claims 1, 3, 17, and 18 are amended.
Claims 2 and 4 are canceled.
No newly added claims.
Claims 1, 3, and 5-18 are presented for examination.
Response to Arguments
Applicant's arguments filed in the amendment filed on 9/18/2025 have been fully considered but applicant’s arguments are moot in view of new grounds of rejection.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 6, 7, 9, 10, and 12-18 are rejected under U.S.C. 103 as being unpatentable over Imanishi (US 20100251173, part of the Information Disclosure Statement of 3/4/2023), in view of Xiaojie (US 20200288112), in further view of Arai (US 20200082603).
Regarding claim 1, Imanishi discloses, an information processing apparatus (Par. 0058, fig. 1, The contents server 100) comprising:
a processor (Par. 0148, Fig. 21, a CPU (Central Processing Unit) 902); and
a memory built in or connected to the processor (Par. 0149, fig. 21, The CPU 902, the ROM 904 and the RAM 906 are connected to one another through a bus 910),
wherein the information processing apparatus generates an image for viewing to be viewed by a viewer based on an image captured by imaging with an imaging apparatus (Par. 0066, The contents server 100 processes and delivers video and sound that are output from the imaging device 102 associated with a selected user position to the user terminal 150 of a user who has selected one of the plurality of user positions in the virtual space 10),
the processor (Par. 0148, Fig. 21, a CPU (Central Processing Unit) 902, i.e. in the contents server 100)
acquires request information for requesting generation of the image for viewing, (Par. 0080-0084, user list is generated based on user selecting one of the position from plurality of user positions 14A-14E in virtual space 10 (as show in fig. 2 corresponds to different outfields in baseball stadium in real space, par. 0065) the situation where users who wish to watch a game together are located close to each other in one user position can be produced in the virtual space 10, here user selection of position = acquire request information for viewing game from that selected position). Par. 0084, when delivering contents to a user, the contents processing unit 134 receives moving images containing video and sound from the imaging device 102 that is associated with the user position 14 in which a delivery destination user is located, i.e. execute generation of image from imaging device 102 associated with one of the user positions 14A-14E which has been selected by users),
executes generation processing of generating the image for viewing in accordance with the acquired request information and the image captured by imaging with the imaging apparatus (Par. 0084, when delivering contents to a user, the contents processing unit 134 receives moving images containing video and sound from the imaging device 102 that is associated with the user position 14 in which a delivery destination user is located, i.e. execute generation of image from imaging device 102 associated with one of the user positions 14A-14E which has been selected by users)
the request information includes setting information indicating setting of the image for viewing (Par. 0080-0084, user list is generated based on user selecting one of the position from plurality of user positions 14A-14E in virtual space 10, user selecting specific user position = request from user to view image from specific location (read as setting information) in virtual space 10), and
the generation processing is processing of generating the image (Par. 0084, when delivering contents to a user, the contents processing unit 134 receives moving images containing video and sound from the imaging device 102 that is associated with the user position 14 in which a delivery destination user is located).
Imanishi does not disclose, wherein the image for viewing includes a virtual viewpoint image which has a viewpoint different from the image captured by imaging with the imaging apparatus,
the setting information includes a specific object to be viewed in the virtual viewpoint image,
wherein the virtual viewpoint image has a field of view from a different viewpoint position with respect to the image captured by imaging with an imaging apparatus,
generating the virtual viewpoint image based on a position of the specific object and the image captured by imaging the specific object with the image apparatus.
Xiaojie discloses, wherein the image for viewing includes a virtual viewpoint image which has a viewpoint different from the image captured by imaging with the imaging apparatus (Fig. 3, par. 0119, shows position of multiple capturing devices, generating multi-angle free- perspective for displaying and performing virtual viewpoint switching, display the reconstructed image generated based on the multi-angle free-perspective data. The reconstructed image corresponds to the virtual viewpoint. According to the user instruction, reconstructed images corresponding to different virtual viewpoints may be displayed, and the viewing position and viewing angle may be switched, i.e. virtual viewpoint can be switched as per user selecting viewpoint and angle that is different from the image captured by capturing device),
the setting information includes a specific object to be viewed in the virtual viewpoint image (Par. 0378, user instruction includes selection of object in the to-be-viewed area, accordingly user’s selection of the player in basketball game),
wherein the virtual viewpoint image has a field of view from a different viewpoint position with respect to the image captured by imaging with an imaging apparatus (Par. 0111 the output generated image, a method for fusing texture images from two cameras is used. The fusion weight is a global weight and is determined by the distance of the position of the virtual viewpoint from the position of the reference camera, i.e. virtual viewpoint had different field of view that is obtained from fusing images from camera’s point out view, Fig. 3, par. 0119, shows position of multiple capturing devices, generating multi-angle free- perspective for displaying and performing virtual viewpoint switching, display the reconstructed image generated based on the multi-angle free-perspective data. The reconstructed image corresponds to the virtual viewpoint. According to the user instruction, reconstructed images corresponding to different virtual viewpoints may be displayed, and the viewing position and viewing angle may be switched, i.e. virtual viewpoint can be switched as per user selecting viewpoint and angle that is different from the image captured by capturing device),
generating the virtual viewpoint image based on a position of the specific object and the image captured by imaging the specific object with the image apparatus (Par. 0312, the depth maps of the position of the virtual viewpoint obtained by the forward projection is processed, so that the generated depth maps may more truly reflect the positional relationship of objects in the scenario viewed at the position of the virtual viewpoint, i.e. generated image created from captured image is based on a position of the object in scene, as depth map is being used in generating image. Par. 0378, the picture under the virtual viewpoint is provided to the user).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filling date of the claimed invention to modify Imanishi, by teaching of the image for viewing includes a virtual viewpoint image which has a viewpoint different from the image captured by imaging with the imaging apparatus, the setting information includes a specific object to be viewed in the virtual viewpoint image, the virtual viewpoint image has a field of view from a different viewpoint position with respect to the image captured by imaging with an imaging apparatus, as taught by Xiaojie, to generate an image scene for a free respective virtual viewpoint that improves over fixed perspective user experience, as disclosed in Xiaojie, par. 0003, 0091.
Imanishi in view of Xiaojie does not disclose, wherein the virtual viewpoint image is generated from the viewpoint position and a visual line direction determined to face the specific object based on a gaze position that is the position the specific object, and the gaze position including a region having a radius about coordinate information of the specific object.
Arai discloses, wherein the virtual viewpoint image is generated from the viewpoint position and a visual line direction determined to face the specific object based on a gaze position that is the position the specific object, and the gaze position including a region having a radius about coordinate information of the specific object (Par. 0070, Fig. 8 and 9 determination unit 106 determines the line that starts from the position of the user 10 and extends linearly toward the position of the gaze point 801 as a movement path. The virtual viewpoint is controlled so as to come close to the gaze point 801 from the position of the user 10 along this movement path. At this time, the line-of-sight direction of the virtual viewpoint is caused to face the gaze point 801. Par. 0063, The gaze point ID includes, for example, alphabets, figures, and the like and is an identification number assigned to every gaze point. The position information indicates the latitude and longitude of the center coordinates of a gaze point in the degree format. The radius is a distance from the center coordinates of a gaze point and indicates an effective range in a case where a virtual viewpoint image is generated. In the following, a circular area indicating the effective range specified by the center coordinates and the radius is referred to as a gaze point unit).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filling date of the claimed invention to modify Imanishi in view of Xiaojie, by teaching of the virtual viewpoint image is generated from the viewpoint position and a visual line direction determined to face the specific object based on a gaze position that is the position the specific object, and the gaze position including a region having a radius about coordinate information of the specific object, as taught by Arai, to indicate the effective range of virtual viewpoint by circle area indicating the effective range specified by center coordinates and radius based on line of sight direction facing the gaze point, as disclosed in Arai, par. 0063, 0070.
Regarding claim 6, the information processing apparatus according to claim 1,
Imanishi further discloses, wherein the processor generates the image for viewing by superimposing the viewer information related to the viewer of which the setting information is within the predetermined range on the virtual viewpoint image (Par. 0093, by changing the size of the user video superimposed on the main video, the volume of the user sound superimposed on the main sound or the like according to the distance between users in the virtual space 10, a user can more strongly recognize the action of other users located nearby. Particularly, by increasing the weight of the user video or the user sound of users who are located in the same user position, a user can share the main video or the main sound while more closely feeling the video or the sound of the users who cheer for the same team or player, i.e. displayed image includes superimposed sound from nearby user who is located in same viewpoint space).
Regarding claim 7, the information processing apparatus according to claim 1,
Imanishi further discloses, wherein the image for viewing includes at least one of audible data related to the viewer of which the setting information is within the predetermined range or visible data related to the viewer of which the setting information is within the predetermined range (Par. 0093, by changing the size of the user video superimposed on the main video, the volume of the user sound superimposed on the main sound or the like according to the distance between users in the virtual space 10, a user can more strongly recognize the action of other users located nearby. Particularly, by increasing the weight of the user video or the user sound of users who are located in the same user position, a user can share the main video or the main sound while more closely feeling the video or the sound of the users who cheer for the same team or player, i.e. displayed image includes superimposed sound from nearby user who is located in same viewpoint space).
Regarding claim 9, the information processing apparatus according to claim 1,
Imanishi further discloses, wherein the image for viewing includes a viewer specification image for visually specifying the viewer of which the setting information is within the predetermined range (Par. 0086-0088, fig. 7 discloses, displaying main video display area 137 and viewer video display areas 138a, 138b, and 139 (i.e. video from user devices along with video from 137 from imaging device 150), the sizes of the three user video display areas 138a, 138b and 139 can be determined so as to be in proportion to the weight according to the predetermined distance in the virtual space 10 for each positional relationship between users represented by the user list of the user position data 130, i.e. size of this user video area visually specifying which viewer (i.e. viewer specification image) has setting information (i.e. location of user) closer to the user).
Regarding claim 10, the information processing apparatus according to claim 1,
Imanishi further discloses, wherein the processor stores the viewer information in the memory, and generates the image for viewing to which the viewer information stored in the memory is reflected (Par. 0072, The data storage unit 126 mainly stores user data 128 and user position data 130 that are used for the contents delivery service by the contents server 100, the user data 128 has three data items: "user ID", "status" and "friend user". Par. 0086, fig. 7, an explanatory view showing a frame 136a as an example of a frame contained in video that is delivered from the content server 100 to the user terminal 150. Referring to FIG. 7, the frame 136a contains a man video display area 137 and user video display areas 138a, 138b and 139, i.e. as shown in fig. 7, viewer ID U12 and U14 is reflected on generated image to be displayed on user device).
Regarding claim 12, the information processing apparatus according to claim 1,
Imanishi further discloses, wherein the request information includes the viewer information (Par. 0082, the user U15 who is a new user selects the user position 14B on the position selection screen. In this case, the screen control unit 118 adds the user U15 to the end of the user list of the user position 14B of the user position data 130 stored in the data storage unit 126, i.e. request to select user position 14B (i.e. from virtual space 10 as shown in fig. 2) includes user ID in this example U15).
Regarding claim 13, the information processing apparatus according to claim 1,
Imanishi further discloses, wherein the setting information includes information related to which of a plurality of videos obtained by imaging with a plurality of the imaging apparatuses is to be viewed (Par. 0065, fig.2, The user positions 14A to 14E are respectively associated with imaging devices 102A to 102E, the imaging devices 102A to 102E are placed in the positions corresponding to the respective user positions of a stadium in the real space. Par. 0066, The contents server 100 processes and delivers video and sound that are output from the imaging device 102 associated with a selected user position (i.e. the setting information = user in selecting specific viewpoint in the virtual space 10 in fig. 2) to the user terminal 150 of a user who has selected one of the plurality of user positions in the virtual space 10).
Regarding claim 14, the information processing apparatus according to claim 13,
Imanishi further discloses, wherein the processor generates a video for viewing by superimposing the viewer information related to the viewer of which the setting information is within the predetermined range on the video to be viewed (Par. 0084, When delivering contents to a user, the contents processing unit 134 receives moving images containing video and sound (i.e. main video) from the imaging device 102 that is associated with the user position 14 in which a delivery destination user is located. Par. 0093, the user video superimposed on the main video, according to the distance between users in the virtual space 10, a user can more strongly recognize the action of other users located nearby. Particularly, by increasing the weight of the user video of users who are located in the same user position (i.e. user selected viewpoint is within predetermined range of other users), a user can share the main video or the main sound while more closely feeling the video or the sound of the users who cheer for the same team or player, i.e. displayed image includes superimposed sound from nearby user who are located in same viewpoint space).
Regarding claim 15, the information processing apparatus according to claim 1,
Imanishi further discloses, wherein the setting information includes information related to which of a plurality of edited videos created based on a plurality of videos obtained by imaging with a plurality of the imaging apparatuses is viewed (Par. 0065, fig.2, The user positions 14A to 14E are respectively associated with imaging devices 102A to 102E, the imaging devices 102A to 102E are placed in the positions corresponding to the respective user positions of a stadium in the real space. Par. 0066, The contents server 100 processes and delivers video and sound that are output from the imaging device 102 associated with a selected user position (i.e. the setting information = user in selecting specific viewpoint in the virtual space 10 in fig. 2) to the user terminal 150 of a user who has selected one of the plurality of user positions in the virtual space 10. Par. 0085, the contents processing unit 134 may process the contents to be delivered to the user terminal 150 according to the distance between the user positions of two or more users in the virtual space 10. Specifically, the contents processing unit 134 may synthesize each frame (i.e. editing the video main video obtained from imaging apparatus 105) to include video from users located in same space) of the main video and each frame of the user video whose size has been changed depending on the distance between the user positions in each frame of video to be delivered to the user terminal 150).
Regarding claim 16, the information processing apparatus according to claim 15,
Imanishi further discloses, wherein the processor generates a video for viewing by superimposing the viewer information related to the viewer of which the setting information is within the predetermined range on the edited video to be viewed (Par. 0084, When delivering contents to a user, the contents processing unit 134 receives moving images containing video and sound (i.e. main video) from the imaging device 102 that is associated with the user position 14 in which a delivery destination user is located. Par. 0093, the user video superimposed on the main video, according to the distance between users in the virtual space 10, a user can more strongly recognize the action of other users located nearby. Particularly, by increasing the weight of the user video of users who are located in the same user position (i.e. user selected viewpoint is within predetermined range of other users), a user can share the main video or the main sound while more closely feeling the video or the sound of the users who cheer for the same team or player, i.e. displayed image includes superimposed sound from nearby user who are located in same viewpoint space).
Regarding claim 17, Imanishi in view of Xiaojie in further view of Arai meets claim limitation as set forth in claim 1,
Regarding claim 18, Imanishi in view of Xiaojie in further view of Arai meets claim limitation as set forth in claim 1, respectively, Imanishi further discloses, a non-transitory computer-readable storage medium storing a program executable by a computer to perform information processing (see par. 0147).
Claims 3-5 are rejected under U.S.C. 103 as being unpatentable over Imanishi (US 20100251173, part of the Information Discloser Statement of 3/4/2023), in view of Xiaojie (US 20200288112), in further view of Arai (US 20200082603), in further view of Hanamoto et al. (US 20190213791, part of the Information Discloser Statement of 3/4/2023).
Regarding claim 3, The information processing apparatus according to claim 1,
Imanishi in view of Xiaojie in further view of Arai does not disclose, wherein the setting information includes gaze position specification information for specifying a gaze position used to generate the virtual viewpoint image in a region indicated by the image.
Hanamoto discloses, wherein the setting information includes gaze position specification information for specifying a gaze position used to generate the virtual viewpoint image in a region indicated by the image (Par. 0058, a user to specify the movement path of a virtual camera by using the bird's eye image display area 300 and for the movement path of a gaze point (i.e. game position) to be determined automatically in accordance with the movement of a player or the like).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filling date of the claimed invention to modify Imanishi in view of Xiaojie in further view of Arai, by teaching of the setting information includes gaze position specification information for specifying a gaze position used to generate the virtual viewpoint image in a region indicated by the image, as taught by Hanamoto, to capture the image in accordance with movement of specific object of interest in field of view such as player’s movement, as disclosed in Hanamoto par. 0058.
Regarding claim 4, The information processing apparatus according to claim 3,
Imanishi in view of Xiaojie in further view of Arai in further view of Hanamoto further discloses, wherein the gaze position is the position of the specific object included in the region (Hanamoto Par. 0058, a user to specify the movement path of a virtual camera by using the bird's eye image display area 300 and for the movement path of a gaze point (i.e. game position) to be determined automatically in accordance with the movement of a player or the like, i.e. object being player in field 201 as shown fig. 2).
Regarding claim 5, The information processing apparatus according to claim 3,
Imanishi in view of Xiaojie in further view of Arai in further view of Hanamoto further discloses, wherein the gaze position specification information includes a gaze position path information indicating a path of the gaze position (Hanamoto Par. 0058, a user to specify the movement path of a virtual camera by using the bird's eye image display area 300 and for the movement path of a gaze point (i.e. game position) to be determined automatically in accordance with the movement of a player or the like).
Claim 8 is rejected under U.S.C. 103 as being unpatentable over Imanishi (US 20100251173, part of the Information Discloser Statement of 3/4/2023), in view of Xiaojie (US 20200288112), in further view of Arai (US 20200082603), in further view of Whitelaw et al. (US 20100096491).
Regarding claim 8, The information processing apparatus according to claim 7,
Imanishi discloses, wherein the image for viewing is a video (Par. 0059, the contents delivered from the contents server 100 may be video in virtual reality such as a video game space),
the processor generates the image for viewing to which the viewer information is reflected, by adding at least one of the audible data or the visible data to the image (Par. 0093, by changing the size of the user video superimposed on the main video, the volume of the user sound superimposed on the main sound or the like according to the distance between users in the virtual space 10, i.e. displayed image added with superimposed sound from nearby user who is located in same viewpoint space).
Imanishi in view of Xiaojie in further view of Arai does not disclose, adding at least one of the audible data or the visible data to the image for viewing at a timing set by the viewer at a time of playback of the image for viewing.
Whitelaw discloses, adding at least one of the audible data or the visible data to the image for viewing at a timing set by the viewer at a time of playback of the image for viewing (Par. 0142 discloses, spectator clicks (i.e. set by the viewer) at time of streaming of live race, to include listening in on cockpit conversation or other audibles, viewing virtual instrument cluster driven with real-time (i.e. user set to add in real time other data to racing video, selecting to add real-time data = a timing being in instant or real-time) telemetry data, options might allow the spectator to stream a video of the pilot's face).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filling date of the claimed invention to modify Imanishi in view of Xiaojie in further view of Arai, by teaching of the adding at least one of the audible data or the visible data to the image for viewing at a timing set by the viewer at a time of playback of the image for viewing, as taught by Whitelaw, to provide user with interactive experience of event, allowing viewer to change the additional options of interactive viewing at a time they want to, as disclosed in Whitelaw par. 00142.
Claim 11 is rejected under U.S.C. 103 as being unpatentable over Imanishi (US 20100251173, part of the Information Discloser Statement of 3/4/2023), in view of Xiaojie (US 20200288112), in further view of Arai (US 20200082603), in further view of Makinen et al. (US 20210160549).
Regarding claim 11, The information processing apparatus according to claim 1,
Imanishi in view of Xiaojie in further view of Arai does not disclose, wherein the viewer information includes an attribute related to a taste of the viewer.
Makinen discloses, wherein the viewer information includes an attribute related to a taste of the viewer (Par. 0035, users are provided an option request video of the event from desired angles, Par. 0064, user may prefer particular viewing angles, a particular team, particular player, i.e. user information includes preferences of viewing angles, a particular team or particular player for viewing, preferences = taste).
Therefore, it would have been obvious to one of the ordinary skill in the art before the effective filling date of the claimed invention to modify Imanishi in view of Xiaojie in further view of Arai, by teaching of wherein the viewer information includes an attribute related to a taste of the viewer, as taught by Makinen, to automatically provide user with content of interest from different position and angel of capturing device as per viewer’s preference, as disclosed in Makinen par. 0064.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AKSHAY DOSHI whose telephone number is (571)272-2736. The examiner can normally be reached M-F 9:30 AM to 6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, JOHN W MILLER can be reached at (571)272-7353. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/A.D./Examiner, Art Unit 2422
/JOHN W MILLER/Supervisory Patent Examiner, Art Unit 2422