DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed on 10/29/2025 have been fully considered but they are not persuasive.
Applicant argues: “Claim 1 as amended recites …”
Examiner notes the updated reasons for rejection below.
Applicant argues regarding the newly amended language, “In other words, Gadnir does not disclose detecting vertical movement (e.g., standing) of a participant …”
Examiner notes that Gadnir teaches detecting movement of the participants as cited under section 103 below. However, the present specification does not appear to support this claim feature as noted under section 112 below.
Applicant argues: “By the foregoing amendments, claim 21 has been amended to recite … Gadnir does not teach or suggest that a dimension of an image is automatically reduced while maintaining a dimension of a display area for that image.”
Examiner notes the updated reasons for rejection below. Further, given that Gadnir indicates that the image content or the window size can be freely adjusted to suit the image content or the application. Applicant’s preference for a particular image arrangement out of the available arrangements in the prior art is not sufficient to overcome obviousness.
Examiner suggests claiming an optimized automation or algorithm for sizing images and windows based on detected data.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claim 27 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 27 recites “determining that the change in the vertical movement of the one of the participants increases a height of the one of the participants in the height direction” however Examiner did not find support for determination of vertical movement in the Specification. Foe example, Specification Paragraph 45 is directed “the panoramic image 203 in the case where some of the participants 1 20 are standing. The first image generation unit 62 increases the height of the 2 0 panoramic image 203 such that the panoramic image 203 includes faces of all the participants 120.” However, this is directed to detecting contents of a static image not to determining motion.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 4, 5, 8-9, 11-12, 14-15 and 19-23, 26-27 are rejected under 35 U.S.C. 103 as being unpatentable over US 9936162 to Gadnir (“Gadnir”) in view of US 20200351435 to Therkelsen (“Therkelsen “) in view of US 20220070385 to Van (“Van”).
Regarding Claim 1: “An information processing apparatus comprising:
circuitry; and a memory storing instructions that cause the circuitry to execute: (“The memory 228 can be any computer readable medium, such as a random access memory (RAM) or other dynamic storage device ( e.g. dynamic RAM, static RAM, synchronous RAM, etc.) coupled to the bus 244 for storing information and instructions to be executed by the processor 236.” Gadnir, Column 8, lines 57-61.)
setting in advance a plurality of targets to be displayed, (Under the broadest reasonable interpretation consistent with the specification and ordinary skill in the art, the plurality of targets are object types or reference objects to be matched in the video frame, such as faces of people or electronic devices. The claim does not require targets to be particular people or particular objects. Prior art teaches this functionality: “The participant monitor 252 can acquire the facial images of each participant in the captured image using face detection techniques, acquire other object images in the captured image (such as a whiteboard, table, chair, and the like) using digital processing techniques, determine an identity of each acquired facial image by face recognition techniques using an identified biometric information of the participant,” Gadnir, Column 9, lines 37-39. This indicates that prior art relies on a database of object and people recognition templates and identifying information of the participants to be known in advance.)
the plurality of targets including participants in remote communication; (For example “image should be provided to remote participant endpoints at any point in time during the communication session, or to define a set of optimal views for the video conferencing session” Gadnir, Column 9, lines 53–56.)
detecting one or more targets among the plurality of targets (“The participant monitor 252 can acquire the facial images of each participant in the captured image using face detection techniques, acquire other object images in the captured image (such as a whiteboard, table, chair, and the like) using digital processing techniques, determine an identity of each acquired facial image by face recognition techniques using an identified biometric information of the participant,” Gadnir, Column 9, lines 37-39.)
from a wide-angle image captured by an image-capturing device; (“The use of a high resolution and wide-angle digital camera” Gadnir, Column 12, lines 13-16.)
in a case where a plurality of targets is detected from the wide-angle image, (“The use of a high resolution and wide-angle digital camera can permit the remote endpoints to display a layout, … the participant monitor 252 acquires and analyzes participants and non-participant objects of interest in the monitored area” Gadnir, Column 12, lines 13-16 and 42-44.)
generating a first image including the plurality of targets; (“Then, the camera can zoom out to get a full view of the room” Gadnir, Column 14, lines 5-6. Similarly, “selects as the optimal view a view having the first, second, and third participants 300a-c in frame, in focus and centralized,” Gadnir, Column 11, lines 24-27, and Column 12, lines 22-39.)
generating a second image of one or more participants speaking among the participants in the remote communication, (“The images positioned in each window can be based on a number of criteria, such as current active speaker participant … wide angle digital image information can permit extraction of multiple high quality images, which can be displayed in the windows as desired.” ,” Gadnir, Column 12, lines 22-35.)
the second image clipped from the first image; (“wide angle digital image information can permit extraction of multiple high quality images, which can be displayed in the windows as desired.” Gadnir, Column 12, lines 22-35. See similarly in Therkelsen, Paragraph 26.)
controlling a communication terminal to display a combined image including the first image and the second image arranged adjacent to each other on a screen … that has a display area for displaying the combined image, the display area including a first display area for displaying the first image and a second display area for displaying the second image, (“The layout 500 includes first, second, and third windows [display areas] 504, 508, and 512, respectively. Each of the first, second, and third windows 504, 508, and 512 displays a corresponding image. As will be appreciated, the layout 500 can have any configuration and any number of windows” Gadnir, Column 12, lines 16-21 and Fig. 5. Note that the prior art examples are similar to Figs 2A-2C in the Specification.)
the second display area including a third display area and a fourth display area; and (“As will be appreciated, the layout 500 can have any configuration and any number of windows” Gadnir, Column 12, lines 16-21 and Fig. 5 in which the top display area corresponds to the claimed second display area and includes areas 504 and 512 that correspond to the claimed third and the fourth display areas. Note that the prior art examples are similar to Figs 2A-2C in the Specification.)
changing a ratio of the first display area for the first image relative to the second display area for the second image in a height direction on the screen such that the first image includes all of the plurality of targets set in advance while maintaining a dimension of the display area for the combined image in the height direction on the screen (Under the broadest reasonable interpretation consistent with the specification and ordinary skill in the art, increasing the height can be directed to zooming, cropping, or resizing the image to properly display the targeted objects on the screen.
Prior art teaches this functionality. First, prior art selects optimum sizes of the images, including the claimed first image as “the imaging controller 256 selects as the optimal view a view having the first, second, and third participants 300a-c in frame, in focus and centralized.” Gadnir, Column 10, lines 58-63. Also, for the second area including third and fourth images of the individuals, prior art “acquires and digitally crops and/or zooms an image of the selected object from the digital image and optionally normalizes or resizes the selected object image so that all of the images for the different objects appear to be equally sized … The normalization may be accomplished by resizing the width and/or height of the facial image … vertical scaling may be performed … Other scaling techniques may alternatively be used.” Gadnir, Column 14, lines 30-48 and example layout in Fig. 5. Finally, prior art teaches sizing and arranging the windows to shit the images that are displayed. the images are displayed, where “Each of the first, second, and third windows 504, 508, and 512 displays a corresponding image. As will be appreciated, the layout 500 can have any configuration …” Gadnir, Column 12, lines 17-23.
Thus, both the image shapes and sizes (heights, widths) and the image layout configuration on display can be adjusted to make sure the optimal viewing of the participants is arranged on the screen. See additional treatment of window resizing below.)
and changing a dimension of the third display area and a dimension of the fourth display area in the height direction on the screen by a same amount in accordance with the changing the ratio of the first display area relative to the second display area, (As noted above for the third and fourth display areas: “As will be appreciated, the layout 500 can have any configuration and any number of windows depending on the application.” Gadnir, Column 12, lines 16-21 and Fig. 5 in which the bottom are corresponds to the claimed first area, the top display area corresponds to the claimed second display area and includes areas 504 and 512 that correspond to the claimed third and the fourth display areas. Thus, in the case where image area two (top) has a configuration where it is resized in height, image areas three and four (504 and 512) are inherently resized in height because they are bound by the image area two. Note that the prior art examples are similar to Figs 2A-2C in the Specification. See additional treatment of window resizing below.)
“wherein the ratio of the first display area relative to the second display area in the height direction is determined and changed in response to detecting a change in a vertical movement of one of the participants appearing in the first image along the height direction.” (For example, “Where a participant has moved to a seat outside the field of view of the camera or a new participant has entered the room and selected a seat outside the field of view, the imaging controller 256 may elect to adjust the view in response to the participant entry or relocation” Gadnir, Column 13, lines 25-30. As noted above, “when the selected object image is selected to replace a previously acquired image, … vertical scaling may be performed.” Gadnir, Column 14, lines 29-45. )
Gadnir does not teach an embodiment of detecting “participants speaking” from the contents of a video, as opposed to audio information from multiple microphones.
Therkelsen teaches using either embodiment in the context of capturing and displaying a video with multiple objects and human participants: “In some embodiments, the auditorium endpoint 104(1) may further use participant detection techniques ( e.g., facial detection techniques, motion detection techniques, upper body detection techniques, etc.) to detect participants 106 (2)-106(M) and their location with respect to microphones 118(1)-118(2) in order to confirm that a participant 106(2)- 106(M) is actively speaking or is about to actively speak into the microphone” Therkelsen, Paragraph 26.
Therefore, before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to supplement the teachings of human action detection in Gadnir to detect a participant as actively speaking based on sound and facial feature recognition as taught in Therkelsen, in order “to present a close-up or zoomed in view of the actively speaking participant.” Therkelsen, Paragraph 26.
Finally, in reviewing the present application, there does not seem to be objective evidence that the claim limitations are particularly directed to: addressing a particular problem which was recognized but unsolved in the art, producing unexpected results at the level of the ordinary skill in the art, or any other objective indicators of non-obviousness.
Gadnir and Therkelsen do not explicitly teach that resizing the image to optimize display of the content is connecting to resizing the window in which the image is displayed. As noted above, Gadnir indicates that any configuration can be selected for the windows and for the images.
Van explicitly teaches resizing the window with the image in the context of displaying images of people and objects: “shifts and resizes the position of various participant windows 1255, as shown in FIG. 12AP. In some embodiments, device 600 enlarges a respective participant window when the participant is speaking. For example, in FIG. 12AP, device 600 detects audio 1276 (e.g., laughter) from the woman represented in participant image data 1204-2 and, in response, enlarges the woman's participant window 1255-2.” Van, Paragraph 476.
Therefore, before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to supplement the teachings of Gadnir and Therkelsen to resize the image as well as the window in which the image is displayed as taught in Van, “for displaying visual effects in a live video communication session”. Van, Paragraph 477.
Finally, in reviewing the present application, there does not seem to be objective evidence that the claim limitations are particularly directed to: addressing a particular problem which was recognized but unsolved in the art, producing unexpected results at the level of the ordinary skill in the art, or any other objective indicators of non-obviousness. Adjusting images and application windows for optimal view or for visual effect are well established features in the field of personal computers.
Regarding Claim 4: “The information processing system according to claim 1, wherein the circuitry is caused to change a dimension of the first display area for the first image on the communication terminal such that the first image having an increased area is displayed in the display area. (For example, for the first image “the imaging controller 256 selects as the optimal view a view having the first, second, and third participants 300a-c in frame, in focus and centralized.” Gadnir, Column 10, lines 58-63. “Each of the first, second, and third windows 504, 508, and 512 displays a corresponding image.” Gadnir, Column 12, lines 16-21 and Fig. 5. Finally, the images of the speaking participants corresponding to the second image of the claims, can be resized and reshaped accordingly, for example so that the faces in each image have the same size making the second images occupy smaller display areas than the first, as noted in Gadnir, Column 14, lines 30-48 and in Fig. 5.)
Regarding Claim 5: “The information processing apparatus according to claim 4, wherein the circuitry is caused to change the dimension of the display area for the first image in the height direction.” (“acquires and digitally crops and/or zooms an image of the selected object from the digital image and optionally normalizes or resizes the selected object image so that all of the images for the different objects appear to be equally sized or spaced equidistant from the capture device. … The normalization may be accomplished by resizing the width and/or height of the facial image … Other scaling techniques may alternatively be used.” Gadnir, Column 14, lines 30-48 and example layout in Fig. 5.)
Regarding Claim 8: “The information processing system according to claim 1, wherein the circuitry is caused to generate the first image in which a target among the plurality of targets is arranged at a center of the first image in a horizontal direction.” (Prior art teaches two examples of this: “an optimal view could include having all participants in frame, in focus and centralized in the captured image … presenter, speaker, etc.) with the selected meeting participant being in focus and centralized in the captured image. Other optimal views will be appreciated …” Gadnir, Column 9, line 61 – Column 10, line 3.)
Regarding Claim 9: “The information processing apparatus according to claim 1, wherein the circuitry is caused to, in a case where the first image does not include the plurality of targets set in advance, increase a width of the first image such that the first image includes the plurality of targets.” (“an optimal view could include having all participants in frame, in focus and centralized in the captured image … Once the target view is identified, the imaging controller 256 adjusts the captured image (e.g., moves the pan, tilt, and zoom of the camera) to produce this view.” Gadnir, Column 9, line 61 – Column 10, line 7. Also note that “Each of the first, second, and third windows 504, 508, and 512 displays a corresponding image.” Gadnir, Column 12, lines 17-20. And that each image can be resized and reshaped to fit the subject matter on the screen layout. Gadnir, Column 14, lines 29-48.)
Regarding Claim 11: “The information processing apparatus according to claim 1, wherein the plurality of targets set in advance includes a face of a person.” (“The participant monitor 252 can acquire the facial images of each participant in the captured image using face detection techniques, acquire other object images in the captured image (such as a whiteboard, table, chair, and the like) using digital processing techniques, determine an identity of each acquired facial image by face recognition techniques using an identified biometric information of the participant,” Gadnir, Column 9, lines 37-39.)
Regarding Claim 12: “The information processing apparatus according to claim 1, wherein the one or more targets preset in the detection setting includes an electronic device.” (“acquire other object images in the captured image (such as a whiteboard, table, chair, and the like)” Gadnir, Column 9, lines 37-39. The object can be an electronic devices, such as a microphone “detecting/recognizing the microphones 118 (1)-118(2) in the video outputs of the video cameras 112” See Therkelsen, Paragraph 25, Fig. 2, and statement of motivation in Claim 1.)
Regarding Claim 14: “The information processing apparatus according to claim 12, wherein the circuitry is caused to: (Note that caused to change indicates an external cause because the claimed circuitry itself or the memory are not configured to perform this function. Thus, this claim is rejected for reasons stated for Claim 4. Cumulatively, Prior Art teaches features noted below.)
collect a sound output by the electronic device; (“When the auditorium endpoint 104(1) determines that a person within proximity of the detected microphones 118(1)-118(2) is about to speak or is actively speaking into one of the detected microphone 118(1)-118(2),” Therkelsen, Paragraph 26. See statement of motivation in Claim 1.)
detect a direction from which the sound is collected; and (“When the auditorium endpoint 104(1) determines that a person within proximity [direction] of the detected microphones 118(1)-118(2) is about to speak or is actively speaking into one of the detected microphone 118(1)-118(2),” Therkelsen, Paragraph 26. See statement of motivation in Claim 1.)
generate the first image including the electronic device, based on the detected direction of the electronic device.” (“alter one of the video outputs of the video cameras 112(1)-112(3) to present a close-up or zoomed in view of the actively speaking participant (e.g., 106(2) or 106(3) as illustrated in FIG. 2).” Therkelsen, Paragraph 26. Also note that a close up image of the participant includes an image of the detected microphone as noted in Therkelsen, Paragraph 42 and Fig. 8C. See statement of motivation in Claim 1.)
Regarding Claim 15: “The information processing apparatus according to claim 12, wherein the circuitry is configured to: (Note that caused to change indicates an external cause because the claimed circuitry itself or the memory are not configured to perform this function. Thus, this claim is rejected for reasons stated for Claim 4. Cumulatively, Prior Art teaches features noted below.)
recognize the electronic device through image processing; and (“In step 600, the participant monitor 252 acquires and analyzes participants and non-participant objects of interest in the monitored area for the communication session. It determines where participants are seated, what objects are in the room” Gadnir, Column 12, lines 42-46. The object can be an electronic devices, such as a microphone “detecting/recognizing the microphones 118 (1)-118(2) in the video outputs of the video cameras 112” See Therkelsen, Paragraph 25, Fig. 2, and statement of motivation in Claim 1.)
generate the first image including the electronic device recognized.” (“alter one of the video outputs of the video cameras 112(1)-112(3) to present a close-up or zoomed in view of the actively speaking participant (e.g., 106(2) or 106(3) as illustrated in FIG. 2).” Therkelsen, Paragraph 26. Also note that a close up image of the participant includes an image of the detected microphone as noted in Therkelsen, Paragraph 42 and Fig. 8C. See statement of motivation in Claim 1.)
Claim 19 is rejected for reasons stated for Claim 1, because the system elements of Claim 1 implement the method steps of Claim 19.)
Regarding Claim 20: “The information processing apparatus according to claim 1, (Note that caused to change indicates an external cause because the claimed circuitry itself or the memory are not configured to perform this function. Thus, this claim is rejected for reasons stated for Claim 4. Cumulatively, Prior Art teaches features noted below.)
wherein the circuitry is caused to display a display-range fixing button on the communication terminal to fix the first display area for the first image.” (“For example, the participant user can press a button in the remote control that invokes the view optimization.” Gadnir, Column 14, lines 3-4. Se view optimization for display layout in Claim 1)
Regarding Claim 21: “The information processing apparatus according to claim 20, wherein
in a case where the display-range fixing button is turned on, the first display area for the first image in the combined image is set to the fixed value, and (“For example, the participant user can press [turn on] a button in the remote control that invokes the view optimization. Then, the camera can zoom out to get a full view of the room, and then zoom in and adjust according to scene analysis,” this adjusts the content of the image but not the area allotted to it on the display of the combined image. Gadnir, Column 14, lines 3-4. “normalizing the facial images allows for the presentation of facial images in standard sized viewports” Gadnir, Column 14, lines 45-47. See view optimization of images for display layout in Claim 1)
a dimension of the first image is automatically reduced, while maintaining a dimension of the first display area, such that the first image includes all of the plurality of targets, and (“For example, the participant user can press [turn on] a button in the remote control that invokes the view optimization. Then, the camera can zoom out to get a full view of the room, and then zoom in and adjust according to scene analysis,” this adjusts the content of the image but not the area allotted to it on the display of the combined image. Gadnir, Column 14, lines 3-4. See view optimization of images for display layout in Claim 1)
in a case where the display-range fixing button is turned off, the first display area for the first image in the combined image is not set to the fixed value.” (Prior art teaches a display-range fixing button as a view optimization button noted above and in Gadnir, Column 14, lines 3-4. When the participant is not pressing the button, “the participant monitor 252, … acquires and digitally crops and/or zooms an image of the selected object from the digital image and optionally normalizes or resizes the selected object image so that all of the images for the different objects appear to be equally sized or spaced equidistant from the capture device,” which does adjust the area allotted to the image on the display based on the content of this and other images on display. Gadnir, Column 14, lines 29-35, and Fig. 5. And, prior art indicates that this feature can be turned on or off as needed by the application: “layout 500 can have any configuration and any number of windows depending on the application.” Gadnir, Column 12, lines 20-22 and similarly that visual effects like window resizing can be enabled or not in Van, Paragraphs 468-471, 477. See statement of motivation in Claim 1.)
Regarding Claim 22: “The information processing apparatus according to claim 1, wherein the circuitry is caused to set the plurality of targets in advance by learning or controlling the communication terminal to learn a shape of each of the plurality of targets using machine learning.” (“The term "facial recognition" or "face recognition" refers to an algorithm for identifying a person's identity based on a detected facial image of the person by applying digital image processing techniques to image information (either still or video frame). One of the ways to do this is by comparing selected facial features from the image and a facial database. For example, an algorithm may analyze [learn] the relative position, size, and/or shape of the eyes, nose, cheekbones, and jaw.… the Multilinear Subspace Learning using tensor representation …” Gadnir, Column 4, lines 12-36. Also: “The control unit 212 can communicate (i.e. exchange audio and video information and/or any additional data), over the communications network 112, … access an enterprise database 260 comprising subscriber information,” Gadnir, Column 7, lines 45-49. Also note using tranine neural networks to identify objects in Therkelsen, Paragraph 41 and statement of motivation in Claim 1.)
Regarding Claim 23: “The information processing apparatus according to claim 1, wherein
the circuitry is caused to detect the one or more participant speaking by performing face detection around a sound direction of the speaking and (“determine an active speaker using speaker localization and a microphone array, … The participant monitor 252, using face detection techniques, microphone array analysis” Gadnir, Column 9, lines 33-47, 60-62.)
clipping 15-degree leftward and rightward ranges relative to a center of a face of the one or more participant speaking from the wide-angle image.” (“The digital camera typically has a horizontal field or angle of view of at least about 100 degrees, more typically at least about 110 degrees, and more typically at least about 120 degrees. … the monitor 252 can extract and crop (or 60 digitally zoom) an image of each of the first participant 400, second participant 404, third participant 408,” Gadnir, Column 11, lines 47-49. “digitally crops and/or zooms an image of the selected object from the digital image and optionally normalizes or resizes the selected object image so that all of the images for the different objects appear to be equally sized.” Gadnir, Column 14, lines 29-35. Thus, Gadnir crops the image to a fixed narrow field of view based on the sound detection localization and face detection of the speaker. Although Gadnir does not specify the narrow field of view to be 30 degrees out of 100, it is reasonable that for the example of 3 participants captured in the video, that the speaker’s face would be within around 30% of the field of view in the horizontal direction, and thus the cropped image would occupy around +/- 15 degrees field of horizontal view around the speaker.)
Regarding Claim 26: “The information processing apparatus according to claim 1, wherein the height direction is a direction in which the plurality of targets stand.” (“The normalization may be accomplished by resizing the width and/or height of the facial image” and a height direction of a face is a height direction in which the face stands. Gadnir, Column 14, lines 37-39.)
Regarding Claim 27: “The information processing apparatus according to claim 1, wherein in response to determining that the change in the vertical movement of the one of the participants increases a height of the one of the participants in the height direction, the ratio of the first display area relative to the second display area in the height direction is increased.” (See rejection under section 112 above. Cumulatively, prior art teaches an example of this: “the imaging controller 256 selects as the optimal view a view having the first, second, and third participants 300a-c in frame, in focus and centralized, with minimal background in the captured image. The imaging controller 256 adjusts the pan, tilt, and zoom of the camera 216 to produce this view. After the video conferencing communication session commences and after a selected time interval has elapsed, the second participant 300b becomes the active speaker and stands and walks to the whiteboard 304. The imaging controller 256, in response, selects as the optimal view a view having the whiteboard and second participant in frame, in focus and centralized,” including height and pan adjustments. Gadnir, Column 12, lines 48-53. See adjusting the window in correspondence with the image in Claim 1.)
Claim 10 rejected under 35 U.S.C. 103 as being unpatentable over US 11655893 to US 9936162 to Gadnir (“Gadnir”) in view of US 20200351435 to Therkelsen (“Therkelsen “) in view of US 20220070385 to Van (“Van”) and in view of US 20180091738 Takahashi (“Takahashi”).
Regarding Claim 10: “The information processing apparatus according to claim 1,
Gadnir and Therkelsen and do not teach the claim language below. Gadnir teaches a similar function: “acquires and digitally crops and/or zooms an image of the selected object from the digital image and optionally normalizes or resizes the selected object image so that all of the images for the different objects appear to be 35 equally sized or spaced equidistant from the capture device.” In Column 14, lines 30-35. However Gadnir does not do it on an explicit determination of space between objects.
Takahashi teaches the claim features below in the context of compositing video:
“wherein the circuitry is caused to, based on a determination that a space between a first target and a second target among the plurality of targets is greater than or equal to a threshold, (Note that caused to change indicates an external cause because the claimed circuitry itself or the memory are not configured to perform this function. Thus, this claim is rejected for reasons stated for Claim 1. Cumulatively, Prior Art teaches: “The shift amounts are set in accordance with the movement amount X of the face of the person B and the distance (depth distance) from the display screen of the display 5 of the person A, … as shown in FIG. 7B. It should be understood that this distance dw is a distance larger than the reference [threshold] distance dl, which is the depth distance of the person A.” Takahashi, Paragraphs 110, 112.)
generate the first image from which an excessive space between the first target and the second target is omitted.” (“When the movement angle 8 and the respective depth distances dl and dw of the person A and the background are identified, the composite image display unit 16 sets the respective shift amounts for the display position of the image of the person A and the display range of the background image in the composite image.” Takahashi, Paragraphs 112, 113.)
Therefore, before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to supplement the teachings of Gadnir and Therkelsen to implement the claimed compositing features above, as taught in Takahashi, so that “the display screen feels that as if the user were facing the conversation partner”. Takahashi, Paragraph 3.)
Finally, in reviewing the present application, there does not seem to be objective evidence that the claim limitations are particularly directed to: addressing a particular problem which was recognized but unsolved in the art, producing unexpected results at the level of the ordinary skill in the art, or any other objective indicators of non-obviousness.
Claim 13 rejected under 35 U.S.C. 103 as being unpatentable over US 9936162 to Gadnir (“Gadnir”) in view of US 20200351435 to Therkelsen (“Therkelsen “) in view of US 20220070385 to Van (“Van”) in view of US 20160253859 Staffer (“Staffer”).
Regarding Claim 13: “The information processing apparatus according to claim 12,
Gadnir and Therkelsen do not teach: “wherein the circuitry is caused to: detect a two-dimensional code displayed by an electronic device; and generate the first image including the electronic device detected based on the two-dimensional code.”
Staffer teaches the above function in the context of obtaining images of objects and persons using wide-angle cameras: “This wide-angle objective captures the near range for bar-code reading and the far range for recording the operating person's actions at the same depth of field” Staffer, Paragraph 52.
Therefore, before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to supplement the teachings of Gadnir and Therkelsen to “detect a two-dimensional code displayed by an electronic device; and generate the first image including the electronic device detected based on the two-dimensional code” in the manner taught in Staffer, so that a camera can be used as “a common device by means of which both the actions carried out by an operating person while operating the system 1 can be recorded and additional information items, in particular in the form of a bar code.” Staffer, Paragraph 52.
Finally, in reviewing the present application, there does not seem to be objective evidence that the claim limitations are particularly directed to: addressing a particular problem which was recognized but unsolved in the art, producing unexpected results at the level of the ordinary skill in the art, or any other objective indicators of non-obviousness.
Claim 28 rejected under 35 U.S.C. 103 as being unpatentable over US 9936162 to Gadnir (“Gadnir”) in view of US 20200351435 to Therkelsen (“Therkelsen “) in view of US 20220070385 to Van (“Van”) in view of US 20220245850 to Liu (“Liu”).
Regarding Claim 28. “The information processing apparatus according to claim 1,
wherein the first image is a panoramic image, (“the high resolution and wide angle digital image information can permit extraction of multiple high quality images, which can be displayed in the windows as desired.” Gadnir, Column 12, lines 33-55. In this case a panoramic image can be a whole or a part of a wide angle image from a camera that “has a horizontal field or angle of view of at least about 100 degrees, more typically at least about 110 degrees, and more typically at least about 120 degrees.” Gadnir, Column 11, lines 46-50.)
Gadnir does not teach: “wherein the wide-angle image is generated from a spherical image with a predetermined angle of view, the spherical image acquired by capturing surroundings by a camera, and” however this is a well-established way of generating images in the art:
Liu teaches the above claim feature in the context of imaging people: “The image processing apparatus 10 panoramically expands an input fisheye image, and generates a panoramic image.” Liu, Paragraphs 40, 47. See a predetermined angle of view in Figs 6-10.
Therefore, before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to supplement the teachings of Gadnir, Therkelsen, and Van to generate a wide-angle and a partial wide angle image as a panoramic image as taught in Liu, in order to use a fish eye camera to capture multiple objects in different locations. See Liu, Paragraph 47 and Figs 6-10.
Finally, in reviewing the present application, there does not seem to be objective evidence that the claim limitations are particularly directed to: addressing a particular problem which was recognized but unsolved in the art, producing unexpected results at the level of the ordinary skill in the art, or any other objective indicators of non-obviousness.
wherein the panoramic image is generated from a portion of the spherical image to include all of the plurality of targets.” (For example “selects as the optimal view a view having the first, second, and third participants 300a-c in frame, in focus and centralized,” Gadnir, Column 11, lines 24-27, and Column 12, lines 22-39. This can be done by controlling the camera or by a computer that “digitally crops and/or zooms an image of the selected object from the digital image” thus providing a panoramic image as a portion of a wide angle digital image from a camera. See Gadnir, Column 14, lines 19-45. Note that a wide angle digital image can be acquired from a spherical image camera as noted in Liu, Paragraphs 40, 47. See statement of motivation above.)
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MIKHAIL ITSKOVICH whose telephone number is (571)270-7940. The examiner can normally be reached Mon. - Thu. 9am - 8pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Joseph Ustaris can be reached at (571)272-7383. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MIKHAIL ITSKOVICH/Primary Examiner, Art Unit 2483