DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 1-15 are objected to because of the following informalities: each independent claim recites a “rending command” in the final clause of each respective claim. This is recognized as a typographical error where “rendering” is meant instead. The claims will be interpreted such that “rendering” is recited instead of “rending”. Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lorrain et al1 (“Lorrain”) in view of Oz et al2 (“Oz”).
Regarding claim 1, Lorrain teaches an imaging method used for performing video conference on a display, wherein the imaging method comprises (note that “imaging method” is any method that involves creating, process, manipulating or generating images in any form where imaging can include any image-related operations such as capture, processing, transformation, synthesis, rendering and/or display; note that such a method being “used for performing video conference on a display” establishes an intended purpose and such a method must be capable of being used for such a purpose where a video conference performed on a display is any display of a conference, which is some real-time or near-real time video communication between two or more participants or endpoints; here note that as “video conference stream” and the “display” are further recited in the body of the claim, the preamble is given patentable weight and limits the claims to methods used in video conferencing contexts on displays based on video conference streams but also note that the body of the claim does not recite that the video conference stream is necessarily related to the “2D frames shown on the display”; see Lorrain, as applied to the below limitations which comprise a method used for performing video on a display ):
receiving at least one video conference stream (note that this video conference stream is not functionally tied to any other limitation other than the preamble but the following “2D frames shown on the display” are not necessarily 2D frames from the at least one video conference stream, however it should be noted that in the interest of compact prosecution, for prior art purposes at least, the video stream is considered essentially the source of the 2D frames shown on the display; see Lorrain, paragraphs 0037-0046 teaching “a system 100 for generating and distributing 3D video segments 122 to user devices 152 according to an aspect” and “system 100 includes a video manager 102 configured to generate a 3D video segment 122 from 2D video content 134 … enabling 3D video content 121 generated from the 3D video segment 122 to be displayed in an interface 140 on a display 123 of a user device 152” and “system 100 may enable real-time rendering of the 3D video segment 122 on user devices 152 by using a streaming engine 120 (e.g., a cloud-based 3D pixel streaming engine) that generates (e.g., renders) the 3D video content 121 from the 3D video segment 122 and re-generates (e.g., re-renders) the 3D video content 121 according to any user selections 171” and “3D video segment 122 may be a 3D video highlight (e.g., a 3D sports highlight). However, the techniques discussed herein may be applied to any type of underlying video content having one or more object 108 that move in 2D video content 134” and “video manager 102 includes a 3D video generator 104 configured to receive 2D video content 134 and generate a 3D video segment 122 based on the 2D video content 134” and “2D video content 134 includes video data (and, in some examples, audio data) captured from a camera system 132. The camera system 132 may include one or more camera devices configured to capture video in two dimensions, representing a scene as a flat image or a sequence of frames with height and width. Each frame of the 2D video content 134 includes a 2D array of pixels, where each pixel includes color and brightness information. In some examples, the camera system 132 includes an audio system having one or more microphones configured to capture audio data” and “2D video content 134 is displayed on the display 123 of the user device 152” and “the 2D video content 134 is streamed to the user device 152 via a media platform (e.g., a streaming platform) while the camera system 132 is generating the 2D video content 134. In some examples, the 2D video content 134 is stored on a remote server computer and streamed to the user device 152 from the remote server computer. In some examples, the 2D video content 134 is stored on the user device 152” and “3D video generator 104 may obtain the 2D video content 134 from a streaming platform that distributes and/or stores the 2D video content 134 captured from the camera system 132” such that here at least one video stream is received and as explained below such stream is where the 2D frames shown on the display are sourced from);
capturing a plurality of 2D frames shown on the display, when a 3D modeling command is received (note that the manner of such capturing is not limited and for example any obtaining of such 2D frames that have been shown on the display at any point in time is such capturing, and note that all of such 2D frames need not be captured “when a 3D modeling command is received”, and note that the claim does not specify what such a command is other than it is related to 3D modeling, nor its format, nor its source such that if there is a command to model something in 3D which is provided by a system or the user and such capturing occurs then the claim limitations are met; see Lorrain, paragraphs 0037-0046 teaching “video manager 102 includes a 3D video generator 104 configured to receive 2D video content 134 and generate a 3D video segment 122 based on the 2D video content 134” and “2D video content 134 includes video data (and, in some examples, audio data) captured from a camera system 132. The camera system 132 may include one or more camera devices configured to capture video in two dimensions, representing a scene as a flat image or a sequence of frames with height and width. Each frame of the 2D video content 134 includes a 2D array of pixels, where each pixel includes color and brightness information. In some examples, the camera system 132 includes an audio system having one or more microphones configured to capture audio data” and “2D video content 134 is displayed on the display 123 of the user device 152” and “the 2D video content 134 is streamed to the user device 152 via a media platform (e.g., a streaming platform) while the camera system 132 is generating the 2D video content 134. In some examples, the 2D video content 134 is stored on a remote server computer and streamed to the user device 152 from the remote server computer. In some examples, the 2D video content 134 is stored on the user device 152” and “3D video generator 104 may obtain the 2D video content 134 from a streaming platform that distributes and/or stores the 2D video content 134 captured from the camera system 132” such that this is capturing of the plurality of 2D frames of the 2D video content and these are captured when a 3D modeling command is received such as to “generate a 3D video segment” and for example such a command may come from a user where “the user may view the 2D video content 134 in an interface 140 of the application 146. The interface 140 may include a UI element 107 configured to enable the user to view the 2D video content 134 in a 3D format. In some examples, the UI element 107 is a UI control that enables the user to view the 2D highlight in a 3D format. In some examples, selection of the UI element 107 causes a 3D viewing request 184 to be generated and transmitted to the video manager 102. The 3D viewing request 184 may include information that identifies the 2D video content 134. In some examples, the 3D viewing request 184 includes the resource identifier of the 2D video content 134. In some examples, the 3D viewing request 184 may include the 2D video content 134” such that here the user may view the 2D video frames shown on the display and may command a 3D model to be built relating to the video content shown);
building at least one 3D model based on the plurality of 2D frames (see Lorrain, paragraphs 0037-0046 as explained above where the plurality of 2D frames are the basis for generating the 3D model where “video manager 102 includes a 3D video generator 104 configured to receive 2D video content 134 and generate a 3D video segment 122 based on the 2D video content 134” where such 2D video content is the plurality of frames shown on the display as explained above and the 3D video segment is a 3D model that is built based on those frames as further explained in paragraphs 0065-0073 teaching “3D object model 116 may include information about the geometry, topology, and appearance of a 3D object. The 3D object model 116 may define the shape and structure of the 3D object, and may include information about vertices, edges, and/or faces that form the object's surfaces. The 3D object model 116 may include information about the connectivity and relationships between the geometric elements of the model, and may define how the vertices, edges, and faces are connected to form the object's structure. The 3D object model 116 may include texture coordinates that define how textures or images are mapped into the surfaces of the model and may provide a correspondence between the points on the 3D surface and the pixels in a 2D texture image. In some examples, the 3D object model 116 may include information about normals (e.g., vectors perpendicular to the surface at each vertex or face) that determine the orientation and direction of the surfaces, indicating how light interacts with the surface during shading calculating. The 3D object model 116 may include information about material properties that describe the visual appearance and characteristics of the 3D object's surfaces, and may include information such as color, reflectivity, transparency, shininess, and other parameters that affect how the surface interacts with light” and “the 3D object engine 112 uses the video footage of the object 108 in the 2D video content 134 to generate the 3D object model 116” where “the 3D object engine 112 may include a mesh generator 182 configured to receive the 2D video content 134 that includes the object 108 and generate the 3D object model 116 to have characteristics that mimic the object's characteristics in the 2D video content 134” such that here the 3D object is built based on the plurality of 2D frames); and
rendering an adjusted image corresponding to a viewing angle information according to the at least one 3D model, when a rendering command corresponding to the viewing angle information is received (see Lorrain, paragraphs 0075-0079 teaching rendering an adjusted image corresponding to a viewing angle information according to the at least one 3D model as where “video manager 102 system may include a streaming engine 120 configured to generate 3D video content 121 from the 3D video segment 122. In some examples, the streaming engine 120 renders video frames from the 3D video segment 122, where the 3D video content 121 includes the rendered video frames” and “streaming engine 120 may enable near real-time streaming and rendering of 3D graphics and interactive content over a network 150” and this rendering of the 3D video content is display of an adjusted image which has now been adjusted to a 3D format from a 2D format and this rendering command is corresponding to viewing angle information that is received as the 2D content is initially displayed at the viewing angle captured by the cameras as explained above such that when rendering the 3D video and 3D model of the 3D video that was built, this renders an adjusted image based on the 3D model and as the user has requested the 3D model to be rendered based on viewing the video at such a viewing angle then this happens when a rendering command corresponding to that viewing angle is received; furthermore a user is also able to input a command to render the content corresponding to a certain viewing angle that is different from the initial video content such that this also renders an adjusted image corresponding to that viewing angle as in paragraphs 0095-0102 teaching an instance of live 2D content being broadcast and shown to a user where the user can control the camera angle from which the content is rendered as where “the interface 240 includes a UI element 207 a, which, when selected, enables the user to view the 2D video segment 234 in a 3D format. The UI element 207 a may be an example of the UI element 107 of FIGS. 1A through 1N. The user may select the UI element 207 a at any time during the display of the 2D video segment 234 (or after the 2D video segment 234 has ended). In some examples, in response to selection of the UI element 207 a, as shown in FIG. 2B, the interface 240 may display a UI information element 213 (e.g., “swipe to rotate”) and a UI information element 215 (“pinch to zoom”) about user selections which can adjust the view of the 3D video segment. The interface 240 may also display a UI element 207 b (“explore in 3D”), which, when selected, causes a 3D viewing request (e.g., the 3D viewing request 184 of FIGS. 1A through 1N) to be transmitted to a streaming engine 120 (e.g., the streaming engine 120 of FIGS. 1A through 1N). In some examples, selection of the UI element 207 a causes the 3D viewing request to be transmitted to the streaming engine” and “for an initial view, the streaming engine may select default settings for the user controls 242 and generate the 3D video content 221 a from the 3D video segment according to the default settings” and “user controls 242 include a viewing angle control 231 configured to enable the user to adjust the viewing angle” and “In response to selection of the viewing angle control 231, as shown in FIG. 2D, the interface 240 displays a menu with a plurality of viewing angles 211. The viewing angles 211 may include viewing angle 211 a (drone cam), viewing angle 211 b (bird's-eye view), viewing angle 211 c (in the game), viewing angle 211 d (quarterback point of view), viewing angle 211 e (be the ball), and viewing angle 211 f (on the sideline)” and “streaming engine may generate 3D video content 221 b from the 3D video segment according to the user selection and transmit the 3D video content 221 b to the application for display on the interface 240, as shown in FIG. 2E”).
Lorrain teaches all that is required as explained above but fails to specifically teach that the performing of the video on a display is performing a video conference on a display, and relatedly that the at least one video stream is a video conference stream. Note that a video conference displayed does not require an active video conference to be occurring as a video conference could be a video of a conference or other event and note that the received video conference stream is not limited to any specific format or capability other than being video related to a video conference in some way. Lorrain does teach to receive a video stream and such a video stream may be from a live event which is being captured by a camera and such video is shown on a display to a remote user as explained above and while the majority of examples of Lorrain are directed to “sports” environments, Lorrain specifically suggests that the technique may be applied to any video content having one or more objects that move in 2D video content where of course a video conference is exactly such type of video content (see Lorrain, paragraph 0041 teaching “3D video segment 122 may be a 3D video highlight (e.g., a 3D sports highlight). However, the techniques discussed herein may be applied to any type of underlying video content having one or more object 108 that move in 2D video content 134”). Thus Lorrain stands as a base device upon which the claimed invention can be seen as an improvement through the ability to utilize video conference stream images to enable display of objects based on the images shown which provides benefits to those engaged in more specific live broadcasts such as video conferences.
In the same field of endeavor relating to capturing images of a video conference and building and displaying 3D models from such images displayed to a user (see Oz, paragraphs ), Oz teaches that it is known to perform display of a video conference by receiving at least one video conference stream, capturing images shown on the display to the user and building a 3D model from the images shown on the display (see Oz, paragraph 0052 teaching “generation of the first avatar and/or the inclusion of the first avatar may be responsive to information gained by the device of the first user or to a camera or sensor associated with the device of the first user. A non-limiting example of information may include information regarding the first participant and/or information regarding to the acquisition of images of the first participant (for example camera setting, illumination and/or ambient conditions)” and paragraph 0144 teaching “a new view is created based on a real-time image obtained from a video camera and the position of the new point of view (virtual camera)” and as in paragraphs 0160-0163, “3D models and texture maps of the users can be created on the fly from a 2D or 3D video camera or can be prepared before the beginning of the 3D video conference call” and “video camera is a 2D camera, then computerized models… may be used to create a 3D model from the 2D images” where “3D models can be created in several ways” and “At runtime, such a network would receive an image of a person as an input and a camera position from which the person should be rendered. The network would render an image of that person from the different camera position” and as in paragraph 0234 “During the communication session, i.e., a 3D video conference call between several users, a 2D or 3D camera (or several cameras) grabs videos of the users. From these videos a 3D model (for example—the best fitting 3D model) of the user may be created at a high frequency” and note paragraphs 0281-0293 teach an instance of “a person speaking in a conference may be holding an object that is significant to the specific conference call or may not be significant at all. The speaker may be holding a pen which has no significance to the meeting or a diagram which is very significant to the meeting. To transmit these objects to the other viewers, they may be recognized and modelled as 3D objects. The model may be transmitted to the other users for reconstruction” and “user will be able to control the part or parts of the narrow field of view image or images by using a mouse, a keyboard, a touch pad or a joystick or any other device that allows to pan and zoom in or out of an image” and finally as in paragraph 0525 “Note that the process mentioned above may be not limited to rendering images of people and can be also used to render animals or any other objects”). Thus Oz provides the above known techniques applicable to the base system of Lorrain.
Therefore it would have been obvious for one of ordinary skill in the art before the effective filing date of the invention to modify Lorrain by applying the known techniques of Oz above as doing so would be no more than application of a known technique to a base device which is ready for improvement and would yield predictable results and result in an improved system. The predictable result of applying Oz’s video conferencing technique to Lorrain’s streaming reconstruction system would be a video conferencing system capable of generating 3D models of conference participants and any objects in the environment from 2D video streams and rendering those 3D models from any viewable angle. The results would be predictable as Lorrain already teaches the technique applied to live-streamed content and applying it in real-time and suggests it can be used in any content in which objects are moving in a 2D video where Oz provides such video showing objects that can be modelled. The combination would result in an improved system as the user of Lorrain as modified by Oz would be have the additional capability to utilize video conference stream images to enable display of objects based on the images shown which provides benefits to those engaged in more specific live broadcasts such as video conferences and the like.
Regarding claim 2, Lorrain as modified teaches all that is required as applied to claim 1 above and further teaches wherein the at least one video conference stream is directly received from a network (note that the claim specifies that the stream is directly received from a network it is not limited as to who or when or what receives this directly and thus if in the system the stream is directly received from a network at some point then the limitation is met; see Lorrain as modified by Oz where the video stream of Lorrain is a video conference stream in the combination and Lorrain teaches such video stream is directly received from a network as in paragraphs 0090-0093 teaching a server or user device may directly receive the stream from a network where “a video manager portion 102-1 is executable by the server computer(s) 160, a video manager portion 102-2 is executable by the user device 152. For example, some of the operations of the video manager 102 may be executable by the server computer(s) 160 and some of the operations of the video manager 102 may be executable by the user device 152. In some examples, the video manager portion 102-1 includes the 3D video generator 104, and the video manager portion 102-2 includes the streaming engine 120. In some examples, the video manager portion 102-1 includes the object tracker(s) 106 and the 3D object engine 112. In some examples, the video manager portion 102-1 includes the object tracker(s) 106, and the video manager portion 102-2 includes the 3D object engine 112” and “server computer(s) 160 may be computing devices that take the form of a number of different devices, for example a standard server, a group of such servers, or a rack server system. In some examples, the server computer(s) 160 may be a single system sharing components such as processors and memories. In some examples, the server computer(s) 160 may be multiple systems that do not share processors and memories. The network 150 may include the Internet and/or other types of data networks, such as a local area network (LAN), a wide area network (WAN), a cellular network, satellite network, or other types of data networks. The network 150 may also include any number of computing devices (e.g., computers, servers, routers, network switches, etc.) that are configured to receive and/or transmit data within network 150. Network 150 may further include any number of hardwired and/or wireless connections” and as in paragraph 0045 Lorrain teaches “3D video generator 104 may obtain the 2D video content 134 from a streaming platform that distributes and/or stores the 2D video content 134 captured from the camera system 132. In some examples, the 3D video generator 104 may obtain the 2D video content 134 from the streaming platform while the camera system 132 is generating the 2D video content 134” such that here the stream can be sent either to a server or user device directly from a network such as network 150 ).
Regarding claim3, Lorrain as modified teaches all that is required as applied to claim 1 above and further teaches wherein the step of building the at least one 3D model and the step of displaying the adjusted image corresponding to the viewing angle information are performed concurrently (see Lorrain as modified by Oz where Lorraine provides such functionality already in combination as in paragraphs 0029-0031 teaching “system may enable real-time rendering of the 3D video segment on user devices by using a streaming engine (e.g., a cloud-based 3D pixel streaming engine) that generates (e.g., renders) the 3D video content from the 3D video segment and re-generates (e.g., re-renders) the 3D video content according to any user selections” and “a user may view the 2D video content using their user device and may initiate a 3D viewing request to view the 2D video content in a 3D format. For example, the interface may include a UI element, which, when selected, causes the 3D viewing request to be transmitted to the video manager. In response to the 3D viewing request, the 3D video generator may generate the 3D video segment (e.g., generates the 3D video segment on the fly or in near-real time)” such that the 3D model built above is built “on the fly or in near-real time” meaning that the building of the 3D model happens at the same time as displaying the 3D model as this enables the real-time viewing of the 3D model according to the request that as explained above can specify a viewing angle ).
Regarding claim 4, Lorrain as modified teaches all that is required as applied to claim 1 above and further teaches wherein a relative angle relationship is built in the step of building the at least one 3D model (note that the claim does not recite what a relative angle relationship is in relation to and thus if any relative angle relationship between any two elements is somehow part of building the 3D model then the limitation is met where this could comprise any or more of camera angle relationships, object orientation relationships, viewing angle relationships, rotational relationships or spatial angular relationships for example; see Lorrain as modified where Lorrain already provides such functionality within the combination such as in paragraphs 0071-0073 as explained above where the 3D reconstruction from the 2D images establishes numerous versions of relative angle relationships between camera viewpoints as well as establishing relative angles for 3D objects to appear in the 3D view of the scene at the commanded viewing angle where for example “mesh generator 182 may obtain or determine the intrinsic parameters of the camera system 132, such as focal length, principal point, and lens distortion. This information may be used to accurately project the 3D geometry into the 2D image coordinates. In order to reconstruct the 3D geometry, the mesh generator 182 may extract the features or keypoints from the 2D video frames. These features can be points, edges, corners, or other distinctive elements. Feature tracking algorithms are used to match corresponding features across different frames, allowing the tracking of their movement over time. In some examples, the mesh generator 182 may include one or more structure-from-motion (SfM) techniques. SfM is a technique used to estimate the camera poses and reconstruct the 3D structure of the scene from a set of 2D images. It utilizes the tracked features and camera calibration information to determine the camera positions and orientations at different time instants. SfM algorithms estimate the 3D structure by triangulating the corresponding feature points from multiple camera viewpoints”).
Regarding claim 5, Lorrain as modified teaches all that is required as applied to claim 1 above and further teaches wherein a light source direction is built in the step of building the at least one 3D model (note that the form of the “light source direction” is not specified, nor exactly what light source or direction, nor is the manner in which this is somehow “built in” the building step such that if during building of the 3D model there is any establishing, computing, estimating or otherwise incorporating of any light direction information then the limitation is met and for example would comprise any or more of estimation of scene lighting as part of reconstruction based on display conditions and image information, computation of surface normal that encode lighting response, incorporation of shading into a model, determining illumination parameters for rendering or that build lighting properties into a 3D model; see Lorrain, paragraph 0066 teaching “The 3D object model 116 may include texture coordinates that define how textures or images are mapped into the surfaces of the model and may provide a correspondence between the points on the 3D surface and the pixels in a 2D texture image. In some examples, the 3D object model 116 may include information about normals (e.g., vectors perpendicular to the surface at each vertex or face) that determine the orientation and direction of the surfaces, indicating how light interacts with the surface during shading calculating. The 3D object model 116 may include information about material properties that describe the visual appearance and characteristics of the 3D object's surfaces, and may include information such as color, reflectivity, transparency, shininess, and other parameters that affect how the surface interacts with light” and paragraphs 0074-0076 teaching “3D video generator 104 may generate the 3D video segment 122 to include other computer-generated model objects (e.g., static and/or animated models) that represent other objects in the 3D scene such the court (e.g., an outside basketball court, a basketball within an arena within an area), the hoop, the fans, lighting and shading that indicate time of day (e.g., nighttime or daytime), etc” and “streaming engine 120 may enable near real-time streaming and rendering of 3D graphics and interactive content over a network 150, which may include one or more rendering operations such as geometry processing, shading and material calculations, camera and viewpoint calculations, frame composition, video encoding, and/or network transmission” such that here there are numerous instances of light source direction built in the step of building the 3D model).
Regarding claim 6, Lorrain as modified teaches all that is required as applied to claim 1 above and further teaches wherein the viewing angle information is adjusted by an input device, and the adjusted image is displayed on an imaging window (note that “an imaging window” is considered any defined visual area, region, or frame in which images are displayed or a portion of a display screen designated for showing image content and could be for example application windows, browser windows, video player frames, display viewports, PiP windows, full-screen displays, or any defined area of a screen or display where images appear; see Lorrain as modified, where Lorrain already teaches in the combination such displaying in an imaging window and where the viewing angle information is adjusted by an input device as in paragraph 0079 teaching for example “user controls 142 may include one or more customization controls 142 a that may modify the playback of the 3D video segment 122. The customization controls 142 a may include a viewing angle control 131 configured to enable the user to adjust a viewing angle 111 of the 3D video segment 122. The viewing angle control 131 may provide a plurality of viewing angles 111 including a viewing angle 111 a and a viewing angle 111 b. Although two viewing angles 111 are shown in FIG. 1I, it is noted that the 3D video segment 122 may be replayed from many viewing angles 111 (including all viewing angles 111). In some examples, the viewing angles 111 include a predefined set of viewing angles 111” and as in paragraphs 0100-0102 “In response to selection of viewing angle 211 d, the application may transmit information about the user selection (e.g., the viewing angle 211 d) to the streaming engine. The streaming engine may generate 3D video content 221 b from the 3D video segment according to the user selection and transmit the 3D video content 221 b to the application for display on the interface 240, as shown in FIG. 2E” such that here the user can adjust the viewing angle using an input device and it can then be displayed to them as adjusted on the display).
Regarding claim 7, Lorrain as modified teaches all that is required as applied to claim 1 above and further teaches wherein the step of building the at least one 3D model comprises: capturing a plurality of feature points from the plurality of 2D frames; and superimposing the plurality of feature points to link the plurality of 2D frames (see Lorrain as modified where Lorrain already teaches such limitations in the reconstructing that builds the 3D model where the limitations merely describe a photogrammetry or structure-from-motion (SfM) reconstruction technique at a high level and which corresponds to Lorrain at paragraphs 0071 teaching such capturing of feature points and superimposing of points to link the features as in the projection steps of a SfM technique as where “to reconstruct the 3D geometry, the mesh generator 182 may extract the features or keypoints from the 2D video frames. These features can be points, edges, corners, or other distinctive elements. Feature tracking algorithms are used to match corresponding features across different frames, allowing the tracking of their movement over time. In some examples, the mesh generator 182 may include one or more structure-from-motion (SfM) techniques. SfM is a technique used to estimate the camera poses and reconstruct the 3D structure of the scene from a set of 2D images. It utilizes the tracked features and camera calibration information to determine the camera positions and orientations at different time instants. SfM algorithms estimate the 3D structure by triangulating the corresponding feature points from multiple camera viewpoints”).
Regarding claim 8, the instant claim recites an “electronic device, comprising” a plurality of elements in the form of modules that are functionally recited where the modules perform the various functions that are performed in relation to the limitations of claim 1 that have been addressed above. Note that while such modules are defined by their functions, the structure responsible for performing these functions is the “electronic device” where such is interpreted as some combination of hardware and/or software elements where such device must be configured in some manner to be capable of performing such recited functions that correspond to those functions as in claim 1 above. Note that an “electronic device” does not limit the claim to a single piece of connected hardware but may for example comprise a system of distributed components that act as a device. In light of this, Lorrain as modified as applied to claim 1 already teaches such a device consisting of the components performing the functions as explained above in the rejection of claim 1 such as the “system that generates a three-dimensional model (3D) video segment (e.g., a 3D model data) from two-dimensional (2D) video content and may render, in near-real time, 3D video content, which is transmitted to and displayed on an interface of an application executable by a user device” (see Lorrain, paragraph 0028 and figure 1 for example showing such a device). In light of this, the limitations of claim 8 correspond to the limitations of claim 1 above; thus it is rejected on the same grounds as claim 1 above.
Regarding claims 9-14, the instant claim limitations correspond to the claim limitations of claims 2-7, respectively in the same manner that claim 8 corresponds to claim 1. In light of this, the limitations of claims 9-14 correspond to the limitations of claims 2-7, respectively; thus they are rejected on the same grounds as claims 2-7, respectively.
Regarding claim 15, the instant claim recites the techniques embodied as a “video conference system, comprising” the same functional limitations required and as would be performed by the method of claim 1. The claim further specifies that the system has some “first electronic device” that is related to the capturing of the at least one video conference stream and such device must broadly in some sense be “used for capturing at least one video conference stream and uploading the at least one video conference stream to a network” such that the device must be “used for” this in some manner which is not limited and does not require such device to actually capture the stream itself or upload the stream itself so long as it is “used for” those functions in any manner and thus could be a camera used to capture it if the content is uploaded, or a module within the processor that causes capturing of the stream and/or uploading of the stream to a network. Additionally a “second electronic device” is specified, though very broadly, as comprising modules for receiving the video where such device has a display of some sort and comprise the functionality of the processing module itself or through availability to resources to perform such steps. Lorrain as modified already teaches all of such functionality of claim 15 as explained in the rejection of claim 1 above and also already teaches in the rejections cited above such first and second devices as Lorrain as modified teaches a system with a first device such as a camera for capturing a stream which is uploaded to a network where some second device can receive such a stream through a network and display on a display of a user for example the rendered information. In light of this, the limitations of claim 15 correspond to the limitations of claim 1; thus it is rejected on the same grounds as claim 1.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SCOTT E SONNERS whose telephone number is (571)270-7504. The examiner can normally be reached Mon-Friday 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Xiao Wu can be reached at (571) 272-7761. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SCOTT E SONNERS/Examiner, Art Unit 2613
/XIAO M WU/Supervisory Patent Examiner, Art Unit 2613
1 US PGPUB No. 20250245898
2 US PGPUB No. 20210360197