DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of the Claims
Claims 1-5, 8, 9, 11, 13, 16, 17, 20, 22, 24, 26, 28, 32-34, 36, and 38 are pending in the present application, with claims 1 and 20 being independent. Claims 2, 5, 17, and 24 have been amended and claims 6, 7, 10, 12, 14, 15, 18, 19, 21, 23, 25, 27, 29-31, 35, and 37 have been cancelled by preliminary amendment.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 15 August 2024 has been considered by the examiner.
The listing of references in the specification is not a proper information disclosure statement. 37 CFR 1.98(b) requires a list of all patents, publications, or other information submitted for consideration by the Office, and MPEP § 609.04(a) states, "the list may not be incorporated into the specification but must be submitted in a separate paper." Therefore, unless the references have been cited by the examiner on form PTO-892, they have not been considered.
Specification
The disclosure is objected to because it contains an embedded hyperlink and/or other form of browser-executable code. Applicant is required to delete the embedded hyperlink and/or other form of browser-executable code; references to websites should be limited to the top-level domain name without any prefix such as http:// or other browser-executable code. See MPEP § 608.01.
Claim Interpretation
“his” in “his performance” is being interpreted as a gender-neutral pronoun and being the performance of a respective player.
Claim Objections
Claims 1-5, 8, 9, 11, 13, 16, 17, 20, 22, 24, 26, 28, 32-34, 36, and 38 is/are objected to because of the following informalities:
Claims 1 and 20 should recite “before said real game, generating a 3D model of each player”
The claims should consistently use the same terminology throughout. For instance, in claim 1 applicant refers to both each frame of said video footages and each frame on its own. Similarly, the claims refer to said game field and said real game field.
The dependent claims should recite “the method” or “the system”, since they are referring back to the independent claim which recites A method and A system, respectively.
Claims should be one sentence. Claim 28 is two sentences.
Claims 3, 13, 22, 32, and 34 should have an “and” or “or” before the last limitation.
Claim 3 should recite “allowing each spectator of a VR user interface of the software application...”
Claim 4 should recite “key points from”
Claim 5 should recite “2D display screen”
Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-5, 8, 9, 11, 13, 16, 17, 20, 22, 24, 26, 28, 32-34, 36, and 38 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
With respect to claim 1, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to the scope of generated a 3D model of each player according to his performance within previously taken video streams. How is the 3D model generated according to performance of the athlete? Said real game has not been previously defined in the claims. Is each player a player in said real game? How are the skeletal and skin features extracted for each player from said video footages using multiple deep learning models? It is not immediately clear as to how the extraction is performed using the models. Does all players relate to each player and the players in said game? Is it all players, even those not in said real game (e.g., players who may not be playing in said game, but have played previously? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
Claim 20 recites similar limitations as to that of claim 1 and is accordingly rejected using substantially similar rationale as to that set forth with respect to claim 1.
Claims depending thereon do not cure all of the noted deficiencies and are accordingly also rejected using substantially similar rationale as to that set forth for the claims from which they depend.
With respect to claim 2, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to what the “or more of the following steps” is referencing, since the claim has a single step, which is “acquiring video footages of a real game ball”. The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 3, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to if claim 3 is requiring that the user device be a VR device or if VR in this case is just an interface within the software application. In addition, point of view, direction of view, zoom level, etc have not been previously defined. It is also unclear as to how controlling the zoom level allows close-up views from any direction and from any desired angle. The desired angle and direction would seem to flow from viewpoint determination. The examiner respectfully requests applicant clarify the scope of the claimed limitations.
Claim 22 recites similar limitations as to that of claim 3 and is accordingly rejected using substantially similar rationale as to that set forth with respect to claim 3.
With respect to claim 4, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to how the skeletal features are extracted using a deep learning model and how said deep learning model determines key points from the character’s geometric skeleton of each player. What is the character’s geometric skeleton referencing? How do skin features include deformation of the player’s clothes? Are the pose features in claim 4 also done over time, as set forth in claim 1? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 5, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to how claim 5 fits in with claim 1, since claim 1 is at a spectator side, and claim 5 is at the client side. The spectator has not been previously set forth. It is also unclear as to the “without any intervention” portion of the claim, since the previous portion of the claim views the game on a 2D display or with google/smart glasses. What is meant by and how is the 3D synthesized game viewed without any intervention? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
Claim 24 recites similar limitations as to that of claim 5 and is accordingly rejected using substantially similar rationale as to that set forth with respect to claim 5.
With respect to claim 8, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to what is meant by “forced to have”. How is the 3D model of the player forced to have the same pose and position in the virtual game field, according to the actual game field? In addition, virtual game field and actual game field have not been previously defined. Claim 1 recites a synthesized game field. The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 9, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to what is meant by the movements of every player is tracked over the available set of cameras, while selecting and the best view in terms of visibility and coherence. What constitutes the “available set of cameras”? Are some of the cameras not available? What is meant by “selecting and the best in terms of visibility and coherence”? Coherence and visibility of what? Selecting? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 11, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to if the deep learning model of claim 11 is part of the deep learning models of claim 1. What features are being extracted using a CNN? How are the transformers applied to map features to a skeleton and skin features. Do the skeleton and skin features relate to the extracted and skeletal skin features of claim 1? Are they different? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 13, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to what is meant by “thereby determining the poses variation over time”. Is the claim calling for determining the variation in pose over time for a given player? Is the thereby language, intended use language? It is not clear as to if the claim requires determining pose variation over time. “The poses” has not been previously set forth. It is also unclear as to how claim 13 relates to claim 1, how is the transformer module adapted to perform said steps and how does said claim fit within claim 1. Is the pose in claim 13 relate to the pose of claim 1? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
Claim 32 recites similar limitations as to that of claim 13 and is accordingly rejected using substantially similar rationale as to that set forth with respect to claim 13.
With respect to claim 16, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to what is meant by player’s model (is it the 3D model in claim 1, a different model, etc). A player’s model has not been previously defined. How is the player’s model obtained using manual modeling, 3D scanning, and model fitting? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 17, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to if the deep learning model of claim 17 relates to the deep learning models of claim 1, how the deep learning model determines the sequence of actions/poses, and is applied to fill missing gaps in the synthesized game. What is meant by deep learning techniques and how are they used to apply character pose estimation and extract skeletal and skin pose features? Do the skeletal and skin pose features of claim 17 relate to those in claim 1? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
Claim 36 recites similar limitations as to that of claim 17 and is accordingly rejected using substantially similar rationale as to that set forth with respect to claim 17.
With respect to claim 26, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to how an animation model is stored for filling gaps and missing players from one or more video footage frames and how this provides smooth animation to the avatars. The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 28, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to what is meant by deep learning techniques and how are they used to apply character pose estimation and extract skeletal and skin pose features? Do the skeletal and skin pose features of claim 17 relate to those in claim 20? What is meant by the movements of every player is tracked over the available set of cameras, while selecting and the best view in terms of visibility and coherence. What constitutes the “available set of cameras”? Are some of the cameras not available? What is meant by “selecting and the best in terms of visibility and coherence”? Coherence and visibility of what? Selecting? In addition, claim 28 should be a single sentence, as currently constructed, it is not immediately clear as to what is being claimed. The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 33, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to how claim 33 fits in with claim 20, since remote spectators and client side have not been previously defined. Are the pose features transmitted along with the other items listed in claim 20? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
With respect to claim 34, given the plain and ordinary meaning of the words when interpreted in light of the corresponding disclosure, the scope of the claimed invention is unclear. For instance, it is not immediately clear as to how claim 34 fits in with claim 20 and if claim 34 is requiring compliance with each specification or with one of them. Also, it is unclear as to what version, etc of said service applicant is claiming. The words should be spelled out before using the abbreviation. The streaming architecture has not been previously defined. How is compliance ensured (complies with for instance HTTP specification)? The examiner respectfully requests applicant clarify the scope of the claimed limitations.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-5, 8, 13, 16, 20, 22, 24, 32, and 38 is/are rejected under 35 U.S.C. 102(a)(1)/(a)(2) as being anticipated by Van Hoff et al. (US PG Publication 2020/0134911).
Regarding claim 1, Van Hoff teaches a method for controlling the rendering of a broadcasted game at a spectator side (see for instance, paragraphs 101-103), comprising:
a) deploying a set of cameras around a real game field (Video capture devices capture imagery associated with a bounded real-world scene that includes a plurality of real-world objects, see for instance, paragraph 48. If bounded real-world scene includes a football stadium, objects may represent the players on each team, the football, the goal posts, the referees, and so forth...if bounded real-world scene includes a tennis court or concert stage, objects may represent any of the specific types of objects described above in relation to Fig. 1, see for instance, paragraph 50);
b) acquiring, in real time, video footages being a sequence of frames of said real game field and players in said real game (see for instance, paragraphs 42, 48, 50, and 55);
c) before said game, generating a 3D model of each player according to his performance within previously taken video streams (see for instance, paragraphs 28, 38, 48, 50, and 78. The system may generate a 3D model of bounded real-world scene, such as a tennis court, see for instance, paragraph 28. Video capture devices capture imagery associated with a bounded real-world scene that includes a plurality of real-world objects, see for instance, paragraph 48. If bounded real-world scene includes a football stadium, objects may represent the players on each team, the football, the goal posts, the referees, and so forth...if bounded real-world scene includes a tennis court or concert stage, objects may represent any of the specific types of objects described above in relation to Fig. 1, see for instance, paragraph 50. Each 3D model accessed by access system may include representation of the scene of object that has been previously generated, see for instance, paragraph 78. Processing facility 204 may access a predefined 3D model of the bounded real-world scene and a 3D model of the real-world object, see for instance, paragraph 39);
d) identifying said each player and said real game field in each frame of said video footages, by an object detection module that receives and processes said video footages (see for instance, paragraphs 38, 52, and 77. Based on the 2D video images and the depth data, processing facility 204 may identify a particular object depicted by the 2D video images, and generate a volumetric 3D model for the particular object, see for instance, paragraph 38. Access system 702 may be configured not only to access 2D video images captured by video capture device 602 and provided by video capture system 302, but may further access data representative of any 3D models that may be used for the 3D simulation, see for instance, paragraph 77);
e) determining the location of each player on said game field in each frame (see for instance, paragraphs 81, 86-92 and 94. Spatial characteristic tracking system 704 may receive 2D video image content depicting bounded real-world scene 404 and objects 406, and may be configured to track spatial characteristics of each object 406 relative to bounded real-world scene 404 based on the 2D video images, see for instance, paragraph 81);
f) extracting skeletal and skin features of each player from said video footages using deep learning models (see for instance, paragraphs 86-92 and 94. When determining the pose of a person (e.g., a tennis player in one example), a skeletal model of the person may be generated and markers placed at key points along the skeletal model.... markers may be associated with the person's hands, feet, head, and various joints (e.g., elbows, knees, shoulders, waist, etc.)., see paragraph 86. based on metadata stored for different voxels or triangles of the compressed mesh, the volumetric modeling system may apply texture from a texture atlas based on captured 2D color data to finish the volumetric 3D model by making the model look like the object, along with taking the shape and form of the object, see for instance, paragraph 118. Machine learning technologies may include one or more neural networks, deep neural networks, convolutional neural networks, recurrent neural networks, training sets (e.g., videos of human body movements, videos of tennis matches, etc.), and/or any other components as may serve a particular implementation, see for instance, paragraph 94. Machine learning algorithm and training procedures may enable system 200 to reliably perform the spatial characteristic tracking without needing to rely extensively on a manually programmed tracking algorithm, see for instance, paragraph 93.);
g) generating 3D avatars for all players using said 3D model and said extracted features, to animate the respective 3D avatars (see for instance, paragraphs 63, 73. Simulation presentation system 708 may include a 3D graphics simulation engine (e.g., a video game engine, etc). Such a simulator may receive input data indicative of what certain objects are ( e.g., object classification data), how the objects move and/or are positioned and oriented in space (e.g., spatial characteristic data), and, based on this data, may render graphics that pose 3D models associated with the objects (e.g., models that appear identical or similar to the objects, models that act as avatars for the objects and look different from the objects, etc.) in corresponding ways, see for instance, paragraph 73. For example, the input data may indicate that the feet and arms of a character are swinging as the body of the character is moving forward in space, and the 3D graphics simulator may thus simulate a 3D model of the character to be walking forward....input data indicates that the character's body has jumped, the 3D graphics simulator simulates the jump, and so forth, see for instance, paragraph 73);
h) continuously tracking the location and movements of each player over the acquired video footages (see for instance, paragraphs 81, 86-92 and 94. Spatial characteristic tracking system 704 may receive 2D video image content depicting bounded real-world scene 404 and objects 406, and may be configured to track spatial characteristics of each object 406 relative to bounded real-world scene 404 based on the 2D video images, see for instance, paragraph 81);
i) determining pose features of each player over time (see for instance, paragraphs 81, 86-92 and 94. Spatial characteristic tracking system 704 may receive 2D video image content depicting bounded real world scene 404 and objects 406, and may be configured to track spatial characteristics of each object 406 relative to bounded real-world scene 404 based on the 2D video images, see for instance, paragraph 81. The tracked spatial characteristics of a particular real-world object may be a pose of the real-world object, see for instance, paragraph 81);
j) transmitting data related to said game field, the location data and said 3D avatars to a software application at the spectator side (Network 306 may be configured to provide data delivery between server-side provider systems (e.g., system 200, video capture system 302, etc.) and client-side systems and devices (e.g., media player devices 308), see paragraph 100. Configuration 300 includes system 200 and a video capture system 302 that is communicatively coupled to a plurality of video capture devices 304 (e.g., video capture devices 304-1 through 304-M) on a server-side of a network 306. System 200 and video capture system 302 are also communicatively coupled, by way of network 306 to a plurality of media player devices 308 (e.g., media player devices 308-1 through 308-N) that are on a client-side of network 306, see paragraph 43. After generating a 3D simulation based on the 2D video images provided by video capture system 302, 3D simulation system 200-2 may provide data required for presentation of the 3D simulation to a simulation presentation system 708, see paragraph 72. Simulation presentation system 708 may be implemented by or otherwise associated with a media player device such as one of media player devices 308 described above, and may be communicatively coupled with the media player device 308 by way of a network such as network 306, see paragraph 72); and
k) generating, on a computerized terminal device that executes a software application at said spectator side, a synthesized game field and animating, by said software application, said 3D avatars according to pose features of each player (Simulation presentation system 708 may include a 3D graphics simulation engine (e.g., a video game engine, etc). Such a simulator may receive input data indicative of what certain objects are ( e.g., object classification data), how the objects move and/or are positioned and oriented in space (e.g., spatial characteristic data), and, based on this data, may render graphics that pose 3D models associated with the objects (e.g., models that appear identical or similar to the objects, models that act as avatars for the objects and look different from the objects, etc.) in corresponding ways, see for instance, paragraph 73. For example, the input data may indicate that the feet and arms of a character are swinging as the body of the character is moving forward in space, and the 3D graphics simulator may thus simulate a 3D model of the character to be walking forward....input data indicates that the character's body has jumped, the 3D graphics simulator simulates the jump, and so forth, see for instance, paragraph 73).
Regarding claim 2, Van Hoff teaches the method according to claim 1 and further teaches one or more of the following steps: acquiring video footages of a real game ball (see for instance, paragraphs 19, 20, 26, 27, and 50. A live sporting event such as a tennis match ....may be viewed by a user using an AR media player device that is associated with a 3D simulation system, see for instance, paragraph 19. If bounded real-world scene includes a football stadium, objects may represent the players on each team, the football, the goal posts, the referees, and so forth...if bounded real-world scene includes a tennis court or concert stage, objects may represent any of the specific types of objects described above in relation to Fig. 1, see for instance, paragraph 50).
Regarding claim 3, Van Hoff teaches a method according to claim 1 and further teaches allowing each spectator a VR user interface of the software application to manipulate the rendering of the synthesized game by: a) changing the point of view during the game; b) changing the direction of view during said game; c) stop and resume said game; d) re-playing selected segments of said game; e) controlling the zoom level to get close-up views from any direction and from any desired angle (see for instance, paragraph 31, 33, 102, and 103. User 310 may indicate a particular viewpoint, within an extended reality world corresponding to a bounded real-world scene captured by video capture devices 304, from which the user 310 wishes to view the world by moving an avatar or virtual camera around within the extended reality world, see for instance, paragraph 102).
Regarding claim 4 Van Hoff teaches a method according to claim 1 and further teaches wherein the pose features comprise: a) skeletal features extracted using a deep learning model, which determines key points form the character's geometric skeleton of each player (see for instance, paragraphs 86-92 and 94); and b) skin features including the deformation of the player's clothes (see for instance, paragraphs 73-75).
Regarding claim 5, Van Hoff teaches a method according to claim 1 and further teaches wherein the spectator at the client side views the 3D synthesized game on a 2D display screed, or by using a 3D VR goggle/smart glasses or the 3D synthesize intervention (see for instance, paragraph 101).
Regarding claim 8, Van Hoff teaches a method according to claim 1 and further teaches wherein the 3D model of the player is forced to have the same pose and position in the virtual game field, according to the actual game field (see for instance, paragraphs 20, 63, 73, and 119. The 3D simulation may be generated based on the tracked pose of the person and the 3D models of the bounded real-world scene, such that the person may be simulated in accordance with the tracked pose so as to mirror the actual person in the actual real-world scene, see for instance, paragraph 20).
Regarding claim 13, Van Hoff teaches a method according to claim 1 and further teaches wherein a transformer module is adapted to: a) receive a collection of features in each frame and translates said features to a pose of each player in each frame, thereby determining the poses variation over time; b) output, for each frame, a skeletal representation of the pose of each player in that frame (see for instance, paragraphs 20, 63, 73, 81, 86-92, 94, and 119. The 3D simulation may be generated based on the tracked pose of the person and the 3D models of the bounded real-world scene, such that the person may be simulated in accordance with the tracked pose so as to mirror the actual person in the actual real-world scene, see for instance, paragraph 20).
Regarding claim 16, Van Hoff teaches a method according to claim 1, wherein the player's model is obtained using manual modeling, 3D scanning and model fitting (see for instance, paragraphs 55, and 80 and figs. 5-7).
Regarding claim 20, Van Hoff teaches a system for controlling the rendering of a broadcasted game at a spectator side (see for instance, paragraphs 101-103), comprising:
a) a set of cameras deploying around a real game field (Video capture devices capture imagery associated with a bounded real-world scene that includes a plurality of real-world objects, see for instance, paragraph 48. If bounded real-world scene includes a football stadium, objects may represent the players on each team, the football, the goal posts, the referees, and so forth...if bounded real-world scene includes a tennis court or concert stage, objects may represent any of the specific types of objects described above in relation to Fig. 1, see for instance, paragraph 50);
b) a memory for storing: b.1) acquired video footages being a sequence of frames of said real game field and players in said real game (see for instance, paragraphs 42, 48, 50, 55, and 153);
c) a 3D model of each player, generated before said game according to the performance of said each player, within previously taken video streams (see for instance, paragraphs 28, 38, 48, 50, and 78. The system may generate a 3D model of bounded real-world scene, such as a tennis court, see for instance, paragraph 28. Video capture devices capture imagery associated with a bounded real-world scene that includes a plurality of real-world objects, see for instance, paragraph 48. If bounded real-world scene includes a football stadium, objects may represent the players on each team, the football, the goal posts, the referees, and so forth...if bounded real-world scene includes a tennis court or concert stage, objects may represent any of the specific types of objects described above in relation to Fig. 1, see for instance, paragraph 50. Each 3D model accessed by access system may include representation of the scene of object that has been previously generated, see for instance, paragraph 78. Processing facility 204 may access a predefined 3D model of the bounded real-world scene and a 3D model of the real-world object, see for instance, paragraph 39);
d) an object detection module comprising at least one processor which is adapted to:
d.1) receive and processes said video footages (see for instance, paragraphs 38, 52, and 77);
d.2) identify said each player and said real game field in each frame of said video footages (see for instance, paragraphs 38, 52, and 77. Based on the 2D video images and the depth data, processing facility 204 may identify a particular object depicted by the 2D video images, and generate a volumetric 3D model for the particular object, see for instance, paragraph 38. Access system 702 may be configured not only to access 2D video images captured by video capture device 602 and provided by video capture system 302, but may further access data representative of any 3D models that may be used for the 3D simulation, see for instance, paragraph 77);
d.3) determine the location of each player on said game field in each frame (see for instance, paragraphs 81, 86-92 and 94. Spatial characteristic tracking system 704 may receive 2D video image content depicting bounded real-world scene 404 and objects 406, and may be configured to track spatial characteristics of each object 406 relative to bounded real-world scene 404 based on the 2D video images, see for instance, paragraph 81);
d.4) extract skeletal and skin features of each player from said video footages using deep learning models (see for instance, paragraphs 86-92 and 94. When determining the pose of a person (e.g., a tennis player in one example), a skeletal model of the person may be generated and markers placed at key points along the skeletal model.... markers may be associated with the person's hands, feet, head, and various joints (e.g., elbows, knees, shoulders, waist, etc.)., see paragraph 86. based on metadata stored for different voxels or triangles of the compressed mesh, the volumetric modeling system may apply texture from a texture atlas based on captured 2D color data to finish the volumetric 3D model by making the model look like the object, along with taking the shape and form of the object, see for instance, paragraph 118. Machine learning technologies may include one or more neural networks, deep neural networks, convolutional neural networks, recurrent neural networks, training sets (e.g., videos of human body movements, videos of tennis matches, etc.), and/or any other components as may serve a particular implementation, see for instance, paragraph 94. Machine learning algorithm and training procedures may enable system 200 to reliably perform the spatial characteristic tracking without needing to rely extensively on a manually programmed tracking algorithm, see for instance, paragraph 93);
d.5) generate 3D avatars for all players using said 3D model and said extracted features, to animate the respective 3D avatars (see for instance, paragraphs 63, 73. Simulation presentation system 708 may include a 3D graphics simulation engine (e.g., a video game engine, etc). Such a simulator may receive input data indicative of what certain objects are ( e.g., object classification data), how the objects move and/or are positioned and oriented in space (e.g., spatial characteristic data), and, based on this data, may render graphics that pose 3D models associated with the objects (e.g., models that appear identical or similar to the objects, models that act as avatars for the objects and look different from the objects, etc.) in corresponding ways, see for instance, paragraph 73. For example, the input data may indicate that the feet and arms of a character are swinging as the body of the character is moving forward in space, and the 3D graphics simulator may thus simulate a 3D model of the character to be walking forward....input data indicates that the character's body has jumped, the 3D graphics simulator simulates the jump, and so forth, see for instance, paragraph 73);
d.6) continuously track the location and movements of each player over the acquired video footages (see for instance, paragraphs 81, 86-92 and 94. Spatial characteristic tracking system 704 may receive 2D video image content depicting bounded real-world scene 404 and objects 406, and may be configured to track spatial characteristics of each object 406 relative to bounded real-world scene 404 based on the 2D video images, see for instance, paragraph 81);
d.7) determine pose features of each player over time (see for instance, paragraphs 81, 86-92 and 94. Spatial characteristic tracking system 704 may receive 2D video image content depicting bounded real world scene 404 and objects 406, and may be configured to track spatial characteristics of each object 406 relative to bounded real-world scene 404 based on the 2D video images, see for instance, paragraph 81. The tracked spatial characteristics of a particular real-world object may be a pose of the real-world object, see for instance, paragraph 81);
e) a transmitter for transmitting data related to said game field, the location data and said 3D avatars to a software application at the spectator side (Network 306 may be configured to provide data delivery between server-side provider systems (e.g., system 200, video capture system 302, etc.) and client-side systems and devices (e.g., media player devices 308), see paragraph 100. Configuration 300 includes system 200 and a video capture system 302 that is communicatively coupled to a plurality of video capture devices 304 (e.g., video capture devices 304-1 through 304-M) on a server-side of a network 306. System 200 and video capture system 302 are also communicatively coupled, by way of network 306 to a plurality of media player devices 308 (e.g., media player devices 308-1 through 308-N) that are on a client-side of network 306, see paragraph 43. After generating a 3D simulation based on the 2D video images provided by video capture system 302, 3D simulation system 200-2 may provide data required for presentation of the 3D simulation to a simulation presentation system 708, see paragraph 72. Simulation presentation system 708 may be implemented by or otherwise associated with a media player device such as one of media player devices 308 described above, and may be communicatively coupled with the media player device 308 by way of a network such as network 306, see paragraph 72); and
f) a computerized terminal device at the spectator side that executes said software application to thereby generate a synthesized game field and animate said 3D avatars according to pose features of each player (Simulation presentation system 708 may include a 3D graphics simulation engine (e.g., a video game engine, etc). Such a simulator may receive input data indicative of what certain objects are ( e.g., object classification data), how the objects move and/or are positioned and oriented in space (e.g., spatial characteristic data), and, based on this data, may render graphics that pose 3D models associated with the objects (e.g., models that appear identical or similar to the objects, models that act as avatars for the objects and look different from the objects, etc.) in corresponding ways, see for instance, paragraph 73. For example, the input data may indicate that the feet and arms of a character are swinging as the body of the character is moving forward in space, and the 3D graphics simulator may thus simulate a 3D model of the character to be walking forward....input data indicates that the character's body has jumped, the 3D graphics simulator simulates the jump, and so forth, see for instance, paragraph 73).
Regarding claim 22, Van Hoff teaches the system according to claim 20, and further teaches a VR user interface for allowing each spectator using the software application on his terminal device, to manipulate the rendering of said synthesized game by:
a) changing the point of view during the game;
b) changing the direction of view during said game; c) stop and resume said game;
d) re-playing selected segments of said game;
e) controlling the zoom level to get close-up views from any direction and from any desired angle (see for instance, paragraph 31, 33, 102, and 103. User 310 may indicate a particular viewpoint, within an extended reality world corresponding to a bounded real-world scene captured by video capture devices 304, from which the user 310 wishes to view the world by moving an avatar or virtual camera around within the extended reality world, see for instance, paragraph 102).
Regarding claim 24, Van Hoff teaches the system according to claim 20, and further teaches in which the spectator at the client side views the 3D synthesized game on a 2D display screed, or by using a 3D VR goggle/smart glasses, or the 3D synthesized game without any intervention (see for instance, paragraph 101).
Regarding claim 32, Van Hoff teaches the system according to claim 20, and further teaches a transformer module which is adapted to: a) receive a collection of features in each frame and translates said features to a pose of each player in each frame, thereby determining the poses variation over time; b) output, for each frame, a skeletal representation of the pose of each player in that frame (see for instance, paragraphs 20, 63, 73, 81, 86-92, 94, and 119. The 3D simulation may be generated based on the tracked pose of the person and the 3D models of the bounded real-world scene, such that the person may be simulated in accordance with the tracked pose so as to mirror the actual person in the actual real-world scene, see for instance, paragraph 20).
Regarding claim 38, Van Hoff teaches the system according to claim 20 and further teaches in which the terminal device is selected from the group of: a smartphone; a tablet; a desktop computer; a laptop computer; a smart TV (see for instance, paragraphs 101 and 107).
Allowable Subject Matter
Since no prior art is being applied, claims 9, 11, 17, 26, 28, 33, 34, and 36 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US PG Publication 2016/0193530 to Parker et al. teaches virtual playbook with user controls.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL J COBB whose telephone number is (571)270-3875. The examiner can normally be reached Monday - Friday, 11am - 7pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alicia Harrington can be reached at 571-272-2330. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL J COBB/ Primary Examiner, Art Unit 2615