DETAILED ACTION
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 2/2/26 has been entered.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-42 are pending in the application. Claims 1-4, 6-14, 16, 18-20, 22, 24-26, 28, 30-32, 34 and 36-42 have been amended.
Response to Arguments
The amendment filed 2/2/26 recites limitations regarding audio data generated by the video game application corresponding to the object in the video frame as training data. The amendment overcomes prior art, but introduces new matter and therefore has 112a issues. See below 112 section for a detailed analysis.
Claim Objections
The following claims are objected to.
Claim 11 3rd line “of a of the” should be “of the”.
Claim 25 5th line “generate” should be “to generate”.
Claim 3/37 “to weight” should be “to weigh”.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claim 1-42 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 1 recites (annotation added):
1. One or more processors, comprising:
circuitry to:
use one or more neural networks to generate a saliency map to predict a likelihood of a gaze of one or more users toward an object in a video frame generated by a video game application (Part A) based on one or more prior gazes by the one or more users to one or more other frames and audio data generated by the video game application corresponding to the object in the video frame (Part B); and
determine one or more portions of the video frame to be compressed, to be transmitted over a network, based, at least in part, on the saliency map.
In above annotated claim 1, Part A recites an inference process of a current video frame (see also claim 6), and Part B recites training data used for training the one or more neural networks. According to the specification, the audio data is associated with an object that is also associated with a user’s gaze. For example, as described in para. [0045] (PGPub) “… audio data can be analyzed as well, such as where a sound is generated corresponding to golf club 104 hitting golf ball 106, which may draw a viewer's gaze or attention to a specific location or region of displayed video”. Therefore the training data described in above Part B is not described in the specification since in claim 1 gaze data is from one or more other frames, while audio data is from the video frame (current inference frame). Similar issue exists in claims 2, 7,13 and 20.
Claim 19 recites “use one or more neural networks to generate a saliency map to predict a gaze of one or more users to an object in a video frame generated by a video game application (Part A) based on one or more prior gazes by one or more users to one or more other frames and audio data generated by the video game application corresponding to the object” (Part B). Similarly, Part A recites inference of a current video frame and Part B recites training data for training the one or more neural networks. Note the video game application and the object is recited in the inference process of a current video frame. Therefore the training data described in above Part B is not described in the specification since in claim 19 gaze data is from one or more other frames, while audio data is from the video frame (the object). Similar issue exists in claims 8,14, 25-26 and 31.
Allowable Subject Matter
Claims 1-42 would be allowable if the 112(a) rejections as identified above are overcome.
The following is a statement of reasons for the indication of allowable subject matter.
As per claim 1, the closest prior art includes Fisher (US Publication 2015/0339589 A1), Azar (US Publication 2014/0292751 A1) and LENKE et al. (US Publication 2020/0110572 A1, hereafter LENKE). Fisher teaches generating a saliency map to predict a likelihood of gaze of one or more users to an object in a video frame using one or more neural networks trained based on prior gazes by one or more users to other frames. Azar teaches adaptive compression of video data in a video game application based on a gamer’s attention area, and transmitting the compressed video date over a network. LENKE further teaches training neural networks using video as well as audio associated with the video in order to perform target identification. The combination of the prior art does not teach or suggest training one or more neural network to predict a likelihood of gaze of one or more users based on one or more prior gazes by the one or more users to one or more other frames and audio data generated by the video game application corresponding to the object in the video frame.
Independent claims 7, 13, 19, 25 and 31 include similar allowable subject matter as claim 1.
Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to XUEMEI G CHEN whose telephone number is (571)270-3480. The examiner can normally be reached Monday-Friday 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, John M Villecco can be reached on (571) 272-7319. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/XUEMEI G CHEN/Primary Examiner, Art Unit 2661