DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 06/19/2025 was being considered by the examiner.
Drawings
The drawings were received on 01/18/2025. These drawings are acceptable.
Claim Objections
Claim 10 is objected to because of the following informalities. Appropriate correction is required. Claim 10 contains the type “The method of operating an electronic device, comprising:”.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-15 are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al. US 2021/0081676 A1, hereafter Kim, in view of Bouguerra US 2012/0069028 A1, hereafter Bouguerra.
Regarding claim 1, Kim discloses an electronic device (system) [title] comprising:
a communication device (transmitter/receiver or a wired port) via wired/wireless electrical connection) [0088];
a storage device (database (DB)) [0080] storing at least one artificial intelligence model (a model based on the CNN algorithm) [0096]; and
at least one processor (Central Processing Unit (CPU) [0079], wherein the at least one processor is configured to:
obtain a first video including a plurality of images (video made up of successive frames) [0085] from a user device connected to the electronic device through the communication device (may acquire the source video…via wired/wireless electrical connection) [0088],
wherein the at least one artificial intelligence model is trained to:
obtain scene information about the first video, wherein the scene information is information related to scene understanding of the plurality of images (scene understanding unit 17 is configured to determine interaction associated with the source object of the tube) [0143],
obtain a second video by editing the first video based on the scene information (generate synopsis video based on arranged tubes and background S655) [FIG. 8].
However, while Kim discloses using an artificial intelligence model to generate a synopsis video from an input video, Kim fails to explicitly disclose at least one artificial intelligence model trained to generate an emoticon based on an input video; and obtain a first emoticon by inputting the video to the at least one artificial intelligence model, and transmit the first emoticon to output the first emoticon by the user device, through the communication device.
Bouguerra, in an analogous environment, discloses at least one artificial intelligence model trained to generate an emoticon based on an input video (a video emoticon may be invoked based on an analysis of the video stream…patterns of features in the video stream are dynamically inferred using, for example, machine vision learning techniques…detection of a smile by the user may cause the ‘smiley’ video emoticon to be automatically invoked, without user input) [0062]; and
obtain a first emoticon by inputting the video to the at least one artificial intelligence model (a video emoticon may be invoked based on an analysis of the video stream…patterns of features in the video stream are dynamically inferred using, for example, machine vision learning techniques…detection of a smile by the user may cause the ‘smiley’ video emoticon to be automatically invoked, without user input) [0062], and transmit the first emoticon to output the first emoticon by the user device, through the communication device (video chat server device (VCSD) 120, video chat client device 101; video chat server module 357 may…generate video emoticons in these video streams) [FIG. 1; 0058].
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to use the synopsis video of Kim as the input video for Bouguerra to generate an emoticon using machine learning, the motivation being convey emotions in a video [0004].
Regarding claim 2, Kim and Bouguerra address all of the features with respect to claim 1 as outlined above.
Kim further discloses the at least one processor is configured to: obtain image groups by grouping a plurality of images included in the first video by reference units (the concatenation of images representing the object or activity across successive frames of a video are referred to as a “tube”) [0076], and obtain the second video by summarizing the first video based on the image groups (generate synopsis video based on arranged tubes and background S655) [FIG. 8].
Regarding claim 3, Kim and Bouguerra address all of the features with respect to claim 2 as outlined above.
Kim further discloses the image groups are obtained based on scene change of the first video identified based on the scene information (the concatenation of images representing the object or activity across successive frames of a video are referred to as a “tube”) [0076].
Regarding claim 4, Kim and Bouguerra address all of the features with respect to claim 1 as outlined above.
Kim further discloses the at least one processor is configured to, through the at least one artificial intelligence model: generate at least one content to be added to the first video based on the scene information (generate source object tube including source object on which motion is detected S615) [FIG. 6], and obtain the second video by adding the at least one content to the first video (generate synopsis video based on arranged tubes and background S655) [FIG. 8].
Regarding claim 5, Kim and Bouguerra address all of the features with respect to claim 4 as outlined above.
Kim further discloses the at least one processor is configured to, through the at least one artificial intelligence model: obtain a point set and a bounding box set for at least one object included in the first video, identify contour of the at least one object based on the point set and bounding box set, and obtain object information about the at least one object based on the scene information and the contour of the at least one object (before determining the class of the object in the video, the source object detection unit 11 may be configured to set a proposed region where the object is located in the video, and analyze the set image. The set region may be referred to as a region of interest (ROI) candidate box, and it is a sort of object localization and corresponds to an initial object detection operation) [0093].
Regarding claim 6, Kim and Bouguerra address all of the features with respect to claim 5 as outlined above.
Kim further discloses the at least one processor is configured to, through the at least one artificial intelligence model: determine a position at which the at least one content is to be added in relation to the first video based on the object information, and obtain the second video by adding the at least one content to the first video based on the position (the synopsis video generation unit 55 may stitch the synopsis object tube and the selected background by applying the position of the synopsis object to the selected background) [0219].
Regarding claim 7, Kim and Bouguerra address all of the features with respect to claim 1 as outlined above.
Bouguerra further discloses the at least one processor is configured to: obtain a user input to generate an emoticon (receiving selection of a video emoticon 402) [FIG. 4], and obtain a second emoticon based on the user input and the first video, wherein the at least one artificial intelligence model is trained to: identify a user intention based on the user input (a video emoticon may be invoked based on an analysis of the video stream…patterns of features in the video stream are dynamically inferred using, for example, machine vision learning techniques…detection of a smile by the user may cause the ‘smiley’ video emoticon to be automatically invoked, without user input) [0062], obtain a third video obtained by editing the first video based on the user intention and the scene information about the first video, and generate the second emoticon based on the third video (augmenting tracking one or more features in the video stream based on selected type of video emotion 408) [FIG. 4].
Regarding claim 7, Kim and Bouguerra address all of the features with respect to claim 1 as outlined above.
Bouguerra further discloses the at least one processor is configured to, through the at least one artificial intelligence model: generate at least one content to be added to the first video based on the user input and the scene information (a video emoticon may be invoked based on an analysis of the video stream…patterns of features in the video stream are dynamically inferred using, for example, machine vision learning techniques…detection of a smile by the user may cause the ‘smiley’ video emoticon to be automatically invoked, without user input) [0062]; and obtain the third video by adding the at least one content to the first video (augmenting tracking one or more features in the video stream based on selected type of video emotion 408) [FIG. 4].
Regarding claim 9, Kim and Bouguerra address all of the features with respect to claim 1 as outlined above.
Bouguerra further discloses the at least one processor is configured to: obtain a user input to modify the first emoticon through the user device (a selection of an animated video emoticon is received from a user) [0059]; obtain a third emoticon generated based on the user input and the scene information through the at least one artificial intelligence model (the user may select a video emoticon that conveys surprise) [0060]; and transmit the third emoticon to the user device, through the communication device, to output the third emoticon (video chat server device (VCSD) 120, video chat client device 101; video chat server module 357 may…generate video emoticons in these video streams) [FIG. 1; 0058].
Claims 10-14 are drawn to a method implemented by the electronic device of claims 1, 2 and 4-6, and are therefore rejected in the same manner as above.
Computer readable medium claim 15 is drawn to the instructions corresponding to the method implemented by the electronic device of claim 1. Therefore, computer readable medium claim 15 corresponds to electronic device claim 1 and is rejected for the same reasons of unpatentability as used above.
Citation of Pertinent Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Dahllof et al. US 2010/0177116 A1 discloses generating emoticons for text from video of facial expressions
Huang et al. US 2021/0158594 A1 discloses processing video data to synthesize an animated emoticon
Degani US 2015/0286371 A1 discloses custom emoticon generation from video
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to STEFAN GADOMSKI whose telephone number is (571)270-5701. The examiner can normally be reached Monday - Friday, 12-8PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jay Patel can be reached at 571-272-2988. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
STEFAN GADOMSKI
Primary Examiner
Art Unit 2485
/STEFAN GADOMSKI/Primary Examiner, Art Unit 2485