Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This office action is in response to application 18/772,299, which was filed 07/15/24. Claims 1-26 are pending in the application and have been considered.
Foreign Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-4, 7-9, 14-17, and 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over Navin et al. (US 20210377607) in view of Lewis, II (US 20140064706).
Consider claim 1, Navin discloses a playback mode switching method (switching device settings during playback of a movie or ports, [0046]) comprising:
a multimedia playback apparatus establishing an index table corresponding to a plurality of preset keyword data and a plurality of playback modes (environment for movies and television playback stores profiles corresponding to drama, sports, video games, i.e. keyword data, which include device settings for playback of media content tagged with the profile, i.e. playback modes, [0045-0046], considered an “index table” in that the first row stores “drama” and drama playback settings, the second row stores sports and sports playback settings, etc., Fig 4 elements 406, 408, [0046]);
the multimedia playback apparatus playing multimedia content (playback of movies, sports, video games, etc., [0046]);
the multimedia playback apparatus extracting at least one frame image from the multimedia content and generating at least one query information to query a network artificial intelligence model (extracting certain frames from the media stream, which are sent to a remote server for content matching via machine learning techniques, [0018], and artificial intelligence techniques, [0029], over network 206, Fig 2, [0032], this is considered a request for the remote server to identify the content of the frame, i.e. a query);
the network artificial intelligence model transmitting a first keyword data according to the query information (ACR service receives content fingerprint and assigns tags to categorize the content, e.g. “drama”, “western”, “sports”, “video game”, etc., which are transmitted to setting recommendation service, Fig 3, [0036]); and
the multimedia playback apparatus analyzing the first keyword data and the index table to determine whether to perform a mode switching operation for adjusting a playback parameter setting of the multimedia content (user device and setting recommendation service determine whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Navin does not specifically mention transmitting a first keyword data back to the multimedia playback apparatus.
Lewis, II discloses transmitting a first keyword data back to the multimedia playback apparatus (video tagging server 118 identifies audiovisual components by sound or image analysis to assign tags such as the name of an actor or character, [0081], which are transmitted back to the playback device for presentation as an overlay during video playback, [0086], and used for playback controls, [0051]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin by transmitting a first keyword data back to the multimedia playback apparatus in order to customize playback to better align with user interests, as suggested by Lewis, II ([0051]). Doing so would have led to predictable results of assisting the user in avoiding content they find annoying, as suggested by Lewis, II ([0051]). The references cited are analogous art in the same field of multimedia.
Consider claim 14, Navin discloses a multimedia playback apparatus connected to a network artificial intelligence model (user device such as a television for playing movies connected to machine learning module over network 306, Fig 3, [0018]-[0019], [0034]-[0035]), the multimedia playback apparatus comprising:
a multimedia playback device for playing a multimedia content (user device for playback of content such as a video on a television, [0018]-[0019]);
a multimedia processing device electrically connected to the multimedia playback device, for establishing in advance an index table corresponding to a plurality of preset keyword data and a plurality of playback modes (content profile service on device for movies and television playback stores profiles corresponding to drama, sports, video games, i.e. keyword data, which include device settings for playback of media content tagged with the profile, i.e. playback modes, [0045-0046], considered an “index table” in that the first row stores “drama” and drama playback settings, the second row stores sports and sports playback settings, etc., Fig 4 elements 406, 408, [0046]),
extracting at least one frame image from the multimedia content (extracting certain frames from the media stream, which are sent to a remote server for content matching via machine learning techniques, [0018]), and
generating at least one query information according to the at least one frame image (extracting certain frames from the media stream, which are sent to a remote server for content matching via machine learning techniques, [0018], and artificial intelligence techniques, [0029], over network 206, Fig 2, [0032], this is considered a request for the remote server to identify the content of the frame, i.e. a query); and
a network transmission device electrically connected to the multimedia processing device, for a first keyword data by the network artificial intelligence model according to the at least one query information back to the multimedia processing device (ACR service receives content fingerprint and assigns tags to categorize the content, e.g. “drama”, “western”, “sports”, “video game”, etc., which are transmitted to setting recommendation service, Fig 3, [0036]); and
the multimedia processing device analyzing the first keyword data and the index table to determine whether to control the multimedia playback device to perform a mode switching operation for adjusting a playback parameter setting of the multimedia content (user device and setting recommendation service determine whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Navin does not specifically mention transmitting a first keyword data returned.
Lewis, II discloses transmitting a transmitting a first keyword data returned (video tagging server 118 identifies audiovisual components by sound or image analysis to assign tags such as the name of an actor or character, [0081], which are transmitted back to the playback device for presentation as an overlay during video playback, [0086], and used for playback controls, [0051]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin by transmitting a first keyword data returned for reasons similar to those for claim 1.
Consider claim 2, Navin discloses: the multimedia playback apparatus generating a voice keyword data according to a voice input data (auditory samples from various actors, i.e. voice data, is used as the fingerprint to identify tags such as sports, movie, video game, [0029], [0030], [0036]); and the multimedia playback apparatus analyzing the voice keyword data and the index table to determine whether to perform the mode switching operation (user device and setting recommendation service determine whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Consider claim 3, Navin discloses when the multimedia playback apparatus determines that the voice input data matches one of the preset keyword data, the multimedia playback apparatus performs the mode switching operation according to the index table (ACR service receives content fingerprint from voice data, [0029-0030], and assigns tags to categorize the content, e.g. “drama”, “western”, “sports”, “video game”, etc., which are transmitted to setting recommendation service, Fig 3, [0036], which determines whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Consider claim 4, Navin discloses when the multimedia playback apparatus determines that the voice input data does not match any preset keyword data, the multimedia playback apparatus queries the network artificial intelligence model according to the at least one query information (ACR service uses media content database auditory samples from various actors to narrow down a range of potential matching content, i.e. the voice data does not match any of the content items ruled out, [0029], and auditory samples are sent to a remote server for content matching via machine learning techniques, [0018], and artificial intelligence techniques, [0029], over network 206, Fig 2, [0032]).
Consider claim 7, Navin discloses the multimedia playback apparatus determines that the first keyword data matches one of the preset keyword data, the multimedia playback apparatus performs the mode switching operation according to the index table (user device and setting recommendation service determines to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Consider claim 8, Navin discloses the at least one query information queries the network artificial intelligence model in a form of a closed-ended choice question about an image type (the output of the neural network is a classification into one or more discrete classes, e.g. “sports”, “movie”, or “video game”; this is effectively considered a form of a close ended question for the classifier, [0030]).
Consider claim 9, Navin discloses when the multimedia playback apparatus determines that the first keyword data does not match any preset keyword data, the multimedia playback apparatus does not perform the mode switching operation, or alternatively, the multimedia playback apparatus queries the network artificial intelligence model about an image type via a question according to the at least one frame image, and the multimedia playback apparatus establishes a correspondence relationship in the index table between a second keyword data transmitted from the network artificial intelligence model and one of the plurality of playback modes (the output of the neural network is a classification into one or more discrete classes, e.g. “sports”, “movie”, or “video game”; this is effectively considered a form of a close ended “question” for the classifier, [0030], and user device and setting recommendation service determine whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Consider claim 15, Navin discloses the multimedia playback apparatus further comprises: a voice processing device electrically connected to the multimedia processing device, for generating a voice input data (device uses auditory samples from various actors, i.e. voice data, is used as the fingerprint to identify tags such as sports, movie, video game, [0029], [0030], [0036]); wherein the multimedia processing device generates a voice keyword data according to the voice input data, and the multimedia processing device analyzes the voice keyword data and the index table to determine whether to control the multimedia playback device to perform the mode switching operation (user device and setting recommendation service determine whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Consider claim 16, Navin discloses when the multimedia processing device determines that the voice input data matches one of the preset keyword data, the multimedia processing device controls the multimedia playback device to perform the mode switching operation according to the index table (ACR service receives content fingerprint from voice data, [0029-0030], and assigns tags to categorize the content, e.g. “drama”, “western”, “sports”, “video game”, etc., which are transmitted to setting recommendation service, Fig 3, [0036], which determines whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Consider claim 17, Navin discloses when the multimedia playback apparatus determines that the voice input data does not match any preset keyword data, the multimedia playback apparatus queries the network artificial intelligence model according to the at least one query information (ACR service uses media content database auditory samples from various actors to narrow down a range of potential matching content, i.e. the voice data does not match any of the content items ruled out, [0029], and auditory samples are sent to a remote server for content matching via machine learning techniques, [0018], and artificial intelligence techniques, [0029], over network 206, Fig 2, [0032]).
Consider claim 20, Navin discloses the multimedia playback apparatus determines that the first keyword data matches one of the preset keyword data, the multimedia playback apparatus performs the mode switching operation according to the index table (user device and setting recommendation service determines to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Consider claim 21, Navin discloses the at least one query information queries the network artificial intelligence model in a form of a closed-ended choice question about an image type (the output of the neural network is a classification into one or more discrete classes, e.g. “sports”, “movie”, or “video game”; this is effectively considered a form of a close ended question for the classifier, [0030]).
Consider claim 22, Navin discloses when the multimedia playback apparatus determines that the first keyword data does not match any preset keyword data, the multimedia playback apparatus does not perform the mode switching operation, or alternatively, the multimedia playback apparatus queries the network artificial intelligence model about an image type via a question according to the at least one frame image, and the multimedia playback apparatus establishes a correspondence relationship in the index table between a second keyword data transmitted from the network artificial intelligence model and one of the plurality of playback modes (the output of the neural network is a classification into one or more discrete classes, e.g. “sports”, “movie”, or “video game”; this is effectively considered a form of a close ended “question” for the classifier, [0030], and user device and setting recommendation service determine whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Claims 5 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Navin et al. (US 20210377607) in view of Lewis, II (US 20140064706), in further view of Raichelgauz et al. (US 20130166276).
Consider claim 5, Navin discloses a multimedia playback apparatus and network artificial intelligence model (user device 202 and machine learning module 216, Fig 2, [0029-0030]).
Navin and Lewis, II do not specifically mention utilizing a translation service to convert a language corresponding to the at least one query information to a default language of the network artificial intelligence model.
Raichelgauz discloses utilizing a translation service to convert a language corresponding to the at least one query information to a default language (multimedia element is uploaded to input to the querying server with a translation request designating at least the target natural language of the translated text, [0025]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the multimedia playback apparatus of Navin utilizes a translation service to convert a language corresponding to the at least one query information to a default language, as in Raichelgauz, of the network artificial intelligence model of Navin in order to assist viewers who cannot read English, as suggested by Raichelgauz ([0007]), predictably increasing the amount of media available for the user to view, as suggested by Raichelgauz ([0007]). The references cited are analogous art in the same field of multimedia.
Consider claim 18, Navin discloses a multimedia playback apparatus and network artificial intelligence model (user device 202 and machine learning module 216, Fig 2, [0029-0030]).
Navin and Lewis, II do not specifically mention utilizing a translation service to convert a language corresponding to the at least one query information to a default language of the network artificial intelligence model.
Raichelgauz discloses utilizing a translation service to convert a language corresponding to the at least one query information to a default language (multimedia element is uploaded to input to the querying server with a translation request designating at least the target natural language of the translated text, [0025]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the multimedia playback apparatus of Navin utilizes a translation service to convert a language corresponding to the at least one query information to a default language, as in Raichelgauz, of the network artificial intelligence model of Navin for reasons similar to those for claim 5.
Claims 6 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Navin et al. (US 20210377607) in view of Lewis, II (US 20140064706), in further view of Sullivan et al. (US 9892188).
Consider claim 6, Navin discloses: the multimedia playback apparatus sending to the network artificial intelligence model for evolution of the network artificial intelligence model (user device prompts user for feedback on the settings, which is sent to machine learning model for tuning, [0025], [0055]).
Navin and Lewis, II do not specifically mention sending the index table.
Sullivan discloses sending the index table (transmit the index table, (Col 14 lines 4-24).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II by sending the index table in order to increase efficiency, as suggested by Sullivan (Col 14 lines 24-30), predictably reducing latency, as suggested by Sullivan (Col 14 lines 19-34). The references cited are analogous art in the same field of multimedia.
Consider claim 19, Navin discloses: the multimedia playback apparatus sending to the network artificial intelligence model for evolution of the network artificial intelligence model (user device prompts user for feedback on the settings, which is sent to machine learning model for tuning, [0025], [0055]).
Navin and Lewis, II do not specifically mention sending the index table.
Sullivan discloses sending the index table (transmit the index table, (Col 14 lines 4-24).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II by sending the index table for reasons similar to those for claim 6.
Claims 10-12 and 23-25 are rejected under 35 U.S.C. 103 as being unpatentable over Navin et al. (US 20210377607) in view of Lewis, II (US 20140064706), in further view of Hardee et al. (US 20200160091).
Consider claim 10, Navin and Lewis, II do not, but Hardee discloses the step of the multimedia playback apparatus playing the multimedia content comprises: the multimedia playback apparatus sending the multimedia content to the network artificial intelligence model to determine whether the multimedia content is copyrighted or whether a URL (Uniform Resource Locator) of the multimedia content is unsafe, to decide whether to play the multimedia content (object recognition component 110 determines whether the video is copyrighted, [0035], [0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the step of the multimedia playback apparatus playing the multimedia content comprises: the multimedia playback apparatus sending the multimedia content to the network artificial intelligence model to determine whether the multimedia content is copyrighted or whether a URL (Uniform Resource Locator) of the multimedia content is unsafe, to decide whether to play the multimedia content in order to effectively and efficiently prevent unauthorized usage of media on the internet, as suggested by Hardee ([0021]), predictably protecting content owners, as suggested by Hardee ([0021]). The references cited are analogous art in the same field of multimedia.
Consider claim 11, Navin and Lewis, II do not, but Hardee discloses the multimedia playback apparatus determines a legality of the multimedia content according to a content description information of the multimedia content (object recognition component 110 determines whether the video is being viewed in an unauthorized manner, [0033], [0035], [0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the multimedia playback apparatus determines a legality of the multimedia content according to a content description information of the multimedia content for reasons similar to those for claim 10.
Consider claim 12, Navin and Lewis, II do not, but Hardee discloses the multimedia playback apparatus determines whether the multimedia content contains a restricted content via the network artificial intelligence model (object recognition component 110 determines whether the video is restricted for usage, [0023], [0033], [0035], [0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the multimedia playback apparatus determines whether the multimedia content contains a restricted content via the network artificial intelligence model for reasons similar to those for claim 10.
Consider claim 23, Navin and Lewis, II do not, but Hardee discloses the multimedia playback apparatus determines whether the multimedia content is copyrighted or whether a URL (Uniform Resource Locator) of the multimedia content is unsafe via the network artificial intelligence model, to decide whether to play the multimedia content (object recognition component 110 determines whether the video is copyrighted, [0035], [0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the multimedia playback apparatus determines whether the multimedia content is copyrighted or whether a URL (Uniform Resource Locator) of the multimedia content is unsafe via the network artificial intelligence model, to decide whether to play the multimedia content for reasons similar to those for claim 10.
Consider claim 24, Navin and Lewis, II do not, but Hardee discloses the multimedia playback apparatus determines a legality of the multimedia content according to a content description information of the multimedia content (object recognition component 110 determines whether the video is being viewed in an unauthorized manner, [0033], [0035], [0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the multimedia playback apparatus determines a legality of the multimedia content according to a content description information of the multimedia content for reasons similar to those for claim 10.
Consider claim 25, Navin and Lewis, II do not, but Hardee discloses the multimedia playback apparatus determines whether the multimedia content contains a restricted content via the network artificial intelligence model (object recognition component 110 determines whether the video is restricted for usage, [0023], [0033], [0035], [0036]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the multimedia playback apparatus determines whether the multimedia content contains a restricted content via the network artificial intelligence model for reasons similar to those for claim 10.
Claims 13 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Navin et al. (US 20210377607) in view of Lewis, II (US 20140064706), in further view of Li et al. (US 20060212897).
Consider claim 13, Navin discloses analyzing the specific keyword with the index table to determine whether to perform the mode switching operation (user device and setting recommendation service determine whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Navin and Lewis, II do not specifically mention the first keyword data comprises a plurality of keywords, and the multimedia playback apparatus assigns a weight value to each keyword; when the multimedia playback apparatus determines that a summed weight value for a specific keyword exceeds a threshold, the multimedia playback apparatus analyzes the specific keyword.
Li discloses the first keyword data comprises a plurality of keywords, and the multimedia playback apparatus assigns a weight value to each keyword (advertising keywords are assigned weights, [0037], [0060]); when the multimedia playback apparatus determines that a summed weight value for a specific keyword exceeds a threshold, the multimedia playback apparatus analyzes the specific keyword (the reweighted keyword vectors are compared to a threshold for ad selection based on the keyword, [0067-0068]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the first keyword data comprises a plurality of keywords, and the multimedia playback apparatus assigns a weight value to each keyword; when the multimedia playback apparatus determines that a summed weight value for a specific keyword exceeds a threshold, the multimedia playback apparatus analyzes the specific keyword in order to increase advertising efficiency, as suggested by Li ([0004]), predictably resulting in increased profits, as suggested by Li ([0004]). The references cited are analogous art in the same field of multimedia.
Consider claim 26, Navin discloses analyzing the specific keyword with the index table to determine whether to perform the mode switching operation (user device and setting recommendation service determine whether to adjust the visual and audio settings for playback based on the settings associated with the tags, [0036-0037], [0045-0046]).
Navin and Lewis, II do not specifically mention the first keyword data comprises a plurality of keywords, and the multimedia playback apparatus assigns a weight value to each keyword; when the multimedia playback apparatus determines that a summed weight value for a specific keyword exceeds a threshold, the multimedia playback apparatus analyzes the specific keyword.
Li discloses the first keyword data comprises a plurality of keywords, and the multimedia playback apparatus assigns a weight value to each keyword (advertising keywords are assigned weights, [0037], [0060]); when the multimedia playback apparatus determines that a summed weight value for a specific keyword exceeds a threshold, the multimedia playback apparatus analyzes the specific keyword (the reweighted keyword vectors are compared to a threshold for ad selection based on the keyword, [0067-0068]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Navin and Lewis, II such that the first keyword data comprises a plurality of keywords, and the multimedia playback apparatus assigns a weight value to each keyword; when the multimedia playback apparatus determines that a summed weight value for a specific keyword exceeds a threshold, the multimedia playback apparatus analyzes the specific keyword for reasons similar to those for claim 13.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20230206632 Shah discloses fine-grained video frame classification
US 20200136580 Renner discloses dynamic audio equalization during playback based on classifying the audio signal
US 20120215630 Surendran discloses keyword based video contextual advertisements based on speech recognized keywords from the video
US 20080129877 Ohno discloses image control for playback based on video content genre, see Fig 3
US 20100040349 Landy discloses real-time synchronization of a video resource and different audio resources
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jesse Pullias whose telephone number is 571/270-5135. The examiner can normally be reached on M-F 8:00 AM - 4:30 PM. The examiner’s fax number is 571/270-6135.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Andrew Flanders can be reached on 571/272-7516.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Jesse S Pullias/
Primary Examiner, Art Unit 2655 01/10/26