DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is responsive to: Application filed 15 Apr. 2024
Claims 1-20 are pending in this case. Claims 1, 13 and 18 are independent claims
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Burkitt et al. (Pub. No.: US 2012/0254917 A1; Filed: Apr. 1, 2011)(hereinafter “Burkitt”) in view of Munoz et al. (Pub, No.: US 2024/0386185 A1; Filed: May 19, 2023)(hereinafter “Munoz”).
Regarding independent claims 1, Burkitt disclose a computer-implemented method for analyzing individual components of video content separately and correlating the separate analysis to provide timely content compliance checks against rules, the method comprising (0031; 0038; 0099; 0124; 0132):
receiving, by a categorization engine stored in non-transitory memory of a computer system and executable by a processor of the computer system, a content item for presentation in a next available slot that is based on opportunistic scheduling (0423; 0655);
extracting, by the categorization engine, an audio component and a frames component from the content item, wherein the frames component comprises a set of frames (0126; 0467);
analyzing, by the categorization engine, the audio component and the frames component separately by:
extracting, by the categorization engine, first text from the audio component (0038; 0094; 0096; 0099; 0124; 0132);
determining, by the categorization engine, based at least in part on the audio component, first timing information associated with the first text (0094-0097);
extracting, by the categorization engine, second text from the frames component (0099; 0126; 0138; 0190; 0379);
determining, by the categorization engine, based at least in part on the frames component, second timing information and spatial information associated with the second text (0126; 0138; 0190; 0379);
determining, by the categorization engine, based on a correlation of the first timing information to the second timing information, a temporal relationship between the first text from the audio component and the second text from the frames component (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633);
applying, by a rules engine stored in the non-transitory memory of the computer system and executable by the processor of the computer system, rules that are based on the determined one or more categories to at least one of the first timing information associated with the first text or the second timing information and the spatial information associated with the second text to determine whether the rules are satisfied, wherein the applying comprises (0099; 0126; 0138; 0190; 0379):
determining whether the temporal relationship between the first text from the audio component and the second text from the frames component satisfies a temporal relationship requirement between audio text and visual text specified by a first rule of the rules (0357);
determining whether the second timing information of the second text satisfies a visual text timing requirement specified by a second rule of the rules (0362-0365); and
inserting a different content item in the next available slot rather than the content item in response to determining at least one of the temporal relationship between the first text from the audio component and the second text from the frames component fails to satisfy the temporal relationship requirement specified by the first rule, the second timing information of the second text fails to satisfy the visual text timing requirement specified by the second rule, or the spatial information of the second text fails to satisfy the visual text sizing and location requirement specified in the third rule (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633).
Burkitt does not expressly disclose determining, by the categorization engine, one or more categories of the content item using one or more large language models (LLMs) to generate, using the first text and the second text, a response to one or more questions associated with the one or more categories;
determining whether the spatial information of the second text satisfies a visual text sizing and location requirement specified by a third rule of the rules;
Munoz teach determining, by the categorization engine, one or more categories of the content item using one or more large language models (LLMs) to generate, using the first text and the second text, a response to one or more questions associated with the one or more categories (0024; 0028; 0036; 0040);
determining whether the spatial information of the second text satisfies a visual text sizing and location requirement specified by a third rule of the rules (0024; 0030-0031).
Therefore before the effective filing date of the claims invention, it would have been obvious to one of ordinary skill in the art to combine Munoz with Burkitt for the benefit of insuring that documents consistently follow a given formatting requirement, thus ensuring the readability and usability of the documents may fall below acceptable standards (0005).
Regarding dependent claim 2, Burkitt disclose the method of claim 1, wherein the first timing information associated with the first text from the audio component comprises at least one of a start time, an end time, or a duration of each of one or more words in the first text (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633).
Regarding dependent claim 3, Burkitt in view of Munoz disclose the method of claim 1, wherein the spatial information associated with the second text from the frames component comprises at least one of a size or a pixel coordinate of each of one or more textual characters, and wherein the second timing information associated with the second text comprises a start time of each of the one or more textual characters based on a respective pixel coordinate (0024; 0030-0031).
Regarding dependent claim 4, Burkitt in view of Munoz disclose the method of claim 3, further comprising:
calculating, by the categorization engine, the size of an individual textual character of the one or more textual characters as a percentage of a frame size of an individual frame of the set of frames (0024; 0030-0031).
Regarding dependent claim 5, Burkitt in view of Munoz disclose the method of claim 1, wherein the extracting the first text from the audio component comprises:
detecting, by the categorization engine, using a speech recognition model, speech from the audio component (0025-0026; 0040-0042); and
converting, by the categorization engine, using a speech-to-text model, the detected speech into the first text (0025-0026; 0040-0042).
Regarding dependent claim 6, Burkitt disclose the method of claim 1, wherein the extracting the second text from the frames component comprises:
identifying, by the categorization engine, using an optical character recognition (OCR) model, textual characters from the frames component (0124; 0132; 0176); and
generating, by the categorization engine, from the identified textual characters and based on the second timing information and the spatial information associated with the second text, a sequence of words corresponding to the second text (0099; 0126; 0138; 0190; 0379).
Regarding dependent claim 7, Burkitt disclose the method of claim 1, wherein:
the determining the first timing information associated with the first text is further based on a correlation of a timeline of the audio component to a timeline of the frames component and a video timeline of the content item (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633),
the determining the second timing information associated with the second text is further based on a correlation of the timeline of the frames component to the timeline of the audio component and the video timeline of the content item (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633), and
the determining the temporal relationship between the first text from the audio component and the second text from the frames component is further based on a timing of the first text from the audio component with respect to the video timeline and a timing of the second text from the frames component with respect to the video timeline (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633).
Regarding dependent claim 8, Burkitt disclose the method of claim 7, further comprising:
correlating, by the categorization engine, the timeline of the frames component to the video timeline based on a first frame rate associated with the frames component and a second frame rate associated with a rendering of the content item, wherein the first frame rate is different than the second frame rate (0126; 0467).
Regarding dependent claim 9, Burkitt disclose the method of claim 1, wherein the one or more categories of the content item comprises at least one of politics, healthcare, pharmaceutical, cannabidiol (CBD), alcohol, adult content, gun, or gambling (0190).
Regarding dependent claim 10, Burkitt disclose the method of claim 1, further comprising:
identifying, by the categorization engine, using an object detection model, an object in at least one frame of the set of frames (0362-0365); and
determining, by the categorization engine, based at least in part on the frames component, third timing information and the spatial information associated with the object, wherein the applying the rules further comprises (0362-0365):
determining whether the third timing information of the object satisfies a visual object timing requirement specified by a fourth rule of the rules (0362-0365); and
determining whether the spatial information of the object satisfies a visual object sizing and location requirement specified by a fifth rule of the rules (Monoz 0024; 0030-0031).
Regarding dependent claim 11, Burkitt in view of Munoz disclose the method of claim 1, wherein the applying the rules comprises:
determining, by the rules engine, that the first text extracted from the audio component includes a specific message satisfying a message requirement specified by a fourth rule of the rules, using at least one LLM to generate, using the first text, a response to one or more questions associated with the specific message (0024; 0028; 0036; 0040).
Regarding dependent claim 12, Burkitt in view of Munoz disclose the method of claim 1, wherein:
the applying the rules comprises:
determining, by the rules engine, that the second text extracted from the frames component includes a specific message satisfying a message requirement specified by a fourth rule of the rules, using at least one LLM to generate, using the second text, a response to one or more questions associated with the specific message (0024; 0028; 0036; 0040),
the determining whether the second timing information of the second text satisfies the visual text timing requirement comprises:
determining, by the rules engine, based on the second timing information, at least one of a start time, an end time, or a duration associated with the specific message (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633); and
determining, by the rules engine, whether the at least one of the start time, the end time, or the duration with the specific message satisfies the visual text timing requirement, and
the determining whether the spatial information of the second text satisfies the visual text sizing and location requirement comprises (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633):
determining, by the rules engine, based on the spatial information associated with the second text, a size of an individual textual character in the specific message with respect to a size of an individual frame of the set of frames and a location of the individual textual character with respect to the individual frame (0024; 0030-0031); and
determining whether the size and the location of the individual textual character in the specific message satisfies the visual text sizing and location requirement (0024; 0030-0031).
Regarding independent claim 13, Burkitt disclose a computer-implemented method for analyzing individual portions of media content to provide content categorization and content compliance checks against rules depending on the content categorization, the method comprising:
extracting, by a categorization engine stored in non-transitory memory of a computer system and executable by a processor of the computer system, an audio portion and a frames portion from a content item (0126; 0467);
extracting, by the categorization engine, first text from the audio portion of the content item (0038; 0094; 0096; 0099; 0124; 0132);
determining, by the categorization engine, based at least in part on the audio portion, first timing information associated with the first text (0094-0097);
extracting, by the categorization engine, at least one of second text or an object from the frames portion of the content item (0099; 0126; 0138; 0190; 0379);
determining, by the categorization engine, based at least in part on the frames portion, second timing information and spatial information associated with the at least one of the second text or the object (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633);
determining, by a rules engine stored in the non-transitory memory of the computer system and executable by the processor of the computer system, one or more rules based on the determined one or more categories (0072-0078; 0190);
applying, by the rules engine, the one or more rules to at least one of the first timing information associated with the first text, the second timing information associated with the at least one of the second text or the object, or the spatial information associated with the at least one of the second text or the object to determine whether the rules are satisfied (0099; 0126; 0138; 0190; 0379); and
displaying, via a user interface (UI) of the computer system, at least a portion of the content item with at least one indicator indicating an issue in the portion of the content item, wherein the issue is based on a determination that the at least one of the first text in association with the first timing information or the second text in association with the spatial information fails to satisfy one or more of the rules (00145; 0300; 0362-0365).
Burkitt does not expressly disclose determining, by the categorization engine, one or more categories of the content item using one or more large language models (LLMs) to generate, based on the first text and the at least one of the second text or the object, a response to one or more prompts associated with the one or more categories.
Munoz teach determining, by the categorization engine, one or more categories of the content item using one or more large language models (LLMs) to generate, based on the first text and the at least one of the second text or the object, a response to one or more prompts associated with the one or more categories (0024; 0028; 0036; 0040).
Therefore before the effective filing date of the claims invention, it would have been obvious to one of ordinary skill in the art to combine Munoz with Burkitt for the benefit of insuring that documents consistently follow a given formatting requirement, thus ensuring the readability and usability of the documents may fall below acceptable standards (0005).
Regarding dependent claim 14, Burkitt disclose the method of claim 13, wherein:
the extracting the at least one of the second text or the object from the frames portion comprises:
identifying, by the categorization engine, using an object detection model, the object in at least one frame of the frames portions (0126; 0467), and
the determining the one or more categories of the content item comprises:
determining, by the categorization engine, a subject matter of the content item based on the first text extracted from the audio portion (0038; 0094; 0096; 0099; 0124; 0132); and
determining, by the categorization engine, contextual information associated with the content item based on the subject matter and a frequency of appearance of the at least one of the second text or the object in the frames portion(0126; 0138; 0190; 0379), and
the determining the one or more categories of the content item is further based on
the contextual information (0096; 0132-0134).
Regarding dependent claim 15, Burkitt disclose the method of claim 13, wherein the determining the one or more rules is further based on at least one of a customer policy (0438; 0476).
Regarding dependent claim 16, Burkitt in view of Munoz disclose the method of claim 13, further comprising:
determining, by the categorization engine, a series of prompts based on a plurality of specific categories, wherein the series of prompts comprise the one or more prompts (0026).
Regarding dependent claim 17, Burkitt disclose the method of claim 13, wherein the displaying further comprises displaying, via the UI, a recommendation to correct the issue in the portion of the content item (00145; 0300; 0362-0365).
Regarding independent claim 18, Burkitt disclose a computer-implemented method for analyzing individual portions of media content to provide content compliance checks against rules (0031; 0038; 0099; 0124; 0132), the method comprising:
receiving, by a categorization engine stored in non-transitory memory of a computer system and executable by a processor of the computer system, a content item (0423; 0655);
analyzing, by the categorization engine, an audio portion and a frames portion of the content item separately, wherein the analyzing comprises:
extracting, by the categorization engine, first text from the audio portion (0038; 0094; 0096; 0099; 0124; 0132);
determining, by the categorization engine, based at least in part on the audio
portion, first timing information associated with the first text in metadata (0094-0097);
extracting, by the categorization engine, second text from the frames portion component (0099; 0126; 0138; 0190; 0379);
determining, by the categorization engine, based at least in part on the frames portion, second timing information and spatial information associated with the second text (0126; 0138; 0190; 0379);
determining, by the categorization engine, based on the first text and the second
text, one or more categories of the content item (0359; 0378; 0440; 0459-0461; 0496-0497; 0630; 0633);
applying, by a rules engine stored in the non-transitory memory of the computer system and executable by the processor of the computer system, one or more rules that are based on the determined one or more categories to determine whether at least one of the first text, the first timing information associated with the first text, the second text, the second timing information associated with the second text, or the spatial information associated with the second text satisfy the one or more rules (0099; 0126; 0138; 0190; 0379).
Burkitt does not expressly disclose outputting, by a presentation component stored in the non-transitory memory of the computer system and executable by the processor of the computer system, based on the applying, an indication of whether there is an issue in at least a portion of the content item.
Munoz teach outputting, by a presentation component stored in the non-transitory memory of the computer system and executable by the processor of the computer system, based on the applying, an indication of whether there is an issue in at least a portion of the content item (0024; 0026-0028; 0066-0067).
Therefore before the effective filing date of the claims invention, it would have been obvious to one of ordinary skill in the art to combine Munoz with Burkitt for the benefit of insuring that documents consistently follow a given formatting requirement, thus ensuring the readability and usability of the documents may fall below acceptable standards (0005).
Regarding dependent claim 19, Burkitt in view of Munoz disclose the method of claim 18, further comprising:
modifying, by an action component stored in the non-transitory memory of the computer system and executable by the processor of the computer system, based on the indication, the at least the portion of the content item (0024; 0026-0028; 0066-0067).
Regarding dependent claim 20, Burkitt disclose the method of claim 18, further comprising:
providing, by an action component stored in the non-transitory memory of the computer system and executable by the processor of the computer system, based on the indication, the content item for streaming (0265; 0638-0639).
NOTE
It is noted that any citations to specific, pages, columns, lines, or figures in the prior art references and any interpretation of the reference should not be considered to be limiting in any way. A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art. See MPEP 2123.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAMES J DEBROW whose telephone number is (571)272-5768. The examiner can normally be reached on 09:00 - 06:00.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Bashore can be reached on 571-272-4088. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center and the Private Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from Patent Center or Private PAIR. Status information for unpublished applications is available through Patent Center or Private PAIR to authorized users only. Should you have questions about access to Patent Center or the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/uspto-automated- interview-request-air-form.
/James J Debrow/
Primary Patent Examiner
Art Unit 2174
571-272-5768