Last updated: April 19, 2026
Application No. 18/337,667
SYSTEM AND METHOD FOR DETERMINING MULTI-PARTY COMMUNICATION INSIGHTS

Non-Final OA §101§103§112
Filed
Jun 20, 2023
Examiner
CHUNG, DANIEL WONSUK
Art Unit
2659
Tech Center
2600 — Communications
Assignee
Verizon Patent and Licensing Inc.
OA Round
3 (Non-Final)
This examiner grants 54% of cases after interview

— +37.5% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 44 resolved cases, 2023–2026
Examiner Intelligence

CHUNG, DANIEL WONSUK View full profile →
Grants 54% of resolved cases
Career Allow Rate
24 granted / 44 resolved
-7.5% vs TC avg
Strong +38% interview lift
Without
With
+37.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
33 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
25.2%
-14.8% vs TC avg
§103
52.3%
+12.3% vs TC avg
§102
17.3%
-22.7% vs TC avg
§112
5.2%
-34.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 44 resolved cases
Office Action

§101 §103 §112
DETAILED ACTION
This communication is in response to the Amendments and Arguments filed on 1/13/2026.
Claims 1-20 are pending and have been examined.
All previous objections / rejections not mentioned in this Office Action have been withdrawn by the examiner.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendments
Applicant’s arguments filed on 1/13/2026 have been fully considered but they are not persuasive.  Applicant has amended independent claim 1, 8, and 14.  
Regarding the Applicant’s arguments for the rejections under 35 U.S.C. § 101, applicant asserts that independent claim limitations cannot be performed in the mind and are directed to technological solution that improves computer functionality in processing and analyzing multi-party communication data.  Applicant refers to specific steps in the claim language and explains that due to the processing complexity or technical requirements the steps cannot be performed in the mind.  Examiner respectfully disagrees.   During patent examination, pending claims must be “given their broadest reasonable interpretation consistent with the specification.”  MPEP 2111.   Also, claims should not be interpreted by reading limitations of the specification into the claim, to narrow the scope of the claim, by implicitly adding disclosed limitations that have no express basis in the claim language.  In re Prater, 415 F.2d 1393.  Here, the steps in the claim language are broad and examiner interprets the claim broadly.  First, the steps recited in the claim limitation can be performed in the mind.  Specifically, the human mind can hear segments of audio, the human mind can segment audio segments that are overlapping, the human mind can determine time-domain features for each segment such as pitch or prosody, a trained machine learning model can be interpreted as a set of rules or instructions to output and the human mind can follow determined rules or instructions, the use of different types of vectors can be interpreted as pitch or prosody for the speech segments that the human mind can use to determine speaker for speech segments, the human mind can think of a summary of the conversation, the human mind can think of a main topic of the conversation, and a human can write the main topic on paper using a pen or pencil.  The claim encompasses mental observations or evaluations that can be practically performed in the human mind.  Second, MPEP 2106.05(f) provides the following considerations for determining whether a claim simply recites a judicial exception with the words “apply it” (or an equivalent), such as mere instructions to implement an abstract idea on a computer: (1) whether the claim recites only the idea of a solution or outcome i.e., the claim fails to recite details of how a solution to a problem is accomplished; (2) whether the claim invokes computers or other machinery merely as a tool to perform an existing process; and (3) the particularity or generality of the application of the judicial exception.  Here, the steps to obtain multi-party communication data, generate diarized MPC data by segmenting audio, generating processed diarized MPC data, applying a summarization model to generate an MPC summary, determining an MPC insight based on MPC summary, and outputting MPC insight on a display are limitations that only recite an outcome and do not include any details of how the steps are accomplished.  The claim utilizes a trained machine learning model and extracted features, but fails to provide technical details of how the machine learning model is utilized, how and what type of features are extracted, and how the extracted features are utilized to diarize speech. Therefore, the claims as currently recited does not overcome the 35 U.S.C. § 101 abstract idea rejection.  
Regarding the Applicant’s arguments for the rejections under 35 U.S.C. § 103, applicant has amended the independent claim to add new limitations.  The added limitations raises new grounds for rejection.  Since Applicant’s arguments are directed towards the new amendment, the arguments are moot in view of new grounds for rejection. 
Therefore, the claims as currently recited does not overcome the prior art reference.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 1, 8, and 14 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.  Specifically, the as filed disclosure does not disclose “the event timeline comprising a speaker identify and a content attributed to the speaker identity” and “the identification comprising matching a text transcript associated with a respective segment to the content attributed to the speaker identity in the event timeline”.  The as filed specification discloses event timeline comprising an anonymized speaker label for speech segments that have a start and end times.  Moreover, the as filed disclosure does not disclose a matching procedure to match a respective segment to the content attributed to the speaker identity.  
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1, 8, and 14 the limitations of “obtaining multi-party communication (MPC) data corresponding to an MPC, the MPC data comprising an audio component and an event timeline, the event timeline comprising a speaker identity and a content attributed to the speaker identity”, “generating diarized MPC data based on the MPC data by: segmenting the audio component into overlapping sliding window segments, extracting at least one of a spectral feature or a time-domain feature from each segment, determining, via at least one trained machine learning model, based on the at least one extracted feature from each segment, at least two of an i_Vector, a d_Vector, and an x_Vector for each segment, and assigning, based on at least two vectors for each segment, an anonymized speaker label to each segment”, “generating processed diarized MPC data by identifying the speaker corresponding to the anonymized speaker label based on the event timeline, the identification comprising matching a text transcript associated with a respective segment to the content attributed to the speaker identity in the event timeline”, “applying a summarization model to the processed diarized MPC data to generate an MPC summary”, “determining an MPC insight corresponding to the MPC based on the MPC summary and the MPC data”, and “outputting, for display on a device, the MPC insight”, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components.  More specifically, the mental process of a human hearing segments of audio, segmenting audio segments that are overlapping, determining time-domain features for each segment such as pitch or prosody, human mind can follow determined rules or instructions with the use of pitch or prosody for the speech segments, determining speaker for speech segments, thinking of a summary of the conversation, thinking of a main topic of the conversation, and writing the main topic on paper using a pen or pencil.  
This judicial exception is not integrated into a practical application because the recitation of a non-transitory computer readable storage medium in claim 8 and device in claim 14, reads to generalized computer components, based upon the claim interpretation wherein the structure is interpreted using P0113-P0122 in the specification. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using generalized computer components to human hear segments of audio, segment audio segments that are overlapping, determine time-domain features for each segment such as pitch or prosody, follow determined rules or instructions with the use of pitch or prosody for the speech segments, determine speaker for speech segments, think of a summary of the conversation, think of a main topic of the conversation, and write the main topic on paper using a pen or pencil amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
With respect to claim 2, 9, and 15, the claim recites “combining the at least two vectors for each segment”, “clustering the combined vectors into an N-number of clusters, the N-number of clusters corresponding to a number of speakers associated with the MPC”, and “assigning the anonymized speaker label to each cluster”, which reads on a human thinking of segments of a conversation, thinking and writing voice characteristics and prosody of the speech segments together on paper using an pen or pencil, determining speech segments that had similar voice characteristics and prosody, and labeling the determined speech segments as anonymized speaker on paper.  No additional limitations are present.
With respect to claim 3, 10, and 16, the claim recites “extracting a plurality of video frames from a video component of the MPC data”, “deduplicating the video frames to obtain unique frames”, “performing Optical Character Recognition (OCR) on each unique frame to obtain a plurality of keywords”, and “generating the set of MPC keywords by ranking the plurality of keywords”, which reads on a human thinking of points in time watching people have a conversation and reading and thinking of words that were visually presented.  No additional limitations are present.
With respect to claim 4, 11, and 17, the claim recites “ranking the plurality of keywords using a Term Frequency - Inverse Document Frequency (TF-IDF) technique”, which reads on a human thinking of how many times specific words were presented visually.  No additional limitations are present.
With respect to claim 5, 12, and 18, the claim recites “wherein the MPC insight is a set of MPC chapter names, the method further comprising determining the MPC insight by applying a chapterization model to the MPC keywords”, which reads on a human thinking of topics that were discussed by thinking of the words that were presented visually.  No additional limitations are present.
With respect to claim 6, 13, and 19, the claim recites “wherein the chapterization model includes a Bidirectional and Auto-Regressive Transformers (BART) architecture”, which reads on a human thinking of topics that were discussed by thinking of the words that were presented visually.  The use of BART architecture is a conventional technique in natural language processing and does not show improvement in functionality that demonstrates patent eligibility.  MPEP 2106.05(a).  No additional limitations are present.
With respect to claim 7 and 20, the claim recites “wherein the chapterization model is evaluated during training on embedding based distance metrics”, which reads on a human thinking of topics that were discussed by thinking of the words that were presented visually.  Training and using distance metrics are general commonplace methods and does not show improvement in functionality that demonstrates patent eligibility.  MPEP 2106.05(a).   No additional limitations are present.
These claims further do not remedy the judicial exception being integrated into a practical application and further fail to include additional elements that are sufficient to amount to significantly more than the judicial exception.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 8, 9, 14, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Fanelli et al. (U.S. PG Pub No. 20240160849), hereinafter Fanelli, in view of Daredia et al. (U.S. PG Pub No. 20200403817), hereinafter Daredia.

Regarding claim 1, 8, and 14 Fanelli teaches:
(Claim 1) A method comprising: (P0005, Method comprises.)
(Claim 8) A non-transitory computer-readable storage medium for storing instructions executable by a processor, the instructions comprising: (P0023, A non-transitory computer-readable storage medium stores at least one program for execution by at least one processor of an electronic device.)
(Claim 14) A device comprising a processor configured to: (P0023, A non-transitory computer-readable storage medium stores at least one program for execution by at least one processor of an electronic device.)
obtaining multi-party communication (MPC) data corresponding to an MPC, the MPC data comprising an audio component and an event timeline, the event timeline comprising a speaker identity and a content attributed to the speaker identity; (P0050, Diarization pipeline takes as input media data (e.g., a mono or stereo audio file) containing utterances (e.g., speech) and returns as output corresponding segments (e.g., speech segments) with an associated tag for each speaker detected in the audio file.)
generating diarized MPC data based on the MPC data by: segmenting the audio component into overlapping sliding window segments, extracting at least one of a spectral feature or a time-domain feature from each segment, determining, via at least one trained machine learning model, based on at least one extracted feature from each segment, at least two of an i_Vector, a d_Vector, and an x_Vector for each segment, and assigning, based on at least two vectors for each segment, an anonymized speaker label to each segment; (P0055, VAD detects the speech of multiple speakers and performs overlapped speech detection. The speech detections are used to isolate label speech segments, i.e., portions of the blocks containing speech … Each audio channel are processed by block-based embeddings extraction component, which performs feature extraction.; P0056, The isolated speech segments are input into embedding extraction component together with data for overlapped speech detection and speaker change detection. … Embedding extraction component and embedding generation model to extract embeddings (e.g., a multidimensional vectorial representation of a speech segment) from the isolated speech segments.; P0059, Tthe embedding model is a deep neural network (DNN) trained to have a speaker-discriminative embeddings layer.; P0060, Multiple embeddings of speech segment 1 are generated and features are extracted from each embedding of SEGM1.; P0073, After embeddings extraction and processing, embeddings are clustered into different speakers.)
generating processed diarized MPC data by identifying the speaker corresponding to the anonymized speaker label based on the event timeline, the identification comprising matching a text transcript associated with a respective segment to the content attributed to the speaker identity in the event timeline; (P0005, Identifying, with the at least one processor, segments of each block of the plurality of blocks associated with a single speaker. … Assigning, with the at least one processor, a speaker label to each of the embeddings for the identified segments in accordance with a result of the clustering; and outputting, with the at least one processor, speaker diarization information associated with the media data based in part on the speaker labels.  P0050, Diarization pipeline takes as input media data (e.g., a mono or stereo audio file) containing utterances (e.g., speech) and returns as output corresponding segments (e.g., speech segments) with an associated tag for each speaker detected in the audio file.)
Fanelli does not specifically teach:
obtaining multi-party communication (MPC) data corresponding to an MPC, the MPC data comprising an audio component and an event timeline comprising a speaker identity and a content attributed to the speaker identity;
applying a summarization model to the processed diarized MPC data to generate an MPC summary; and
determining an MPC insight corresponding to the MPC based on the MPC summary and the MPC data; and
outputting, for display on a device, the MPC insight.
Daredia, however, teaches:
obtaining multi-party communication (MPC) data corresponding to an MPC, the MPC data comprising an audio component and an event timeline comprising a speaker identity and a content attributed to the speaker identity; (P0017, Receives media data (e.g., audio data or video data) and information associated with user input(s) to client device(s) of users participating in a meeting (e.g., where the user inputs are provided by the users to indicate important or relevant portions of the meeting).)
applying a summarization model to the processed diarized MPC data to generate an MPC summary; (P0017, Digital content management system that generates meeting insights (e.g., summaries, highlights, or action items) for providing to one or more users based on media data, documentation, and user inputs to client devices associated with a meeting.; P0024, The content management system also uses data from past meetings to train a machine-learning model to automatically tag or suggest insights for meetings.)
determining an MPC insight corresponding to the MPC based on the MPC summary and the MPC data; and (P0103, The content management system can then identify the text corresponding to the relevant portions and paste the text or summarize/rephrase the text into the meeting summary as highlights. The highlights can include bullet points or other easily digestible representations of the discussion points from the meeting.)
outputting, for display on a device, the MPC insight. (P0075, Display meeting insights.; P0145, Devices for presenting output to a user, including, but not limited to, a graphics engine, a display (e.g., a display screen).; P0005, The meeting insights can include, for example, a summary of the meeting, a list of highlights from the information covered during the meeting.)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to generate a summary of a conversation between speakers.  It would have been obvious to combine the references because creating a summary saves time in reading transcripts or listening to entire audio recorded meeting when participants need to review past meetings.  (Daredia P0003)

Regarding claim 2, 9, and 15 Fanelli in view of Daredia teaches claim 1, 8 and 14:
Fanelli further teaches:
combining the at least two vectors for each segment; (P0056, Embedding extraction component and embedding generation model to extract embeddings (e.g., a multidimensional vectorial representation of a speech segment) from the isolated speech segments.; P0060, Multiple embeddings of speech segment 1 are generated and features are extracted from each embedding of SEGM1.)
clustering the combined vectors into an N-number of clusters, the N-number of clusters corresponding to a number of speakers associated with the MPC; and (P0059, The rationale behind generating embeddings is to facilitate the clustering of different speakers, thus succeeding in performing speaker diarization. Models, such as, for example, an embedding generation model are used to generate embeddings by mapping or converting speech segments to a representation in multidimensional space.)
assigning the anonymized speaker label to each cluster. (P0061, The average embeddings for Blocks 1, 2 and 3 are then input into clustering processes (e.g., short and long segment clustering). Each cluster represents the speech from one of the 3 speakers, e.g., 3 clusters in three different regions of the multidimensional feature space.; P0075, The resulting improved and postprocessed embeddings are used by segmentation component to segment the audio file into segments with associated speaker label/identifiers.)  

Claims 3-5, 10-12, and 16-18 are rejected under 35 U.S.C. 103 as being unpatentable over Fanelli in view of Daredia and further view of Erol et al. (U.S. Patent No. 7298930), hereinafter Erol.

Regarding claim 3, 10, and 16 Fanelli in view of Daredia teaches claim 1, 8 and 14. 
Fanelli in view of Daredia does not specifically teach:
extracting a plurality of video frames from a video component of the MPC data;
deduplicating the video frames to obtain unique frames;
performing Optical Character Recognition (OCR) on each unique frame to obtain a plurality of keywords; and
generating the set of MPC keywords by ranking the plurality of keywords. 
Erol, however, teaches:
extracting a plurality of video frames from a video component of the MPC data; (Col. 7, Lines 32-37, The video information collected by the video capture functionality. In accordance with an embodiment of the present invention, the video is captured with an omni-directional camera and converted to the MPEG-2 data format for storage and subsequent access.; Col. 12, Lines 16-18, A visual significance score (Va) can be produced for each video frame of the video component of the captured recording meeting.)
deduplicating the video frames to obtain unique frames; (Col. 7, Lines 45-58, A visual significance measure is generated based on local luminance changes in a video sequence. A large luminance difference between two consecutive frames is generally an indication of a significant content change, such as a person getting up and moving around. However, insignificant events, such as dimming the lights or all the participants moving slightly, may also result in a large luminance difference between two frames. In order to reduce the likelihood of identifying such events as being significant, the visual significance measure, according to an embodiment of the invention, can be determined by considering luminance changes occurring in small windows of a video frame rather than a single luminance change of the whole frame.; Col. 9, Lines 47-49, The visual significance metric in turn can serve as a basis for identifying visually significant events in a video recording.)
performing Optical Character Recognition (OCR) on each unique frame to obtain a plurality of keywords; and (Col. 15, Lines 42-45, Optical character recognition (OCR) techniques can be applied to the captured slide images to extract textual content from the slides.; Col. 15, Lines 66, 67, Col. 16, Lines 1-2, Each keyword k is associated with the audio clip (keyword audio segment) having the highest DOk as described above. The slide(s) presented during the time period spanned by that audio clip can be associated with that keyword.)
generating the set of MPC keywords by ranking the plurality of keywords. (Col. 15, Lines 42-4, Optical character recognition (OCR) techniques can be applied to the captured slide images to extract textual content from the slides. TF-IDF analysis can be performed on the text and associated with “events” from the slide presentation.; Col 12, Lines 34-36, The computed scores generated in step 604 can be ranked. … Computed keyword audio segments (those with the highest DOk values) are sorted according to the TF-IDF of the keywords.)  
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize OCR to obtain keywords.  It would have been obvious to combine the references because OCR and TF-IDF allows the capture of events in a slide presentation where the events encompass an outline of a presentation when summarizing the meeting.  (Erol, Col. 15, Lines 42-49)

Regarding claim 4, 11, and 17 Fanelli in view of Daredia and further view of Erol teaches claim 3, 10 and 16. 
Fanelli in view of Daredia does not specifically teach:
ranking the plurality of keywords using a Term Frequency - Inverse Document Frequency (TF-IDF) technique. 
Erol, however, teaches:
ranking the plurality of keywords using a Term Frequency - Inverse Document Frequency (TF-IDF) technique. (Col. 15, Lines 42-4, Optical character recognition (OCR) techniques can be applied to the captured slide images to extract textual content from the slides. TF-IDF analysis can be performed on the text and associated with “events” from the slide presentation.; Col 12, Lines 34-36, The computed scores generated in step 604 can be ranked. … Computed keyword audio segments (those with the highest DOk values) are sorted according to the TF-IDF of the keywords.)  
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize OCR to obtain keywords.  It would have been obvious to combine the references because OCR and TF-IDF allows the capture of events in a slide presentation where the events encompass an outline of a presentation when summarizing the meeting.  (Erol, Col. 15, Lines 42-49)

Regarding claim 5, 12, and 18 Fanelli in view of Daredia and further view of Erol teaches claim 3, 10 and 16. 
Fanelli does not specifically teach:
wherein the MPC insight is a set of MPC chapter names, the method further comprising determining the MPC insight by applying a chapterization model to the MPC keywords. 
Daredia, however, teaches:
wherein the MPC insight is a set of MPC chapter names, the method further comprising determining the MPC insight by applying a chapterization model to the MPC keywords. (P0022, The generated meeting insights can include content based on the analyzed data associated with the meeting. For example, a meeting insight can include a meeting summary, highlights from the meeting, action items, etc.; P0024, The content management system also uses data from past meetings to train a machine-learning model to automatically tag or suggest insights for meetings.; P0109, The content management system can detect keywords, phrases, or other content in a meeting and then take an action to display insights on one or more client devices associated with the meeting.)  
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to determine insights by utilizing keywords.  It would have been obvious to combine the references because keywords can determine relevant portions of speech segment that are important in determining highlights of the meeting.  (Daredia, P0105)

Claims 6, 7, 13, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Fanelli in view of Daredia, in view of Erol, and further view of Manuvinakurike et al. (U.S. PG Pub No. 20230343331), hereinafter Manuvinakurike.

Regarding claim 6, 13, and 19 Fanelli in view of Daredia and further view of Erol teaches claim 5, 12 and 18.
 Fanelli in view of Daredia and further view of Erol does not specifically teach:
wherein the chapterization model includes a Bidirectional and Auto-Regressive Transformers (BART) architecture.
Manuvinakurike, however, teaches:
wherein the chapterization model includes a Bidirectional and Auto-Regressive Transformers (BART) architecture. (P0055, In a meeting between users A, B, C, and D, the summary extractor summarizes the content of the meeting by putting emphasis on details classified as important. In this example, a BERT language base is leveraged. In other examples, a language model such as BART, GPT-4 or another large-language model may be used as the language base.)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize BART architecture.  It would have been obvious to combine the references because utilizing BART is applying a known technique to yield a predictable result of inputting speech segments and outputting speech highlights of conversations in natural language processing.

Regarding claim 7 and 20 Fanelli in view of Daredia and further view of Erol teaches claim 6 and 19.
Fanelli does not specifically teach:
wherein the chapterization model is evaluated during training on embedding based distance metrics.
Daredia, however, teaches:
wherein the chapterization model is evaluated during training on embedding based distance metrics. (P0024, In particular, the content management system can use a training dataset including manually-labeled insight data corresponding to past meetings to train the machine-learning model. The content management system can then input audio data for a meeting into the trained machine-learning model, which outputs insights or suggestions by analyzing the audio data and other information associated with the later meeting.; P0070, The machine-learning model generates highlights, action items, and summaries, the content management system generates a loss function for the machine-learning model. Specifically, the content management system uses labeled audio data and insights to create the loss function. More specifically, the content management system uses curated audio data with portions of the audio data marked as relevant, in addition to using manually generated insights (e.g., manually generated highlights, action items, and summaries). The content management system can compare the labeled audio data and insights to the outputs of the machine-learning model and then generate the loss function based on the difference.)
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize training of a model based on distance.  It would have been obvious to combine the references because the training of a model utilizing a loss function is applying a known technique to yield a predictable result of training of a model and adjusting parameters according to the produced output and training samples.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL WONSUK CHUNG whose telephone number is (571)272-1345. The examiner can normally be reached Monday - Friday (7am-4pm)[PT].
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, PIERRE-LOUIS DESIR can be reached at (571)272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DANIEL W CHUNG/Examiner, Art Unit 2659                 

/PIERRE LOUIS DESIR/Supervisory Patent Examiner, Art Unit 2659
Read full office action
Prosecution Timeline

Jun 20, 2023
Application Filed
May 02, 2025
Non-Final Rejection — §101, §103, §112
Aug 01, 2025
Response Filed
Oct 06, 2025
Final Rejection — §101, §103, §112
Jan 13, 2026
Request for Continued Examination
Jan 26, 2026
Response after Non-Final Action
Apr 02, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/984,768
Patent 12579471
DATA AUGMENTATION AND BATCH BALANCING METHODS TO ENHANCE NEGATION AND FAIRNESS
2y 5m to grant Granted Mar 17, 2026
17/812,782
Patent 12493892
METHOD AND SYSTEM FOR EXTRACTING CONTEXTUAL PRODUCT FEATURE MODEL FROM REQUIREMENTS SPECIFICATION DOCUMENTS
2y 5m to grant Granted Dec 09, 2025
17/706,303
Patent 12400078
INTERPRETABLE EMBEDDINGS
2y 5m to grant Granted Aug 26, 2025
18/441,766
Patent 12387000
PRIVACY-PRESERVING AVATAR VOICE TRANSMISSION
2y 5m to grant Granted Aug 12, 2025
17/842,986
Patent 12380875
SPEECH SYNTHESIS WITH FOREIGN FRAGMENTS
2y 5m to grant Granted Aug 05, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
54%
Grant Probability
92%
With Interview (+37.5%)
2y 10m
Median Time to Grant
High
PTA Risk
Based on 44 resolved cases by this examiner. Grant probability derived from career allow rate.