Last updated: May 29, 2026
Application No. 18/654,019
SYSTEMS AND METHODS FOR GENERATING SYNTHETIC DATA, AND TRAINING AND TESTING CONVERSATIONAL ARTIFICIAL INTELLIGENCE PLATFORMS

Non-Final OA §102§103§112
Filed
May 03, 2024
Examiner
CAUDLE, PENNY LOUISE
Art Unit
2657
Tech Center
2600 — Communications
Assignee
Sestek Ses Ve Iletisim Bilgisayar Teknolojileri San Ve Tic A S
OA Round
1 (Non-Final)
Interview Optional

— +14.9% interview lift. Interview lift (+14.9%) is below the 15.0% threshold. A written response is recommended.
Based on 73 resolved cases, 2023–2026
Examiner Intelligence

CAUDLE, PENNY LOUISE View full profile →
Grants 68% — above average
Career Allowance Rate
50 granted / 73 resolved
+6.5% vs TC avg
Moderate +15% lift
Without
With
+14.9%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
13 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
11.8%
-28.2% vs TC avg
§103
78.6%
+38.6% vs TC avg
§102
3.7%
-36.3% vs TC avg
§112
5.9%
-34.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 73 resolved cases
Office Action

§102 §103 §112
DETAILED ACTION
This examination is in response to the communication filed on 05/03/2024. Claims 1-20 are currently pending, where claims 1, 8 and 17 are independent.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 18-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 18 recites “whether the synthetic data is first synthetic data and the method further comprises…” in line 1 the term “whether” makes it unclear if the additional limitation of generating second synthetic data is dependent on a determination of “whether” the synthetic data is first synthetic data or the term is typographical error and should read “wherein the synthetic data is first synthetic data”. For purposes of Examination, the “whether” term is interpretated as typographical error which should read “wherein.”
Claims 19 and 20 depend from claim 18 and therefore are rejected for the same reasons as claim 18.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims1-4, 6, 8-11 and 15 are rejected under 35 U.S.C. 102(a)(a) as being anticipated by Golany et al. “Efficient Data Generation for Source-grounded Information-seeking Dialogs: A Use Case for Meeting Transcripts” Google Research arXiv:2405.01121v1 [cs.CL:] May 2, 2024 (herein “Golany”).
Regarding claims 1 and 8, Golany teaches a method for training a neural network (Page 4, Section 5 teaches “This test set was created in order to assess whether training models on the MISeD data leads to better performance…”) and a system for generating synthetic data for training and/or testing of neural network (Page 4, Section 5 teaches “This test set was created in order to assess whether training models on the MISeD data leads to better performance…”) comprising: at least one Large Language Model (LLM) module, a data storing unit, and a bot builder service module, implemented by at least one processor (page 1, Figure 1 “LLM” and page 2, 1st column teaches “In summary, our main contributions are as follows: (1) presenting an LLM-based data generation methodology for information-seeking dialogs; (2) creating the MISeD dataset” and Fig. 1, full generated dialog transcript requires a storage unit), configured to perform the method comprising: 
generating synthetic data by defining roles for a plurality of speakers (Page 1, Abstract teaches “Instead of the labor-intensive Wizard-of-Oz (WOZ) method, where two annotators generate a dialog from scratch, role-playing agent and user, we use LLM generation to simulate the two roles.” The agent and user roles are interpreted as a plurality of speakers), 
inputting the roles to at least one Large Language Model (LLM) implemented by at least one first processor (Page 1, 2nd column teaches “We utilize separate prompts to guide the LLM’s generation of both the user queries and the agent responses” See also, Fig. 1, “Query Prompt”), 
requesting the at least one LLM to generate a first statement based on the role of a first speaker of the plurality of speakers (Page 1, 2nd column teaches “We utilize separate prompts to guide the LLM’s generation of both the user queries and the agent responses” See also, Fig. 1, “Query Prompt”), 
instructing the at least one LLM to generate a second statement based on the role of a second speaker of the plurality of speakers that is responsive to the first statement (Page 1, 2nd column teaches “We utilize separate prompts to guide the LLM’s generation of both the user queries and the agent responses” See also, Fig. 1, “Response Prompt”), 
storing a dialog between the first speaker and the second speaker comprising the first and second statements (Fig. 1, transcript of full generated dialog), 
iterating (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…”) the requesting, instructing and storing such that 
the first statement is responsive to the second statement of a preceding iteration of the requesting (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…” ), 
the second statement is responsive to the first statement of a current iteration of the requesting instructing and storing (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…” ), 
the storing comprises adding the first and second statements of a current iteration to the dialog such that the dialog comprises the first and second statements of each previous iteration of the requesting, instructing and storing ( Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…”), and 
each instance of the requesting and instructing comprises providing the at least one LLM with the dialog of a preceding iteration of the storing (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…”), 
ceasing said iterating in response to a termination condition to obtain the stored dialog in a final iteration of the iterating, wherein the stored dialog in the final iteration is the synthetic data (Page 1, 2nd column Figure 1 description teaches “… Iterating this automatic process yields a full dialog, which is then validated by annotators, who further augment it with response attributions.” The iteration process necessarily requires a termination condition in order to trigger the validation process, therefore, the iteration process is inherently ceased upon detection of the termination condition); and 
training a neural network, implemented by at least one second processor, based on the synthetic data (Page 7, Section 7.2 teaches “We train the finetuned agent models using the MISeD training set…” ).  
Regarding claim 2, Golany teaches all of the elements of claim 1 (see detailed mapping above). In addition, Golany further teaches the training comprises performing a first learning by the neural network based on other data and performing a second learning by the neural network based on the synthetic data to refine the neural network (Page 1, Abstract teaches “Models finetuned with MISeD demonstrate superior performance…” and page 7, section 7.1 teaches “Finetuned Encoder-Decoder…We finetuned the open-source LongT5 XL…on the MISeD training set, using a context length of 16 thousand tokens” by definition “finetuned” is performed on a pre-trained model).  
Regarding claim 3, Golany teaches all of the elements of claim 2 (see detailed mapping above). In addition, Golany further teaches the other data is real data based on at least one real dialog (page 7, section 7.1 teaches “Finetuned Encoder-Decoder…We finetuned the open-source LongT5 XL…on the MISeD training set, using a context length of 16 thousand tokens” As evidenced by Guo et al. “LongT5:Efficient Text-To-Text Transformer for Long Sequences” Findings of the Association for computational Linguistics: NAACL 2022, pages 724-736 July 10-15, 2022, the LongT5 XL was trained on the MediaSum dataset which is large-scale media interview, i.e., dialog, dataset).  
Regarding claims 4 and 10, Golany teaches all of the elements of claims 1 and 9 (see detailed mapping above). In addition, Golany further teaches the synthetic data is text data (Page 3, 1st column description of Figure 2 teaches “The agent receives the source text (meeting transcript), dialog history…” Thus, the meeting transcript which corresponds to LLM generated dialog transcript is text). 
Regarding claims 6 and 15, Golany teaches all of the elements of claims 1 and 8 (see detailed element mapping above). In addition, Golany further teaches at least one of the roles of the first speaker or the second speaker comprises characteristics of the first speaker or the second speaker (page 1, 2nd column teaches “We utilize separate prompts to guide the LLM’s generation of both the user queries and the agent responses” the user or agent role is interpreted as a characteristic of the speaker, i.e., either user or agent.).
Regarding claim 9, Golany teaches all of the elements of claim 8 (see detailed mapping above). In addition, Golany further teaches the at least one LLM module provides each instance of the first and second statement as text data (Page 1, Figure 1, query 1 and response 1 and Page 3, 1st column description of Figure 2 teaches “The agent receives the source text (meeting transcript), dialog history…” Thus, the meeting transcript which corresponds to LLM generated dialog transcript is text).  
Regarding claim 11, Golany teaches all of the elements of claim 9 (see detailed mapping above). In addition, Golany further teaches the dialog is modeled for implementation on a dialog channel that is a text-based platform (Page 3, 1st column description of Figure 2 teaches “The agent receives the source text (meeting transcript), dialog history…” Thus, the meeting transcript which corresponds to LLM generated dialog transcript is text).  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 5, 12 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Golany as applied to claim 1 above, and further in view of Lee et al. “Exploring The Viability of Synthetic Audio Data for Audio-Based Dialogue Tracking” arXiv:2312.01841v1 [cs.SD] December 4, 2023 (herein “Lee”).
Regarding claims 5, Golany teaches all of the elements of claim 1 (see detailed mapping above). However, Golany fails to disclose that the synthetic data is audio data.
Lee teaches a method and system for extending text-based dialogue state tracking (DST) to the audio domain. More specifically, Lee teaches by leveraging text-to-speech (TTS) models it is possible to generate a diverse and customizable synthetic audio dialogue dataset that covers a wide range of scenarios and contexts (see Lee page 1, 2nd column, 2nd to last paragraph). More specifically, Lee teaches in Section 4.1 that “SynthWOZ is a comprehensive multi-turn synthetic audio dataset…utilizing the text transcripts required for audio generations, wherein were obtained from MultiWOZ 2.1…a widely acknowledged benchmark resource for text-based DST.” Accordingly, Lee teaches generation synthetic audio datasets for DST from text-based transcripts.
Golany differs from the claimed invention, as defined in claim 5, in that Golany fails to disclose the synthetic data is in the audio domain. Generation synthetic audio dialogue data from text-based dialogue is known in the art as evidenced by Lee. Therefore, it would have been obvious to one having ordinary skill in the art to have modified the synthetic dialog generation method/system of Golany to include utilizing text-to-speech (TTS) models to generate a diverse and customizable synthetic audio dialogue dataset that covers a wide range of scenarios and contexts (Lee, page 1, 2nd column).
Regarding claim 12, the Golany teaches all of the elements of claim 9 (see detailed mapping above). However, Golany fails to disclose a Text-to-Speech (TTS) Service module, implemented by the at least one processor, wherein the TTS Service module is configured to convert the text data to audio data; and a voice cloning service module, implemented by the at least one processor, wherein the voice cloning service module is configured to clone at least one voice and convert the audio data into cloned audio data in the at least one voice such that the synthetic data is stored as the cloned audio data.
Lee teaches a Text-to-Speech (TTS) Service module, implemented by the at least one processor, wherein the TTS Service module is configured to convert the text data to audio data (page 1, 2nd column teaches “By leveraging text-to-speech (TTS) models…to generate a diverse and customizable synthetic audio dialogue dataset”); and 
a voice cloning service module, implemented by the at least one processor, wherein the voice cloning service module is configured to clone at least one voice and convert the audio data into cloned audio data in the at least one voice such that the synthetic data is stored as the cloned audio data (page 3, 1st column teaches “…we employed multiple speaker voices during the synthesis process by passing speaker labels to a fixed lookup table ).
Golany differs from the claimed invention, as defined in claim 5, in that Golany fails to disclose the synthetic data is in the audio domain. Generation synthetic audio dialogue data from text-based dialogue is known in the art as evidenced by Lee. Therefore, it would have been obvious to one having ordinary skill in the art to have modified the synthetic dialog generation method/system of Golany to include utilizing text-to-speech (TTS) models to generate a diverse and customizable synthetic audio dialogue dataset that covers a wide range of scenarios and contexts (Lee, page 1, 2nd column).
Regarding claim 13, the combination of Golany and Lee teaches all of the elements of claim 12 (see detailed mapping above). In addition, Lee further teaches the dialog is modeled for implementation on a dialog channel that is a voice-based platform (page 1, section 1 teaches “Audio-based DST offers several advantages in enhancing user experience…” ).
Golany differs from the claimed invention, as defined in claim 13, in that Golany fails to disclose the synthetic data is modeled for a voice-based platform. Generation synthetic audio dialogue data for voice-based platforms is known in the art as evidenced by Lee. Therefore, it would have been obvious to one having ordinary skill in the art to have modified the synthetic dialog generation method/system of Golany to include utilizing text-to-speech (TTS) models to generate a diverse and customizable synthetic audio dialogue dataset that covers a wide range of scenarios and contexts (Lee, page 1, 2nd column).
 Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over the combination of Golany and Lee as applied to claim 12 above, and further in view of Lee et al (US 2024/0143916 A1 herein “Lee ‘916”). 
Regarding claim 14, the combination of Golany and Lee teaches all of the elements of claim 12 (see detailed element mapping above). However, the combination fails to explicitly disclose the dialog is modeled for implementation on a dialog channel that is both a textual-based platform and a voice-based platform.
Lee ‘916 teaches an electronic apparatus that performs a voice assistant functions using a language model, wherein the voice assistant provides a response to user interactions through text or voice. Thus, Lee teaches an LLM system which includes both a textual-based and voice-based platform (see ¶[0003] of Lee ‘916).
 The combination of Golany and Lee differs from the claimed invention, as defined in claim 14, in that combination fails to disclose the synthetic data is modeled for a system that is both a textual-based platform and a voice-based platform. LLM based assistants which are implemented on both textual-based and voice-based platforms are known in the art as evidenced by Lee ‘916. Therefore, it would have been obvious to one having ordinary skill in the art to have utilized the synthetic audio and textual dialog generated by the system taught by the combination of Golany and Lee to train an LLM provided in a textual and voice based platform as taught by Lee ‘916 as it merely constitutes the combination of known systems to achieve the predictable result of providing synthetic dialogue data for both text-based and audio-based systems.
Claims  7 and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Golany as applied to claim 1 above, and further in view of Wang et al. “NoteChat: A Dataset of Synthetic Patient-Physician Conversations Conditioned on Clinical Notes” arXiv:2310.15959v1 [cs.CL] Oct 24 2023 (herein “Wang”).
Regarding claims 7 and 16, Golany teaches all of the elements of claims 6 and 15 (see detailed mapping above). However, Golany fails to explicitly teach that the characteristics comprise at least one of: name, gender, age, address or occupation.
Wang teaches a system and method for generating a synthetic dialog dataset using LLM chatbots that includes, inter alia, a Roleplay module where two ChatGPT agents take on the roles of doctor and patient, respectively. See Wang Figure 1 and page 2, 2nd column. Thus, Wang teaches at least one of the role characteristics includes an occupation, i.e., doctor or patient. 
Golany differs from the claimed invention, as defined by claims 7 and 16, in the Golany fails to disclose that the role characteristics for the LLM include a speakers occupation. Using occupation roles to generate synthetic dialog data are known as evidenced by Lee. Therefore, it would have been obvious to one having ordinary skill in the art, before the effective filing date of the invention, to have modified the system of Golany to include the occupation, such as doctor, roles for the LLMs to roleplay as taught by Wang as it merely constitutes the combination of known processes to achieve the predictable result of generating synthetic medical datasets without violating privacy regulations associated with real Docter-patient dialog data.
Regarding claim 17, Golany teaches a method for refining a conversation analytics platform (under a BRI conversation analytic platform is interpreted as any platform which process conversational data to generate an output or response; page 3, section 3 teaches “our goal is to generate datasets for agent models in source-grounded information-seeking dialogs” the source-grounded information-seeking dialogs system is interpreted as a conversation analytics platform in as much as it is designed to process/analyze conversational data to provide a response/output ) comprising: 
generating synthetic data by defining roles for a plurality of speakers (Page 1, Abstract teaches “Instead of the labor-intensive Wizard-of-Oz (WOZ) method, where two annotators generate a dialog from scratch, role-playing agent and user, we use LLM generation to simulate the two roles.” The agent and user roles are interpreted as a plurality of speakers), 
inputting the roles to at least one Large Language Model (LLM), implemented by at least one first processor (Page 1, 2nd column teaches “We utilize separate prompts to guide the LLM’s generation of both the user queries and the agent responses” See also, Fig. 1, “Query Prompt”), 
requesting the at least one LLM to generate a first statement based on the role of a first speaker of the plurality of speakers (Page 1, 2nd column teaches “We utilize separate prompts to guide the LLM’s generation of both the user queries and the agent responses” See also, Fig. 1, “Query Prompt”), 
instructing the at least one LLM to generate a second statement based on the role of a second speaker of the plurality of speakers that is responsive to the first statement (Page 1, 2nd column teaches “We utilize separate prompts to guide the LLM’s generation of both the user queries and the agent responses” See also, Fig. 1, “Response Prompt”), 
storing a dialog between the first speaker and the second speaker comprising the first and second statements (Fig. 1, transcript of full generated dialog), 
iterating (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…”) the requesting, instructing and storing such that 
the first statement is responsive to the second statement of a preceding iteration of the requesting (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…”), 
the second statement is responsive to the first statement of a current iteration of the requesting instructing and storing (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…”), 
the storing comprises adding the first and second statements of a current iteration to the dialog such that the dialog comprises the first and second statements of each previous iteration of the requesting, instructing and storing (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…”), and 
each instance of the requesting and instructing comprises providing the at least one LLM with the dialog of a preceding iteration of the storing, ceasing said iterating in response to a termination condition to obtain the stored dialog in a final iteration of the iterating, wherein the stored dialog in the final iteration is the synthetic data (Page 1, 2nd column Figure 1 description teaches “Iterative dialog generation flow. In each turn, a query prompt guides the LLM to generate a user query given the transcript, the accumulated dialog history and a query template. Then, a response prompt, accompanied by the full context so far, generates the agent response. Iterating this automatic process yields a full dialog…”); 
inputting the synthetic data to the conversation analytics platform, which is implemented by at least one second processor (page 7, section 7.2 teaches “We train the finetuned agent models using the MISeD training set…” as discussed above and noted in Section 3 of Golany, the agent model is part of the source-grounded information-seeking dialog system, i.e., the conversation analytics platform ); 
receiving feature results characterizing the synthetic data from the conversation analytics platform (page 7, section 7.2 teaches “We train the finetuned agent models using the MISeD training set…” training the finetuned agent models  ); 
comparing the feature results to initial parameters  (under a broadest reasonable interpretation this limitation is interpreted an automated validation of the conversation analytics platform based on the synthetic training dataset; page 6, section 6 teaches “We evaluate the agent models along two dimensions; the quality of the generated responses (§6.1), and the accuracy of the provided attributions (§6.2), through both automatic and human evaluations” See also Sections 6.1.2, and 6.2.2); 
refining the at least one model portion of the conversation analytics platform in response to determining that the at least one model portion of the conversation analytics platform is deficient (page 7, sections 7.1 teaches “we finetuned the open-source5 LongT5 XL (3 billion parameters) on the MISeD training set…we finetuned the Gemini Pro Model7 on the MISeD training set…”).
Golang fails to explicitly disclose that the comparison parameters utilized for the validation of the conversation analytics platform includes the roles for the plurality of speakers. 
Wang teaches a system and method for generating a synthetic dialog dataset using LLM chatbots that includes, inter alia, a Roleplay module where two ChatGPT agents take on the roles of doctor and patient, respectively. See Wang Figure 1 and page 2, 2nd column. In addition, Wang further teaches that the extrinsic evaluation parameters/training labels includes the role, i.e., physician/doctor, of the speakers. (See section 3.2 of Wang).
Golany differs from the claimed invention, as defined by claim 17, in the Golany fails to explicitly disclose that validation features includes the roles for the plurality of speakers. Utilizing speaker role as a training label/validation features is known as evidenced by Wang. Therefore, it would have been obvious to one having ordinary skill in the art, before the effective filing date of the invention, to have modified the system of Golany to include validation of the speaker role features as taught by Wang as it merely constitutes the combination of known processes to achieve the predictable result of generating synthetic medical datasets without violating privacy regulations associated with real Docter-patient dialog data.
Regarding claim 18, the combination of Golany and Wang teaches all of the elements of claim 17 (see detailed element mapping above). In addition, Golany further teaches [wherein] the synthetic data is first synthetic data (page 3, section 4 teaches “Its first stage automatically generates a dialog using LLMs (§4.1), simulating the typical WOZ process…”) and 
the method further comprises: generating second synthetic data (page 4, section 4.2 teaches “following the automatic generation of the dialog, we present it to trained annotators who asses the generated query and response, and identify corresponding attributions within the source text” the addition of the attributions is interpreted as second synthetic data), 
wherein the refining comprises refining the at least one model portion of the conversation analytics platform with the second synthetic data (page 7, sections 7.1 teaches “we finetuned the open-source5 LongT5 XL (3 billion parameters) on the MISeD training set…we finetuned the Gemini Pro Model7 on the MISeD training set…”)).  
Regarding claim 19, the combination of Golany and Wang teaches all of the elements of claim 18 (see detailed element mapping above). In addition, Golany further teaches the second synthetic data is provided in a model training dataset and wherein the refining comprises training the at least one model portion with the model training dataset (page 3, Figure 3 teaches the dialog history includes the Attribution information; and page 7, sections 7.1 teaches “we finetuned the open-source5 LongT5 XL (3 billion parameters) on the MISeD training set…we finetuned the Gemini Pro Model7 on the MISeD training set…”).  
Regarding claim 20, the combination of Golany and Wang teaches all of the elements of claim 19 (see detailed element mapping above). In addition, Golany further teaches the model training dataset comprises the first synthetic data (page. 3, Figure 3 teaches that that Dialog history include the User and Agent responses, i.e., the first synthetic data).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PENNY L CAUDLE whose telephone number is (703)756-1432. The examiner can normally be reached M-Th 8:00 am to 5:00 pm eastern.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at 571-272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PENNY L CAUDLE/Examiner, Art Unit 2657                                                                                                                                                                                                        

/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657
Read full office action
Prosecution Timeline

May 03, 2024
Application Filed
Mar 03, 2026
Non-Final Rejection mailed — §102, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/852,548
Patent 12626688
METHOD AND SYSTEM FOR SPEECH DETECTION AND SPEECH ENHANCEMENT
3y 10m to grant Granted May 12, 2026
18/461,333
Patent 12620392
INITIATING AN ACTION BASED ON A VOICE MESSAGE TEXT GENERATED FROM A VOICE MESSAGE
2y 8m to grant Granted May 05, 2026
18/178,376
Patent 12609123
AUDIO PROCESSING METHOD AND APPARATUS
3y 1m to grant Granted Apr 21, 2026
18/302,683
Patent 12592243
METHOD AND ELECTRONIC DEVICE FOR PERSONALIZED AUDIO ENHANCEMENT
2y 11m to grant Granted Mar 31, 2026
18/038,631
Patent 12573371
VOCABULARY SELECTION FOR TEXT PROCESSING TASKS USING POWER INDICES
2y 9m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
68%
Grant Probability
83%
With Interview (+14.9%)
2y 11m (~10m remaining)
Median Time to Grant
Low
PTA Risk
Based on 73 resolved cases by this examiner. Grant probability derived from career allowance rate.