Last updated: April 19, 2026

Application No. 18/688,062

SYSTEMS AND METHODS RELATING TO SYNCHRONIZATION AND ANALYSIS OF AUDIO COMMUNICATIONS DATA AND TEXT DATA

Non-Final OA §103

Filed

Feb 29, 2024

Examiner

JACKSON, JAKIEDA R

Art Unit

2657

Tech Center

2600 — Communications

Assignee

Digital Reasoning Systems Inc.

OA Round

2 (Non-Final)

Interview Optional

— +15.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 905 resolved cases, 2023–2026

Examiner Intelligence

JACKSON, JAKIEDA R View full profile →

Grants 74% — above average

Career Allow Rate

669 granted / 905 resolved

+11.9% vs TC avg

Strong +15% interview lift

Without

With

+15.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 0m

Avg Prosecution

35 currently pending

Career history

940

Total Applications

across all art units

Statute-Specific Performance

§101

25.8%

-14.2% vs TC avg

§103

42.5%

+2.5% vs TC avg

§102

21.8%

-18.2% vs TC avg

§112

3.5%

-36.5% vs TC avg

Black line = Tech Center average estimate • Based on career data from 905 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


	Response to Amendment
In response to the Office Action mailed October 1, 2025, applicant submitted an amendment filed on October 30, 2025, in which the applicant traversed and requested reconsideration.

Response to Arguments
Applicants argue that the prior art cited fails to teach generating an audio waveform representing the audio data and outputting a visual representation corresponding to the audio waveform, wherein the visual representation of the audio waveform is displayed along a vertical axis and generating a visual representation of the transcribed text, along a horizontal axis perpendicular to the vertical axis, such that lines of the transcribed text, as output for display, are synchronized with and align with corresponding portions of the audio waveform while the vertical waveform.  Applicants’ arguments are persuasive, but are moot in view of new grounds of rejection.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 5-6, 8-18 and 24-33 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kahn et al. (PGPUB 2006/0149558), hereinafter referenced as Kahn in view of Sorensen et al. (USPN 8,676,590), hereinafter referenced as Sorensen and in further view of Hodge (PGPUB 2017/0270627).

Regarding claims 1, 17 and 33, Kahn discloses a computer-implemented method, system, hereinafter referenced as a method, comprising:
receiving audio data (computer software for speech audio input data; p. 0064, 0520);
transcribing the audio data into text, based on one or more languages being spoken within the audio data (transcribe the speaker dependent language into text; p. 0065, 0520-0521);
identifying a potential of a predetermined standard within at least the transcribed text, wherein the identified potential violation corresponds to a match of a potential violation of a predetermined policy, wherein the policy corresponds to at least one scenario and target population, and wherein the predetermined standard is based on the at least one scenario (recognition of text with a hypothetical standard (predetermined standard) in the transcription corresponding to a recognition (match) model rule (a predetermined policy) based on a target foreign language/dialect context/condition one scenario; p. 0132, 0646-0647, 0652, 0720), but fails to disclose generating an audio waveform representing the audio data and outputting a visual representation corresponding to the audio waveform, wherein the visual representation of the audio waveform is displayed along a vertical axis, generating a visual representation of the transcribed text, along a horizontal axis perpendicular to the vertical axis, such that lines of the transcribed text, as output for display, are synchronized with and align with corresponding portions of the audio waveform while the vertical waveform and a violation. 
Sorenson discloses a method comprising:
generating an audio waveform representing the audio data and outputting a visual representation corresponding to the audio waveform, wherein the visual representation of the audio waveform is displayed along a vertical axis (figs. 2-3 with column 3, lines 17-64); and 
generating a visual representation of the transcribed text, along a horizontal axis perpendicular to the vertical axis, such that lines of the transcribed text, as output for display, are synchronized with and align with corresponding portions of the audio waveform while the vertical waveform (figs. 2-3 with column 3, lines 17-64), to assist with ensuring alignment and synchronization.
Therefore, it would have been obvious to one of ordinary skill of the art to modify the method as described above, to facilitate concurrent transcription display and synchronization.
Hodge discloses a violation (rule violations; p. 0026, 0034), to protect sensitive information using transcription to review communication.
Therefore, it would have been obvious to one of ordinary skill of the art to modify the method as described above, for the advantage of providing enhanced transcription review.
Regarding claims 2 and 18, it is interpreted and rejected for similar reasons as set forth above.  In addition, Kahn discloses a method further comprising receiving text data from an electronic communication between persons, and wherein identifying the predetermined standard further comprises identifying the data based on both the audio data and the received text data from the electronic communication (identifying the standard potential based on conversations between speakers; p. 0060, 0616, 0720).  
Regarding claim 3, Kahn discloses a method wherein transcribing the audio data into text is based on multiple languages being spoken within the audio data (transcribing based on multilingual speech in audio; p. 0574, 0647).  
Regarding claim 5, Kahn discloses a method wherein the at least one scenario comprises a lexicon representing one or more terms or regular expressions (lexicon defines recognizable words; p. 0034-0036).  
Regarding claim 6, Kahn discloses a method wherein the at least one scenario comprises at least one of features corresponding to a machine learning model, features corresponding to a lexicon, and natural language features (p. 0128-0133).  
Regarding claims 8 and 24, Kahn discloses a method further comprising outputting, for display, a play head configured to move through the audio waveform and/or corresponding, aligned text during playback (playback; p. 0066-0068, 0096, 0256-0258).  
Regarding claims 9 and 25, Kahn discloses a method wherein the play head moves vertically along the audio waveform as the transcribed text moves forwards or backwards in playback (forward or backwards; p. 0308).  
Regarding claim 10, Kahn discloses a method wherein the movement of the play head is controllable by a user based on received user input (user input; p. 0223, 0271).  
Regarding claims 11 and 27, Kahn discloses a method wherein as the audio data corresponding to the audio waveform moves forwards or backwards in playback, the location in time within the audio waveform is visually emphasized within the visual representation of the audio waveform (visualize waveform; p. 0274-0275).  
Regarding claims 12 and 28, Kahn discloses a method wherein as the transcribed text corresponding to the audio data moves forwards or backwards in playback, the location in time corresponding to a corresponding segment of the transcribed text is visually emphasized (emphasizing text; p. 0413).  
Regarding claims 13 and 29, Kahn discloses a method wherein the emphasized segment of the transcribed text is underlined and/or highlighted (color-coded/underlined; p. 0413).  
Regarding claims 14 and 30, Kahn discloses a method wherein as the audio waveform and transcribed text are scrolled during playback, as displayed, the audio waveform and transcribed text scrolls in synchronization with the playback (synchronize text; p. 0072-0079).  
Regarding claims 15 and 31, Kahn discloses a method wherein the audio waveform and transcribed text are displayed within a webpage and the audio waveform and transcribed text scroll content of the webpage in synchronization with the playback (p. 0028, 0442, 0521, 0589).  
Regarding claims 16, 26 and 32, Kahn discloses a method wherein the visual representation corresponding to the audio waveform and the visual representation of the transcribed text are output for display to a user in an interactive graphical user interface (display; p. 0081, 0216-0220, 0225-0227).  

Claim(s) 4 and 7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kahn in view of Sorenson and Hodge and in further view of Beringer et al (PGPUB 2019/0294718), hereinafter referenced as Beringer.

Regarding claim 4, Kahn in view of Sorenson and Hodge disclose a method as described above, but does not specifically teach wherein identifying the potential violation comprises implementing a machine learning model.  
Beringer disclosed a method wherein identifying the potential violation (p. 0511) comprises implementing a machine learning model (p. 0437-0447, 0471), to assist with recommending additional data sources.
Therefore, it would have been obvious to one of ordinary skill of the art, before the effective filing date of the claimed invention, to modify the method as described above, to assist with training.
Regarding claim 7, it is interpreted and rejected for similar reasons as set forth above.  In addition, Beringer discloses a method wherein the at least one scenario is formed by joining the machine learning features and the lexicon features, using Boolean operators (p. 0277).  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  This information has been detailed in the PTO 892 attached (Notice of References Cited).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAKIEDA R JACKSON whose telephone number is (571)272-7619. The examiner can normally be reached Mon - Fri 6:30a-2:30p.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at 571.272.5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/JAKIEDA R JACKSON/           Primary Examiner, Art Unit 2657

Read full office action

Prosecution Timeline

Feb 29, 2024

Application Filed

Sep 30, 2025

Non-Final Rejection — §103

Oct 30, 2025

Response Filed

Feb 17, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/151,953

Patent 12603079

PROVIDING A REPOSITORY OF AUDIO FILES HAVING PRONUNCIATIONS FOR TEXT STRINGS TO PROVIDE TO A SPEECH SYNTHESIZER

2y 5m to grant Granted Apr 14, 2026

18/379,618

Patent 12603088

TRAINING A DEVICE SPECIFIC ACOUSTIC MODEL

2y 5m to grant Granted Apr 14, 2026

17/750,345

Patent 12598092

SYSTEMS, METHODS, AND APPARATUS FOR NOTIFYING A TRANSCRIBING AND TRANSLATING SYSTEM OF SWITCHING BETWEEN SPOKEN LANGUAGES

2y 5m to grant Granted Apr 07, 2026

18/327,115

Patent 12597427

CONFIGURABLE NATURAL LANGUAGE OUTPUT

2y 5m to grant Granted Apr 07, 2026

18/614,575

Patent 12597418

AUDIO SIGNAL PROCESSING DEVICE AND METHOD FOR SYNCHRONIZING SPEECH AND TEXT BY USING MACHINE LEARNING MODEL

2y 5m to grant Granted Apr 07, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

2-3

Expected OA Rounds

74%

Grant Probability

89%

With Interview (+15.4%)

3y 0m

Median Time to Grant

Moderate

PTA Risk

Based on 905 resolved cases by this examiner. Grant probability derived from career allow rate.

SYSTEMS AND METHODS RELATING TO SYNCHRONIZATION AND ANALYSIS OF AUDIO COMMUNICATIONS DATA AND TEXT DATA

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email