Last updated: April 19, 2026

Application No. 18/674,750

Systems, Methods, and User Interfaces for Communicating Data Uncertainty

Non-Final OA §103

Filed

May 24, 2024

Examiner

LEE, JANGWOEN

Art Unit

2656

Tech Center

2600 — Communications

Assignee

Salesforce Inc.

OA Round

1 (Non-Final)

Interview Optional

— +24.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 44 resolved cases, 2023–2026

Examiner Intelligence

LEE, JANGWOEN View full profile →

Grants 82% — above average

Career Allow Rate

36 granted / 44 resolved

+19.8% vs TC avg

Strong +24% interview lift

Without

With

+24.2%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

23 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

26.5%

-13.5% vs TC avg

§103

54.6%

+14.6% vs TC avg

§102

11.0%

-29.0% vs TC avg

§112

4.1%

-35.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 44 resolved cases

Office Action

§103

DETAILED ACTION
This communication is in response to the Application filed on 05/24/2024. Claims 1-20 are pending and have been examined. Claims 1, 19 and 20 are independent. This Application was published as U.S. Pub. No. 2025/009,4488.
	
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Applicant’s claims for benefit of a provisional application 63/538,497 submitted on 09/14/2023 is acknowledged. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-9, 11, 13-16 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Evans (US Pat No. 12,536,602, hereinafter, Evans) in view of Ivers et al. (US Pat No. 12,431,112, hereinafter, Ivers).
Regarding Claim 1,
Evans discloses a method of communicating data uncertainty, comprising: at a computing device having a display, one or more processors, and memory storing one or more programs configured for execution by the one or more processors (Evans, Fig.3, "…viewing screen layout of a mobile computing device, partitioned into several areas…";Fig.6, col.17, lls.7-64, "…a mobile device may comprise a display, a battery, a user interface, such as a keyboard, touchscreen, etc., and a memory…"): 
in response to a user query regarding a dataset that includes variability (Evans, Fig.6, col.14, lls.48-54, "…the search logic 6-11 may enable searching of all text Transcripts…The search logic 6-11 also may enable searching of corresponding options for alternative applications…"): obtaining a multimodal data representation of the dataset (Evans, col.15, lls.9-50, "…a system may use multimedia with a synchronized text transcription thereof…for viewing and manipulating the multimedia…"); 
displaying an interactive media playback element in a first region of a user interface of the computing device (Evans, Fig.1, col.11, ll.43-col.13, 11.28, "…viewing screen layout of a mobile computing device…the display screen may be divided into several main areas: the multifunction area 1, the header area 8, the media area 9, the text area 10, and the scrub-bar area 11…"): 
in response to receiving a user input via the interactive media playback element, causing playback of the multimodal data representation on the user interface, including: presenting audio content describing data in the multimodal representation (Evans, Fig.1, col.11, ll.43-col.13, 11.28, "…using a speaker, audio that corresponds to the displayed media, such as…audio of a movie being displayed, audio of a music video being displayed, audio of an electronic book for which an illustration…"; Fig.7, col.19, lls.27-52, "…a user may view a deposition video and a deposition transcript in a synchronized manner…"); and  
Evans discloses the multimodal representation of the data using media (e.g., images or video) and corresponding text transcript (Evans, Fig.1, col.11, ll.43-col.13, 11.28, Fig.7), and the time-synchronization of text and multimedia (Evans, col.16, lls.35-67, "…The synchronization index itself may, then, include the transcript and the timing values, or positional values, for the associated media (e.g., multimedia)…") but does not explicitly disclose the presentation of audio content and visualization content in separate regions and modification of a playback portion of the visual content.
However, Ivers discloses while presenting the audio content describing the data in the multimodal representation, simultaneously presenting visual content via a visualization in a second region of the user interface that is different from the first region (Ivers, col.5, ll.15 - col.6, ll.43, "…system for optional platform independent visualization of audio content…"; Fig.13A-C, col.38, lls.34-66, "…platform-independent visualization of audio content system (1300)…In the lower part of the figure is a progress bar (1302) that represents an entire audio track (1304)…the text (1320) of the spoken content of the audio track (1304), generated by a voice recognition module such as voice recognition module (112) of FIG. 5…Based on the textual elements, e.g., (1331-1333), algorithms directed to generating and suggesting visual content as described above, will offer matching visual assets (1340)…") , 
wherein the visual content is time-synchronized with the audio content (Ivers, Fig.3, col.18, lls.40-62, "…FIG. 3 illustrates a digital audio track (10) divided into distinct topical audio segments (1, 2, 3, 4)…These visual assets (31-34) serve to enhance a corresponding audio segments (1-4) of conversation with a visual match or counterpart…each visual asset is displayed on a user device, and timed to coincide along with the audio discussion taking place…"; Claim 20, "…the user interface is configured to: (1) provide time-synchronized playback of the audio segment together with the corresponding transcript text…"). 
detecting a user interaction with the interactive media playback element (Ivers, col.6, lls.29-43, "…system for creating multimedia moments from audio data, wherein the user interface based on the multimedia moment
optionally comprises a control usable to provide a user feedback to the server…(i) receive the user feedback…"); and 
in response to detecting the user interaction, modifying a playback portion of the visual content and the audio content that is time-synchronized with the visual content (Ivers, col.6, lls.29-43, "…(ii) update the
training dataset based upon the user feedback…"; Fig.13C, Col.38, lls.56-66, "…users, administrators, and automated processes/devices may select certain visual assets (1340) for pairing with the audio segment (1306, 1308)…AI Image Generator (1400) is utilized for the creation and/or assignment of visual assets, such as visual assets (128)…")
Therefore, it would have been obvious to one of ordinary skill in the art, before effective filing date of the claimed invention, to have modified a computing system/method for synchronous playback of media (e.g., multimedia) and associated text of Evans with a system for optional platform-independent visualization of audio content of Ivers with a reasonable expectation of success to make digital audio readily searchable, indexable and shareable (Ivers, Abstract, col.6, ll.15 - col.6, ll.43).
Regarding Claim 2,
Evans in view of Ivers discloses the method of claim 1, further comprising: 
Ivers further discloses while simultaneously presenting the audio content and the visual content, concurrently presenting text content in a third region of the user interface that is different from the first region and the second region, wherein the text content is time-synchronized with both the audio content and the visual content (Ivers, Fig.13A-C, col.38, lls.34-66, "…In the lower part of the figure is a progress bar (1302) that represents an entire audio track (1304)…the text (1320) of the spoken content of the audio track (1304)…matching visual assets (1340)…").
Regarding Claim 3,
Evans in view of Ivers discloses the method of claim 2, wherein the visual content and the text content are timed-synchronized with the audio content according to a timestamp of the audio content (Evans, Fig.39, col.8, lls.33-49, "…FIG. 39 illustrates an exemplary data structure for the synchronization index…The synchronization index assists in synchronizing multimedia content, such as audio and/or video clips, with transcripts of text…";Ivers, Fig.3, col.18, lls.40-62, "…FIG. 3 illustrates a digital audio track (10) divided into distinct topical audio segments (1, 2, 3, 4)…each visual asset is displayed on a user device, and timed to coincide along with the audio discussion taking place…"; Claim 20, "…the user interface is configured to: (1) provide time-synchronized playback of the audio segment together with the corresponding transcript text…").
Regarding Claim 4,
Evans in view of Ivers discloses the method of claim 2, wherein the text content is a text transcript of the audio content (Ivers, col.5, ll.34-col.6, ll.28, "…a transcript database configured to store a plurality of transcript datasets, wherein each transcript dataset of the plurality of transcript datasets comprises text derived from corresponding audio data and is time indexed to the corresponding audio data…").
Regarding Claim 5,
Evans in view of Ivers discloses the method of claim 4, wherein presenting the text content includes presenting the text transcript sentence-by-sentence in the third region of the user interface (Evans, Fig.7, col.19, lls.27-44, "…an example interface that may be used in handling errata sheet operations. As shown, in FIG. 7, a user may view a deposition video and a deposition transcript in a synchronized manner…").
Regarding Claim 6,
Evans in view of Ivers discloses the method of claim 2, wherein: the text content includes hedge words; and presenting the text content in the third region of the user interface includes presenting the hedge words with a different visual characteristic than other text in the text content (Evans, Figs.6-8, col.15, ll.63 - col.17, ll.7, "…The video, display, and playlist logic 6-13 may control synchronous text/video using a synchronization index…The synchronization index also may include information relating to predetermined text display settings, for example defaults of text font, size, color formatting and so on selected to optimize an orderly display…"; Ivers, Fig.22B, col.43, lls.1-67, "…Generated (328) dynamic components may include, for example, selecting background colors, text colors, text font, and other stylistic options for how the shared moment will appear on the target platform( s ), and may also include selection of generic images, icons, or other assets that are unavailable or unusable…").
Regarding Claim 7,
Evans in view of Ivers discloses the method of claim 2, wherein the text content includes data values of a data field; and presenting the text content in the third region of the user interface includes presenting the data values with a different visual characteristic than other text in the text content (Evans, Figs.6-8, col.15, ll.63 - col.17, ll.7, "…The video, display, and playlist logic 6-13 may control synchronous text/video using a synchronization index…The synchronization index also may include information relating to predetermined text display settings, for example defaults of text font, size, color formatting and so on selected to optimize an orderly display…"; Ivers, Fig.22B, col.43, lls.1-67, "…Generated (328) dynamic components may include, for example, selecting background colors, text colors, text font, and other stylistic options for how the shared moment will appear on the target platform( s ), and may also include selection of generic images, icons, or other assets that are unavailable or unusable…").  
Regarding Claim 8,
Evans in view of Ivers discloses the method of claim 7, further comprising: 
Ivers further discloses when the audio content corresponds to a respective data value of the data field, visually emphasizing the respective data value in the text content (Ivers, Fig.21E, col.41, lls.53-67, "…such text may be displayed with visual characteristics that identify the text associated with the moment (e.g., in FIG. 21E, the moment text is displayed as bolded and underlined)…").
Regarding Claim 9,
Evans in view of Ivers discloses the method of claim 1.
Ivers further discloses wherein simultaneously presenting the visual content while presenting the audio content includes presenting the visual content as an animated dot plot that is time-synchronized with the audio content (Ivers, col.23, lls.41-61, "…A visual asset (128) may be any form of visual information ,such as an image or photograph (e.g., dot plot). the visual asset (128) paired with the indexed audio segment (126) is a cinemograph. These are generally published as an animated GIF or other video formation and give the illusion that the viewer is watching an animation…").
Regarding Claim 11,
Evans in view of Ivers discloses the method of claim 2.
Ivers further discloses wherein simultaneously presenting the visual content while presenting the audio content includes presenting the visual content as an animated density plot that is time-synchronized with the audio content (Ivers, col.23, lls.41-61, "…A visual asset (128) may be any form of visual information ,such as an image or photograph (e.g., density plot). the visual asset (128) paired with the indexed audio segment (126) is a cinemograph. These are generally published as an animated GIF or other video formation and give the illusion that the viewer is watching an animation…").
Regarding Claim 13, 
Evans in view of Ivers discloses the method of claim 2.
Evans further discloses wherein: the dataset includes data values of a first data field ; and simultaneously presenting the visual content and the text content while presenting the audio content (Evans, Fig.7, col.18, ll.4 - col.19, ll.44, "…As shown, in FIG. 7, a user may view a deposition video and a deposition transcript in a synchronized manner, and manipulate an errata (i.e., dataset) display area to make a change to a deposition transcript to be added as a change in an errata sheet.…") includes: 
when the audio content corresponds to a respective data value of the first data field, simultaneously visually emphasizing: one or more portions of the visual content corresponding to the respective data value; and a portion of the text content that matches the respective data value (Evans, Fig.7, "…Altered text displays differently… "; col.12, lls.9-18, "…text area 10 also may include a highlight bar that serves as a position indicator…by highlighting the current line of text being output as audio for the media displayed in the media area 9…").
Regarding Claim 14,
Evans in view of Ivers discloses the method of claim 1, further comprising: 
Evans further discloses prior to causing playback of the multimodal data representation on the user interface, displaying, in the user interface, a plurality of affordances, each affordance of the plurality of affordances corresponding to a respective visualization type for visualizing the visual content (Evans, Fig.1, col.13, lls.13-20, "…The multifunction area 1 may include a media select icon 2…"); and 
the method further includes: in response to user selection of a first affordance of the plurality of affordances, corresponding to a first visualization type, presenting the visual content in the first visualization type (Evans, Fig.1, col.13, lls.13-20, "…The media select icon 2 may enable a user to select media to display in the media area 9 by, for example, causing display of a directory from which the user may select a desired media file or causing display of a list of available media files for user selection…"). 
Regarding Claim 15, 
Evans in view of Ivers discloses the method of claim 2, wherein: 
Evans further discloses the audio content describing the data includes a first data field and one or more hedge words; and concurrently presenting the text content while simultaneously presenting the audio content and the visual content includes displaying the text content so that one or more data values of the first data field have a first visual characteristic and the one or more hedge words have a second visual characteristic that is different from the first visual characteristic (Evans, Fig.7, "…Altered text displays differently… "; col.12, lls.9-18, "…text area 10 also may include a highlight bar that serves as a position indicator…by highlighting the current line of text being output as audio for the media displayed in the media area 9…"; Examiner interprets 'hedge words' as a dataset of a second data field, and the similar rationale could be applied as in claim 13 with 'errata display.' ).  
Regarding Claim 16, 
Evans in view of Ivers discloses the method of claim 15, further comprising: 
Evans further discloses in response to receiving a user interaction with a first hedge word of the one or more hedge words (Evans, Fig.7, col.18, lls.18-35, "…A Graphical User Interface (GUI) allows a user to edit the text of any line of testimony…"), 
causing the first hedge word to be displayed with a third visual characteristic that is distinct from the first and second visual characteristics . (Evans, Fig.7, col.18, ll.4 - col.19, ll.44, "…The system notes the original text of the line and compares it to the altered text…New or added text can, at the user's selection, be color coded
to denote it as a change from the original text…Font characteristics may include strikethrough, font color, font size, font, background highlighting, and similar characteristics…")
Regarding Claim 18, 
Evans in view of Ivers discloses the method of claim 1, further comprising: 
Evans further discloses after playback of the multimodal data representation, displaying a replay icon on the user interface; and in response to receiving user selection of the replay icon, replaying the multimodal data representation on the user interface (Evans, Fig.1, col.13, lls.6-12, "…The scrub-bar area 11 includes a scrub bar that may be used to control display of the media displayed in the media area 9 and the text displayed in the text area 10…"; Fig.6, col.29, lls.46-51, "… The video, display, and playlist logic 6-13 may perform video clip sequencing, jump to location features, runtime calculation, display of font/appearance, display and control of a scrub bar, timestamp editing, header/exhibits formatting, and variable speed playback control…"; col.34, lls.1-20, "…navigational scrub bar…").
Claim 19 is a computer device claim with limitations similar to the limitations of Claim 1 and is rejected under similar rationale. Rationale for combination is similar to that provided for Claim 1.
Claim 20 is a non-transitory computer-readable storage medium claim with limitations similar to the limitations of Claim 1 and is rejected under similar rationale. Rationale for combination is similar to that provided for Claim 1.

Allowable Subject Matter
Claims 10, 12 and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Crowder (US Pub No. 2021/0389868) discloses a media development tool that facilitates the creation and insertion of audio-synchronized events within a dynamic user-influenced media experience (Crowder, paras [011-014]). Chen et al. (US Pub No. 2015/0296228) discloses systems and methods providing users with personalized video content feeds. A multi-modal segmentation process is utilized that relies upon cues derived from video, audio and/or text data present in a video data stream (Chen, Abstract).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JANGWOEN LEE whose telephone number is (703)756-5597. The examiner can normally be reached Monday-Friday 8:00 am - 5:00 pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, BHAVESH MEHTA can be reached at (571)272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JANGWOEN LEE/Examiner, Art Unit 2656                                                                                                                                                                                                        
/BHAVESH M MEHTA/Supervisory Patent Examiner, Art Unit 2656

Read full office action

Prosecution Timeline

May 24, 2024

Application Filed

Feb 01, 2026

Non-Final Rejection — §103

Apr 13, 2026

Interview Requested

Precedent Cases

Applications granted by this same examiner with similar technology

18/007,025

Patent 12597432

HUM NOISE DETECTION AND REMOVAL FOR SPEECH AND MUSIC RECORDINGS

2y 5m to grant Granted Apr 07, 2026

18/118,619

Patent 12586571

EFFICIENT SPEECH TO SPIKES CONVERSION PIPELINE FOR A SPIKING NEURAL NETWORK

2y 5m to grant Granted Mar 24, 2026

18/258,569

Patent 12573381

SPEECH RECOGNITION METHOD AND APPARATUS, STORAGE MEDIUM, AND ELECTRONIC DEVICE

2y 5m to grant Granted Mar 10, 2026

17/925,261

Patent 12567430

METHOD AND DEVICE FOR IMPROVING DIALOGUE INTELLIGIBILITY DURING PLAYBACK OF AUDIO DATA

2y 5m to grant Granted Mar 03, 2026

18/310,577

Patent 12566930

CONDITIONING OF PRODUCTIVITY APPLICATION FILE CONTENT FOR INGESTION BY AN ARTIFICIAL INTELLIGENCE MODEL

2y 5m to grant Granted Mar 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

82%

Grant Probability

99%

With Interview (+24.2%)

2y 11m

Median Time to Grant

Low

PTA Risk

Based on 44 resolved cases by this examiner. Grant probability derived from career allow rate.