Last updated: May 29, 2026

Application No. 18/538,743

Systems and Methods for Synthetic Image Generation based on RNA Expression

Non-Final OA §103

Filed

Dec 13, 2023

Priority

Dec 13, 2022 — provisional 63/387,261

Examiner

WAMBST, DAVID ALEXANDER

Art Unit

2663

Tech Center

2600 — Communications

Assignee

The Board Of Trustees Of The Leland Stanford Junior University

OA Round

1 (Non-Final)

Interview Optional

— +42.9% interview lift. Examiner has a relatively high allowance rate (71%); +42.9% interview lift. A written response may suffice.

Based on 31 resolved cases, 2023–2026

Examiner Intelligence

WAMBST, DAVID ALEXANDER View full profile →

Grants 71% — above average

Career Allowance Rate

22 granted / 31 resolved

+9.0% vs TC avg

Strong +43% interview lift

Without

With

+42.9%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

12 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

0.9%

-39.1% vs TC avg

§103

91.5%

+51.5% vs TC avg

§102

2.8%

-37.2% vs TC avg

§112

3.8%

-36.2% vs TC avg

Black line = Tech Center average estimate • Based on career data from 31 resolved cases

Office Action

§103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Specification
The disclosure is objected to because of the following informalities: Para. 4, line 6 recites “refers to the scanning of glass slides are scanned to produce digital images” should read “refers to the scanning of glass slides to produce digital images”.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3-14, and 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Way et al. (NPL, “Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders”, published 2018, pdf attached) in view of Ho et al. (NPL, “Cascaded Diffusion Models for High Fidelity Image Generation”, published 2021, pdf attached) further in view of Schmauch et al. (NPL, “A deep learning model to predict RNA-Seq expression of tumours from whole slide images”, published 2020, pdf attached).
Regarding claim 1, Way teaches a method comprising: obtaining a plurality of RNA-Seq records (Pg. 81, “In the following report, we extend the autoencoder framework by training and evaluating a VAE on TCGA RNA-seq data.”); translating each record in the plurality of RNA-Seq records into a latent space using an encoder component of a variational auto encoder (Pg. 81, “We aim to demonstrate the validity and specific latent space benefits of a VAE trained on gene expression data… We shall name this model “Tybalt””; Pg. 85, “Tybalt compressed tumors into a lower dimensional space, acting as a nonlinear dimensionality reduction algorithm.”); obtaining a given RNA-Seq record (Pg. 81, “In the following report, we extend the autoencoder framework by training and evaluating a VAE on TCGA RNA-seq data.”); translating the given RNA-Seq record into the latent space using the encoder component of the variational autoencoder (Pg. 81, “We aim to demonstrate the validity and specific latent space benefits of a VAE trained on gene expression data… We shall name this model “Tybalt””; Pg. 85, “Tybalt compressed tumors into a lower dimensional space, acting as a nonlinear dimensionality reduction algorithm.”).
Way does not explicitly disclose obtaining a plurality of histological slide images, where each histological slide image is associated with one of the RNA-Seq records; training a first diffusion model to produce a first synthetic histological slide image at a lower resolution using the translated plurality of RNA-Seq records and the associated histological slides; training a second diffusion model to upscale lower resolution synthetic histological slide images to higher resolution synthetic histological slide images using lower resolution images produced by the first diffusion model and the associated histological slide images; providing the latent representation of the given RNA-Seq record to the trained first diffusion model; generating a given lower resolution synthetic histological slide image using the trained first diffusion model; providing the given lower resolution synthetic histological slide image to the trained second diffusion model; and generating a given higher resolution synthetic histological slide image using the trained second diffusion model.
Ho teaches a method of generating synthetic images, comprising: training a first diffusion model to produce a first synthetic image at a lower resolution using an external input signal for conditioning (Pgs. 4-5, “In the conditional generation setting, the data x0 has an associated conditioning signal c, for example a label in the case of class-conditional generation, or a low resolution image in the case of super-resolution. The goal is then to learn a conditional model pθ(x0|c)”); training a second diffusion model to upscale lower resolution synthetic images to higher resolution synthetic images using lower resolution images produced by the first diffusion model (Fig. 4, shows the cascading architecture where the lower resolution image is fed into another diffusion model for upscaling); providing the conditional signal of the given input to the trained first diffusion model (Fig. 4); generating a given lower resolution synthetic image using the trained first diffusion model (Fig. 4); providing the given lower resolution synthetic histological slide image to the trained second diffusion model (Fig. 4); and generating a given higher resolution synthetic image using the trained second diffusion model (Fig. 5).
Ho does not explicitly teach to obtain a plurality of RNA-seq records associated with histological slides, translating them into a latent space using an encoder, and then using the latent space as input to the diffusion models. However, they do teach that it is straightforward to condition an entire cascading pipeline on other conditioning information (Pg. 5).
Schmauch teaches obtaining a plurality of histological slide images, where each histological slide image is associated with one of the RNA-Seq records (Pg. 2, Col. 1, “We used matched WSIs and RNA-Seq profiles from TCGA data, including 8725 patients and 28 different cancer types, to develop HE2RNA, a deep-learning model based on a multitask weakly supervised approach. The model was trained to predict normalized gene expression data from WSIs.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Way to incorporate the teachings of Ho and Schmauch to include obtaining a plurality of histological slide images, where each histological slide image is associated with one of the RNA-Seq records, train a first diffusion model based on the translated RNA-seq record and the associated histological slides to produce a first synthetic image at a lower resolution, train a second diffusion model to upscale the lower resolution image, and using both trained models to generate a higher resolution synthetic histological slide image. Schmauch establishes that RNA-Seq records and histological whole-slide images derived from the same patient are meaningfully correlated, and that deep learning models can make use of this correspondence. Way teaches that a variational autoencoder is well-suited for reducing the high dimensionality of pan-cancer RNA-Seq data into a compact and meaningful latent representation, demonstrating that this latent representation preserves enough information for downstream predictive tasks involving cancer. Ho teaches that cascaded diffusion models represent the state of the art in conditioned image synthesis, and explicitly discloses conditional image generation with an external input. One of ordinary skill in the art would have clear motivation to substitute the VAE-encoded RNA-Seq latent representation taught by Way in place of the conditioning signal in the cascaded diffusion system of Ho, applying it to the associated RNA-Seq record and histological slides data described by Schmauch, with a reasonable expectation of success. The cascaded diffusion models produce higher quality and more robust images than GAN-based approaches, as disclosed by Ho. Schmauch further teaches that the relationship between RNA-Seq records and histological slide data is important when attempting to make a prediction on gene expression and that the size of a dataset impacts the accuracy of predictions (Pg. 2, Col. 2), providing a skilled artisan with a reason to utilize a meaningful conditioning signal derived from associated RNA-Seq data to guide diffusion-based image synthesis toward realistic histological outputs.
Regarding claim 3, Way as modified above teaches all of the elements of claim 1, as stated above, as well as training the first and second diffusion models on a second plurality of RNA-Seq records, where each of the RNA-Seq records in the second plurality of RNA-Seq records are associated with one image of a second plurality of histological slide images, and where the second plurality of RNA-Seq records are associated with a specific cancer classification (Schmauch teaches to associate RNA-Seq records with histological slides. Doing this with a second set of RNA-Seq records that are associated with a specific cancer classification is a straightforward use of the same method).
Regarding claim 4, Way as modified above teaches all of the elements of claim 1, as stated above, as well as wherein the first and second diffusion models comprise a UNet architecture (Ho; Fig. 3).
Regarding claim 5, Way as modified above teaches all of the elements of claim 1, as stated above, as well as wherein the lower resolution is 64x64 pixels (Ho; Pg. 13, “Our cascading pipelines are structured as a 32×32 base model, a 32×32→64×64 super resolution model, followed by 64×64→128×128”).
Regarding claim 6, Way as modified above teaches all of the elements of claim 1, as stated above, as well as wherein the higher resolution is 256x256 pixels (Ho; Pg. 13, “64×64→256×256 super-resolution models”, multiple lower resolution and higher resolution values are disclosed).
Regarding claim 7, Way as modified above teaches all of the elements of claim 1, as stated above, as well as wherein the given higher resolution synthetic histological slide image is a tile of a larger synthetic histological slide image (Schmauch; Pg. 2, Col. 1-2, “WSIs were partitioned into “tiles” (squares of 112 × 112 μm) and aggregated into clusters, called supertiles.”, partitioning WSIs into tiles is well-known in the art).
Regarding claim 8, Way as modified above teaches all of the elements of claim 1, as stated above, as well as generating a plurality of higher resolution synthetic histological slide images; and combining the plurality of higher resolution synthetic histological slide images to form the larger synthetic histological slide image (Schmauch; Pg. 2, Col. 1-2, “WSIs were partitioned into “tiles” (squares of 112 × 112 μm) and aggregated into clusters, called supertiles.”).
Regarding claim 9, Way as modified above teaches all of the elements of claim 1, as stated above, as well as wherein the synthetic histological slide image depicts a plurality of human tissue types (Schmauch; Fig. 1).
Regarding claim 10, Way as modified above teaches all of the elements of claim 1, as stated above, as well as training the encoder component of the variational autoencoder using a decoder component of the variational autoencoder, wherein the encoder component and the decoder component are trained together to minimize reconstruction error at the output of the decoder component (Pg. 81-82, “The VAE is based on an autoencoding framework… A traditional autoencoder consists of an encoding phase and a decoding phase where input data is projected into lower dimensions and then reconstructed”).
Regarding claim 11, the recited system performs variably the same function as that of claim 1. It is rejected under the same analysis.
Regarding claim 12, the recited elements perform variably the same function as that of claim 10. It is rejected under the same analysis.
Regarding claim 13, the recited elements perform variably the same function as that of claim 1. It is rejected under the same analysis.
Regarding claim 14, the recited elements perform variably the same function as that of claim 3. It is rejected under the same analysis.
Regarding claim 15, the recited elements perform variably the same function as that of claim 2. It is rejected under the same analysis.
Regarding claim 16, the recited elements perform variably the same function as that of claim 4. It is rejected under the same analysis.
Regarding claim 17, the recited elements perform variably the same function as that of claim 5. It is rejected under the same analysis.
Regarding claim 18, the recited elements perform variably the same function as that of claim 6. It is rejected under the same analysis.
Regarding claim 19, the recited elements perform variably the same function as that of claim 7. It is rejected under the same analysis.
Regarding claim 20, the recited elements perform variably the same function as that of claim 8. It is rejected under the same analysis.





Claim(s) 2 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Way et al. as modified in view of Ho et al. and Schmauch et al., further in view of Higgins et al. (NPL, “β-VAE: LEARNING BASIC VISUAL CONCEPTS WITH A CONSTRAINED VARIATIONAL FRAMEWORK”, published 2017, pdf attached).
Regarding claim 2, Way as modified above in view of Ho and Schmauch teaches all of the elements of claim 1, as stated above. They do not explicitly disclose wherein the variational autoencoder is a β-VAE encoder model.
Higgins teaches wherein the variational autoencoder is a β-VAE encoder model (Pg. 1, “We introduce β-VAE, a new state-of-the-art framework for automated discovery of interpretable factorised latent representations from raw image data in a completely unsupervised manner.”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Way, Ho, and Schmauch to incorporate the teachings of Higgins to include a β-VAE encoder model. Way teaches to use a variational autoencoder, and they also disclose the use of a K parameter to weight the KL divergence term in their loss function, which is functionally similar. Higgins discloses the use of a β-VAE encoder model. One of ordinary skill in the art would have understood that substituting the VAE in Way with the β-VAE encoder model of Higgins would provide improved performance, as disclosed by Higgins (Pg. 1).
Regarding claim 15, the recited elements perform variably the same function as that of claim 2. It is rejected under the same analysis.



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID A WAMBST whose telephone number is (703)756-1750. The examiner can normally be reached M-F 9-6:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Gregory Morse can be reached at (571)272-3838. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DAVID ALEXANDER WAMBST/Examiner, Art Unit 2663                                                                                                                                                                                                        

/GREGORY A MORSE/Supervisory Patent Examiner, Art Unit 2698

Read full office action

Prosecution Timeline

Dec 13, 2023

Application Filed

Apr 08, 2026

Non-Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/741,959

Patent 12632998

PLAUSIBLE DAYSCALE TIMELAPSE GENERATION METHOD AND COMPUTING DEVICE

4y 0m to grant Granted May 19, 2026

18/076,021

Patent 12597278

IMAGE AUTHENTICITY DETECTION METHOD AND DEVICE, COMPUTER DEVICE, AND STORAGE MEDIUM

3y 4m to grant Granted Apr 07, 2026

18/146,445

Patent 12524892

SYSTEMS AND METHODS FOR IMAGE REGISTRATION

3y 0m to grant Granted Jan 13, 2026

18/052,658

Patent 12437437

DIFFUSION MODELS HAVING CONTINUOUS SCALING THROUGH PATCH-WISE IMAGE GENERATION

2y 11m to grant Granted Oct 07, 2025

17/886,664

Patent 12423783

DIFFERENTLY CORRECTING IMAGES FOR DIFFERENT EYES

3y 1m to grant Granted Sep 23, 2025

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

71%

Grant Probability

99%

With Interview (+42.9%)

2y 11m (~6m remaining)

Median Time to Grant

Low

PTA Risk

Based on 31 resolved cases by this examiner. Grant probability derived from career allowance rate.