Diffusion-Models-Improve-Points

日期: September 6th 2024, 5:24:25 am
期刊: xxx

方向

  1. 在每一次采样过程中加入一个Condition block。

    1. [Bidirectional Condition Diffusion Probabilistic Models for PET Image Denoising](###Bidirectional Condition Diffusion Probabilistic Models for PET Image Denoising)
    2. [ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models](###ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models)
    3. [Denoising Diffusion Restoration Models](###Denoising Diffusion Restoration Models)
    4. [SRDiff: Single Image Super-Resolution with Diffusion Probabilistic Models](###SRDiff: Single Image Super-Resolution with Diffusion Probabilistic Models)
    5. [***SegDiff: Image Segmentation with Diffusion Probabilistic Models](###SegDiff: Image Segmentation with Diffusion Probabilistic Models)
    6. [Person Image Synthesis via Denoising Diffusion Model](###Person Image Synthesis via Denoising Diffusion Model)
    7. [Learning Reliability of Multi-Modality Medical Images for Tumor Segmentation via Evidence-Identified Denoising Diffusion Probabilistic Models***](###Learning Reliability of Multi-Modality Medical Images for Tumor Segmentation via Evidence-Identified Denoising Diffusion Probabilistic Models)]
    8. [Diffusion Autoencoders: Toward a Meaningful and Decodable Representation](###Diffusion Autoencoders: Toward a Meaningful and Decodable Representation)
    9. [Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image](###Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image)
  2. 使用多个扩散模型来构建新的扩散模型

    1. [Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation](###Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation)
    2. [Erasing-inpainting-based data augmentation using denoising diffusion probabilistic models with limited samples for generalized surface defect inspection](###Erasing-inpainting-based data augmentation using denoising diffusion probabilistic models with limited samples for generalized surface defect inspection)
    3. [RePaint: Inpainting using Denoising Diffusion Probabilistic Models](###RePaint: Inpainting using Denoising Diffusion Probabilistic Models)
    4. [Face Morphing Attack Detection with Denoising Diffusion Probabilistic Models](###Face Morphing Attack Detection with Denoising Diffusion Probabilistic Models)
    5. [Cloud Removal in Remote Sensing Using Sequential-Based Diffusion Models](###Cloud Removal in Remote Sensing Using Sequential-Based Diffusion Models)
  3. 将时间序列模型引入到扩散模型中。

    1. [DiffTAD: Denoising diffusion probabilistic models for vehicle trajectory anomaly detection](##DiffTAD: Denoising diffusion probabilistic models for vehicle trajectory anomaly detection)
  4. 潜空间结合扩散模型(AE+DDM等,stable diffusion等模式)

    1. [Radio Anomaly Detection Based on Improved Denoising Diffusion Probabilistic Models](###Radio Anomaly Detection Based on Improved Denoising Diffusion Probabilistic Models)
    2. [D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation](###D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation)
    3. [TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction](###TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction)
    4. [VAEs meet Diffusion Models: Efficient and High-Fidelity Generation](###VAEs meet Diffusion Models: Efficient and High-Fidelity Generation)
    5. [Denoising Diffusion Models on Model-Based Latent Space](###Denoising Diffusion Models on Model-Based Latent Space)
    6. [Diverse Hyperspectral Remote Sensing Image Synthesis With Diffusion Models](###Diverse Hyperspectral Remote Sensing Image Synthesis With Diffusion Models)
  5. 加速采样

    1. [Denoising Diffusion Implicit Models](###Denoising Diffusion Implicit Models)
    2. [gDDIM: Generalized denoising diffusion implicit models](###gDDIM: Generalized denoising diffusion implicit models)
    3. [DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models](###DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models)
    4. [Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models](###Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models)
    5. [Accelerating Diffusion Models via Early Stop of the Diffusion Process](###Accelerating Diffusion Models via Early Stop of the Diffusion Process)
    6. [***Parallel Sampling of Diffusion Models](###Parallel Sampling of Diffusion Models)
    7. [***Diffusion Posterior Sampling for General Noisy Inverse Problems](###Diffusion Posterior Sampling for General Noisy Inverse Problems)
    8. [***Efficient Denoising Diffusion via Probabilistic Masking](###Efficient Denoising Diffusion via Probabilistic Masking)
  6. 参数优化

    1. [Improved Denoising Diffusion Probabilistic Models](###Improved Denoising Diffusion Probabilistic Models)
    2. [Bilateral Denoising Diffusion Models](###Bilateral Denoising Diffusion Models)
  7. 训练过程

    1. [Residual Denoising Diffusion Models](###Residual Denoising Diffusion Models)
  8. 替换分布

    1. [Denoising Diffusion Gamma Models](###Denoising Diffusion Gamma Models)
    2. [AnoDDPM: Anomaly Detection with Denoising Diffusion Probabilistic Models using Simplex Noise](###AnoDDPM: Anomaly Detection with Denoising Diffusion Probabilistic Models using Simplex Noise)
    3. [Non Gaussian Denoising Diffusion Models](###Non Gaussian Denoising Diffusion Models)
  9. 替换网络

    1. [Spiking Denoising Diffusion Probabilistic Models](##Spiking Denoising Diffusion Probabilistic Models)
    2. [DiffTAD: Denoising diffusion probabilistic models for vehicle trajectory anomaly detection](##DiffTAD: Denoising diffusion probabilistic models for vehicle trajectory anomaly detection)
    3. [***Pyramidal Denoising Diffusion Probabilistic Models-接受不同尺度图像的网络,有点意思](###Pyramidal Denoising Diffusion Probabilistic Models)
    4. [Diffusion Probabilistic Model Made Slim](###Diffusion Probabilistic Model Made Slim)
    5. [SAR Despeckling using a Denoising Diffusion Probabilistic Model](###SAR Despeckling using a Denoising Diffusion Probabilistic Model)
    6. [DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion](##DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion)
    7. [Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition](###Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition)
    8. [TCDM: Effective Large-Factor Image Super-Resolution via Texture Consistency Diffusion](###TCDM: Effective Large-Factor Image Super-Resolution via Texture Consistency Diffusion)
    9. 其他
      1. [***Denoising Diffusion Bridge Models](###Denoising Diffusion Bridge Models)
      2. [Advancing Realistic Precipitation Nowcasting With a Spatiotemporal Transformer-Based Denoising Diffusion Model](###Advancing Realistic Precipitation Nowcasting With a Spatiotemporal Transformer-Based Denoising Diffusion Model)

网络改进

DiffTAD: Denoising diffusion probabilistic models for vehicle trajectory anomaly detection

DiffTAD1

DiffTAD2

Spiking Denoising Diffusion Probabilistic Models

SDDPM

SDDPM2

使用spikeNet based UNet。但是还是比不过ANN Based UNet。

Diffusion Probabilistic Model Made Slim

![Diffusion Probabilistic Model Made Slim1](../images/Diffusion-Models-Improve-Points/Diffusion Probabilistic Model Made Slim1.png)

首先减小了DPM的模型,然后使用图像频域来替换UNet中的上下采样块。

![Diffusion Probabilistic Model Made Slim2](../images/Diffusion-Models-Improve-Points/Diffusion Probabilistic Model Made Slim2.png)

但是文中提到的先学习低频,在学习高频的思路早就有了,并且也不是扩散模型的特点,而是UNet神经网络的模型。所以本文也只是换了Backbone,把频域的概念引入了进来。

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

DDFM1

DDFM2

在 DDPM 采样框架下,融合任务被表述为一个条件生成问题,它又被进一步划分为一个无条件生成子问题和一个最大似然子问题。

后者采用潜变量的分层贝叶斯方式建模,并通过期望最大化(EM)算法进行推断。通过将推理解决方案集成到扩散采样迭代中,我们的方法可以利用自然图像生成先验和来自源图像的跨模态信息生成高质量的融合图像。

Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition

![Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition1](../images/Diffusion-Models-Improve-Points/Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition1.png)

![Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition2](../images/Diffusion-Models-Improve-Points/Zero-shot Generation of Training Data with Denoising Diffusion Probabilistic Model for Handwritten Chinese Character Recognition2.png)

只是改进了网络,加入了一个条件,但是是加在网络上的,不是扩散模型的条件概率。

ICDAR International Conference on Document Analysis and Recognition CCF-C

扩散模型改进

整体

Pyramidal Denoising Diffusion Probabilistic Models

PDDPM1

PDDPM2

使用一种可以接受多种尺度图片的网络,训练一个多尺度的网络。然后不断的进行扩散采样和上采样来训练得到高分辨率的图像。有点意思。

Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation

![Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation1](../images/Diffusion-Models-Improve-Points/Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation1.png)

![Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation2](../images/Diffusion-Models-Improve-Points/Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation2.png)

Erasing-inpainting-based data augmentation using denoising diffusion probabilistic models with limited samples for generalized surface defect inspection

![Erasing-inpainting-based data augmentation using denoising diffusion probabilistic models with limited samples for generalized surface defect inspection1](../images/Diffusion-Models-Improve-Points/Erasing-inpainting-based data augmentation using denoising diffusion probabilistic models with limited samples for generalized surface defect inspection1.png)

![Erasing-inpainting-based data augmentation using denoising diffusion probabilistic models with limited samples for generalized surface defect inspection2](../images/Diffusion-Models-Improve-Points/Erasing-inpainting-based data augmentation using denoising diffusion probabilistic models with limited samples for generalized surface defect inspection2.png)

Radio Anomaly Detection Based on Improved Denoising Diffusion Probabilistic Models

![Radio Anomaly Detection Based on Improved Denoising Diffusion Probabilistic Models1](../images/Diffusion-Models-Improve-Points/Radio Anomaly Detection Based on Improved Denoising Diffusion Probabilistic Models1.png)

D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation

D2C1

D2C2

除了使用自编码器潜空间的思想,也使用了对比学习的思想。

Denoising Diffusion Gamma Models

DDGM1

DDGM2

AnoDDPM: Anomaly Detection with Denoising Diffusion Probabilistic Models using Simplex Noise

使用了Simplex Noise替代高斯噪声。

AnoDDPM1

AnoDDPM2

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

Repaint1

思想还是比较直观,去噪后相加,其实也算是在每一次去噪后加上一个condition。

Denoising Diffusion Restoration Models

DDRM1

DDRM2

TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction

TransFusion1

Face Morphing Attack Detection with Denoising Diffusion Probabilistic Models

MAD-DDPM

MAD-DDPM1

将扩散模型运用到下游的人脸变形攻击检测中,这里有一个值得学习的是使用了两个扩散模型来构建模型。

Accelerating Diffusion Models via Early Stop of the Diffusion Process

ES-DDPM

相当于将DDPM与AE结合。

重要:

  1. forward阶段只增加T’次,而不是完整的T次生成$x^{T’}$。
  2. 训练一个AE、VAE或者GAN模型p。该模型用于恢复$x^{T’}$。
  3. reverse阶段就先从高斯分布中采样一个潜变量z,然后用模型p恢复$x^{T’}$。
  4. 用恢复的$x^{T’}$进行去噪。

Person Image Synthesis via Denoising Diffusion Model

PIDM

将DDP用到了人物姿势更换上,因此加上了两个condition($x_p, x_s$)。

Learning Reliability of Multi-Modality Medical Images for Tumor Segmentation via Evidence-Identified Denoising Diffusion Probabilistic Models

EI-DDPM1

EI-DDPM2

  1. 使用了多个扩散模型训练不同的数据集。
  2. 在每个扩散模型采样的时候添加了条件概率。

Diffusion Posterior Sampling for General Noisy Inverse Problems

由于扩散模型具有高质量的重构能力,而且易于与现有的迭代求解器相结合,因此最近将其作为强大的生成逆问题求解器进行了研究。然而,大多数研究都侧重于解决无噪声环境下的简单线性逆问题,这大大低估了实际问题的复杂性。在这项工作中,我们扩展了扩散求解器,通过对后验采样的近似,有效地处理了一般的有噪声(非)线性逆问题。有趣的是,由此产生的后验采样方案是扩散采样与流形约束梯度的混合版本,没有严格的测量一致性投影步骤,与之前的研究相比,在噪声环境中产生了更理想的生成路径。我们的方法证明,扩散模型可以包含高斯和泊松等各种测量噪声统计,还能有效处理傅立叶相位检索和非均匀去模糊等噪声非线性逆问题。代码见 https://github.com/DPS2022/diffusion-posterior-sampling。

DPS

VAEs meet Diffusion Models: Efficient and High-Fidelity Generation

image-20240229195228265

image-20240229195255756

image-20240229195319858

image-20240229195340843

同时优化VAE和DDPM的损失函数,获得更好的效果。

Denoising Diffusion Models on Model-Based Latent Space

定义了一个潜空间扩散模型。不过这个潜空间不是通过模型学习到的,而是通过一个预定义的方法学习到的(这个过程是一系列的传统方法)。

![Denoising Diffusion Models on Model-Based Latent Space1](../images/Diffusion-Models-Improve-Points/Denoising Diffusion Models on Model-Based Latent Space1.png)

![Denoising Diffusion Models on Model-Based Latent Space2](../images/Diffusion-Models-Improve-Points/Denoising Diffusion Models on Model-Based Latent Space2.png)

![Denoising Diffusion Models on Model-Based Latent Space3](../images/Diffusion-Models-Improve-Points/Denoising Diffusion Models on Model-Based Latent Space3.png)

Denoising Diffusion Bridge Models

扩散模型是一种功能强大的生成模型,它利用随机过程将噪声映射到数据。然而,在图像编辑等许多应用中,模型输入的分布并非随机噪声。因此,扩散模型必须依赖引导或预测采样等繁琐的方法,才能将这些信息纳入生成过程。在我们的工作中,我们提出了去噪扩散桥模型(Denoising Diffusion Bridge Models,DDBMs),这是一种基于扩散桥的自然替代方案。我们的方法从数据中学习扩散桥的得分,并根据学习到的得分求解一个(随机)微分方程,从而将一个端点分布映射到另一个端点分布。我们的方法自然地统一了几类生成模型,如基于分数的扩散模型和 OT 流匹配模型,使我们能够调整现有的设计和架构选择,以适应我们更普遍的问题。根据经验,我们将 DDBM 应用于像素和潜空间中具有挑战性的图像数据集。在标准图像翻译问题上,DDBM 比基准方法取得了显著的改进;当我们通过将源分布设置为随机噪声而将问题简化为图像生成时,尽管 DDBM 是为更一般的任务而构建的,但其 FID 分数却与最先进的方法相当。

DDRM

Diverse Hyperspectral Remote Sensing Image Synthesis With Diffusion Models

![Diverse Hyperspectral Remote Sensing Image Synthesis With Diffusion Models](../images/Mamba-papers/Diverse Hyperspectral Remote Sensing Image Synthesis With Diffusion Models.png)

  1. 使用了CNN编码-解码器
  2. 使用RGB条件图片

Cloud Removal in Remote Sensing Using Sequential-Based Diffusion Models

![Cloud Removal in Remote Sensing Using Sequential-Based Diffusion Models1](../images/Mamba-papers/Cloud Removal in Remote Sensing Using Sequential-Based Diffusion Models1.png)

![Cloud Removal in Remote Sensing Using Sequential-Based Diffusion Models2](../images/Mamba-papers/Cloud Removal in Remote Sensing Using Sequential-Based Diffusion Models2.png)

使用扩散模型来解决序列图片数据。

Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image

![Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image1](../images/Mamba-papers/Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image1.png)

![Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image2](../images/Mamba-papers/Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image2.png)

![Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image3](../images/Mamba-papers/Conditional Stochastic Normalizing Flows for Blind Super-Resolution of Remote Sensing Image3.png)

两个条件:

  1. LR
  2. Degradation representation

Advancing Realistic Precipitation Nowcasting With a Spatiotemporal Transformer-Based Denoising Diffusion Model

![Advancing Realistic Precipitation Nowcasting With a Spatiotemporal Transformer-Based Denoising Diffusion Model1](../images/Mamba-papers/Advancing Realistic Precipitation Nowcasting With a Spatiotemporal Transformer-Based Denoising Diffusion Model1.png)

![Advancing Realistic Precipitation Nowcasting With a Spatiotemporal Transformer-Based Denoising Diffusion Model2](../images/Mamba-papers/Advancing Realistic Precipitation Nowcasting With a Spatiotemporal Transformer-Based Denoising Diffusion Model2.png)

TCDM: Effective Large-Factor Image Super-Resolution via Texture Consistency Diffusion

![TCDM Effective Large-Factor Image Super-Resolution via Texture Consistency Diffusion1](../images/Mamba-papers/TCDM Effective Large-Factor Image Super-Resolution via Texture Consistency Diffusion1.png)

![TCDM Effective Large-Factor Image Super-Resolution via Texture Consistency Diffusion2](../images/Mamba-papers/TCDM Effective Large-Factor Image Super-Resolution via Texture Consistency Diffusion2.png)

SpectralDiff: A Generative Framework for Hyperspectral Image Classification With Diffusion Models

![SpectralDiff A Generative Framework for Hyperspectral Image Classification With Diffusion Models1](../images/Mamba-papers/SpectralDiff A Generative Framework for Hyperspectral Image Classification With Diffusion Models1.png)

![SpectralDiff A Generative Framework for Hyperspectral Image Classification With Diffusion Models2](../images/Mamba-papers/SpectralDiff A Generative Framework for Hyperspectral Image Classification With Diffusion Models2.png)

训练过程

Residual Denoising Diffusion Models

通过对比输入图片$I_in$和目前时刻的图片$I_0$的差别进行加噪(表示为$I_{res}$)。

某种程度上也是guidance,和去噪的思想很像啊。

RDDM1

RDDM2

RDDM3

Non Gaussian Denoising Diffusion Models

![Non Gaussian Denoising Diffusion Models1](../images/Diffusion-Models-Improve-Points/Non Gaussian Denoising Diffusion Models1.png)

通过分析$x_t-x_0$的残差得知,通过gamma噪声去拟合更加有效,可以加快采样步骤。(但我觉得这个分布还是要根据图像而定,看图像的分布,如果暗部多才会呈现出这种残差,否则就是相反的,所以还是有点问题,但也是个启发)

Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion Probabilistic Models

Modiff

好像没什么改进,只是增加了一个条件label-c。

SRDiff: Single Image Super-Resolution with Diffusion Probabilistic Models

SRDiff

RDIFF模型是建立在去噪扩散模型之上的单张图像超分辨模型。但其$x_0$并非我们所要预测的SR图像或者HR图像本身,在扩散过程中 $x_0$为HR图像与上采样后的LR图像的差值,论文中将其用 $x_r=x_H - UP(x_L)$) 表达。在反向推断过程中最后获得的$x_0$与上采样后的LR图像相加获得预测的超分辨图像 $x_{SR}=x_0+UP(x_L)$ 。

使用残差预测而非直接预测,能够加快训练速度,更好地优化网络,同时更关注图像容易丢失的高频信息。

反向过程的预测是迭代产生的,每步是由上一步的结果参与进行预测,每步预测采用的是同一个预测网络,该网络既相当于(5)式中的$\epsilon_{\theta} $ ,但在SRDIFF当中并不与DDPM中完全相同,论文中将预测网络命名为条件噪声预测器

条件噪声预测器的输入不仅包含了上步过程的 $x_t$同时还融合了经过一个编码器的LR图像生成的与 $x_t$大小维度相同的 $x_e$作为条件。

这是因为在训练过程中我们可以通过扩散过程的特性直接获得 $x_t=\sqrt{\bar{\alpha}_t}x_r+\sqrt{1-\bar{\alpha}_t}\epsilon$,其包含了该步HR图像的潜在信息。但在训练完成后,进行对LR图像超分的推断过程中,每步的 $x_t$ 仅能由上步过程获得,而推断过程的初始状态为$x_T\sim \cal{N}$ ,如果没有在条件噪声预测器中加入$x_e$,过程中会缺失对应的LR图像的信息,因此需要将LR图像通过Encoder作为隐变量加入条件噪声预测器,来引导对应的HR信息生成,其原理类似于VAE中的Encoder。

SAR Despeckling using a Denoising Diffusion Probabilistic Model

![SAR Despeckling using a Denoising Diffusion Probabilistic Model1](../images/Diffusion-Models-Improve-Points/SAR Despeckling using a Denoising Diffusion Probabilistic Model1.png)

没什么改进,好像~

SegDiff: Image Segmentation with Diffusion Probabilistic Models

SegDiff1

还是一个条件概率,不过不同的是这里多加了一个F和一个G,是一个值得注意的点。

T2V-DDPM: Thermal to Visible Face Translation using Denoising Diffusion Probabilistic Models

T2V-DDPM

同样是添加了一个condition-y:为一个background color。

MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model

MedSegDiff

同样是添加一个condition-l。不过是用在了医学分割中。

Diffusion Autoencoders: Toward a Meaningful and Decodable Representation

DDIM-AE

理论上来说,还是一个condition DDIM。

采样

ILVR: Conditioning Method for Denoising Diffusion Probabilistic ModelsBidirectional Condition Diffusion Probabilistic Models for PET Image Denoising

BC-DPM

BC-DPM2

在每一次采样过程中加入一个Condition block。

DiffTAD: Denoising diffusion probabilistic models for vehicle trajectory anomaly detectionILVR: Conditioning Method for Denoising Diffusion Probabilistic Models

ILVR1

ILVR2

Denoising Diffusion Implicit Models

将采样过程扩展到非马尔科夫链,加速采样过程。

DDIM1

DDIM2

DDIM3

DDIM4

gDDIM: Generalized denoising diffusion implicit models

将DDIM扩展到除DDPM之外的DDM模型。

DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

DPM-Solver++

Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models

扩散概率模型(DPM)是一类功能强大的生成模型。尽管它们很成功,但 DPMs 的推理却很昂贵,因为它通常需要遍历数千个时间步。推理中的一个关键问题是估计反向过程中每个时间步的方差。在这项工作中,我们提出了一个令人惊讶的结果,即 DPM 的最优反向方差和相应的最优 KL 发散在其分数函数上都具有解析形式。在此基础上,我们提出了无需训练的推理框架 Analytic-DPM,该框架使用蒙特卡罗方法和预训练的基于分数的模型来估计方差和 KL 发散的解析形式。此外,为了纠正基于分数的模型可能造成的偏差,我们推导出最优方差的下限和上限,并对估计值进行剪辑,以获得更好的结果。从经验上看,我们的解析-DPM 提高了各种 DPM 的对数似然,产生了高质量的样本,同时速度提高了 20 倍到 80 倍。

Parallel Sampling of Diffusion Models

NeurIPS 2023

我们没有研究减少去噪步骤数量的其他技术,因为这可能导致质量下降,我们转而研究加速采样的其他方法。特别是,我们研究了以计算换速度的想法:我们能否通过并行执行去噪步骤来加速采样?我们要澄清的是,我们的目标并不是提高样本吞吐量–这可以通过简单的并行化来实现,即同时产生多个样本。我们的目标是提高采样延迟–通过并行解决单个采样的去噪步骤,最大限度地减少生成单个采样所需的壁钟时间。在不牺牲质量的前提下降低样本延迟,可以大大改善使用扩散模型的体验,并使生成应用更具交互性和实时性。然而,由于现有采样方法的顺序性,去噪步骤的并行化似乎具有挑战性。计算图具有链式结构(图 1),因此很难在图中快速传播信息。为了取得进展,我们提出了 Picard 迭代法,这是一种通过定点迭代求解 ODE 的技术。ODE 由漂移函数 s(x,t)定义,带有位置和时间参数以及初始值 x0。

PDDPM

Efficient Denoising Diffusion via Probabilistic Masking

本文尝试增加采样速度,主要思路是通过一个$m_t$来控制某个步骤$t$是否需要。

$α_t(m) = 1 − β_tm_t$

也就是说,我们将 $m$ 重新参数化为二元随机向量,每个分量 mt 都是独立的伯努利随机变量,$s_t∈[0, 1]$的概率为 1,$1 - s_t$ 的概率为 0。

并且使用了一种方法来自动更新$m_t$。

参数改进

Improved Denoising Diffusion Probabilistic Models

  1. 通过神经网络学习方差

    由于 Lsimple 不依赖于 Σθ(xt,t),我们定义了一个新的混合目标:

    IDDPM1

    $Lhybrid = Lsimple + λLvlb$

  2. 提出了cos调度算法

    IDDPM2

Bilateral Denoising Diffusion Models

BDDM

设计了一个算法、网络,来计算β。可以实现更快的采样。因此实际上有两个网络:

  1. a score network $\epsilon_{\theta}$ for sampling
  2. a scheduling network $σ_φ$ for estimating a noise schedule for sampling