2026 - Thesis - Accelerate 3D Brain Mapping
Alzheimer's Disease, warp
Readings
- π Demos
- π 2019 - Point-Voxel CNN for Efficient 3D Deep Learning, NIPS
- 2020 - Searching efficient 3d architectures with sparse point-voxel convolution
- 2026 - nvidia/NV-Generate-MR-Brain
Comparison between Conventional Structural MRI and Quantitative Susceptibility Mapping (QSM MRI)
| Property | Conventional Structural MRI (e.g., T1/T2) | Quantitative Susceptibility Mapping (QSM MRI) |
|---|---|---|
| Signal origin | Measures proton relaxation properties (T1 and T2 relaxation times) following RF excitation. | Measures phase shifts induced by local variations in the proton Larmor precession frequency. |
| Primary contrast mechanism | Differences in longitudinal and transverse relaxation of hydrogen protons. | Local magnetic field perturbations caused by tissue magnetic susceptibility. |
| What it depicts | Macroscopic anatomical structures (e.g., gray matterβwhite matter boundaries, cortical thickness). | The absolute magnetic susceptibility (Ο) distribution of tissue. |
| Physical nature | Qualitative or semi-quantitative signal intensity (relative brightness or darkness). | Quantitative physical parameter expressed in parts per million (ppm). |
| Biophysical interpretability | Indirect and non-specific; contrast reflects multiple tissue properties simultaneously. | Directly linked to underlying biophysical sources of magnetism in tissue. |
| Sensitivity to pathology | Sensitive to gross structural changes such as atrophy or lesions, but largely insensitive to early molecular pathology. | Highly sensitive to paramagnetic substances (e.g., iron deposition) and diamagnetic components (e.g., calcification or protein aggregates). |
Topics
- EU Open Research Repository, AiiDA.net
- 2026 - A generalizable foundation model for analysis of human brain MRI
- 2026 - How the brainβs wiring changes
- 2026 - SpaceX
Coding
- 2026 - Let your training 8hrs -> π 13mins
- π Armadillo
- Differentiable Physics with Nvidia - Warp 1.11.1
Alzheimerβs Disease Neuroimaging Initiative (ADNI)
- A large-scale longitudinal multi-center study initiated in 2004. The dataset includes 3D brain MRI and PET images with associated diagnostic labels and clinical metadata, and is publicly available via the ADNI Image and Data Archive under a data use agreement
- ADNI Database
- The essence of Alzheimerβs disease (AD) is the breakdown of neuronal connections caused by the deposition of amyloid plaques at the microscopic level, PATHFINDER (bioRxiv 2025) addresses how to precisely reconstruct damaged neurons, QSM/MRI Framework (Arxiv 2503) addresses how to quantify plaque burden in vivo using imaging
- Data alignment: Microscopic data (PATHFINDER) and MRI data (ADNI) differ in spatial scale by several orders of magnitude. Instead of directly feeding them into the same model, you need to learn their representation mapping, 3D U-Net or
A Medical GAN - Python + PyTorch (deep learning) + ANTs (image registration) + MEDI (QSM reconstruction)
- Based on 3D deep learning,
Spatial Mapping Reconstructionfrom QSM magnetic signals to Amyloid pathological signals is achieved,Why:-
PET scan: Can directly visualize amyloid plaques in the brain, but it is expensive, involves radiation, and is not available in many hospitals -
QSM MRI (Input): A newer MRI technique, highly sensitive to magnetic materials in the brain (such as iron deposits and plaques). It is inexpensive and safe -
Thesis task: Use AI to find patterns between QSM signals and PET plaque distribution.
-
ADNI Cohort
β
βββ QSM MRI (in vivo)
β βββ QSM reconstruction & normalization
β βββ Spatial registration to PET space
β βββ 3D volume cropping / resampling
β
βββ Amyloid PET (reference standard)
β βββ ADNI-standard preprocessing
β βββ Intensity normalization
β βββ Co-registration with QSM
β
βΌ
3D QSM Volume
β
βΌ
Encoder: BrainIAC-Pretrained 3D Vision Transformer
β (global contextual representation learning)
β
βΌ
Latent Cross-Modal Representation
β
βΌ
Alignment Module
β (Conditional Diffusion or GAN-based refinement)
β
βΌ
Decoder
β
βΌ
Predicted Amyloid Burden Map
(continuous voxel-wise 3D estimate)
β
βΌ
Loss Optimization
β βββ Structural Similarity (SSIM)
β βββ Perceptual Loss (VGG-based)
β βββ Intensity Consistency Loss
β
βΌ
Voxel-wise Quantification of Cerebral Amyloid Plaque Burden
1. Overview of the ADNI Dataset
| Item | Description |
|---|---|
| Study Name | Alzheimerβs Disease Neuroimaging Initiative (ADNI) |
| Start Year | 2004 |
| Current Phase | ADNI4 |
| Phases | ADNI1, ADNIGO, ADNI2, ADNI3, ADNI4 |
| Study Type | Longitudinal, multi-center, multi-modal |
| Primary Goal | Early detection and progression modeling of Alzheimerβs disease |
| Access | IDA portal (login + Data Use Agreement required) |
2. Participant Identifiers and Longitudinal Indexing
| Field | Description | Usage |
|---|---|---|
| PTID | Participant ID (format: XXX_S_XXXXX) | Primary key across all tables |
| RID | Numeric subject ID derived from PTID | Easier joins and indexing |
| VISDATE / EXAMDATE / SCANDATE | Visit / exam / scan date | Temporal alignment for longitudinal analysis |
| Phase Indicator | ADNI1 / GO / 2 / 3 / 4 | Cohort and protocol stratification |
3. Diagnostic Group Distribution
| Group | Description | Number of Subjects |
|---|---|---|
| CN | Cognitively Normal | 1,272 |
| SMC | Significant Memory Concern | 97 |
| EMCI | Early Mild Cognitive Impairment | 315 |
| LMCI | Late Mild Cognitive Impairment | 180 |
| MCI (total) | EMCI + LMCI | 1,006 |
| AD | Alzheimerβs Disease | 523 |
| Total Patients | All non-CN subjects | 141 |
4. Neuroimaging Data (Raw and Processed)
| Modality | Access Path | Format | Dimensionality | Typical Use |
|---|---|---|---|---|
| Structural MRI | Advanced Image Search | DICOM / NIfTI | 3D | Brain atrophy analysis, 3D CNN |
| Functional MRI | Advanced Image Search | NIfTI | 4D | Functional connectivity |
| Amyloid PET | Advanced Image Search | DICOM / NIfTI | 3D | Amyloid burden estimation |
| FDG-PET | Advanced Image Search | DICOM / NIfTI | 3D | Glucose metabolism analysis |
| Pathology Slides | Advanced Image Search | Whole-slide images | 2D/3D | Neuropathological validation |
Brain Signals (Why Median + MAD)
| Property | Meaning | Impact |
|---|---|---|
| Non-stationary | The mean varies across time and sessions | Mean and standard deviation become unstable |
| Heavy-tailed distribution | Strong artifacts or high-amplitude spikes | Standard deviation is inflated by outliers |
| Weak signal + mixed noise | High-frequency oscillations + low-frequency drift | Large mean variation, clear skewness |
| Inter-channel variation | Each sensor has different sensitivity | Requires independent per-channel normalization |
References
- CVPR
-
If a team / mentor can tolerate you saying "This has no information" and listen carefully to the rest of your sentence, then it is a very good peer / team. - DiffusionDrive, CVPR highlight 2025.
- Disentangling Monocular 3D Object Detection, ICCV 2019.
- The core method of 3D perception that
does not rely on LiDARlaid the foundation for many subsequent 3D Tracking and 3D MOT vision methods.
- The core method of 3D perception that
- Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction, CVPR 2019.
- Monocular Quasi-Dense 3D Object Tracking, 2021.
- Multi-Level Fusion based 3D Object Detection from Monocular Images, CVPR 2018.
- Development of the Nervous System, Prof. Dr. Stoeckli Esther
Multi-sensor Input Fusion From Space, Safety Detection
- 1960 - A New Approach to Linear Filtering and Prediction Problems, Kalman
- 2005 - Probabilistic Robotics, Multi-sensor Input Fusion
- 2025 - ACDC Dataset, training and testing semantic perception on adverse visual conditions
- π 2019 - Calibration Wizard: A Guidance System for Camera Calibration Based on Modelling Geometric and Corner Uncertainty
Topics
0. Sensor Modalities and Data Types
| Modality | Sensor Type | Data Representation |
|---|---|---|
| Optical | Visible-light satellite camera | 3-channel RGB image (8-bit) |
| SAR | Synthetic Aperture Radar | 1-channel SAR image (32-bit float) |
1. Maritime Search and Rescue
Optical satellite images
+ SAR satellite images
β Ship Detection
β Ship Re-Identification (ReID)
β Trajectory generation & route prediction
| Platform | Strength | Fundamental Limitation |
|---|---|---|
| GEO satellites | Wide coverage, high temporal resolution | Low spatial resolution |
| Video satellites | High spatial & temporal resolution | Short duration, small coverage |
| AIS-based systems | Accurate identity info | Only works for cooperative targets |
| Axis | Examples |
|---|---|
| Sensors | Optical, SAR, LiDAR, multispectral |
| Tasks | Detection, ReID, tracking, mapping |
| Scale | Local β Global |
| Time | Snapshot β Long-term monitoring |
2. Input Data Type
| Modality | Data Type | Format |
|---|---|---|
| Optical | RGB image | 3-channel, 8-bit TIF |
| SAR | Radar backscatter | 1-channel, 32-bit float TIF |
| Geometry | Ship size (derived) | Numeric vector (length, width, aspect ratio) |
3. Fusion Space
Optical image ββ
ββ Dual-head tokenizer β Shared Transformer Encoder β Unified embedding
SAR image ββ
4. Output Data
| Stage | Output Used |
|---|---|
| ReID | Feature distance matrix |
| Tracking | Identity association |
| Trajectory | Time-ordered identity matches |
A Dynamic Camera with Multi-modal Input Signal Fusion
Human perception
ββββββββββββββββββββ
β Vestibular β
β Vision β
ββββββββββββββββββββ
β²
β
ββββββββββββββ΄βββββββββββββββββ
β Wearable System Estimation β
ββββββββββββββ¬βββββββββββββββββ
β
βββββββββ¬βββββββββ¬βββββββββ¬βββββββββ¬βββββββββ
β Cameraβ IMU β Eye β Depth β Others β
β β β trackerβ / ToF β β
βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ
Core Formulation: Bayesian Multi-Modal Sensor Fusion
Latent State Definition
-
At time step t, the latent state is defined as: $x_t = { T_t, \theta, \psi_t }$
-
where $T_t$ denotes the device pose, $\theta$ represents the calibration parameters shared across time, and $\psi_t$ denotes user-centric latent variables.
Multi-Modal Observations
-
Given heterogeneous sensor measurements at time t: $z_t = { z_t^{cam}, z_t^{imu}, z_t^{eye} }$
-
where observations are obtained from the camera, IMU, and eye-tracking modalities.
Bayesian Fusion Objective
-
Multi-modal fusion is defined as inference over the joint posterior: $p(x_{1:T} \mid z_{1:T})$
-
Using the Markov assumption and conditional independence of observations, the posterior factorizes as: $p(x_{1:T} \mid z_{1:T}) \propto \prod_{t=1}^{T} p(z_t \mid x_t)\, p(x_t \mid x_{t-1})$
Multi-Modal Likelihood Factorization
- Assuming conditional independence between sensor modalities given the latent state: $p(z_t \mid x_t) = p(z_t^{cam} \mid x_t)\, p(z_t^{imu} \mid x_t)\, p(z_t^{eye} \mid x_t)$
State Transition Model
-
The temporal evolution of the latent state is modeled as: $p(x_t \mid x_{t-1}) = p(T_t \mid T_{t-1})\, p(\psi_t \mid \psi_{t-1})\, p(\theta)$
-
where $\theta$ is treated as a time-invariant latent variable, $p(\theta)$ enforces temporal consistency of calibration parameters.
Interpretation
- Fusion thus corresponds to Bayesian state estimation under uncertainty, where heterogeneous sensor observations impose probabilistic constraints on a shared latent state evolving over time. Calibration parameters are inferred jointly with pose and user states, enabling online self-calibration.
Sensor Models
- $z_t^{imu} = h_{imu}(T_{t-1}, T_t) + \epsilon_{imu}$
- $z_t^{cam} = h_{cam}(T_t, \theta) + \epsilon_{cam}$
- $z_t^{eye} = h_{eye}(T_t, \psi_t) + \epsilon_{eye}$
Filtering Approximation
For online inference, we approximate the posterior using Bayesian filtering.
- Prediction: $p(x_t \mid z_{1:t-1}) = \int p(x_t \mid x_{t-1}) p(x_{t-1} \mid z_{1:t-1}) dx_{t-1}$
- Update: $p(x_t \mid z_{1:t}) \propto p(z_t \mid x_t) p(x_t \mid z_{1:t-1})$
Multiple sensors = multiple Gaussian constraints on the same state
z_t^cam
(camera likelihood)
β
βΌ
ββββββββββ
z_t^imu ββββββββΆ β x_t β βββββββ z_t^eye
(IMU likelihood) β latent β (eye-tracking likelihood)
β state β
ββββββββββ
4D and LiDAR Free
- 2024 - Interactive4D: Interactive 4D LiDAR Segmentation
- 4D Lidar L1 Application Scenarios - Robots - Unitree
- Aeva β 4DLiDAR for Autonomous Navigation - Auto Driving - beyond Beam
- A Digital Geneva / Zurich