BrainHemNet

Automated Intracranial Hemorrhage Detection and Segmentation using DINOv3 and SAM2

Sep 7th 2025

•

Authors: Mohsen Mostafa Sayed

Abstract:
Intracranial hemorrhage (ICH) is a life-threatening medical emergency requiring rapid diagnosis. This paper presents BrainHemNet, a novel deep learning framework that automates the detection and precise segmentation of ICH in non-contrast head CT scans. Our method leverages the self-supervised visual features of DINOv3 for anomaly detection and the powerful zero-shot segmentation capabilities of the Segment Anything Model 2 (SAM2). The pipeline first extracts rich feature representations from CT slices using a pre-trained DINOv3 model. A Principal Component Analysis (PCA)-based anomaly detector is then trained on features from normal scans to identify deviations indicative of hemorrhage. Finally, the resulting anomaly heatmap is used to generate automatic prompts for SAM2, enabling precise pixel-level segmentation of the hemorrhagic region without any manual intervention. Evaluated on the RSNA Intracranial Hemorrhage Detection dataset, BrainHemNet demonstrates a robust ability to identify and segment hemorrhages, offering a significant step towards automated, explainable AI-assisted diagnosis in neuroradiology.

Keywords: Intracranial Hemorrhage, Computer-Aided Diagnosis, Self-Supervised Learning, Foundation Models, DINOv3, Segment Anything Model, Medical Image Segmentation, Anomaly Detection.

Detailed Code Explanation for Methodology Section

Here’s a breakdown of the code's key components, perfect for a paper's Methodology section.

1. Data Preprocessing & Handling (read_dicom_image function)

Medical imaging data, particularly DICOM files, are highly heterogeneous. A major challenge addressed in this work is handling unusual and problematic DICOM formats, such as arrays with shape (1, 1, 3) and dtype int16, which contain a single pixel's RGB values. Our preprocessing pipeline includes:

Robust DICOM Reading: The read_dicom_image function incorporates special logic to convert these single-pixel representations into standard 2D images.
Medical Windowing: A critical step for CT scans, applying the Hounsfield Unit windowing (e.g., brain window: width 80, level 40) to optimize contrast for soft tissue and hemorrhage visualization.
Normalization: Images are normalized to a [0, 1] range and resized to a fixed input size of 512x512 pixels.

2. Feature Extraction with DINOv3

We utilize DINOv3 (specifically facebook/dinov2-base), a state-of-the-art self-supervised vision transformer, as a powerful feature extractor. Its pre-trained weights provide rich, contextual representations of image patches without requiring task-specific training from scratch.

Process: Each preprocessed CT slice is fed through the DINOv3 model.
Output: We extract the last_hidden_state features, which encapsulate contextual information for each image patch. These features are then pooled (average pooling) to form a global image descriptor vector for anomaly detection.

3. Unsupervised Anomaly Detection

To identify hemorrhages as anomalies, we employ an unsupervised learning approach:

Training on Normals: An anomaly detector is fit only on features from scans labeled as normal (no hemorrhage). This teaches the model the distribution of a "healthy" feature space.
PCA-Based Reconstruction: The core of the detector is a Principal Component Analysis (PCA) model with a reduced number of components (n_components=5). The anomaly score for a new image is calculated as the mean squared reconstruction error after projecting its features onto the principal components learned from normal data. A high error indicates a significant deviation from the normal feature distribution, signaling a potential hemorrhage.
Explainable Heatmaps: The method generates an anomaly heatmap by calculating the reconstruction error for each individual image patch, providing a visual explanation for the model's decision, as shown in the result figure.

4. Prompt Engineering & Segmentation with SAM2

The generated anomaly heatmap is used to automatically guide the Segment Anything Model 2 (SAM2):

Prompt Generation: The generate_sam_prompts_from_heatmap function thresholds the heatmap and uses computer vision techniques (morphological operations, contour detection) to derive two types of prompts:
- Point Prompts: Foreground (green) points are sampled along high-anomaly contours. Background (red) points are sampled from low-anomaly regions.
- Box Prompts: Bounding boxes are generated around large anomalous regions.
Zero-Shot Segmentation: These automated prompts are fed to SAM2, which performs a zero-shot segmentation, outputting a precise binary mask of the suspected hemorrhagic area. This leverages SAM2's general-purpose segmentation capability without any fine-tuning on medical data.

5. Result Visualization & Interpretation

The final output provides a comprehensive visual analysis, crucial for clinical interpretability:

Original Image: The input CT slice.
Anomaly Heatmap: Highlights regions the model deems anomalous (hot colors = high probability of hemorrhage).
SAM2 Prompts: Shows the automatically generated points and boxes used to guide the segmentation, enhancing trust and transparency.
Segmented Anomaly: The final output, with the hemorrhagic region overlayed on the original image.

Result Figure Interpretation:
The provided result image demonstrates the pipeline's success. The anomaly heatmap (second panel) shows a focused area of high activation, which corresponds to a hyperdense (bright) region on the CT scan, a classic signature of acute hemorrhage. SAM2 successfully used the prompts generated from this heatmap to produce a precise segmentation (fourth panel), accurately outlining the bleed. This visual proof confirms the practical viability of the integrated DINOv3-SAM2 approach.

Conclusion for Paper

The BrainHemNet framework demonstrates the powerful synergy between modern self-supervised vision models and foundational segmentation models. By using DINOv3 for feature extraction and anomaly detection and SAM2 for zero-shot segmentation guided by automated prompts, we create an effective, explainable, and computationally efficient pipeline for ICH analysis. This work paves the way for leveraging large foundation models in specialized medical imaging tasks without extensive domain-specific training, potentially improving the speed and accuracy of diagnosis in emergency settings. Future work will involve quantitative validation on a larger dataset and integration of classification for hemorrhage subtypes.

Code Availability: The full implementation code is available at: https://www.kaggle.com/code/babydriver1233/brain-diagnosis-using-dinov3-sam2