Improved understanding of disease heterogeneity leads to a constant increase in research efforts for more “personalized and targeted therapy” with the potential to improve patient outcomes and quality of life due to better target selection. Prostate cancer (PC) is a disease with an inherent high heterogeneity and with variable behavior (from indolent to highly aggressive). PC is the most common malignancy in the advanced age of male patients, representing the second global cause of death from cancer with an incidence of about 1.3 million patients with 360,000 related deaths.1,2 The main therapeutic strategies for in-situ PC considered intermediate to high-risk include radical prostatectomy or radiotherapy treatment. In the presence of metastatic PC, systemic strategies (chemotherapy, hormone therapy, targeted therapy) are indicated.3–5 Despite the high degree of success of primary definitive therapy options, PC mortality, is not decreasing. In fact, depending on the stage at diagnosis, 20–50% of patients experience biochemical recurrence (BCR).1–6 Imaging plays a fundamental role in both PC primary diagnosis and early identification of relapse. However, conventional diagnostic techniques such as trans-rectal ultrasound, computed tomography (CT), magnetic resonance imaging (MRI), and bone scans show some limitations.7,8 Therefore, in recent years, the attention was focused on positron emission tomography (PET) in association with CT (PET/CT) or MRI (PET/MRI) through a single whole-body hybrid exam, PET enables the assessment of the extent of the disease, the detection of metastases and the evaluation of functional and metabolic information, not achievable with conventional techniques.9 Several promising radiopharmaceuticals are currently in use or under-survey for the assessment of PC characteristics and shall be introduced in the following paragraph. Further improvement in the interpretation of this complex data is attempted with the use of artificial intelligence (AI) that, through several approaches (machine-learning, radiomics, and deep-learning), enables to deal with the so-called “big-data”, to condense information and enable better patient selection or improve workflow.
MAIN PET RADIOPHARMACEUTICALS FOR INDIVIDUALIZED ASSESSMENT IN PC
The limited value of 2-deoxy-2-[18F]fluoro-D-glucose (FDG) for PC staging and early recurrence has been well established.10 However, the ability to detect dedifferentiated tumors with high glucose metabolism in patients with the castration-resistant disease is bringing [18F]FDGPET back for PC patients, especially for the accurate patient selection of internal radiotherapy.11 Choline is essential for the synthesis of phosphatidylcholine, which is a major constituent of the cell membrane. Phosphorylation by choline kinase constitutes an important step in incorporating choline into phospholipids, which is relevant for cell viability. In cancer, there is often an increase in the cellular transport and phosphorylation of choline (such as in PC). Several studies have examined the value of [18F] or [11C]Choline PET/CT, obtaining interesting results in terms of initial staging in high-risk patients12 and in the assessment of response to therapy.13 Also, in the scenario of patients with PC BCR, several studies evaluated the [18F] or [11C] Choline PET value in the restaging phase.14 Recently, the accuracy of [18F] or [11C]Choline PET/CT in PC restaging confirmed that the detection rate depends on the prostate-specific antigen (PSA) levels, with 36% for PSA<1 ng/mL to 73% for PSA>3 ng/mL.13 Acetate is a fatty acid precursor and a beta-oxidation metabolic substrate. Cancer cells are generally characterized by an overexpression of fatty acid enzyme synthase and an increased need for lipidic precursors. Acetate has been used through labeling with 11C in several neoplasms,15 and also in PC in which have demonstrated a correlation between its uptake and Gleason Score (GS)/aggressiveness.16 [18F]-flurocyclobutane-1-carboxylic acid (FACBC) is a synthetic amino acid probe able to assess the L-amino acid transport expression, overrepresented in the case of several tumors such as PC. FACBC is mainly used in the United States. Its main advantage is the absent/reduced urinary uptake due to gastrointestinal/hepatic elimination and the ability to improve the assessment of small lesions in the pelvis and the prostatic bed.17 The prostate-specific membrane antigen (PSMA) is an enzyme bound to the cell membrane. It is also known as glutamate carboxypeptidase II, considered the state-of-the-art for PC molecular imaging PET assessment. PSMA is physiologically represented in several tissues (i.e., brain, neural ganglia, small intestine, kidneys), and overexpressed in sundry pathologies such as PC, brain tumors, Paget’s disease.18 PSMA can be labeled with [68Ga] (i.e., [68Ga]PSMA11)19 or [18F] (i.e., [18F]DCFPyL, [18F]PSMA-1007,) , allowing cyclotron production.20 As mentioned earlier, [68Ga]PSMA has a higher detection rate in BCR patients than [18F]Choline, with a detection rate of 50% vs 12% for PSA<0.5 ng/ml and 69%–86% vs 31%–57% for higher PSA values, respectively (Table 1).21
Despite the huge success of PSMA, up to 10% of PC tumors do not express PSMA, therefore further alternatives are still under investigation. Gastrin-releasing peptide receptors (GRPR) bind bombesin (BBN), a neuropeptide mainly distributed in the peripheral nervous system and in the gastrointestinal tract. GRPRs are overexpressed in several tumors such as PC, breast, and small-cell lung cancer. Radiolabeling BBN analogs (targeting GRPRs) and PSMA could allow selective imaging of PC, using β+ emitting isotopes for PET (as 68Ga or 18F), and open the possibility of peptide receptor radionuclide therapy (PRRT), using β− emitting isotopes in a theragnostic approach.22,23 One of the most promising antagonists is [68Ga]labeled DOTA-4-amino-1-carboxymethyl-piperidine-d-Phe-Gln-Trp-Ala-Val-Gly-His-Sta-Leu-NH2 ([68Ga]RM2) studied in literature since 2013, reaching a detection rate of 71.8% in patients with conventional negative imaging.24 Due to PC androgen dependence, imaging the androgen receptor (AR) expression may open wide and interesting therapeutic assessment potential. Namely, PC growth is directly related to testosterone levels (and its derivative dihydrotestosterone - DHT), and preliminary results have been described for imaging with [18F]FDHT.25 For the assessment of bone disease in PC, the bone scan has been used in the past despite the known limited sensitivity and specificity. A newer approach to assess the bone reaction to PC is the use of [18F]NaF with an improved accuracy due to higher resolution.26 An overview of currently used tracers for PC is shown in Figure 1.
ARTIFICIAL INTELLIGENCE APPROACHES
In medical imaging, artificial intelligence (AI) is a system’s ability to precisely interpret data, learning from them, and acquiring knowledge to achieve specific goals and complete tasks with flexible adaptation. AI can handle a massive amount of data compared with traditional statistical methods: most of the AI systems are analytical, being classified as machine-learning (ML) or deep learning (DL) techniques. ML techniques are focused on studying algorithms able to learn and improve, allowing to create models based on a big set of data and then use it on the unseen data through semi-supervised (small-labeled data and a huge amount of unlabeled data), supervised (needs labels to infer regression or classification) or unsupervised methods (direct finding of patterns in unlabeled data). DL encompasses ML techniques, with neural networks’ organization in multiple, progressive, and subsequent related layers (process). Namely, DL approaches simultaneously learn relevant features and prediction models from input images with no need of the so-called “feature engineering” (i.e., computation and extraction of “custom-tailored” imaging variables), using more layers than traditional approaches, thus resulting well suited for huge, diverse, complex data and tasks such as image segmentation and classification.27 Radiomics is a promising methodology for quantitative analysis and description of medical images using advanced mathematics and statistics. It aims to provide quantitative characteristic features that cannot be assessed by human eyes from biomedical images of different nature such as computerized tomography (CT), magnetic resonance imaging (MRI), single-photon emission computed tomography (SPECT), and positron emission tomography (PET).28 In other words, radiomics assumes that any smallest constituent (i.e., voxel and/or pixel) may encompass “features” of tumor’s phenotypes that may be potentially related to tumor’s outcome and patients’ response to therapy, thus reflecting the pathophysiological process and supporting medical decisions. Radiomics’ processes can be simply resumed in five main steps as described in Figure 2: a) acquisition/collection of images b) pre-processing (registration, deconvolution, de-noising, and so on) and volume of interest (VOI) delineation c) feature extraction d) radiomics features reduction and selection e) predictive model using AI-based classifiers. Despite the great potential of these methods, we have to remember that the analysis of grey levels of neighboring voxels in PET depends largely on imaging parameters, especially filters, voxel size, and scatter correction.29 Strick imaging reconstruction standardization is therefore mandatory, and results need external validation to be meaningful.
The strength of AI models is the ability to analyze a huge amount of data and generate predictions30,31; however, in real-life, well-annotated large data sets needed for the AI process are often unavailable. Too small datasets are a potential source of error. Furthermore, the harmonization of data from different centers is possible but difficult due to different protocols, equipment, and software. Another AI application of paramount importance concerns the VOI delineation, representing a major issue of medical image analysis32,33: in fact, manual delineation is characterized by a high inter-and intra-observer variability, and for this reason, the use of an operator-independent and automatic segmentation system is nowadays mandatory.34 After the segmentation process, quantitative features can be extracted using an automated high-throughput analysis tool, as shown by several authors.35,36 Summarize, fundamental blocks of a radiomics workflow are: 1-target segmentation; 2-feature extraction; 3-feature selection; and 4- classification model. In PC molecular imaging, AI may potentially improve several technical aspects of the daily workflow enabling automatic radiopharmaceutical synthesis,37 improving the target delineation/segmentation,38 the images registration,39 and the attenuation correction procedures40–42; however, AI clinical applications are warranted. Here we would like to give a systematic review of the the-state-of-the-art literature of AI clinical applications in PC molecular imaging based on PET.
MATERIALS AND METHODS
We searched the PubMed, PMC, Scopus, Google Scholar, Embase, Web of Science, and Cochrane library databases (up to October 2020), using the following as both text and as MeSH terms: “prostate”, “radiomic”, “PET”, “convolutional”, “neural”, “network”, “machine”, “learning”. No language restriction was applied to the search, but only articles in English were reviewed. The systematic literature search returned 88 articles. According to the PRISMA flow-chart, after duplicate removal, 12 articles have been considered, fully read, analyzed, and extensively described according to their title and abstract as previously described.43 We also checked for further relevant articles in the references of the papers included in the retrieved literature.
MACHINE LEARNING AND RADIOMICS STUDIES
An overview of the presented publications, including machine-learning and radiomics, is summarized in Table 2.
One of the first publications on ML in prostate cancer imaging by Gatidis et al. (2015) considering several parameters of [11C]Choline PET/MRI to improve diagnostic accuracy for detection of PC. In 16 patients, they included T2 signal intensities, apparent diffusion coefficients (ADC), parametrical maps of the volume transfer (Ktrans) and extracellular volume constant (Kep) from dynamic contrast-enhanced (ce) MRI, and PET standardized uptake values (SUVs). A spatially constrained fuzzy c-means algorithm (sFCM) was applied to the single datasets, and the resulting labeled data were used for training of a support vector machine (SVM) classifier. Accuracy, false positive/negative rates of the proposed algorithm were determined in comparison with manual tumor delineation or histopathology correlation in 5 of 16 patients. The combined sFCM/SVM algorithm revealed reliable classification results consistent with the histopathological reference standard and comparable to those of manual tumor delineation. Also, sFCM/SVM generally performed better than unsupervised sFCM alone. The authors observed an improvement in accuracy with an increasing number of imaging parameters (particularly considering SUVs) used for clustering and SVM training.44 In another study, Perk et al. (2018) developed the first whole-body automatic bone-lesion classification tool for [18F]NaF PET using machine learning algorithms. A total of 598 lesions from 14 metastatic castration-resistant PC (mCRPC) patients were analyzed and classified by 4 Nuclear Medicine Physician (moderate agreement through Fleiss’ κ=0.53). A statistically optimized regional thresholding algorithm (SORT) plus two different lesion-detection techniques based on thresholding were assessed (SUV>10g/ml, SUV>15g/ml) to perform a fully automated lesion classification. For each ROI in the image, 172 imaging features were extracted, based on [18F]NaF PET, CT, and spatial probability, the extracted features were used as inputs for different machine-learning algorithms. The factors with the strongest classification performance were the ML algorithm and the lesion identification method (random forests). SORT (area under the curve [AUC]=0.95) resulted in a superior classification performance compared to simple cut-offs SUV>10/15 g/ml (p<0.001). Therefore, this RF whole-body automatic bone-lesion classification technique enables a fully automated analysis of [18F]NaF PET/CT images for the detection of bone metastasis.45 Recently, Alongi et al. (2020) retrospectively applied the CGITA toolbox to extract 106 radiomics features from 46 high-risk PC lesions assessed with [18F]Choline PET/CT. They selected the neighbor component analysis (NCA) and found 13 of 106 features, able to discriminate the disease status at follow-up, reaching the best performance in the discriminant analysis (DA) classification (sensitivity [se] - 74%, specificity [sp] - 58% and accuracy [acc] - 66%) versus the use of all features (se=40%, sp=52%, and acc=51%). The highest per-site performance of the 13 combined features in DA classification was described for nodal lymph lesions (se=87%, sp=91%, and acc=90%).46 For PSMA PET, one of the first machine-learning studies was published by Zamboglou: namely, they aimed to assess the performance of [68Ga]PSMA PET/CT radiomics features for intraprostatic tumor discrimination, non-invasive GS characterization, and pelvic lymph node status. They prospectively enrolled 20 patients who also underwent radical surgery for PC and used the coregistered histopathological gross-tumor volume (GTV-histo) as the gold standard and a manually-created segmentation of the intraprostatic tumor volume on [68Ga]PSMA-PET (GTV-Exp). The authors obtained 133 radiomics features from GTV-Histo and GTV-Exp and found that 76% and 81% respectively significantly discriminated between PC and non-PC tissue. Furthermore, the radiomics feature QSZHGE discriminated between GS 7 and ≥8 on GTV-Histo (AUC=0.93); and between pN1 and pN0 disease on GTV-Histo (AUC=0.85).47 In a more extensive prospective cohort study (76 pre-surgical PC patients), Cysouw et al. (2021) assessed the ML-based quantitative analysis of an [18F]PSMA tracer ([18F]DCFPyL) PET to predict metastatic disease or high-risk pathological tumor features. PC primary tumor was segmented using a 50–70% isocontour peak thresholds; then, 480 radiomics features were extracted per tumor, and RF models were trained to predict lymph node involvement, presence of metastasis, extracapsular extension, and GS≥8. The models’ performance was validated using 50 times repeated 5-fold cross-validation yielding the mean AUC. This model significantly predicted (p<0.01) lymph node involvement, presence of metastasis, extracapsular extension, and GS≥8 with a mean AUC of, respectively, 0.86, 0.86, 0.76, and 0.81. For comparison, models were also trained using the standard PET features (i.e., SUV, volume, total PSMA uptake), but their highest AUC was lower than those of RF models.48 Moazemi et al. (2020) also used a [68Ga]PSMA PET/CT ML analysis and compared its efficiency with physicians. The authors compared 5 different ML methods (RF, Extra Trees and Linear, Polynomial, and RBF Kernel SVM) on 2419 hotspots manually classified in benign and malignant from 72 patients. They extracted 40 textural features from both PET and low-dose CT images (72 patients, 48/72 applied for training) and achieved an AUC of 98% (se=94%, and sp=89%) with the ML-based algorithm. Of note, CT radiomics features increased the accuracy significantly. Therefore, the authors conclude that ML models based on [68Ga]PSMA-PET/CT can classify hotspots with high precision, comparable to that of experienced physicians.49
DEEP LEARNING STUDIES
An overview of the presented publications regarding deep learning is summarized in Table 3.
One of the first clinical DL applications in PC has been described by Rubinstein et al. (2019), who, in a small series of PC patients (n=20), presented a method for primary detection/localization of PC using pathology as the gold standard. In their prospective paper, the Authors proposed an unsupervised learning approachable to extract three feature classes from 4D imaging data that include statistical, kinetic biological, and deep features from dynamic PET, CT, and pathology that are learned by a deep-stacked convolutional autoencoder. Then, anomalies classified as tumors are detected in the features space using density estimation. This proposed algorithm generates promising results for the detection of large cancer foci on [11C]Choline PET images (AUC=0.899 vs AUC=0.812 using only the mean SUV features).50 In the study of Mortensen et al. (2019), a convolutional neural network (CNN) was trained on [18F]Choline PET/CT scans obtained before radical surgery in 45 patients to assess the feasibility of a fully automated AI-based PC quantification. Therefore, the following values were calculated automatically and manually: the prostate volume, SUVmax, SUVmean, the volume of abnormal voxels (Volabn), and total lesion uptake (TLU). The range weight of the prostate specimens was 20–109 versus 31–108 for the CNN-estimated volume. The mean differences between CNN-based and manually derived PET measures of SUVmax, SUVmean, Volabn, and TLU were only 0.37, 0.08, 1.4, 9.61, respectively. Therefore, the proposed automated CNN segmentation can provide volume and quantitative PET measures similar to manually derived values.51 A similar approach with a larger cohort was presented by Polymeri et al. (2020), again on [18F]Choline PET/CT, with a DL-algorithm trained to determine the prostate volume. The manually segmented CT images of 100 patients were used for training, and validation was carried out in 45 patients with biopsy-proven hormone-naïve PC. The automated measurements of prostate volume were compared with manual ones performed independently by two Observers. Then, the PET/CT measurements of tumor burden based on volume and SUV of abnormal voxels were calculated using a fixed threshold above SUV of 2.65 and compared to manual measurements. A univariate Cox regression model evaluated associations between automatically based PET/CT biomarkers and age, PSA, GS, and overall survival (OS). The Sørensen-Dice index (SDI) between the automated and the manual volume segmentations was 0.78 and 0.79, respectively. The automated TLU and the ratio of the volume of abnormal voxels and total prostate volume were significantly associated with overall survival (p=0.02).52 Borrelli et al. (2021) used an AI-based tool for the detection of lymph node metastasis, a challenging problem with high inter-observer variability. The authors considered a cohort of 399 patients with biopsy-proven PC who undergone staging [18F]Choline PET/CT before treatment. 319/399 and 80/399 were used to train and test the AI-based CNN. In the test set, the AI-based lymph node detection was compared with two independent readers. The AI-based tool detected more lymph node metastasis than reader B (98 vs. 87; p=0.045 using reader A as reference) and had a similar performance with reader A (90 vs. 87; p=0.63 using reader B as reference). Furthermore, the AI-based number of lymph node metastasis, as well as PSA and curative treatment, were significantly associated with PC-specific survival.53 Based on [68Ga]PSMA11 PET/CT. Zhao et al. (2020) developed a deep neural network based on triple-combining 2.5D U-Net to automatically characterize PC tumor burden. They assessed 193 metastatic PC patients with 1003 bone, 626 lymph node metastasis, and 127 local lesions. Their network simultaneously extracted features from axial, coronal, and sagittal planes, mimicking the Physicians’ workflow, aiming to reduce computational and memory requirements. Then, the fusion component of the tool synthesized all the information to predict the final characterization result by majority voting strategy. The proposed segmentation tool achieved an accuracy of 99% for bone metastasis and 89% for lymph node metastasis.54 Another DL application in [18F]FACBC pelvic images, published by Lee et al. (2020), described Stanford’s experience with a comparison of different CNN architectures in discriminating normal and abnormal PET scans (251) based on the presence of recurrence and/or metastases in 233 BCR PC patients. PET images were labeled as normal or abnormal according to clinical reports as the gold standard. CNN models were trained using two different architectures: a 2D-CNN (ResNet-50) using a slice-based approach versus the same 2D-CNN plus a 3D-CNN (ResNet-14) using a hundred slices per PET image (case-based approach). Models’ performances were evaluated on independent test datasets. The Sens, Spec, and AUC of the 2D-CNN slice-based approach to discriminate abnormal images were 90.7%, 95.1%, and 97.1% (p<0.001), respectively. For the case-based approaches using both 2D-CNN and 3D-CNN architectures the Sens, Spec, and AUC were 85.7%, 71.4%, and 75.0% (p=0.013), and 71.4%, 71.4%, and 69.9% (p=0.053), respectively. Therefore, a DL classifier using a 2D-slice-based prediction resulted superior to the case-based approach.55
Machine learning (ML), radiomics, and deep learning (DL) approaches are receiving growing attention from the scientific community worldwide in different neoplasms.56–60 As above-mentioned, there are several potential applications in the clinical assessment of PC patients, from the semi-automatization to the technical aspect of image preparation, through image interpretation, calculation of additional factors based on data obtained during scanning, prognosis prediction, and risk-group selection. Furthermore, AI may has the potential to optimize timing, reduce costs, and eliminating the subjective human factor, responsible for the intra-observer differences. Furthermore, in the theragnostic PC field, AI may improve the workflow as preliminarily described in the literature.61,62 Data availability, privacy-concerns, lack of transparent, standardized, and universal procedural agreement are limiting the development of AI approaches. But there are preliminary attempts to override these limitations,63 and fully automated processing and high-level computer interpretation of imaging are nowadays becoming a reality. AI applications will be essential to handle and integrate the vast amounts of quantitative data generated for each exam with clinical data to facilitate a shift towards more personalized medicine.
Personalized diagnostic approaches are becoming more and more available for the assessment of prostate cancer at different stages with an increasing number of tracers, to image specific features of the disease.
There is growing evidence for potential applications of artificial intelligence (AI) in the clinical assessment of prostate cancer.
Despite some limitations, AI may optimize the workflow, reduce costs, and eliminate the subjective human factor in the analysis of complex molecular imaging data sets.
AI applications will help to construct standardized and large databases with complex information (clinical, imaging, laboratory), being an essential step to create and train automated diagnostic/prognostic models that can help clinicians make unbiased and faster decisions aimed at personalized healthcare.
Study concept and design: RL and IAB
Literature search: RL and AC
Articles screening: AI and SB
Figures: RL, AI, AC, and IAB
Drafting of the manuscript: RL, AI, AC, and IAB
Critical revision of the manuscript: AI, AC, SB, IAB
Project supervision: AI, SB, IAB
ACKNOWLEDGMENTS AND FUNDING
This study was partially sustained by a grant from the “Sick Foundation” and “Jimmy Wirth Foundation” whom we sincerely thank.
CONFLICT OF INTEREST
The authors declare that the review was done in the absence of any commercial or financial relationships that could be a potential conflict of interest.