Identification of Autism Spectrum Disorder with Salivary RNA
Objectives: This study characterized the oral micro-transcriptome of 179 children with autism spectrum disorder (ages 2-6 years) and 106 age-matched controls with typical development, to establish a panel of small RNAs with screening potential in autism spectrum disorder.
Methods: A prospective case-control design compared salivary RNA profiles of ASD (n=179) and TD (n=106) participants. ASD status was established by physician DSM-5 diagnosis and autistic traits were quantified with the Autism Diagnostic Observation Schedule, Second Edition. TD status was confirmed by physician assessment at a regularly scheduled well child visit. Adaptive behaviors were assessed in all participants with the Vineland Adaptive Behavior Scales, Second Edition. Salivary RNA was obtained in a non-fasting state with P157 swabs (DNA Genotek, Ottawa Canada) following oral tap-water rinse. RNA was quantified with high throughput sequencing on a NextSeq 500 Instrument (Illumina, San Diego, California). Human RNA reads were aligned in Partek flow using Refseq Transcripts v82 and miRBase v21 with the Shrimp2 algorithm. Microbial RNA was aligned to the human microbiome database using K-SLAM. RNA entities with read counts ≥10 in ≥10% of samples were interrogated for differential expression with Mann-Whitney U-test and RNA profiles were visualized with a partial least squares discriminant analysis (PLSDA). Thirty-five factors most crucial to PLSDA vector projection were used to construct a logistic regression model in the first half of samples (ASD=89, TD=53). This model was then tested in the remaining half of naïve “hold-out” samples (ASD=90, TD=53).
Results: Among the 285 samples, 1343 RNA factors were explored: 337 microRNAs, 85 small nucleolar RNAs, 170 mRNAs, and 751 microbial taxons. Seven RNA factors displayed significant differences between ASD and TD groups (FC>1.5; FDR<0.1). A PLSDA employing all 1343 RNA factors and 11 medical/demographic characteristics distinguished ASD and TD in two-dimensions while accounting for 7.5% of the variance in the dataset (Figure 1). The 35 factors with the highest variable importance in projection scores on PLSDA were used to create a logistic regression model of ASD status. This model included 14 microRNAs, 4 mRNAs, 13 microbial RNAs, and 1 small nucleolar RNA, while controlling for sex, disordered sleep, and gastrointestinal disturbance. It demonstrated an area under the curve (AUC) of 0.830 (95% CI: 0.733-0.913) in the training set and an AUC of 0.895 in the hold-out set on receiver operator characteristics curve (Figure 2).