Use of Data Linkage Methods to Investigate Healthcare Interactions in Individuals Who Self-harm and Die by Suicide: A Scoping Review
Article information
Abstract
Objectives
In this review, the primary objective was to comprehensively summarize and evaluate the themes and analytical strategies of studies that used data linkage methods to examine the healthcare engagement of individuals with self-harming and suicidal tendencies. Additionally, the review sought to identify gaps in the existing literature and suggest directions for future research in this area.
Methods
This review utilized the PubMed, PsycINFO, and Scopus databases. Employing a scoping review methodology, 27 papers were analyzed.
Results
One particularly common data source is the routine information collected by government agencies. However, some studies supplement this data with newly collected information. Compared to other research methods, data linkage offers the advantage of incorporating participants from diverse backgrounds into the analysis. Most relevant studies using data linkage methods have primarily focused on identifying socio-demographic correlates of self-harm, suicide deaths, and healthcare interactions. Additionally, some studies have used cluster analysis to identify patterns of healthcare utilization within affected populations. Certain papers have employed unique methods to measure self-harm and healthcare interactions, while one study utilized a moderator analytical approach.
Conclusions
Data linkage offers a promising approach for researching the dynamics between self-harm, suicide, and healthcare contact. A notable challenge, however, is the focus of most studies on the associations between socio-demographic factors and the risks of self-harm and suicide.
INTRODUCTION
The healthcare interactions of individuals at risk of self-harm and suicide have been extensively studied. This focus stems from the recognition that the effectiveness of evidence-based treatments for self-injury and suicide risk depends on individuals actively seeking professional help within the healthcare system [1]. Consequently, research on healthcare engagement within this vulnerable population may necessitate the use of innovative research methods, such as data linkage. Data linkage involves the combination of information about an individual from at least 2 different databases [2,3]. This approach enables the integration of healthcare data, including hospital visit histories and pharmacy records, with information on suicide-related deaths [4]. Employing this method yields a more accurate and thorough understanding of healthcare service utilization among at-risk populations.
Researchers from countries with advanced healthcare systems, such as Australia [5], Canada [6], and France [7], have utilized data linkage methods to conduct high-quality studies on healthcare interactions among vulnerable individuals [8,9]. Given the growing interest in this area, the current state of research must be examined through a scoping review approach to identify gaps in the literature and establish future research directions [10]. To date, at least 2 scoping reviews have been conducted on this topic. However, these reviews were limited to studies that linked ambulance and police records [4] or those involving pregnant females [11]. Furthermore, they focused exclusively on suicidal behaviors. This study aims to provide a more complete assessment of data linkage methodology in examining healthcare interactions among individuals with self-harming behaviors and suicidal tendencies.
METHODS
Design
This scoping review followed the “5 stages” guideline from Arksey and O’malley [12]: (1) identifying the research question, (2) identifying relevant literature, (3) selecting the literature, (4) charting the data, and (5) collating, summarizing, and reporting the results. To guide the process, the review began with the following research questions [13]: (1) What topics have recent data linkage studies explored regarding healthcare interactions among individuals who self-injure or engage in suicidal behaviors? (2) Which types of data have been used in data linkage studies related to this subject? (3) How have researchers employed data linkage approaches to analyze healthcare interactions among individuals who self-injure or engage in suicidal behaviors?
Search Strategy and Results
The report adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines [14]. The databases used for this review included PsycINFO, PubMed, and Scopus. The review focused on peer-reviewed journal articles written in English and published between 2013 and August 31, 2023. The Supplemental Material 1 contains the full list of search terms used. The inclusion and exclusion criteria were established based on previous reviews [4,11].
The inclusion criteria were as follows: (1) the article presents findings from a quantitative data linkage study involving at least 2 datasets; (2) at least 1 dataset in the data linkage contains healthcare-related information, such as hospital attendance, ambulance records, or referrals; and (3) the article examines self-harm and/or suicidal behaviors, as either a primary research objective, an outcome of interest, or characteristics of the participants.
The exclusion criteria were as follows: (1) articles on assisted suicide or terrorist suicide; (2) articles on animal models and “suicide genes”; (3) non-empirical studies, such as those involving the review or development of data linkage systems; (4) articles on the validity and reliability of suicide and/or self-harm measurements; and (5) articles that do not focus on suicide and/or self-harm in the context of healthcare contact in their titles and abstracts, for instance, if suicide and/or self-harm were only secondary outcomes of the study.
Charting the Data
FWD and ZP (see Acknowledgements) performed the paper selection and data extraction process with the approval of the supervisory panel. The flowchart detailing this process is presented in Figure 1. Data were extracted using EndNote (Clarivate Analytics, Philadelphia, PA, USA) to remove duplicates and irrelevant papers [15]. Initially, FWD retrieved a total of 522 articles from PubMed, 514 from PsycINFO, and 2209 from Scopus, yielding 1208 unique articles. FWD and ZP then screened out papers that did not directly relate to the review’s objectives by assessing their titles, which reduced the pool to 330 papers. Further assessment of titles and abstracts narrowed the selection to 29 papers for comprehensive full-text review. Two papers were excluded based on the predefined exclusion criteria. Finally, FWD used Microsoft Excel (Microsoft, Redmond, WA, USA) to extract data from the remaining 27 papers, continuing until all information necessary to address the research questions was collected.
Collating, Summarizing, and Reporting Results
The results of the collation and summarization process are presented in Table 1 (Supplemental Material 2), which includes sections for title, aims, participants and location, study design and statistical analysis, outcome measures, and results. The findings of this review were conveyed through a narrative synthesis [16]. Notably, unlike systematic reviews, scoping reviews do not assess the strength or weakness of the evidence in the included papers [12]. Hence, the results were discussed with an awareness of these limitations.
Ethics Statement
This study is a scoping review that only analyze peer reviewed journals.
RESULTS
Characteristics of Included Studies
This review synthesized peer-reviewed journal articles that utilized a data linkage approach to examine healthcare interactions among individuals engaging in self-harm and/or suicidal behaviors from 2013 to 2023 (n=27).
All studies (n=27) utilized data routinely collected by government agencies. For instance, Carr et al. [8] used data from the Clinical Practice Research Datalink, a government agency that anonymizes primary healthcare data for research purposes in the United Kingdom [17]. This dataset contained details on demographics, symptoms, diagnoses, therapies, referrals, and other relevant information. It was linked to death records obtained from the Office for National Statistics.
Some research projects incorporated data from new studies and integrated this information with government data. For example, 3 papers [18–20] utilized data from the Passport study, which assessed the effectiveness of an intervention designed to increase healthcare utilization among former prisoners in Queensland, Australia. The researchers linked the newly collected data from the Passport study with subsequent administrative records of the participants’ healthcare visits. Additionally, 2 published papers originated from the Multicentre Study of Self-Harm (MSSH) [21,22]. This project examined self-harm in 3 cities in the United Kingdom and integrated the newly collected data with routine healthcare information.
Most of the research was conducted in 3 countries: Australia (n=13), the United Kingdom (n=8), and Korea (n=3). Additional studies were contributed by Canada, France, Iceland, Norway, and the United States, with each country contributing at least 1 study. Most studies focused on the general population (n=15). However, several studies targeted specific populations, including former prisoners (n=3), younger people (n=3), and individuals with psychiatric diagnoses (n=2). Furthermore, 2 articles addressed marginalized communities in Australia, namely the Aboriginal community [5] and the culturally and linguistically diverse (CALD) population [23].
Measurement of Self-harm and Suicide-related Outcomes in the Studies
We identified 21 studies focused on suicide deaths. Fifteen of these employed the International Classification of Diseases (ICD) for coding [24]. A Korean study utilized the Korean Standard Classification of Diseases coding system [25], which is similar to the ICD but also incorporates Korean traditional medicinal knowledge [26]. Additionally, a Scottish study used the coding framework of the Advanced Medical Priority Dispatch System, which recorded individuals who died after presenting with self-harm or a suicide attempt within the ambulance system [27]. Two studies employed the Victorian Suicide Register (VSR) database [28,29], which consolidated data from 2 other databases: the Victorian Admitted Episodes Dataset, which uses ICD coding, and the Victorian Emergency Minimum Dataset, which has its own coding framework [30]. The remaining 4 studies did not specify the classification system used [6,7,23, 31]. All studies treated suicide death as a binary outcome.
Three papers specifically investigated the intent of participants in cases of suicide death. The prevalent research practice in the United Kingdom involves categorizing individuals who died with undetermined intent as part of the “death by suicide” group [32]. However, Dougall et al. [33] diverged from this norm by omitting participants from their analysis when clear suicidal intent was not established. In turn, Carr et al. [8] and Vuagnat et al. [7] conducted separate analyses for individuals who definitively died by suicide and those who did not.
Two papers included suicidal ideation in their analyses. Leckning et al. [5] classified participants as either experiencing suicidal ideation or engaging in self-harm. Participants who exhibited both self-harm and suicidal ideation were categorized as self-harming individuals. Meanwhile, Clapperton et al. [29] treated suicidal ideation as a binary covariate to investigate whether its presence in cases of self-harm elevated the risk of suicide death.
Eighteen studies investigated self-harm, either as the primary variable of interest or as a characteristic of the participants. Most of these employed a binary approach to differentiate between individuals who had experienced self-harm and those who had not. Thirteen used the ICD system to categorize self-harm. The remaining studies utilized various other classification systems: 3 studies [8,34,35] applied the Read coding system, which is used in the United Kingdom, while 2 papers [19,27] used ambulance data coding frameworks independently established in their respective regions. Clapperton et al. [28] obtained their self-harm data from the VSR. Two studies [18,19] created their own classification systems, inspired by the work of Moran et al. [36], and re-examined ambulance and healthcare records to confirm the cases.
Several articles provided no clear reference for their methodology in categorizing self-harm. Two studies utilized data from the MSSH project [21,22], which introduced a specific coding system; however, they did not clearly explain its development process. One paper offered no information regarding the coding of self-harm [37].
Self-harm history [9] and method [18] were common covariates in the studies. Furthermore, some studies considered the frequency of self-harm behaviors as a variable [5,22]. Pham et al. [23] documented the location of self-injury prior to seeking medical attention, such as the home, a healthcare center, or another location. Research derived from the Passport study included the flagging of individuals at high risk of self-harm within the prison system as a covariate [20].
Clapperton et al. [29] developed a novel code for self-harm, termed “escalation of self-harm.” Their research focused on 2 dimensions of self-harm escalation: the severity of the method, to assess whether employing a more lethal method heightened the risk of suicide, and the interval between self-harm episodes, where a decreasing timeframe could indicate an increased risk of repeated self-harm [38]. Meanwhile, Geulayov et al. [21] considered the area of self-harm (e.g., neck, hands, or other regions) as an exposure variable. They also adopted a unique approach of using the self-harm episode, and not the individual, as the unit of analysis.
Healthcare Contact Measurement in the Study
One of the most common metrics evaluating healthcare contact is the assessment of the index and prior instances of patients’ self-injury in medical settings [39]. Self-harm can be analyzed as either a binary variable [28] or a count of admissions/incidents [5]. Another frequently examined covariate is the psychiatric diagnosis given by professionals [40]. Additionally, 2 studies [41,42] reported on the duration of participants’ hospital stays. Morgan et al. [35] examined medical prescriptions as another key aspect of healthcare contact. Often, researchers have distilled the wide array of healthcare interactions into 2 primary categories: non-psychiatric and psychiatric healthcare, considering both inpatient and outpatient encounters [25].
Several studies have adopted unique approaches to measure healthcare interactions. For instance, Kvaran et al. [43] used the number of visits to the emergency department (ED) as their measure of exposure. Lee et al. [42] quantified continuity of psychiatric care as a study variable using the Bice-Boxerman formula. Park et al. [44] incorporated the utilization of traditional medicine services into their assessment of healthcare contact. Lastly, Chitty et al. [24] included the most comprehensive data on healthcare utilization, encompassing both psychiatric and non-psychiatric visits. They also provided detailed accounts of the types of mental healthcare professionals consulted, as well as the use of psychiatric and non-psychiatric medications.
Methodological Approaches in the Data Linkage Study
Descriptive studies
Five papers focused on describing the prevalence of incidents without conducting hypothesis testing. For instance, Morgan et al. [35] reported the number of suicide deaths following visits to general and psychiatric hospitals. The authors strictly compared the proportion of individuals who died by suicide after visiting these 2 types of hospitals to those who did not. Mallon et al. [31] documented the percentage of males and boys in Northern Ireland who had not accessed healthcare services in the year prior to their suicide between 2007 and 2009.
Correlational and regression studies
Most research primarily examined the relationship between socio-demographic factors and their impact on suicide risk, self-harm risk, or healthcare interactions (n=19). For instance, Borschmann et al. [18] investigated the influence of 10 socio-demographic variables on the frequency of self-harm presentations in the ED. Other studies focused on the specific impact of a single demographic variable on the outcomes of interest, while adjusting for other demographic factors. For example, among individuals who presented for self-harm, Pham et al. [23] found that individuals with a CALD background had lower odds of subsequent self-harm admissions compared to those from non-CALD backgrounds, after adjusting for other demographic characteristics.
Six articles investigated the correlation between self-harm and suicide, while considering the role of covariates. These covariates commonly included healthcare contact and various socio-demographic factors. For instance, Vuagnat et al. [7], utilizing data from France, established a connection between self-harm and an elevated risk of suicide death, while adjusting for healthcare-related variables such as hospital stay.
Some studies incorporated unique variables into their analyses. For instance, Spittal et al. [41] found that previous engagement with primary mental health care predicted an increased likelihood of returning to primary mental health care within 30 days following discharge from a hospitalization for self-harm. Additionally, Hu et al. [39] explored various socio-demographic variables in relation to a heightened risk of ED admission for self-harm within 7 days of a prior self-harm episode, which is an exceptionally short time frame. Lee et al. [25] took a different approach by examining the impacts of various types of healthcare contacts on the risk of suicide death among individuals with psychiatric diagnoses. The survival analysis revealed that only visits to non-psychiatric outpatient care were unassociated with an elevated risk of suicide death. Finally, Lee et al. [42] investigated how poor continuity of care interacts with residing in a more impoverished region of Korea, finding that their combination increased the risk of suicide death among people with psychiatric disorders. That study was unique in its use of interaction analysis.
The studies employed various statistical methods to assess the relationships among the study variables. The choice of statistical method was contingent upon the data type, which could be binary, count-based, or time-related. Notably, DelPozo-Banos et al. [34] was the only study that utilized an artificial neural network method for analyzing the association.
Other research methods
Four studies examined the patterns of healthcare contact among individuals who ultimately died by suicide. Park et al. [44] monitored the healthcare interactions of individuals in Korea who died by suicide, specifically during the year leading up to their death. This study documented the cumulative number of people who visited tertiary hospitals, secondary hospitals, primary local clinics, aged-care facilities, EDs, and Oriental medicine practitioners. These visits were recorded for both psychiatric and non-psychiatric reasons over 3 time frames: 1 year, 4 weeks, and 1 week prior to death. Schaffer et al. [6] followed a similar approach, but it focused solely on mental health contacts, which it reported on a weekly basis. The primary objective of both studies was to determine the overall proportion of participants who had utilized various healthcare services in the year before their suicide.
Two studies employed cluster analysis to categorize typologies of healthcare contact in the year preceding suicide death. Chitty et al. [24] used longitudinal, 3-dimensional k-means cluster analysis to identify 5 distinct patterns of healthcare contact among the deceased. Meanwhile, Myhre et al. [9] employed state sequence analysis and an agglomerative nesting algorithm to discern 4 trajectories of contact patterns.
DISCUSSION
This scoping review examines data linkage studies published between 2013 and 2023 that explore healthcare interactions among individuals who self-harm and those who die by suicide. The review reveals that data linkage has been effectively utilized in 27 peer-reviewed studies to investigate factors related to self-harm and suicidal behavior. The review primarily emphasized the benefits and challenges of using data linkage methods, rather than the specific outcomes. Nevertheless, the collective findings of the studies suggest that continuity of care, particularly in community settings, may serve as a protective factor against suicide. Conversely, visits to EDs, along with diverse socio-demographic and health risk factors, are associated with a higher risk of subsequent suicide. Data linkage has clearly gained popularity as a method for studying healthcare utilization in high-risk populations. By facilitating the analysis of longitudinal data, data linkage also mitigates potential biases associated with recall, thus improving the validity of research findings [44].
Many studies included in this review employed longitudinal analysis, representing a key strength of data linkage methods. This approach leverages government administrative datasets, which contain routinely collected public data spanning long periods. The use of a longitudinal approach increases the validity of the findings from these studies. Additionally, data linkage methods appear to promote the inclusion of individuals from minority groups. For example, multiple studies were focused on Indigenous populations [5] and CALD communities [23] in Australia. Individuals from minority cultural backgrounds often exhibit a lower propensity to participate in health studies [45]. Moreover, participation rates in studies addressing self-harm and suicide are often low due to the stigma attached [46]. Consequently, minority individuals with a history of self-harm or suicidal behavior may be reluctant to participate. Nonetheless, they are likely to have had at least 1 healthcare interaction, such as seeking hospital treatment for self-harm [29]. Data linkage enables the collection of valuable information from individuals who might otherwise decline to take part in conventional surveys, thus improving the representation of those from marginalized groups. This underscores a principal benefit of data linkage studies.
Several limitations are associated with the data linkage method in this area of study. First, many studies face the potential for linkage failures due to missing data [18]. Another limitation is that data linkage may not capture certain healthcare access, especially traditional treatments, that are not documented within the system. For instance, the Korean government has incorporated traditional treatments into its healthcare coding system [44], a consideration that may also be relevant in countries with substantial migrant populations, like Australia [47]. The lack of documentation for these treatments could exclude individuals who initially seek alternative therapies before their symptoms intensify [48]. Consequently, these particular healthcare interactions might be overlooked by Australian policymakers if they are not recognized as part of the country’s official healthcare system.
Furthermore, we have highlighted the abundance of research focusing on socio-demographic determinants of outcomes such as suicide, self-harm, and healthcare contact. While this approach helps identify sub-populations that may benefit from targeted interventions, it does not pinpoint the specific mechanisms that should be addressed to improve healthcare contact for this particularly high-risk group. Many researchers have emphasized the gaps in the availability of useful variables in data linkage studies. For instance, Hu et al. [39] noted that variables such as stressful life events and family dysfunction could not be included in their study because the routine data collection by the relevant government (in Western Australia) primarily captured socio-demographic information.
A potential solution to this issue is to initiate a new data collection process and link it to routine administrative data. For instance, Young et al. [20] included measures of psychological distress and physical health-related functioning that were collected as part of the Passport study. Future research should consider the inclusion of psychological variables or family characteristics in their survey data collection efforts and should aim to connect these with datasets produced by the government. Nevertheless, beginning a new research project carries additional financial costs, which may be prohibitive [49]. Therefore, in situations where financial burdens cannot be overcome, future research should encourage more innovative methods for analyzing variables available in government datasets, such as employing self-harm “escalation” coding [29] or utilizing the moderator framework in regression analysis [42], among other approaches.
A major limitation of this scoping review is that many papers utilizing the data linkage approach were not captured due to the absence of standardized terminology to describe this method [50,51]. Researchers often neglect to explicitly mention the use of data linkage in the titles and abstracts of their papers, which impedes effective scientific communication. Consequently, we recommend that future researchers reach a consensus to explicitly include the data linkage method in the titles and abstracts of their papers.
CONCLUSION
Data linkage offers promising avenues for innovative research into self-harm and suicide. The primary challenge lies in identifying methods to incorporate variables beyond socio-demographic characteristics, which alone provide limited insights into healthcare contact dynamics among populations at high risk for self-harm and suicide. While adopting new data collection approaches can be advantageous for including psychological and family-related variables, they may also introduce financial constraints. Researchers and government policymakers must more clearly identify the key information that should be gathered. Despite certain limitations, the methodology of data linkage is critical to understanding the relationships of service use with suicide and self-harm.
Supplemental Materials
Supplemental materials are available at https://doi.org/10.3961/jpmph.24.448.
Notes
Conflict of Interest
The author has no conflicts of interest associated with the material presented in this paper.
Funding
None.
Author Contributions
All work was done by FWD.
Acknowledgements
This research was conducted as part of an internship with the Australian National Internships Program at the Australian National University (ANU), under the guidance of Ms. Zoe Pollock, Dr. Louise Freebairn, and Mr. Glenn Draper from the ACT Health Directorate. The author also extends gratitude to Dr. Dan Chateau and Prof. Philip J. Batterham from ANU, as well as Dr. Indra Yohanes Kiling from Nusa Cendana University (Indonesia), for their additional mentorship. The author received the Australia Awards scholarship to support his postgraduate studies at ANU.