Durometry as an alternative tool to the modified Rodnan’s skin score in the assessment of diffuse systemic sclerosis patients: a cross-sectional study

The reproducibility and reliability of the modified Rodnan’s Skin Score (mRSS) are debated due to investigator-related subjectivity. Here, we evaluate if durometry correlates with mRSS in patients with diffuse systemic sclerosis (SSc). This cross-sectional study was conducted from December 2018 to June 2019, including 58 diffuse SSc patients. Two certified researchers, blind to each other’s scores, performed the mRSS, followed by durometry at 17 predefined skin sites. For durometry and mRSS, individual scores per skin site were registered. Durometry and mRSS results measured by each researcher, as well as scores from different researchers, were compared. Skin thickness measurements from forearm skin biopsies were available in a subset of the patients, for comparisons. Statistical analyses included Cohen’s Kappa Coefficient, Intraclass Correlation Coefficient, Kendall’s Coefficient and Spearman’s test. Mean (standard deviation, SD) patient age was 44.8 (12.9) years, and 88% were female. Inter-rater agreement varied from 0.88 to 0.99 (Intraclass correlation coefficient) for durometry, and 0.54 to 0.79 (Cohen’s Kappa coefficient) for mRSS, according to the specific evaluated sites. When data were compared with skin thickness assessed in forearm biopsies, durometry correlated better with skin thickness than mRSS. Durometry may be considered as an alternative method to quantify skin involvement in patients with diffuse SSc. The strong inter-rater agreement suggests that the method may be useful for the assessment of patients by multiple researchers, as in clinical trials.


Background
Systemic sclerosis (SSc) is a chronic autoimmune connective tissue disease characterized by microvasculopathy, inflammatory phenomena and visceral and cutaneous fibrosis, associated with high morbidity and mortality [1,2]. Skin involvement is the hallmark of SSc and affects more than 95% of the patients. Excess collagen and other extracellular matrix components and increased myofibroblasts in the dermis, epidermis and subcutaneous tissue lead to progressive thickening of the cutaneous tissue [3][4][5].
The modified Rodnan's Skin Score (mRSS) is the most commonly used noninvasive method to assess skin involvement in SSc [6]. However, the method is subject to evaluator skills and interpretation, and thus imposes limitations [7]. Despite adequate training and experience of health professionals in performing mRSS, the reproducibility among professionals is low [8]. Therefore, alternative methods to assess skin thickness are warranted [6,9,10]. Albeit invasive, skin biopsies are the gold-standard method to quantify cutaneous involvement, and have been used in the context of clinical research [10]. Biopsies have the limitation of assessing skin from only one location of the skin surface, usually the ventral area of a forearm, while mRSS enables a more global evaluation. Recently, durometry has been evaluated as an alternative tool for skin assessment, aiming to improve accuracy and consistency of skin hardness or stiffening evaluations [11][12][13].

Methods
We aimed to compare mRSS and durometry results in diffuse SSc patients, and validate these measurements through comparisons with skin biopsies available in a subset of the patients.
From December 2018 to June 2019, 58 SSc patients, diagnosed according to the ACR/EULAR 2013 criteria [14], were included in this cross-sectional study, at the Ribeirão Preto Medical School, University of São Paulo. Subjects older than 16 years of age and presenting the diffuse form of SSc were enrolled. Patients were excluded if having amputation or absence of any limb, or skin abnormalities not related to SSc, which, in the opinion of the researchers, would have affected the evaluation. The Institutional Research Ethics Committee approved the study, under registration number 3.045.124. Subjects or their legal guardians, when patients were younger than 18 years of age, read and signed informed consent forms at study enrollment.
Modified Rodnan's skin score and durometry assessments Two researchers, blind to each other's scores, evaluated each patient for mRSS, followed by durometry, on the same date. Patients were comfortably placed on a bed, at room temperature, with relaxed muscles. For mRSS, skin thickness was assessed as previously described [7]. Briefly, each of the 17 predetermined skin areas was scored from zero to three, according to severity of skin thickness, the final added score from all 17 areas ranging from zero to 51 points [6]. For durometry, a portable dial analog durometer (model 1600, type 00, Rex Gauge Company, Inc., Buffalo Grove, IL, USA) was used to measure skin hardness. The durometer (Fig. 1), an equipment of 2.5"× 6.125″ and 6 oz., is the international standard for the hardness measurement of rubber, plastic and other non-metallic materials. The type 00, used in this study, is applicated in animal tissue. It has an indentor of 3/32″ spherical and a spring force of 113 g. When the durometer presses the cutaneous tissue, it allows an assessment of the skin hardness through the resistance to indentation. Measurements are verified in Durometry units, with a continuous scale accurate to one decimal point [12]. Lower numbers indicate less resistance and softer materials. For each skin area, durometry was assessed three times, the mean value considered as final result. For each patient, mRSS and durometry were assessed by the same researcher and at the same areas comprised by mRSS [11], with some specific adjustments of sites to place the durometer, as described in Table 1.

Skin biopsies
A subgroup of ten patients included in this study underwent forearm skin biopsies through a punch technique on the medial area of the left forearm, immediately after mRSS and durometry assessments. Biopsies were processed, fixated and stained with Hematoxylin-Eosin (HE). An experienced pathologist, blind to mRSS and durometry scores, evaluated the skin samples for quantification of epidermal plus dermal thickness in micrometers (μm). Results were available for 9 samples, as one of them did not have enough quality to be measured.

Statistical analysis
Data were analyzed in the following sequence. First, inter-investigator scores for mRSS and for durometry were evaluated for consistency. For mRSS, inter-rater agreement was analyzed by the Cohen's Kappa coefficient [15,16], and for durometry, agreement was assessed by the Intraclass Correlation Coefficient [17]. Then, measurements from durometry and mRSS, for each investigator, were correlated through the Kendall's coefficient. For each of the correlation tests, a result of zero indicates absence of agreement and one, total agreement. Finally, mRSS, durometry scores and skin thickness were correlated (Spearman's test), skin thickness considered as the golden-standard reference for skin involvement. The R software, version 3.4.1 [18] and SAS 9.2 [19] were used for statistical analyses. Significance was set at 0.05.

Results
Fifty-eight diffuse SSc patients, 51 (88%) female, were included in this study. Mean (SD) age was 44.8(12.9) years and mean (SD) time from diagnosis was 22(42.3) months. All patients underwent complete mRSS and durometry evaluations. The mRSS ranged from 0 to 49, with a median value of 22.3 and all patients had visceral involvement, either gastrointestinal or pulmonary. Eighty-one percent had already been treated with one or more lines of immunosuppressant medications. Table 2 shows the mean values for mRSS and durometry for investigators 1 and 2, respectively, at each evaluated skin site, as well as the inter-rater agreement evaluations. Inter-rater agreement ranged from 0.88 (finger) to 0.99 (abdomen) for durometry, and from 0.54 (thigh and abdomen) to 0.79 (foot) for mRSS. Table 3 shows the intra-rater correlations between mRSS and durometry. For specific sites, such as face, fingers and legs, poor correlation was detected, with Kendall's coefficient ranging from 0.03 to 0.30. Table 4 shows correlations between skin thickness, mRSS and durometry scores for the nine patients with available skin biopsies (Table 4).

Discussion
Skin involvement is an important characteristic of SSc patients and associates with prognosis [20]. Skin thickness is usually assessed by the mRSS, but limitations are associated to this method, mainly the evaluatorassociated subjectivity, compromising its reproducibility and reliability. Here, we show that durometry may be used as an alternative method to quantify skin involvement. Durometry scores are reproducible between investigators and directly reflect the severity of skin involvement assessed by mRSS and skin biopsies. We evaluated the inter-rater agreement for mRSS and durometry scores. Correlation coefficients between investigators were higher for durometry than for mRSS, which is in accordance with Merkel et al. [13] and with Kissin et al. [11]. Durometry, therefore, is more reproducible between different researchers than mRSS, on   In the left column, skin thickness from the forearm is correlated with total mRSS and durometry scores. On the right column, forearm skin thickness is correlated with forearm mRSS and durometry. Analysis by Spearman's test. mRSS: modified Rodnan's skin score; #1 and #2 indicate investigator 1 and 2, respectively transversal evaluations. Specifically in the abdomen and thighs, we detected an inter-rater agreement for mRSS that was lower than average for the remaining skin sites, indicating that reproducibility of mRSS is uneven across evaluated areas. In our perspective, this is as an opportunity to use durometry as a complementary tool, enabling better assessment of the areas where mRSS is not sufficiently accurate. When each investigator's mRSS and durometry scores were compared (intra-rater evaluations, Table 3), weak correlation indices were observed in areas where bone tissue was more prominent to the skin surface. These results may reflect the smaller amounts of subcutaneous tissue at specific body sites, such as the face, fingers, and legs. Durometry measures the hardness of a material; therefore underlying bone may be misevaluated as thick skin. Moon et al. [21] have already described similar difficulties for leg evaluations, and Kissin et al. [11] have suggested, for future applications, modifications in sites for durometry, aiming to further increase the accuracy of the method. For further studies, the durometer should be placed in areas with less underlying bone, thereby replacing fingers, forehead and anterior areas of the legs for other sites.
The use of durometry, complementary to mRSS assessments, should be considered in the follow-up of SSc patients. First, the device does not require certification or extended learning time to be used, as it consists of a simple place-and-measure procedure. The method is reproducible between evaluators and may be used interchangeably on patient follow-up when the same investigator is not available [13]. This may be a valuable tool to simplify the logistics of clinical trials [11]. Moreover, in our patient cohort, durometry scores, both global and next to the site of skin biopsy, correlated better to skin thickness evaluated by histopathology, than mRSS. We believe that these discrepant results reflect the lack of reliability of mRSS across different evaluators and reinforce the need of alternative (or complementary) skin assessment strategies in SSc patients. Finally, durometry measurements are provided as a continuous scale, accurate to decimal units, and may evidence more subtle changes in skin involvement than mRSS. This hypothesis should be further evaluated in longer, prospective studies.
Nine patients from this study coincidentally underwent forearm skin biopsies for clinical purposes, on the same date that durometry and mRSS were assessed. The skin thickness of the fragment enabled comparisons to durometry and mRSS values, as a golden-standard reference. In our study, we did not evaluate myofibroblasts or hyalinized collagen, but Kissin et al. [12] showed positive correlations between mRSS and durometry with both collagen and myofibroblast content. In healthy individuals, conversely, analyses of skin biopsy results did not show myofibroblasts and the amount of collagen in the control sample was lower compared to the diseased samples.
This study was limited to diffuse SSc patients as a strategy to homogenize the sample of patients. We found interesting that at specific skin areas, mRSS and durometry correlated well with each other, with a few already described exceptions in areas with scarce subcutaneous tissue. In addition, the skin thickness correlated well with mRSS and durometry in both global and forearm (site of biopsy) evaluations. These results suggest that the findings may be expanded to limited forms of SSc, still to be evaluated in the future.
An interesting future addition would be to compare mRSS and durometry measurements with those from skin ultrasound, which is another available tool for skin assessment. Ultrasound requires a specialist and is not as readily available as the other tools, but may contribute to the field. Ultrasound has a low intra and inter-observer variability, but correlates only low-to-moderately with mRSS [11,22,23]. In a prior study, ultrasound has shown moderate correlation with durometry and with total mRSS, but when specific skin sites were analyzed, there was poor correlation in the fingers [11]. Aspects such as to standardize imaging and best skin sites to be measured are still to be defined in ultrasonography. In the future, more than alternatives to the already established mRSS, these methods may be combined in the assessment of the scleroderma skin as a composite measure.
When compared to previous reports, the specific sites of durometry were slightly different in our study. The decision reflected our initial belief that durometry sites should be similar to those used in mRSS and that patients should be evaluated in anterior areas of the body. We also chose areas that we considered more stable to place the durometer, instead of sites subject to fat accumulation. For the facial evaluations, we used the center of the forehead, which has proven inadequate due to the great divergence between mRSS and hardness measured by the durometer. We therefore agree that alternative approaches, using the cheeks, instead of the forehead, may be more reliable [21]. A study that used the calves, instead of the anterior surface of the legs, also yielded non-reliable results [21]. The durometry of the fingers, however, still stands as a challenge. Our results show that it is not reliable to place the durometer on the hard dorsal surface of the phalanges. The alternative of placing it on the palmar surface of the fingers, however, does not add benefit, as part of the patients have claw-like contractures in the hands. For future studies, investigators should consider excluding the fingers from durometry assessments. Whether this may affect the overall evaluation of a patient, especially those with limited SSc or with initial manifestations of the disease, remains unknown.

Conclusions
We were able to show that for single point transversal analyses of skin involvement, the durometry may reproduce, with possible advantages, the results obtained with the mRSS. The inter-rater reliability of the durometry may prove it useful for assessment of large numbers of patients in single or multicenter trials. For future studies, we seek to investigate whether durometry is also reliable in longitudinal evaluations of patients, with multiple assessments over time.