Laparoscopic distal gastrectomy skill evaluation from video: a new artificial intelligence-based instrument identification system

May 30, 2024

61 4 minutes read

Untitled design6 — 41598 2024 63388 Fig1 HTML.jpg

Participants

This study was conducted under the Declaration of Helsinki and was approved by the Institutional Review Board of Jichi Medical University (No.23–043) in Japan. This study analyzed videos of previous surgeries, and the right to opt-out refusal was secured from the patients. Written informed consent was obtained from surgeons subject to skill evaluation. Six gastric surgeons from the Department of Surgery, Division of Gastroenterological, General and Transplant Surgery, Jichi Medical University participated in the study: three were experts who had performed more than 100 laparoscopic gastrectomies, and three were novices who had performed less than 20 laparoscopic gastrectomies (Table 1).

Table 1 Descriptive data of surgeon expert and novice group participants.

Videos for analysis

All laparoscopic distal gastrectomy videos were taken between 2019 and 2022, and three from each surgeon for a total of 18 videos were evaluated. Expert group members selected videos of their three most recent cases during the study period, and novice group members selected videos of the first three career cases. The videos were recorded at 30fps and in HD quality using a 3D curved tip video scope (LTF-S300-10-3D; Olympus, Tokyo, Japan) and 3D video processor (VISERA ELITE II OTV-S300; Olympus, Tokyo, Japan). Because surgeons from a single institution participated, the surgical procedure was standardized and constant: Dissection of the greater omentum segment, takedown of the transverse mesocolon, subpyloric lymph node dissection, right gastroepiploic vascular dissection, and vascular dissection of the greater curvature side of the duodenum.

AI development

Analysis was performed using a unique AI algorithm developed by Anaut Co., Ltd. (Tokyo) based on DeepLab v2 developed by Google LLC. As the training data set, 1,080 still images were culled from surgical videos of 6 cases other than the 18 cases to be evaluated. A pair of ultrasonic coagulation shears (HARMONIC HD1000i, Johnson & Johnson, New Jersey, USA) was used in all surgeries, and models were trained by annotating the active blade tip, active blade, and tissue pad of the HARMONIC shears (Fig. 1a) and analyzed by the learned models.

Figure 1

(a) Annotated image. The active blade tip, active blade, and tissue pad of the HARMONIC shears were annotated. (b) Screen shot of AI surgical video analysis showing coloring of HARMONIC shears. The active blade was colored blue, the tissue pad was colored green, and the tip of the active blade was surrounded by a purple circle and colored to leave an afterimage of the previous 10 frames. The X and Y coordinates, velocity, acceleration, and jerk of the tip are displayed at the top left of the screen.

Video analysis

In the AI analysis of the surgical video, the active blade was colored blue, the tissue pad was colored green, and the tip of the active blade was surrounded by a purple circle and colored to leave an afterimage of the previous 10 frames. The X and Y coordinates, velocity, acceleration, and jerk of the tip are displayed at the top left of the screen (Fig. 1b). Part of the video can be found in Online Resource 1.

Evaluation accuracy

As the test dataset for accuracy evaluation, each video was divided into 11 parts of equal length, and the second and subsequent top images were extracted. For example, in an 11,000-frame video, ten images at frame count -001 (e.g., 1001, 2001, 3001, etc. to 10,001) were extracted. As ten images were extracted per case, there was a total of 180 images for accuracy evaluation.

Accuracy was evaluated using True Positive Rate (TPR), False Positive Rate (FPR), and Dice. Dice and TPR are the most commonly used indices for machine learning performance evaluation¹⁸.

Fluctuation analysis

When the time-series power spectral density S(f) in the following formula has β close to 1, there is said to be a 1/f fluctuation.

$$S(f)\propto \frac{1}{{f}^{\beta }}$$

Taking the logarithm of both sides, the slope of the approximate line plotted on a logarithmic scale is β. β = 0 is said to be white noise and represents a disorderly situation. β = 1 is a 1/f fluctuation, and as β approaches 2, the movement is regular¹⁹.

$$logS(f)\propto log(\frac{1}{{f}^{\beta }})=-\beta logf$$

We calculated movement characteristics by frequency analysis of data arranged in a time series of travel distances. Using Python Ver. 3.9, time series data were separated every 3 s, power spectra were calculated by fast Fourier transform, and plotted on a logarithmic scale. Fluctuations were analyzed by drawing an approximate straight line using the least squares method and evaluating its slope β.

Code availability

Part of the data and the code for calculating slope β can be found in Online Resource 2,3.

Statistical analysis

The total number of frames and the number of frames with the surgical shears in view were calculated. The X and Y coordinates on screen of the shear tips were calculated from the video, followed by the kinematic indices of distance traveled, velocity, acceleration, jerk, and the fluctuation index β. R (The R Foundation for Statistical Computing, Vienna, Austria, version 4.1.0) was used for statistical evaluation. Statistical post hoc power was calculated using G*Power 3.1.9.7²⁰.

A density plot was used to evaluate the distance traveled per second. This plot expresses the shape of the data distribution as a curve based on kernel density estimation and expresses the distribution more smoothly than a histogram. The area between the curve and the x-axis was equal to 1 for both groups, adjusting for the difference in surgical time between the two groups.

The primary outcome of this study was to identify indices that differ according to skill level, and the secondary outcome was to identify cutoff values for those indices. The design of this study is shown in Fig. 2.