Compared with the traditional laparoscopic surgery, the preoperative planning of robot-assisted laparoscopic surgery is more complex and essential. Through the analysis of the surgical procedures and surgical environment, the laparoscope arm preoperative planning algorithm based on the artificial pneumoperitoneum model, lesion parametrization model is proposed, which ensures that the laparoscope arm satisfies both the distance principle and the direction principle. The algorithm is divided into two parts, including the optimum incision and the optimum angle of laparoscope entry, which makes the laparoscope provide a reasonable initial visual field. A set of parameters based on the actual situation is given to illustrate the algorithm flow in detail. The preoperative planning algorithm offers significant improvements in planning time and quality for robot-assisted laparoscopic surgery. The improved method which combines the preoperative planning algorithm with deep deterministic policy gradient algorithm is applied to laparoscope arm automatic positioning for the robot-assisted laparoscopic surgery. It takes a fixed-point position and lesion parameters as input, and outputs the optimum incision, the optimum angle and motor movements without kinematics. The proposed algorithm is verified through simulations with a virtual environment built by pyglet. The results validate the correctness, feasibility, and robustness of this approach.
With the development of robotic technology and application of minimally invasive surgery (MIS), the laparoscopic MIS robotic system has been widely used in surgical specialties, such as urology (prostate, bladder and kidney cancer), gynecology (hysterectomy and myomectomy). Compared with traditional laparoscopic surgery, robot-assisted laparoscopic surgery displays high-definition, 3-D image of the lesion to the surgeon via the console and allows the surgeon to perform complex operations by manipulating the master controls. Robot-assisted laparoscopic surgery is more precision, flexibility, and controllable than conventional techniques, so it has become the research hotspot in recent years.
Although robot-assisted surgery has many advantages over traditional surgery, there are also some thorny problems, such as control switching between master controls and robotic arms, real-time synchronization of master-slave position and attitude, MIS robotic system preoperative planning. Besides, reasonable preoperative planning can significantly reduce the operation time; otherwise, it may increase surgical risks.
For MIS robotic system preoperative planning, scholars have proposed many different methods, which are divided into three parts: (1) A heuristic method based on surgeon experience. (2) A method based on the virtual surgical environment. (3) A method based on multi-objective optimization algorithm.
Hanna et al. (1997a) investigated the impact of port placement on endoscopic manipulations, especially knotting. The optimal azimuth and elevation angles were obtained by comparing the execution time and performance quality score of tying a surgeon's knot (Hanna et al., 1997a). Austad et al. (2001) completed the coronary artery bypass grafting procedures on pigs using the Zeus robot-assisted surgical system. The Zeus system configurations, like port placement and pigs' position, were set based on recommendations from hospitals and surgeon experience (Austad et al., 2001). Ferzli and Fingerhut (2004) proposed recommendations of trocar placement for laparoscopic surgery. The abdominal cavity is divided into six parts according to the operation area, and recommendations are given according to different operations and patient posture characteristics (Ferzli and Fingerhut, 2004). Pick et al. (2014) proposed an anatomic guide of port placement for laparoscopic radical prostatectomy, which was performed on the da Vinci robot-assisted surgical system. Compared to traditional port placement, the pubic bone was used as optimal landmark (Pick et al., 2004). Badani et al. (2008) proposed a novel technique of port placement for robotic renal surgery, which aimed to maximize the range of motion and eliminate external collisions (Badani et al., 2008). Cestari et al. (2010) proposed a new method of port placement for laparoscopic radical prostatectomy, which used a nautical inclinometer and a homemade triangle mold (Cestari et al., 2010).
The heuristic method based on the surgeon experience is convenient and practical for the surgeon, so it is widely used in clinical practice. However, this method is related to the surgeon's operating habits and requires extensive surgical experience. More importantly, the advantages of the surgical robot system are not fully developed.
Hayashibe et al. (2005) developed the simulation system for preoperative planning of abdominal surgery. The core of the simulation system was kinematics and haptics; the effectiveness of preoperative planning was validated by the surgeon's evaluation (Hayashibe et al., 2005). Hayashibe et al. (2006) developed a new simulation system with volume rendering of medical images and automatic positioning by kinematics (Hayashibe et al., 2006). Sun et al. (2007) developed a simulator of the da Vinci system, which was mainly used for surgeon training. Its primary functions were the simulation of port placement and the practice of simple surgical operations (Sun et al., 2007). Bauernschmitt et al. (2007) developed a simulator for port placement and enhanced guidance in robot-assisted heart surgery. The simulator was completed off-line, the simulation model is established by using the patient's computed tomography (CT) images to get the best ports position. Through this system, preoperative planning was optimized, the operation time was reduced, and operation quality was improved (Bauernschmitt et al., 2007). Konietschke et al. (2011) developed a simulator of the DLR MiroSurge system, which used the VR-Map device to establish the simulator quickly. Its primary functions were preoperative optimization and intraoperative simulation (Konietschke et al., 2011).
The method based on the virtual surgical environment visualizes the port placement and verifies the effect in advance. Compared with the former method, this method simplifies the steps of port placement and reduces the time required. However, this method also requires surgeons with extensive surgical experience, and due to the lack of analysis of surgical robot performance and finite attempts, it is difficult to obtain optimized preoperative planning.
Sun and Yeung (2007) proposed the selection of optimal port placement and the determination of optimal robot attitude based on multi-objective optimization. This method used two performance indices, the global isotropy index (GII) and the efficiency index (EI). Through the interaction of these two indicators, the flexibility and operability of the robot were improved, and the workspace and visual space were also increased (Sun and Yeung, 2007). Azimian et al. (2010) proposed the preoperative planning method for robot-assisted minimally invasive CABG. This method used sequential quadratic programming to implement the optimization of kinematic and geometric requirements. In the optimization process, individualized preoperative planning can be achieved taking into account the surgeon's experience (Azimian et al., 2010). Ma et al. (2014) proposed the preoperative positioning method, which was mainly aimed at the collision problem of the multi-arm system. It used the maximum distance index to achieve collision-free optimal preoperative positioning (Ma et al., 2014). Yu et al. (2014) proposed the preoperative positioning method, which was mainly aimed at cooperative cooperation between two instrument arms. It used the percentage of collaboration workspace to achieve the optimal cooperation between two manipulators (Yu et al., 2014). Wang et al. (2016) proposed a preoperative planning algorithm for robot-assisted minimally invasive CABG. This algorithm used two performance indices, isotropy index based on CV (IICV) and index of instrument collaboration space (IICS), to implement the optimal port placement selection and the manipulator poses determination (Wang et al., 2016).
Compared with the former two methods, the method based on multi-objective optimization algorithm is more scientific. More importantly, in addition to the surgeon experience, the robot's characteristics are also taken into account, so the preoperative planning is more conducive to the operation.
In general, after obtaining the preoperative planning by the above method, the joint variables of the manipulator are obtained by inverse kinematics. At present, the telecentric fixed-point positioning mechanism of the surgical robot system is mostly an undriven mechanism, which needs to be manually adjusted to the target position. Due to errors of manual adjustment and mechanical kinematics parameters, the actual preoperative planning is not the optimal solution previously determined. Therefore, it is necessary to use a new method to complete preoperative planning instead of manual configuration.
Traditional manipulator control is to calculate joint variables by inverse kinematics of a given target position. At present, its trend has turned to the end-to-end solution. In other words, the controller learns diverse strategies directly from sensors data, rather than relying on fixed strategies such as kinematics (James and Johns, 2016; Otte et al., 2016; Phaniteja et al., 2017; Gu et al., 2017; Mohammadi et al., 2018). James and Johns (2016) proposed a method that took images as its input and outputs motor movements and target position. Thus, the control of the 7-DOF robot arm can be realized in a virtual environment without any prior knowledge (James and Johns, 2016). The telecentric fixed-point positioning mechanism is a redundant mechanism; an accurate kinematic inverse solution can only be obtained under appropriate constraints. In order to improve the effect of preoperative planning, it is necessary to explore a new method to tackle the problems caused by previous methods.
This paper proposes a laparoscope arm preoperative planning algorithm, which is based on the lesion parametrization model and evaluation indexes. Besides, an improved method based on reinforcement learning algorithm is proposed to achieve preoperative laparoscope arm automatic positioning. More importantly, it is a crucial step towards the automation of robot-assisted laparoscopic surgery.
The rest of the paper is organized as follows. Section 2 introduces surgical procedures and MIS robotic system. The laparoscope arm preoperative planning algorithm is introduced in Sect. 3. The improved DDPG algorithm is introduced in Sect. 4. The simulation results are presented in Sect. 5. Discussion and conclusion are given in Sects. 6 and 7, respectively.
The common MIS has three steps: (1) According to the actual surgical needs, a surgeon makes several small incisions (usually 5–15 mm) and inserts a thin tube called trocar. The trocar is deployed as a means of introduction for laparoscope or laparoscopic instruments, like scissors and graspers, to provide an access port during surgery. (2) Creation of a pneumoperitoneum by inflating the abdomen with carbon dioxide to make a separation between organs and increase the operating space of surgical instruments. (3) The surgeon views the magnified image of the patient's internal organs provided by laparoscope on a video monitor. Using different instruments, the surgeon performs a series of surgical operations in the pneumoperitoneum.
This paper takes laparoscopic cholecystectomy (LC) as an example. The surgeon makes three incisions and inserts trocar. In LC, it is always with the patient in a supine position. Three incisions are arranged in an isosceles triangle for better operating space, as shown in Fig. 1. A laparoscope is placed through a trocar, and specialized instruments are placed through other trocars. By operating the laparoscope and instruments, the surgeon delicately separates the gallbladder from its attachments to the liver and the bile duct and then removes it through one incision.
The schematic diagram of surgical incisions.
The MIS robotic system includes a master-slave manipulator system and a depth camera. The slave manipulator consists of one laparoscope arm and two instrument arms. Laparoscope arm is equipped with a laparoscope, and instrument arms are equipped with different laparoscopic instruments. Laparoscope arm and instrument arms are located on both sides of the operating bed. A depth camera is installed above the operating bed for acquiring the position of the incisions and robotic arms, as shown in Fig. 2.
The three arms have the same mechanical structure. Each arm is divided into three parts, the telecentric fixed-point positioning mechanism, the remote center of motion mechanism and the end effector, as shown in Fig. 3. The first part adjusts the spatial position of telecentric fixed-point by three revolving joints and one linear joint. The second part adjusts the position and posture of the end effector by the master manipulator operated by a surgeon; at its end, there is a versatile quick-change mechanism for end effectors installation.
The MIS robotic system.
The structure of the robotic arm.
One of the critical issues for MIS is preoperative planning, including preparation for interventions and decision about the optimum surgical incisions. Currently, the surgeon often uses trial-and-error method or experience-based to complete preoperative planning, which may not meet the requirements of the optimum incisions. Therefore, it is necessary to use preoperative planning algorithm instead of the previous method. The preoperative planning includes laparoscope arm and instrument arms preoperative planning. This paper studies the former, including the optimum incision and the optimum angle of laparoscope entry.
The mathematical model of pneumoperitoneum is established before
preoperative planning. The shape of artificial pneumoperitoneum is
approximately ellipsoid (Mulier et al., 2008; Oda et al., 2012), so the
abdominal wall is simplified to ellipsoid, defined as Eq. (1). The
artificial pneumoperitoneal coordinate frame is established by combining the
patient's CT images and anatomy. According to anatomy, there are three
principal planes, namely the sagittal plane, the coronal plane, and the
transverse plane. In the coordinate frame, there are also three reference
planes, namely A plane, B plane, and C plane. A plane coincides with the
sagittal plane; B plane coincides with the coronal plane; C plane is
parallel to the transverse plane, and the pneumoperitoneum is divided
equally by C plane. The origin of the coordinate frame is at the
intersection of three reference planes.
During actual operation, the model parameters (
The coordinate frame and pneumoperitoneum model.
The surgeon should be clear about the information of the surgical site,
including lesion location, lesion anatomy, and surrounding tissues. At
present, the conventional method is imaging (radiology) test, and the lesion
model and its surrounding environment are obtained by the 3-D reconstruction
technology. Describe the relationship between lesion and incision in
parametric form, as shown in Fig. 5. Plane
The definition of lesion parameters.
Through the study of the mathematical model of artificial pneumoperitoneum, lesion parametrization model and preoperative planning principles, the laparoscope arm preoperative planning algorithm is proposed, as shown in Fig. 6, that includes three stages: data processing and modeling, optimum incision determination and optimum angle determination.
In the first stage, obtain patient information from the medical images, and then establish the mathematical model of artificial pneumoperitoneum, and lastly determine the location and lesion parametrization model. This stage is the basis of the entire algorithm, and also the most time-consuming stage.
In the second stage, all allowable surgical incisions are obtained from the first stage, and then the candidate incisions are determined according to the two principles. The candidate base positions are obtained by the candidate incisions. According to the actual situation of the operating room, select one of the positions as the base position. Combine candidate incisions and the base position to determine the optimum incision.
In the third stage, the candidate entry angles are determined by combining
the optimum incision, lesion location, and initial entry angle. Determine
the optimum angle according to the observation direction principle. Since
there may be no direction in which the visual axis is perpendicular to the
plane
Finally, the laparoscope arm preoperative planning algorithm is completed, including the optimum incision and the optimum angle.
Flow chart of the laparoscope arm preoperative planning algorithm.
The telecentric fixed-point positioning mechanism has four degrees of
freedom; the mechanism diagram is shown in Fig. 7.
The mechanism diagram of the telecentric fixed-point positioning mechanism.
First, candidate incisions are determined based on the distance principle.
Assume that the positions of the candidate incision and the lesion are
The
The schematic diagram of candidate incisions.
The projection of three planes.
Besides, the base position also affects the surgical incisions. Removing the
prismatic joint, the telecentric fixed-point positioning mechanism is a
3-RRR planar redundant mechanism. When The link The link The link
The schematic diagram of the simplified mechanism.
According to Sect. 3.4, the allowable base range is an ellipse. Compared
with the projection of the candidate incisions on the plane
Given a set of parameters based on the actual situation, the steps of the
algorithm are described in detail,
Take the data in Sect. 3.1,
The candidate base positions are located on the curve, as shown in Eq. (
The schematic diagram of optimum incisions.
First, determine
The sketch map of the optimum incision and angle.
Reinforcement learning describes the set of learning problems where an agent should learn how to map states to actions in an environment to maximize the defined reward function. Throughout the learning process, an agent is not told which actions to take but instead should find out which action yield the most reward by trying various actions. In most cases, actions may affect not only the immediate reward but also the next state, and through that all subsequent rewards. In solving practical problems, it should define a reasonable reward function to compute the reward for taking actions and have a goal relating to the state of the environment. Also, it should quantify all the variables the environment describes and have access to these variables at each step or state.
In this paper, the agent is the 3-RRR planar redundant mechanism which is a simplified model of telecentric fixed-point positioning mechanism plus laparoscope. The environment is the lesion and the surgical incision obtained through the preoperative planning algorithm. The actions are the movement of three revolute joints. The agent–environment interaction is shown in Fig. 13.
The agent–environment interaction in reinforcement learning.
In this paper, laparoscope arm automatic positioning is achieved by DDPG, which is a model-free, off-policy actor-critic algorithm based on the deterministic policy gradient (DPG) (Silver et al., 2014). Deep neural network (DNN) function approximators were used to estimate the action-value function. Thus, the algorithm can learn policies in high-dimensional, continuous action spaces.
Based on DPG, DDPG combines the ideas underlying the success of Deep
DDPG contains a parameterized actor function
The actor function is updated by the chain rule (Eq.
Every
In the training process, telecentric fixed-point (marked point) position and lesion location are taken as the input of the DDPG algorithm. The fixed-point is obtained by a depth camera, the optimum incision, the optimum angle and the base position are obtained by the preoperative planning algorithm. The DDPG algorithm that combines the algorithm can learn policies directly from the inputs, to achieve laparoscope arm automatic positioning for the robot-assisted laparoscopic surgery. The reward function is essential for the algorithm to learn policies successfully. It consists of intermediate reward and final reward, where the former is given a continuous, guided negative reward when the task is not completed, and the latter is given a positive reward that is one to two orders of magnitude larger than the former when the task is completed. The continuous reward function can make convergence of the algorithm better.
In the
To improve the convergence of the algorithm, the state variables also play a crucial role in addition to the reward function. If state variables can adequately present the environment, the algorithm can learn policies quickly. Because the image from the depth camera contains all the state information of the environment, it is reasonable to use the image directly as input. However, due to the limitations of the hardware, the processing image data is very slow. To speed up training of the algorithm, it uses a low-dimensional states description, such as joint variables and positions, instead of high-dimensional renderings of the environment.
The algorithm is to make the laparoscope arm move to the target position, so the joint variables are used as the state variables. However, from the training results, these variables cannot adequately describe the environment; in other words, the algorithm cannot achieve the laparoscope arm automatic movement. So, the distance from telecentric fixed-point to incision, the distance from laparoscope end to the lesion, and whether the target is reached are added to the state variables. The experimental results of these two state variables are described in Sect. 5.2.
The environment is simulated using Pyglet, including a lesion point, a surgical incision and a simplified model of the telecentric fixed-point positioning mechanism. For this environment, a lesion point is randomly specified within a reasonable range, an incision and a base location are obtained by the preoperative planning algorithm. Batch normalization is used on the state input, all layers of the actor network and all layers of the critic network before the action input. In this way, it can learn effectively across tasks with different types of units, without needing to ensure the units are within a set range manually.
TensorFlow is used in the code for high-performance numerical computation.
The simulations use Adam (Kingma and Ba, 2015) for learning neural network
parameters with a learning rate of 10
Two simulations are set up to evaluate the performance of the improved method applied to laparoscope arm automatic positioning for the robot-assisted laparoscopic surgery. The two simulations make one change to states description during training only, and use the same network architecture, learning algorithm and hyperparameters settings. States descriptor one is three joint variables and states descriptor two is the former plus the distance from fixed-point to incision, the distance from laparoscope end to the lesion, and whether the target is reached.
The two simulations evaluate the policy periodically during training by testing it without exploration noise. The improved method with 3 action dimensions and 20 state dimensions runs ten times in the simulated environment. Performance after training across the environment for at most 2000 episodes. The results of ten training sessions report both total reward per episode and steps to target, as shown in Figs. 14–17. The solid line in the figure represents the average over ten sessions, the upper boundary of the shadow part represents the maximum over ten sessions, and the lower boundary represents the minimum value.
The total reward per episode with states descriptor one.
The steps to target with states descriptor one.
The total reward per episode with states descriptor two.
The steps to target with states descriptor two.
Figure 14 shows that the average of total reward per episode is stabilized to
negative and only a few episodes total reward are positive. Figure 15 shows
that the steps to target are always 600. These two figures show that it
never reaches the goal. Figure 16 shows that the average of total reward per
episode increases from
The preoperative planning algorithm, based on the artificial pneumoperitoneum model and the lesion parametrization model, appears to offer significant improvements in planning time and quality for robot-assisted laparoscopic surgery over experience-based method or literature-based method. The distance principle and the direction principle ensure that the proposed algorithm can meet the surgeon's surgical requirements. Furthermore, preoperative planning does not require an additional landmark on the abdominal wall or particular patient positioning.
The proposed algorithm is designed to simulate the actual clinical procedure of robot-assisted surgery or applied to a virtual surgery training system, and a standardized procedure is proposed for preoperative planning. By taking LC as an example, the results indicate that the port placement and laparoscope entry angle selection have satisfying performance, especially for less experienced surgeons.
Preoperative laparoscope arm automatic positioning is achieved based on the DDPG. In this algorithm, the states descriptor plays a crucial role and affects the performance of the algorithm. From the results, the states descriptor two is outperformed states descriptor one. Although the controller does not learn a reasonable strategy directly from states descriptor one, with the evolution of episodes, the controller still improves compared to the initial. Therefore, it is crucial to select states descriptor reasonably. The controller learns a reasonable strategy from states descriptor two, but there is room to reduce the steps of the target, to improve the learning efficiency of the controller. Furthermore, the laparoscope arm automatic positioning is independent of robot configuration and can be extended to any surgical robot system.
This method successfully learns a controller in simulation, and the next step is to study to learn a controller in real robots without a lot of time training, and the method can be extended to the preoperative planning of other operations or even other surgical procedures. Thus, the implementation of the algorithm for robot-assisted surgery can further realize telesurgery, thereby improving the medical level in many areas.
This paper completes the preoperative planning by analyzing the surgical procedures and surgical environment of robot-assisted laparoscopic surgery. Based on the lesion parametrization model, two principles of laparoscope arm preoperative planning are designed, including the distance principle and the direction principle. According to the two principles, the laparoscope arm preoperative planning algorithm is divided into two parts, the optimum incision and the optimum angle of laparoscope entry. A set of parameters based on the actual situation is given to verify the effectiveness of the algorithm. Preoperative laparoscope arm automatic positioning is achieved by the improved method which combines the preoperative planning algorithm with the DDPG algorithm. The improved method takes the fixed-point position captured by a depth camera and the lesion location obtained by imaging test as input. Based on the input information, optimum incision and optimum angle are obtained through the algorithm, and then the laparoscope arm can automatically move to the target position. Compared to the traditional method, kinematics is not used to calculate the motor movements, so that it can reduce errors caused by inaccuracy of kinematic parameters and improve the effectiveness of preoperative planning. The simulation results show that the improved method can realize preoperative laparoscope arm automatic positioning and it is also robust.
The automatic positioning algorithm provides a theoretical basis for the laparoscope arm preoperative planning of robot-assisted laparoscopic surgery. It avoids the disadvantage of the heuristic method based on surgeon experience, and it also simplifies the preoperative planning process and reduces the operation time. However, the algorithm is implemented in a virtual environment, and there is a certain gap with the actual system. Therefore, how to implement the algorithm in the actual system is the primary direction of subsequent research.
The data in this study can be requested from the corresponding author.
LY, XY, XC and FZ discussed and decided on the methodology in the study. The preoperative planning algorithm, the reinforcement learning algorithm and simulations have been performed by XY, XC and FZ. LY completed literature review and overall plan.
The authors declare that they have no conflict of interest.
The paper is supported by the Natural Science Foundation of Heilongjiang Province (Grand No. F2015034). We also greatly appreciate the efforts of the reviewers and our colleagues.
This paper was edited by Jinguo Liu and reviewed by Yi Yang and two anonymous referees.