Use of communication robots to converse with people suffering from schizophrenia

Medication is an important approach of treatment for patients with schizophrenia; however, the availability of visiting nurses and other human-related support is limited. This study aims to build a system in which service robots support the treatment of individuals with schizophrenia with medication at home. Moreover, medical staff can seamlessly monitor the status of their patients through the robots using this system. In this study, to develop a support system, interactions between a communication robot and patients were surveyed, with a focus on the patients’ impression of the robot and actual utterance times. Communication between a commercially available communication robot and schizophrenic patients was investigated, particularly the participants’ feelings about the robot. In addition, the utterance data between the participants and the robot were extracted and the durations of the conversations were assessed. The examined interaction mode of a robot (talkative or not talkative) and whether the participant spoke or not (spoke or did not speak) revealed no significant association in all adjectives. A co-occurrence network graph showed differences in the participants’ impressions of the robot depending on how talkative the robot was. That is, when the robot did not initiate conversation often, the patient was more likely to initiate interactions and use longer phrases than when the robot initiated conversation frequently. Conversations also lasted longer when the participant, not the robot, initiated conversation. People with schizophrenia converse with a robot regardless of whether the robot is talkative or not. Since the content of the conversation was not analyzed qualitatively, it is necessary to carefully examine whether people with schizophrenia can develop partnerships with robots.


Background
Service robots that can interact with people are expected to help patients with their daily routines despite shortages of medical and nursing caregivers [1,2].
Service robots are expected to not only work for humans but also serve as their partners. PARO, a mental commitment robot, provides psychological care to elderly individuals through robotic therapy [3]. Similarly, AIBO, a robot pet, provides companionship to single people, which creates a sense of attachment [4]. When service robots are recognized as partners, they may be able to provide psychological encouragement.
Service robots may be able to support medication management, because long-term medication treatment requires psychological support for the patient. Such support from robots, called medication adherence, is effective because it raises the patient's willingness to take their medication by emphasizing the importance of doing so [5,6]. World Health Organization (WHO) has shown the importance of improving medication adherence to support medication treatment [7]. Therefore, it is preferable to avoid ordering patients to take their medication according to dosage schedules [8]. Rather, it is desirable to provide support via companionship so that schizophrenic patients can improve their lives independently [9].
Many of the existing medication support technologies have a fixed installation location and an LCD display or alarm that alerts the user when it is time to drink the medicine [10,11]. A smartphone-based system, designed to improve medication adherence, provides support with a medication time alarm and self-monitoring function on an operation screen composed of letters and buttons [12]. Conversely, visual agents that display animations of human faces on desktop computers, and build social relationships through dialogue increase the satisfaction of schizophrenic patients when completing self-reporting tasks [13]. According to a survey by the Consumer Affairs Agency, only 40% of Japanese psychiatric patients (including schizophrenic patients) use the Internet for shopping, etc., which is less compared to healthy individuals [14]. Schizophrenic patients living alone in rural areas have difficulty using smartphones and PCs. Therefore, it is necessary to consider the usability of non-verbal methods other than textual information in consideration of cognitive decline in schizophrenic patients. In addition, it is effective for the robot to have an appearance that enables social interaction that is considered to be effective for improving medication adherence, and have a function that helps to build a relationship between the patient and robot.
Medication treatment is important for individuals with schizophrenia, but nurses that visit the patient at their home and other human support is limited. When a patient's motivation is low, their reducing nodding and answering impede their social relationships [15]. Visiting nurses and families understand the status of people with schizophrenia and help to improve their medical adherence by speaking to them appropriately [16]. To similarly support schizophrenic patients, it is necessary for robots to understand a patient's characteristics and have a trusting relationship that allows the patient to recognize a robot as a partner. A study on the relationship between schizophrenic patients and humanoid robots showed that the facial expressions of robots could convey emotions [17]. However, there are few studies on the interaction between schizophrenic patients and robots. Therefore, to develop support service robots for people with schizophrenia, it is necessary to investigate how the patients communicate with robots.
In contrast, multiple investigations related to the mechanisms by which healthy individuals interact with robots has been conducted in recent years [18]. Interactions with store sales robots using appropriate conversation participation models have performed well [19]. A talkative robot that actively greeted users made the users feel welcome [20]. In conversations with robots, voice recognition technology is identified as a social entity more than touch-based technology [21]. It is necessary to study effective technical approaches, such as interactive design and speech recognition, based on the characteristics of schizophrenia to build partnerships between these robots and healthy people.
This study aims to use robot communication technology that enables the robots to talk positively with patients with schizophrenia in the physical world and support them through communication so they can reliably take their medication at home. The goal is to build a system that allows schizophrenic patients to be treated at home with the support of robots, through which medical and nursing staff can seamlessly monitor the patient's status (see Fig. 1). Medication support robots for schizophrenic patients are being developed through collaboration with experts in psychiatric nursing who are familiar with the work of visiting nurses, robotics experts who have developed medical and welfare robots, and cognitive science experts who can conduct examinations and analyses of user-centered design. In this paper, we report how humanoid communication robots, which were intended to prompt social interaction, were accepted by schizophrenic patients using a basic survey that was focused on the patients' impressions and actual utterance times. To examine how well schizophrenic patients accepted communication with the robots, we investigated the relationship between the degree of a robot's talkativeness and the patient's impression of it, as well as the relationship between the duration of conversations and whether the patient or robot initiated them.
This study was approved by the Medical Review Board of Gifu University of Medical Science (2018. 11.30, and the three psychiatric daycare centers that participated in the investigation. All participants and their support staff at the daycare facilities gave oral informed consent before participation. The author explained to the participants that they were free to participate or not, and that they would not suffer any disadvantages by not participating. The author also explained that both the contents of the impression survey questionnaire and the voice data recorded would be anonymized prior to data analysis. In addition, the participants were informed that their submission of the completed questionnaire constituted their consent to participate in the study.

Experimental design
This study examined the interaction between commercial communication robots and schizophrenic patients. Using the questionnaire to record their responses, participants reported their opinion when they first met the robot. In addition, data from the voice recordings of the conversations between the participant and the robot were extracted and their durations were recorded. In mental health, nursing, and welfare studies, it is important to analyze interviews and statement records to document the situation and characteristics of the subjects. Therefore, utterance data, which is observational data, was collected and its contents were verified.

Communication robot
The communication robots used in this study were programed with input from three psychiatric nursing specialists familiar with the three daycare facilities involved in the investigation, robotics specialists with medical robot development experience, and cognitive science specialists with mental health and welfare research experience. The three following conditions were confirmed by the committee members. First, the robot communicated in Japanese, which was the participants' native language. Second, a chat function was provided as standard. Third, it was ensured that there was a well-established record of welfare facilities, which were used to ensure safety. A humanoid communication robot that met these conditions was selected from products currently available in Japan.
The PALRO (Fujisoft Co., Ltd.), which has been widely used in communication support for elderly welfare facilities in Japan, was selected (see Fig. 2) [22,23]. PALRO is a small robot approximately 40 cm tall and weighs approximately 1.8 kg. It uses open architecture, general-purpose products and is equipped with communication and autonomous movement functions, such as human emotions and learning (speech recognition/speech synthesis/ motion detection/personal recognition/face recognition, etc.). Table 1 shows the communication functions that are standard in PALRO. The PALRO has a switch that allows the user to control the number of possible communications when it recognizes that there are people around the user. Bickmore et al. [13] presented visual agents that display animations of human faces on desktop computers and build social relationships for schizophrenic patients. Considering cognitive decline in schizophrenic patients, it might be more appropriate to provide audio information in addition to the visual information. In this survey, we used two levels of interaction settings "not talkative" in which the PALRO does not speak, and "talkative" in which the PALRO speaks often, and examined whether there were changes in participants' impressions and conversation time depending on the amount of talking. This communication function allows the user to select from two levels of "talkative" mode, that increases the number of speaking actions when it recognizes that there is a person around; the "not talkative" mode reduces the number of actions.

Impression survey
Kanda et al. [24] performed a seven-step evaluation of the impression of the robot using the structure design method using a set of 28 adjective pairs. However, in the opinion of psychiatric nursing specialists and cognitive science experts, it was possible that number might be burdensome for our schizophrenic participants to perform. The adjective "wise" was also considered potentially difficult for our participants because it was determined that it might cause delusion and anxiety. Therefore, we selected adjectives that were applicable to the communication robot from the 27 positive expression words of the adjective pair excluding "wise" (kind, likable, familiar, safe, warm, cute, relaxing, comprehensible, casual, light, helpful, natural, full, exciting, pleasant, favorite, interesting, good, complex, fast, quick, chaotic, active, strong, ebullient, fun, and sensitive). Additionally, we provided a simplified version if the participant had any other impressions (see Fig. 3). Participants observed the communication robot, and they were able to describe it freely and submit the questionnaire survey whenever they finished answering it. The impression survey was a simplified version that excluded negative adjective choices and elements; therefore, it might not have been possible to collect a fair and subjective impression of the robot.
In addition, the participants were asked to write down how they felt about the robot. These comments were statistically analyzed with a KH-coder [25]. At the end of the questionnaire, the participant chose the words provided to describe the robot and answered questions if they had spoken to the robot. If the participant said that they had not spoken to the robot, they were asked why not.

Utterance survey
Concerning the utterance data, while setting up PALRO for questionnaire surveys asking participants for their impressions, only the audio data were recorded to check whether the communication robot and participants were communicating. The conversation between the  participant and the communication robot was recorded, and the continued conversation was extracted. The data collected by the voice recorder were anonymized for ethical considerations. We prepared data from the extracted data, in the frequency band of 200-2000Hz, from the participants at one facility over 10 days (interaction mode: talkative 5 days/not talkative 5 days), excluding the data from the first day, when the interviewers informed the participants about the survey. The utterances determined suitable for the purpose of this survey were extracted, anonymized, and arranged as a transcript. An analysis was conducted using a modified version of the grounded theory approach (M-GTA), which is a mainstream analysis method in psychiatric nursing research, as a guideline [26,27]. However, conversations that were judged to contain inappropriate content related to personal information and medical conditions were excluded.

Participants providing impression and utterance data
The participants were schizophrenia patients who attended a psychiatric daycare facility in Gifu Prefecture. The participants were recruited by word of mouth at the facility. Only participants who agreed to cooperate were given the questionnaire for the impression survey. Table 2 shows the information the participants provided on the questionnaire; they submitted 127 completed surveys. However, since the participants were allowed to submit completed surveys multiple times during the implementation period, there may have been duplicate submissions. The proportion of participants by age showed no significant differences in a comparison with Japanese schizophrenic patients during outpatient treatment, although the proportion of those aged 20-39 was almost twice as that of outpatients.
In the utterance survey, the conversations between the communication robot and each participant were recorded using a voice recorder, but no individual was identified for ethical considerations. Though recording was performed at three facilities, we decided to use data from only one facility because the recording conditions were the best at this facility. Table 3 shows the recording and utterance times. A statistical power analysis to determine the necessary sample size was conducted using G*Power [28]. We opted for an effect size estimate of f = .25 , which is a medium effect according to Cohen's conventions [29]. The power and alpha error levels were set to 1 − β = .80 and α = .05 , respectively. To identify a possible significance in an ANOVA (fixed effect, special, main effect, and interactions), G*Power indicated a minimum total sample size of 128.

Measurements
In the impression survey, incomplete data such as incomplete/unfilled questionnaires were excluded from the collected data forms. Whether a participant talked with the robot was noted, and the set of adjectives that the participant had selected was collected. The questionnaire also collected the data of age and gender. Kanda et al. [24] collected participants' experience such as watching movies about robots and playing with robot toys and revealed that those activities  influenced their impressions. The experience of interaction with robots may different depending on age and gender. Therefore, this study also examines whether there is a relationship between demographics and impression reporting.
In the utterance survey, we noted whether it was the participant or the robot that initiated the conversation. A conversation was considered to occur if the speaker alternated between robot and participant with no more than a 5-second gap. The utterances were counted and their durations were measured.

Procedure
The purpose of this study is to investigate whether a communication robot can function to support improved medication adherence through social interaction. In order to promote social interaction, it is desirable that communication robots not be regarded as unilateral information providers, but as peers with social cognition with whom schizophrenics may want to engage. Therefore, we assessed whether schizophrenics would show interest in communication robots. To record the natural conversation with schizophrenic patients, the authors put the robot in a public space of daycare facility.
A communication robot was installed at a psychiatric daycare facility in Gifu Prefecture. Participants were provided with an environment where they could freely interact with the robot. They participated arbitrarily in impression surveys using questionnaires and speech surveys. The survey was conducted from February 2019 to July 2019 at monthly intervals at three psychiatric daycare facilities. In all facilities, the communication robot was introduced during a morning meeting on the first day of PALRO installation, and both the participants and staff were informed of the survey method. In the first week, PARLO was set up in the "talkative" interaction mode (i.e., the robot spoke often) or in a "not talkative" mode, in which the robot rarely spoke. In the second week, the interviewers distributed an impression survey questionnaire after removing PALRO. In the third week, PALRO was set up in the opposite interaction mode ("not talkative" if "talkative" was used in the first week and vice versa). In the fourth week, an impression survey questionnaire was conducted without PALRO.
When we introduced PALRO to our users, we used the popular "Please sing some song" communication function. During the installation period, a sheet outlining the contents of Table 1 enclosed with PALRO was posted for guidance. Experimenters and staff did not guide patients to talk to PALRO.

Impression survey
A total of one-hundred and twenty-seven single-sheet completed questionnaires were collected at the three facilities. Of those, 16 incomplete sheets were excluded from the analysis, and 29 sheets on which participants answered "I don't feel anything" for the question about the robot's presence were also excluded from the analysis. Another 9 sheets were excluded because the participants had not answered whether they had spoken with the robot or not, as those sheets would have reduced the accuracy of the answer. Therefore, 73 sheets were analyzed, as shown in Table 4. Figure 4 shows the adjectives the participants used to describe their feelings about the robot. A chi-square test was performed on four groups of answers, divided by interaction mode (talkative or not) and whether the participant spoke to the robot or not (spoke or did not speak). No significant association was found for any of the adjectives. Therefore, the interaction mode of the robot, and whether or not the participant spoke to the robot was not related to a participant's impression of the robot. A regression analysis was performed to determine whether participants' gender, age, and the "talkative / not talkative" robot condition were related to the selected impression. We found that the selection of "cute" was related to lower age groups ( p < 0.1 ), while "casual" was related to whether or not the robot was talkative ( p < 0.1 ). "Sensitive" was statistically significant, with male participants selecting this descriptive when the robot was talkative ( p < 0.001).
The comments provided by the participants in the empty text box were converted into text data. Breakdown of the free description answers analyzed and found as 156 sentences in "talkative". The total number of extracted words were 618 words from 156 sentences (242 words (hereafter the number of words used in analysis are in parentheses)), and the number of types of words 232 (146 words). In the category "not-talkative", 208 sentences were analyzed and the total number of extracted words was 1170 words (457), and the number of types of words was 342 (237). The text data on the impression of the robot were statistically processed using KH-Coder. Figure 5 is a co-occurrence network graph based on the text data collected in talkative mode, and Fig. 6 is that for the not-talkative mode. Since the collected comments had many short sentences, only responses consisting of at least three words were analyzed. In order to search for patterns in word-to-word connections, graphing was performed by subgraph detection (mediation) using color-coded groups. The co-occurrence network graph confirmed a difference in the descriptive reporting, with not-talkative mode being more connected to words and groupings than talkative mode. Figure 5 shows that the Jaccard coefficient representing the mean degree of association between the extracted characters was 0.09, and the displayed graph had a light degree of association. On the other hand, in Fig. 6, the Jaccard coefficient was 0.25, and the displayed graph had a mid-degree of association. There were five color-coded groups in Fig. 5 and six in Fig. 6. Therefore, not-talkative had stronger ties between words than talkative, and slightly more groups of cooccurring words, confirming that there may be a difference in the descriptive reporting of impressions. Table 5 shows the number of utterances and conversations of every speaker. A chi-square test for independence indicated no significant association between the type of speaker and the interaction concerning the number of utterances ( χ 2 (1) = 2.87, p > .05 ). Further, it indicated a significant association between the type of speaker and the interaction mode of the robot concerning the amount of conversations ( χ 2 (1) = 7.81, p < .01 ). Consequently, it was shown that the interaction mode affected whether the participant or the robot was more likely to initiate conversation. Figure 7 summarizes the amount of conversation depending on the interaction mode and the speaker. When the interaction mode was not talkative, the fewer conversations occurred, indicating that the interaction mode of the robot affects the initiation of the conversation. On the other hand, the participants had the same number of conversations regardless of the interaction mode of the robot; hence, the interaction mode did not affected the likelihood of the participant initiating conversation. Figure 8 shows the means and standard deviations of the robot's and the participants' conversations. A two-way 2 (interaction mode: talkative, not talkative; within-participants) × 2 (type of speaker: Robot, Participants; within-participants) ANOVA was applied. This revealed a significant main effect of the type of speaker, ( F (1, 332) = 6.91, p < .01, η 2 p = .02, 1 − β = .76 ), indicating that conversations lasted significantly longer when the robot spoke for longer periods of time than the participant.

Utterance survey
The dialogues compiled by psychiatric nursing specialists in transcripts were performed from a minimum of 1 turn to a maximum of 39 turns in a single dialogue. When a conversation between a communication robot and a participant made one turn, the average was 7.9 while the median was found as 6 turns.
The verbal transcripts created from the collected utterance data were classified into topics based on the content and conclusions of the utterances in a process called open coding in the M-GTA, and finally 88 concepts were generated. Next, these concepts were summarized into 13 categories as shown in Table 6 through axial coding, which organized the phenomena that occurred in conversation. In this categorization, to examine how much the communication functions of PALRO shown in Table 1 were used, even if the number of concepts was small, they were left as categories without being aggregated. The eight of the twelve functions presented in the manual as communication functions were used: No. 2,No. 5,No. 6,No. 7,No. 8,No. 10,No. 11,and No. 12 in Table 6.
Of the 88 concepts, 39 were 0-5 turns below the median, and 49 were over 6 turns, indicating that some conversations involved the patient waiting for a response. Some conversations with over 6 turns involved negative utterances ("It's irritating"), and many positive utterances ("Ah, PALRO. Can you talk about something?") could also be interpreted from the script. The free description of the paper questionnaire survey included comments such as "I was happy to come to the day care and talk to the robot". Some participants reported that PALRO was welcomed as a conversation partner.

Discussion
The investigation of the robot's degree of talkativeness and the opinions of the schizophrenic patients indicated no relationship between the talkativeness of the robot and the impression felt. However, the statistical processing of comments gathered in the questionnaire revealed that, when the robot was not talkative, in the co-occurrence network more complicated words were formed. Conversations lasted longer when the robot initiated them. However, the robot's interaction mode was not related to the establishment or length of the conversation. The results suggest that people with schizophrenia participate in conversations regardless of the talkativeness of the communication robot. Since the content of the conversation was not analyzed quantitatively, it is necessary to carefully examine whether individuals with schizophrenia are able to form partnerships with robots.
In an experiment with healthy people, participants evaluated several models of interaction and decided that the best model was based on interaction with a  shopkeeper [19]. A robot that greeted customers actively, gave the participants an outgoing and friendly impression [20]. These results suggest that a robot that implements an interaction design tailored to the target and context could change people's impression of the interaction. Our results did not show evidence that schizophrenia affects the impression of whether the robot's interaction mode is talkative or not. Participants in the present study may have found it difficult to obtain an impression about, or relate to the robot based on its talkativeness. However, the complexity of the co-occurrence network graph obtained from the questionnaire did depend on the robot's interaction mode. To make a communication robot that has been evaluated by schizophrenic patients, it would be necessary to study more interaction designs and models that promote natural conversation based on the characteristics of schizophrenic patients. It is also necessary to analyze how visiting nurses interact with schizophrenic patients [16].
Healthy young people reported feeling friendly with robots in conversations based on speech recognition [21]. Bickmore et al. [13] showed that visual agents that display animations of human faces had an influence on schizophrenic patients' satisfaction. However, "not-talkative" condition complicated the free description of the questionnaire. In addition, the younger schizophrenic participants had the impression that PALRO was "cute". Additionally, schizophrenic participants felt "active" impression in "talkative" condition and male participants reported impression of "sensitive" in particular. These suggest that the attributes and experiences of schizophrenic participants might have an influence on their impression. In addition, Kanda et al. [24] showed that participants' experience had influenced their impressions for robot's behavior. PALRO used in this study has physical pose and behavior reminding human movement (bowing, turning the face in the direction of people, dancing, etc.) that were originally incorporated in   [17]. People with schizophrenia responded faster with negative facial expressions than with positive facial expressions. However, it has been reported from a comparative analysis of healthy people that it is harder for a robot to convey emotions via facial expression than for a human to do so. In addition, schizophrenic patients tend not to function well in social interactions if their symptoms are negative [15]. The communication robot employed in this study can recognize speech, however, its facial expressions are not similar to that of humans, unlike the case with humanoid-type robots (for example, PALRO does not have eyebrows, eyes, a nose, or a mouth). The schizophrenic patients who participated in the survey were patients at mental daycare facilities and were not personally identified. The participants' symptoms were relatively stable but not uniform. Therefore, it is necessary to scrutinize an effective approach to schizophrenic features and to know a patient's symptoms in real time.
The establishment of a conversation between a schizophrenic person and the robot improved when it was the robot that initiated the conversation. The robot used in the present research was effective in communication care for people with dementia in Japan [23]. Therefore, there is a possibility that the effect of initiating conversation with dementia patients would be similar as that of the people with schizophrenia. That possibility could not be evaluated in this survey, however, PALRO may be a suitable robot for such an evaluation.
Finally, this study had several limitations. No relationship was found between the patients' impression of the robot and speaking as interaction. However, the more people who reported that they did not feel anything about the presence of the robot were excluded from the analysis data for the not-talkative condition than for the talkative condition (see Table 7). The chi-square test showed a significant relationship between interaction mode and feeling the robot's presence ( χ 2 (1) = 8.54, p < .01 ). In other words, there is a possibility that the degree of talkativeness was related to whether the robot made an impression on the participant. Therefore, it is necessary to consider a different approach than the one taken in this study in order to explore the factors related to impression. Moreover, for speech analysis, the contents of speech and conversation were not analyzed. Therefore, we did not sufficiently examine the social interaction between the patients and the robot. It is also possible that a lack of social interaction with the robot prevented the patients from forming a relevant impression about the robot. It will also be necessary to further examine whether the content of the conversation with the robot was appropriate for schizophrenic patients. For that reason, it is necessary to scrutinize the contents of utterances, however, this must be done with a great care.

Conclusions
In this study, we investigated the impressions and utterances between schizophrenic patients and communication robots. In summary, more than two-thirds of the daycare facility users participated in the impression survey using the free participation questionnaire, and we obtained the patients' opinions about the communication robots. During the installation period, conversations occurred daily between the participants and the communication robot. Among these, half of the dialogs, which were analyzed and arranged in transcripts, were conversations of six or more turns, and there were some events that could be interpreted as the patient considering the communication robot as a conversation partner. In addition, there were no complaints or cancellations from users or daycare facility staff during the experiment, and no direct denials claiming harm or damage by the communication robot. Therefore, we considered that even in the living environment of the schizophrenic patients who participated, they did not feel a sense of rejection and considered themselves to have been accepted, and it was observed that social interaction could occur. The patients conversed with the robot, and did so regardless of the robot's talkativeness. When the robot was not talkative, schizophrenic patients initiated conversation with the robot about twice as often as the robot did.
For future research, it will be necessary to examine in detail how the amount of conversations from the communication robot affected schizophrenic patients from the viewpoint of user-centered design. In this study, some of the utterances that could not be recorded in the script were derived from the symptoms of schizophrenic patients, so it will be necessary to establish a system that can respond to such data and conduct a resurvey. At that time, a survey method using video data will be needed to capture the characteristics of body motion. To improve medication adherence, which was the purpose of this study, and develop a robot that can support schizophrenic patients in their daily lives, it will be necessary to further explore the characteristics of social interactions of schizophrenic patients. We believe that studying an interaction model based on the characteristics of schizophrenia during the patients' daily life will lead to a better understanding of schizophrenia and its various symptoms, and will contribute towards improved home medical care for people. Finally, in terms of the limitations of this study, the data may be biased because the questionnaire survey had only 127 participants, and the utterance data was as low as 63 times for approximately 170 users. At some facilities, there were participants who could not talk to the robots due to the daily schedule of recreation and outings, so it will be necessary to consider the installation method and period of studies. In addition, many of the schizophrenic participants who participated this time are living with relatively stable symptoms, so it is highly likely that the survey results show only a part of the overall effect of schizophrenia. It will be necessary to continue the research to establish the relationship between schizophrenic patients and communication robots compared to other research in the field of mental health.