catlab

[PUBLISHED] Polite AI mitigates user susceptibility to AI hallucinations

Pak, R., Rovira, E., & McLaughlin, A. C. (in press). Polite AI mitigates user susceptibility to AI hallucinations. Ergonomics. https://doi.org/10.1080/00140139.2024.2434604

Abstract

With their increased capability, AI-based chatbots have become increasingly popular tools to help users answer complex queries. However, these chatbots may hallucinate, or generate incorrect but very plausible-sounding information, more frequently than previously thought. Thus, it is crucial to examine strategies to mitigate human susceptibility to hallucinated output. In a between-subjects experiment, participants completed a difficult quiz with assistance from either a polite or neutral-toned AI chatbot, which occasionally provided hallucinated (incorrect) information. Signal detection analysis revealed that participants interacting with polite-AI showed modestly higher sensitivity in detecting hallucinations and a more conservative response bias compared to those interacting with neutral-toned AI. While the observed effect sizes were modest, even small improvements in users’ ability to detect AI hallucinations can have significant consequences, particularly in high-stakes domains or when aggregated across millions of AI interactions.

Practitioner Summary

This study examined how AI chatbot etiquette affects users’ susceptibility to AI hallucinations. Through a controlled experiment, results showed polite-AI led to modestly higher sensitivity in detecting hallucinations and a more conservative response bias. This suggests a potential design strategy that may enhance users’ critical evaluation of AI-generated content.

[PUBLISHED] Attention control measures improve the prediction of performance in navy trainees

Burgoyne, A. P., Mashburn, C. A., Tsukahara, J. S., Pak, R., Coyne, J. T., Foroughi, C., Sibley, C., Drollinger, S. M., & Engle, R. W. (in press). Attention control measures improve the prediction of performance in navy trainees. International Journal of Selection and Assessment. https://doi.org/10.1111/ijsa.12510

Abstract

Military selection tests leave room for improvement when predicting work-relevant outcomes. We tested whether measures of attention control, working memory capacity, and fluid intelligence improved the prediction of training success above and beyond composite scores used by the U.S. Military. For student air traffic controllers, commonality analyses revealed that attention control explained 9.1% (R = .30) of the unique variance in academic performance, whereas the Armed Forces Qualification Test explained 5.2% (r = .23) of the unique variance. For student naval aviators, incremental validity estimates were small and nonsignificant. For student naval flight officers, commonality analyses revealed that attention control measures explained 11.8% (R = .34) of the unique variance in aviation preflight indoctrination training performance and 4.3% (R = .21) of the unique variance in flight performance. Although these point estimates are based on relatively small samples, they provide preliminary evidence that attention control measures might improve training outcome classification accuracy in real-world samples of military personnel.

PUBLISHED: Evaluating Attitudes and Experience With Emerging Technology in Cadets and Civilian Undergraduates

Our new research has just been published. The full text PDF is available by clicking here.

Citation:
Pak, R., Rovira, E., McLaughlin, A. C., & Leidheiser, W. (2017, April 10). Evaluating Attitudes and Experience With Emerging Technology in Cadets and Civilian Undergraduates. Military Psychology. http://dx.doi.org/10.1037/mil0000175

Evaluating Attitudes and Experience With Emerging Technology in Cadets and Civilian Undergraduates.

Abstract: Existing research on the characteristics of digital natives, traditionally defined as those born after 1980, has shown subtle differences in how they approach technology compared with other cohorts. However, much of the existing research has focused on a limited set of conventional technologies, mostly related to learning. In addition, prior research has shown differences within this cohort in how they respond to autonomous technology (e.g., trust, reliance; Pak, Rovira, McLaughlin, & Baldwin, 2016). The purpose of this short report, representing the first wave of data collection in a larger study examining technology experience and attitude change, is to directly address 2 shortcomings in the literature on digital natives which tends to emphasize: (a) civilian students; and (b) conventional, often learning technologies. We addressed these 2 issues by recruiting 2 subgroups of digital natives (students and military cadets) and assessing attitudes and experience with a wide range of technology spanning from conventional (e.g., mobile) to emerging (e.g., robotics). The results showed that that both groups were surprisingly unfamiliar with emerging consumer technologies. Additionally, contrary to expectations, cadets were significantly, albeit only slightly, less experienced with mobile technologies, VR/augmented reality, social media, and entertainment technology as compared to civilian undergraduates.

PUBLISHED: Effects of individual differences in working memory on performance and trust with various degrees of automation

Our latest article “Effects of individual differences in working memory on performance and trust with various degrees of automation” has been published on Taylor & Francis Online. It is available at: http://www.tandfonline.com/doi/full/10.1080/1463922X.2016.1252806.

ABSTRACT
Previous studies showed performance benefits with correct automation, but performance costs when the automation was incorrect (i.e. provided an incorrect course of action), particularly as degrees of automation increased. Automation researchers have examined individual differences, but have not investigated the relationship between working memory and performance with various degrees of automation that is both correct and incorrect. In the current study, working memory ability interacted with automation reliability and degree of automation. Higher degrees of correct automation helped performance while higher degrees of incorrect automation worsened performance, especially for those with lower working memory. Lower working memory was also associated with more trust in automation. Results illustrate the interaction between degree of automation and individual differences in working memory on performance with automation that is correct and automation that fails.

PUBLISHED: The effect of individual differences in working memory in older adults on performance with different degrees of automated technology

Our latest research is published and available here: http://www.tandfonline.com/doi/full/10.1080/00140139.2016.1189599

Pak, R., McLaughlin, A. C., Leidheiser, W., & Rovira, E. (2016). The effect of individual differences in working memory in older adults on performance with different degrees of automated technology. Ergonomics. http://doi.org/10.1080/00140139.2016.1189599

ABSTRACT

A leading hypothesis to explain older adults’ overdependence on automation is age-related declines in working memory. However, it has not been empirically examined. The purpose of the current experiment was to examine how working memory affected performance with different degrees of automation in older adults. In contrast to the well-supported idea that higher degrees of automation, when the automation is correct, benefits performance but higher degrees of automation, when the automation fails, increasingly harms performance, older adults benefited from higher degrees of automation when the automation was correct but were not differentially harmed by automation failures. Surprisingly, working memory did not interact with degree of automation but did interact with automation correctness or failure. When automation was correct, older adults with higher working memory ability had better performance than those with lower abilities. But when automation was incorrect, all older adults, regardless of working memory ability, performed poorly.

Practitioner Summary: The design of automation intended for older adults should focus on ways of making the correctness of the automation apparent to the older user and suggest ways of helping them recover when it is malfunctioning.

PUBLISHED: Effects of information visualization on older adults’ decision-making performance in a medicare plan selection task

The paper can be downloaded here:

Price, M. M., Crumley-Branyon, J., Leidheiser, W., & Pak, R. (2016). Effects of information visualization on older adults’ decision-making performance in a medicare plan selection task: a comparative usability study. JMIR Human Factors.

ABSTRACT

Background: Technology gains have improved tools for evaluating complex tasks by providing environmental supports (ES) that increase ease of use and improve performance outcomes through the use of information visualizations (info-vis). Complex info-vis emphasize the need to understand individual differences in abilities of target users, the key cognitive abilities needed to execute a decision task, and the graphical elements that can serve as the most effective ES. Older adults may be one such target user group that would benefit from increased ES to mitigate specific declines in cognitive abilities. For example, choosing a prescription drug plan is a necessary and complex task that can impact quality of life if the wrong choice is made. The decision to enroll in one plan over another can involve comparing over 15 plans across many categories. Within this context, the large amount of complex information and reduced working memory capacity puts older adults’ decision making at a disadvantage. An intentionally designed ES, such as an info-vis that reduces working memory demand, may assist older adults in making the most effective decision among many options.

Objective: The objective of this study is to examine whether the use of an info-vis can lower working memory demands and positively affect complex decision-making performance of older adults in the context of choosing a Medicare prescription drug plan.

Methods: Participants performed a computerized decision-making task in the context of finding the best health care plan. Data included quantitative decision-making performance indicators and surveys examining previous history with purchasing insurance. Participants used a colored info-vis ES or a table (no ES) to perform the decision task. Task difficulty was manipulated by increasing the number of selection criteria used to make an accurate decision. A repeated measures analysis was performed to examine differences between the two table designs.

Results: Twenty-three older adults between the ages of 66 and 80 completed the study. There was a main effect for accuracy such that older adults made more accurate decisions in the color info-vis condition than the table condition. In the low difficulty condition, participants were more successful at choosing the correct answer when the question was about the gap coverage attribute in the info-vis condition. Participants also made significantly faster decisions in the info-vis condition than in the table condition.

Conclusions: Reducing the working memory demand of the task through the use of an ES can improve decision accuracy, especially when selection criteria is only focused on a single attribute of the insurance plan.

PUBLISHED: Does the domain of technology impact user trust? Investigating trust in automation across different consumer-oriented domains in young adults, military, and older adults

Our new paper can be downloaded at: http://www.tandfonline.com/eprint/HJrFr5ChDd6xvFjv5pjA/full

Pak, R., Rovira, E., McLaughlin, A. C., & Baldwin, N. (2016). Does the Domain of Technology Impact User Trust? Investigating trust in automation across different consumer-oriented domains in young adults, military, and older adults. Theoretical Issues in Ergonomics Science. doi:10.1080/1463922X.2016.1175523.

ABSTRACT

Trust has been shown to be a determinant of automation usage and reliance. Thus, understanding the factors that affect trust in automation has been a focus of much research. Despite the increased appearance of automation in consumer-oriented domains, the majority of research examining human-automation trust has occurred in highly specialised domains (e.g. flight management, military) and with specific user groups. We investigated trust in technology across three different groups (young adults, military, and older adults), four domains (consumer electronics, banking, transportation, and health), two stages of automation (information and decision automation), and two levels of automation reliability (low and high). Our findings suggest that trust varies on an interaction of domain of technology, reliability, stage, and user group.

 

Published: A multi-level analysis of the effects of age and gender stereotypes on trust in anthropomorphic technology by younger and older adults

Our recent paper on anthropomorphic technology and stereotypes has just been published.

Pak, R., McLaughlin. A. C., & Bass, B. (In press). A Multi-level Analysis of the Effects of Age and Gender Stereotypes on Trust in Anthropomorphic Technology by Younger and Older AdultsErgonomics

Abstract: Previous research has shown that gender stereotypes, elicited by the appearance of the anthropomorphic technology, can alter perceptions of system reliability. The current study examined whether stereotypes about the perceived age and gender of anthropomorphic technology interacted with reliability to affect trust in such technology. Participants included a cross-section of younger and older adults. Through a factorial survey, participants responded to health-related vignettes containing anthropomorphic technology with a specific age, gender, and level of past reliability by rating their trust in the system. Trust in the technology was affected by the age and gender of the user as well as its appearance and reliability. Perceptions of anthropomorphic technology can be affected by pre-existing stereotypes about the capability of a specific age or gender.

Practitioner Summary: The perceived age and gender of automation can alter perceptions of the anthropomorphic technology such as trust. Thus, designers of automation should design anthropomorphic interfaces with an awareness that the perceived age and gender will interact with the user’s age and gender.