Skip to main content
STaPLab

LLM-driven Conversational Survey and Classification Frameworks for Reporting Privacy Incidents

Technology users experience various types of privacy harms, including harms to reputation, psychology, or autonomy, or can even lead to discrimination. However, we know very little about exactly what technology and features are causing these harms in what context. In this project we aim to create a research infrastructure to track privacy harms, mitigation techniques, and best practices to prevent such harm.

Problem

Conversational Surveys

Traditionally, qualitative data (like survey responses) have been gathered through in-person interactions, such as interviews. Because these take time, money, and human resources to accomplish, many research institutions and companies use free response surveys instead. We’ve all taken many free-response surveys online, but these surveys do not always capture the full depth of information that the survey is looking for. To make matters worse, there is no way for the institution who runs the survey to make sure that responses are accurate, complete, and relevant. Researchers and companies have successfully begun using LLMs to guide surveys and make them more effective. Sometimes, these are even formatted as conversations with an AI agent. Our research is focused on discovering how we can use these LLM-driven conversational-style surveys to gather reports of privacy incidents and infractions. What survey formats encourage people to share more information? Can an AI agent help gather all the needed information for a helpful privacy incident report? Are people comfortable sharing their privacy incidents with an AI agent in a chat conversation? These are all questions our research is attempting to address.

Classification

Free-response data, such as responses from surveys, are most useful when each response is tagged with important information. This helps us understand what certain responses have in common and what types of things people talked about in responses. Our lab is focused on analyzing survey responses about privacy incidents and infractions. This means we need to review each survey response and tag it with information about what people said about a privacy incident. However, this process takes a significant amount of time, and at large scale this process requires lots of human involvement. Consequently, we’re interested in exploring how well LLMs can be leveraged to tag survey responses using a common way of tagging survey responses (Solove’s taxonomy of privacy incidents). With careful prompt tuning and ensemble architecture, we are focused on automating much of the classification (tagging) process.

Current Research Efforts

Preferred Formats for Conversational Surveys

Our team has developed a prototype web application that allows us to administer a survey about privacy incidents in different formats and with varying degrees of AI involvement. We are interested in understanding which of these approaches will help people share detailed and relevant information about privacy incidents that have happened to them. Participants will be asked to take the survey in one of four formats: traditional free-response, free-resposne with AI generated follow-up questions, text chat format with restricted AI involvement, and a text chat format with full AI involvement. We expect that some level of AI involvement will be needed to help people report all of the right details about privacy incidents that have happened to them. Reading related research might lead one to assume that the text chat with full AI involvement will perform best. However, we expect to see interesting and promising results from the other formats utilizing AI. It is very possible that people are more comfortable sharing information when it is formatted like a traditional free-response survey, even if AI is still involved. The findings of this study will inform our work going forward in the large Privacy Infrastructure Project, aimed at creating an infrastructure that collects and analyzes privacy incidents that would otherwise go unreported.

Related Works

  1. Barari, S., Angbazo, J., Wang, N., Christian, L.M., Dean, E. and Sepulvado, B. AI-Assisted Conversational Interviewing: Effects on Data Quality and User Experience.
  2. Bernasconi, E., Redavid, D. and Ferilli, S. 2025. Integrated Survey Classification and Trend Analysis via LLMs: An Ensemble Approach for Robust Literature Synthesis. Electronics. 14, 17 (Aug. 2025), 3404. https://doi.org/10.3390/electronics14173404.
  3. Celino, I. and Re Calegari, G. 2020. Submitting surveys via a conversational interface: An evaluation of user acceptance and approach effectiveness. International Journal of Human-Computer Studies. 139, (July 2020), 102410. https://doi.org/10.1016/j.ijhcs.2020.102410.
  4. Cuevas, A., Scurrell, J.V., Brown, E.M., Entenmann, J. and Daepp, M.I.G. 2024. Collecting Qualitative Data at Scale with Large Language Models: A Case Study. arXiv.
  5. Đuka, I. and Njeguš, A. 2021. Conversational Survey Chatbot: User Experience and Perception. Proceedings of the International Scientific Conference - Sinteza 2021 (Beograd, Serbia, 2021), 322–327.
  6. Ge, Y., Xiao, Z., Diesner, J., Ji, H., Karahalios, K. and Sundaram, H. 2023. What should I Ask: A Knowledge-driven Approach for Follow-up Questions Generation in Conversational Surveys. arXiv.
  7. Jacobsen, R.M., Cox, S.R., Griggio, C.F. and Van Berkel, N. 2025. Chatbots for Data Collection in Surveys: A Comparison of Four Theory-Based Interview Probes. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (Yokohama Japan, Apr. 2025), 1–21.
  8. Kaiyrbekov, K., Dobbins, N.J. and Mooney, S.D. 2025. Automated Survey Collection with LLM-based Conversational Agents. arXiv.
  9. Karpov, A. and Delić, V. eds. 2025. Speech and Computer: 26th International Conference, SPECOM 2024, Belgrade, Serbia, November 25–28, 2024, Proceedings, Part I. Springer Nature Switzerland.
  10. Kim, S., Lee, J. and Gweon, G. 2019. Comparing Data from Chatbot and Web Surveys: Effects of Platform and Conversational Style on Survey Response Quality. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow Scotland Uk, May 2019), 1–12.
  11. Kim, T.W., Jiang, L., Duhachek, A., Lee, H. and Garvey, A. 2022. Do You Mind if I Ask You a Personal Question? How AI Service Agents Alter Consumer Self-Disclosure. Journal of Service Research. 25, 4 (Nov. 2022), 649–666. https://doi.org/10.1177/10946705221120232.
  12. Knijnenburg, B.P., Page, X., Wisniewski, P., Lipford, H.R., Proferes, N. and Romano, J. eds. 2022. Modern Socio-Technical Perspectives on Privacy. Springer International Publishing.
  13. Mellon, J., Bailey, J., Scott, R., Breckwoldt, J., Miori, M. and Schmedeman, P. 2024. Do AIs know what the most important issue is? Using language models to code open-text social survey responses at scale. Research & Politics. 11, 1 (Jan. 2024), 20531680241231468. https://doi.org/10.1177/20531680241231468.
  14. Meng, Y., Zhang, Y., Huang, J., Xiong, C., Ji, H., Zhang, C. and Han, J. 2020. Text Classification Using Label Names Only: A Language Model Self-Training Approach. arXiv.
  15. Papneja, H. and Yadav, N. 2025. Self-disclosure to conversational AI: a literature review, emergent framework, and directions for future research. Personal and Ubiquitous Computing. 29, 2 (Apr. 2025), 119–151. https://doi.org/10.1007/s00779-024-01823-7.
  16. Parker, M.J., Anderson, C., Stone, C. and Oh, Y. 2025. A Large Language Model Approach to Educational Survey Feedback Analysis. International Journal of Artificial Intelligence in Education. 35, 2 (June 2025), 444–481. https://doi.org/10.1007/s40593-024-00414-0.
  17. Petronin, P. 2023. Automatic Augmentation of Online Surveys Using Conversational AI. (2023).
  18. Rouzegar, H. and Makrehchi, M. 2024. Enhancing Text Classification through LLM-Driven Active Learning and Human Annotation. arXiv.
  19. Shim, H., Cho, J. and Sung, Y.H. 2024. Unveiling Secrets to AI Agents: Exploring the Interplay of Conversation Type, Self-Disclosure, and Privacy Insensitivity. Asian Communication Research. 21, 2 (Aug. 2024), 195–216. https://doi.org/10.20879/acr.2024.21.019.
  20. Tanwar, H., Shrivastva, K., Singh, R. and Kumar, D. 2024. OpineBot: Class Feedback Reimagined Using a Conversational LLM. arXiv.
  21. Vajjala, S. and Shimangaud, S. 2025. Text Classification in the LLM Era -- Where do we stand? arXiv.
  22. Wambsganss, T., Zierau, N., Söllner, M., Käser, T., Koedinger, K.R. and Leimeister, J.M. 2022. Designing Conversational Evaluation Tools: A Comparison of Text and Voice Modalities to Improve Response Quality in Course Evaluations. Proceedings of the ACM on Human-Computer Interaction. 6, CSCW2 (Nov. 2022), 1–27. https://doi.org/10.1145/3555619.
  23. Williams, R.T. and Ingleby, E. 2025. The online survey in qualitative research: can AI act as a probing tool? Frontiers in Research Metrics and Analytics. 10, (Sept. 2025), 1519008. https://doi.org/10.3389/frma.2025.1519008.
  24. Xiao, Z., Zhou, M.X., Liao, Q.V., Mark, G., Chi, C., Chen, W. and Yang, H. 2020. Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys with Open-ended Questions. ACM Transactions on Computer-Human Interaction. 27, 3 (June 2020), 1–37. https://doi.org/10.1145/3381804.
  25. Yu, H., Yang, Z., Pelrine, K., Godbout, J.F. and Rabbany, R. 2023. Open, Closed, or Small Language Models for Text Classification? arXiv.