ABSTRACT
There has been little research into how end users might be able to communicate advice to machine learning systems. If this resource--the users themselves--could somehow work hand-in-hand with machine learning systems, the accuracy of learning systems could be improved and the users' understanding and trust of the system could improve as well. We conducted a think-aloud study to see how willing users were to provide feedback and to understand what kinds of feedback users could give. Users were shown explanations of machine learning predictions and asked to provide feedback to improve the predictions. We found that users had no difficulty providing generous amounts of feedback. The kinds of feedback ranged from suggestions for reweighting of features to proposals for new features, feature combinations, relational features, and wholesale changes to the learning algorithm. The results show that user feedback has the potential to significantly improve machine learning systems, but that learning algorithms need to be extended in several ways to be able to assimilate this feedback.
- Altendorf, E., Restificar, E., and Dietterich, T. Learning from sparse data by exploiting monotonicity constraints. Conf. Uncertainty in Artificial Intelligence, 2005.Google Scholar
- Billsus, D., Hilbert, D. and Maynes-Aminzade, D. Improving proactive information systems. ACM IUI 2005, 159--166. Google ScholarDigital Library
- Blei, D., Ng, A. and Jordan, M. Latent Dirichlet allocation. J. Machine Learning Res., 3, 2003, 993--1022. Google ScholarDigital Library
- Blythe, J. Task learning by instruction in Tailor. ACM IUI 2005, 191--198. Google ScholarDigital Library
- Brutlag, J., Meek, C. Challenges of the email domain for text classification. Intl. Conf. Machine Learning 2000, 103--110. Google ScholarDigital Library
- Chklovski, T., Ratnakar, V. and Gill, Y. User interfaces with semi-formal representations: A study of designing argumentation structures. ACM IUI 2005, 130--136. Google ScholarDigital Library
- Cohen, W. Learning rules that classify e-mail. AAAI Spring Symp. Information Access, 1996.Google Scholar
- Dalvi, N., Domingos, P., Sanghai, M. S. and Verma, D. Adversarial classification. ACM Intl. Conf. Knowledge Discovery and Data Mining, 2004, 99--108. Google ScholarDigital Library
- Getoor, L., Friedman, N., Koller, D. and Pfeffer, A. Learning probabilistic relational models. S. Dzeroski and N. Lavrac (eds.) Relational Data Mining. Springer-Verlag, 2001.Google Scholar
- Hart, S. and Staveland, L. Development of a NASA-TLX (Task load index): Results of empirical and theoretical research. P. Hancock and N. Meshkati (eds.), Human Mental Workload, 1988, 139--183.Google Scholar
- Herlocker, J., Konstan, J. and Riedl, J. Explaining collaborative filtering recommendations. ACM CSCW 2000, 241--250. Google ScholarDigital Library
- Kissinger, C., Burnett, M., Stumpf, S., Subrahmaniyan, N., Beckwith, L., Yang, S. and Rosson, M. B. Supporting end-user debugging: What do users want to know? ACM Advanced Visual Interfaces 2006, 135--142. Google ScholarDigital Library
- Klimt, B. and Yang, Y. The Enron corpus: A new dataset for email classification research. European Conf. Machine Learning 2004, 217--226.Google ScholarDigital Library
- Lieberman, H. and Kumar, A. Providing expert advice by analogy for on-line help. IEEE/WIC/ACM Intl. Conf. Intelligent Agent Technology 2005, 26--32. Google ScholarDigital Library
- Marx, Z., Rosenstein, M. T., Kaelbling, L. P., Dietterich, T. G. Transfer learning with an ensemble of background tasks. NIPS 2005 Workshop on Transfer Learning.Google Scholar
- McCarthy, K., Reilly, J., McGinty, L. and Smyth, B. Experiments in dynamic critiquing. ACM IUI 2005, 175--182. Google ScholarDigital Library
- Miller, G. WordNet: A lexical database for English. Comm. ACM 38(11), 1995, 39--41. Google ScholarDigital Library
- Myers, B., Weitzman, D., Ko, A. Chau, D. Answering why and why not questions in user interfaces. ACM CHI 2006. Google ScholarDigital Library
- Oblinger, D., Castelli, V. and Bergman, L. Augmentation-based learning. ACM IUI 2006, 202--209. Google ScholarDigital Library
- Pazzani, M. J. Representation of electronic mail filtering profiles: a user study. ACM IUI 2000, 202--206. Google ScholarDigital Library
- Phalgune, A., Kissinger, C., Burnett, M., Cook, C., Beckwith, L. Ruthruff, J. Garbage in, garbage out? An empirical look at oracle mistakes by end-user programmers, IEEE Symp. Visual Languages and Human Centric Computing 2005, 45--52. Google ScholarDigital Library
- Porter, M. An algorithm for suffix stripping. Program, 14(3), 1980, 130--137.Google ScholarCross Ref
- Pu, P. and Chen, L. Trust building with explanation interfaces. ACM IUI 2006, 93--100. Google ScholarDigital Library
- Rettig, M. Prototyping for tiny fingers. Comm. ACM 37(4), 1994, 21--27. Google ScholarDigital Library
- Shen, J., Li, L., Dietterich, T. Herlocker, J. A hybrid learning system for recognizing user tasks from desk activities and email messages. ACM IUI 2006, 86--92. Google ScholarDigital Library
- Witten, I., Frank, E. Data Mining: Practical Machine Learning Tools and Techniques, 2nd Ed., Morgan Kaufmann, 2005. Google ScholarDigital Library
- Zhou, G., Su, J. Named entity recognition using an HMM-based chunk tagger. ACM Assoc. Comp. Linguistics 2002. Google ScholarDigital Library
Index Terms
- Toward harnessing user feedback for machine learning
Recommendations
Enhancing Performance of Operationalized Machine Learning Models by Analyzing User Feedback
IVSP '22: Proceedings of the 2022 4th International Conference on Image, Video and Signal ProcessingMachine learning (ML) models that have been put into production must be actively monitored and maintained to ensure that the models continue to satisfy performance quality requirements. User feedback is often a very good indicator of whether the model ...
Interacting meaningfully with machine learning systems: Three experiments
Although machine learning is becoming commonly used in today's software, there has been little research into how end users might interact with machine learning systems, beyond communicating simple ''right/wrong'' judgments. If the users themselves could ...
Pareto-Based Multiobjective Machine Learning: An Overview and Case Studies
Machine learning is inherently a multiobjective task. Traditionally, however, either only one of the objectives is adopted as the cost function or multiple objectives are aggregated to a scalar cost function. This can be mainly attributed to the fact ...
Comments