research-article

Smartening the crowds: computational techniques for improving human verification to fight phishing scams

Authors:
Gang Liu

City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong and Carnegie Mellon University, Pittsburgh, PA

City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong and Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Guang Xiang

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Bryan A. Pendleton

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Jason I. Hong

Carnegie Mellon University, Pittsburgh, PA and Wombat Security Technologies, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA and Wombat Security Technologies, Pittsburgh, PA
View Profile

,
Wenyin Liu

City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong

City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
View Profile

SOUPS '11: Proceedings of the Seventh Symposium on Usable Privacy and SecurityJuly 2011Article No.: 8Pages 1–13https://doi.org/10.1145/2078827.2078838

Published:20 July 2011Publication History

SOUPS '11: Proceedings of the Seventh Symposium on Usable Privacy and Security

Pages 1–13

ABSTRACT

Phishing is an ongoing kind of semantic attack that tricks victims into inadvertently sharing sensitive information. In this paper, we explore novel techniques for combating the phishing problem using computational techniques to improve human effort. Using tasks posted to the Amazon Mechanical Turk human effort market, we measure the accuracy of minimally trained humans in identifying potential phish, and consider methods for best taking advantage of individual contributions. Furthermore, we present our experiments using clustering techniques and vote weighting to improve the results of human effort in fighting phishing. We found that these techniques could increase coverage over and were significantly faster than existing blacklists used today.

References

Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'04), 319--326. Google ScholarDigital Library
Cosley, D., Frankowski, D., Terveen, L., and Riedl, J. 2007. Suggestbot: Using intelligent task routing to help people find work in wikipedia. In Proceedings of the 12th International Conference on Intelligent User Interfaces (IUI'07), 32--41. Google ScholarDigital Library
Chen, T, Dick, S. and Miller, J. 2010. Detecting visually similar Web pages: Application to phishing detection. In ACM Transactions on Internet Technology (TOIT), Vol. 10(2). Google ScholarDigital Library
Chou, N., Ledesma, R., Teraguchi, Y. and Mitchell, J. 2004. Client-side defense against web-based identity theft. In Proceedings of the 11th Annual Network and Distributed System Security Symposium (NDSS'04).Google Scholar
Dhamija, R. and J. D. Tygar. 2005. The battle against phishing: dynamic security skins. In Proceedings of the 2005 Symposium on Usable Privacy and Security (SOUPS'05), 77--88. Google ScholarDigital Library
Dhamija, R., J. D. Tygar, and Hearst, M. 2006. Why phishing works. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'06), 581--590. Google ScholarDigital Library
Edwards, W., Poole, E., and Stoll, J. 2007. Security automation considered harmful. In Proceedings of the IEEE New Security Paradigms Workshop (NSPW'07), 33--42. Google ScholarDigital Library
Egelman, S., Cranor, L., and Hong, J. 2008. You've been warned: An empirical study of the effectiveness of web browser phishing warnings. In Proceeding of the 26th Annual SIGCHI Conference on Human Factors in Computing Systems (CHI'08), 1065--1074. Google ScholarDigital Library
Garera, S., Provos, N., Chew, M., and Rubin, A. D. 2007. A framework for detection and measurement of phishing attacks. In Proceedings of the 2007 ACM Workshop on Recurring Malcode (WORM'07), 1--8. Google ScholarDigital Library
Goecks, J. and Mynatt, E. D. 2005. Supporting privacy panagement via pommunity experience and expertise. In Proceedings of the 2nd Communities and Technologies Conference, 397--418.Google Scholar
Golder, S. A. and Huberman, B. A. 2006. Usage patterns of collaborative tagging systems. Journal of Information Science, Vol. 32(2), Apr. 2006, 198--208. Google ScholarDigital Library
Heer, J. and Bostock, M. 2010. Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In Proceedings of the 28th international conference on Human factors in computing systems (CHI'10), 203--212. Google ScholarDigital Library
http://sb.google.com/safebrowsing/update?version=goog-white-domain:1:1.Google Scholar
http://www.millersmiles.co.uk/scams.php.Google Scholar
Ipeirotis, P. 2010. Analyzing the amazon mechanical turk marketplace, NYU Working Paper No. CEDER-10-04.Google Scholar
Karau, S. and Willianms, K. 1993. Social loafing: A meta-analytic review and theoretical integration. Journal of Personality and Social Psychology. Vol 65(4), Oct. 1993, 681--706.Google ScholarCross Ref
Kirda, E. and Kruegel, C. 2005. Protecting users against phishing attacks with antiPhish. In Proceedings of the 29th Annual International Computer Software and Applications Conference (COMPSAC'05), 517--524. Google ScholarDigital Library
Kittur, A., Chi, E. H., Suh B. 2008. Crowdsourcing user studies with mechanical turk. In Proceedings of the 26th annual SIGCHI conference on Human factors in computing systems (CHI'08), 453--456. Google ScholarDigital Library
Kumaraguru, P., Cranshaw, J., Acquisti, A., Cranor, L., Hong, J., Blair, M., and Pham, T. 2009. School of phish: A realword evaluation of anti-phishing training. In Proceedings of the 5th Symposium on Usable Privacy and Security (SOUPS'09) Google ScholarDigital Library
Liu, W., Huang, G., Liu, X., Zhang, M. and Deng, X. 2005. Detection of phishing webpages based on visual similarity. In Proceedings of the special interest tracks and posters of the 14th international conference on World Wide Web (WWW'05), 1060--1061. Google ScholarDigital Library
Ludl, C., McAllister, S., Kirda, E., Kruegel, C. 2007. On the effectiveness of techniques to detect phishing sites. Lecture Notes in Computer Science (LNCS). Vol. 4579/2007, 20--39. Google ScholarDigital Library
Mason, W. and Watts, D. J. 2009. Financial incentives and the "performance of crowds". In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP'09), 77--85. Google ScholarDigital Library
Medvet, E., Eurecom, E., and Kruegel. C. 2008. Visual-similarity-based phishing detection. In Proceedings of the 4th international conference on Security and privacy in communication networks (SecureComm'08), 30--36. Google ScholarDigital Library
Millen, D., Yang, M., Whittaker, S., and Feinberg, J. 2007. Social bookmarking and exploratory search. In Proceedings of the 2007 Tenth European Conference on Computer-Supported Cooperative Work (ECSCW'07), 21--40.Google Scholar
Moore, T. and Clayton, R. 2007. Examining the impact of website take-down on phishing. In Proceedings of the Anti-Phishing Working Groups 2nd Annual Ecrime Researchers Summit (eCrime'07), 1--13. Google ScholarDigital Library
Moore, T. and Clayton, R. 2008. Evaluating the wisdom of crowds in assessing phishing websites. Lecture Notes in Computer Science (LNCS). Vol 5143/2008, 16--30. Google ScholarDigital Library
Pan, Y. and Ding, X. 2006. Anomaly Based Web Phishing Page Detection. In Proceedings of the 22nd Annual Computer Security Applications Conference (ACSAC'06), 381--392. Google ScholarDigital Library
Rosiello, A., Kirda, E., Kruegel, C. and Ferrandi, F. 2007. A layout-similarity-based approach for detecting phishing pages. In Proceedings of the 3rd International Conference on Security and Privacy in Communication Networks (SecureComm'07), 454--463.Google Scholar
Ross, B., Jackson, C., Miyake, N., Boneh, D. and Mitchell, J. 2005. Stronger password authentication using browser extensions. In Proceedings of the 14th conference on USENIX Security Symposium, 17--32. Google ScholarDigital Library
Sheng, S., Kumaraguru, P., Acquisti, A., Cranor, L., Hong, J. 2009. Improving phishing countermeasures: an analysis of expert interviews. In Proceedings of the Anti-Phishing Working Groups 4th Annual Ecrime Researchers Summit (eCrime'09), 1--15.Google ScholarCross Ref
Sheng, S., Magnien, B., Kumaraguru, P., Acquisti, A., Cranor, L., Hong, J., and Nunge, E. 2007. Anti-Phishing phil: The design and evaluation of a game that teaches people not to fall for phish. In Proceedings of the 3rd Symposium on Usable Privacy and Security (SOUPS'07), 88--89. Google ScholarDigital Library
Sheng, S., Wardman, B., Warner, G., Cranor, L., Hong, J., and Zhang, C. 2009. An empirical analysis of phishing blacklists. In Proceedings of the 6th Conference on Email and Anti-Spam (CEAS'09).Google Scholar
Statistics about Phishing Activity and Phishtank Usage. (2011). Retrieved January, 2011, from http://www.phishtank.com/stats/.Google Scholar
Surowiecki, J. 2004. The wisdom of crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies, societies and nations. Doubleday. Google ScholarDigital Library
Weaver, R. and Collins, M. P. 2007. Fishing for phishes: Applying capture-recapture methods to estimate phishing populations. In Proceedings of the Anti-Phishing Working Groups 2nd Annual Ecrime Researchers Summit (eCrime'07), 14--25. Google ScholarDigital Library
Wu, M., Miller, R. C., and Little, G. 2006. Web wallet: Preventing phishing attacks by revealing user intentions. In Proceedings of the 2nd Symposium on Usable Privacy and Security (SOUPS'06), 102--113. Google ScholarDigital Library
Xiang, G. and Hong, J. 2009. A hybrid phish detection approach by identity discovery and keywords retrieval. In Proceedings of the 18th International Conference on World Wide Web (WWW'09), 571--580. Google ScholarDigital Library
Xiang, G., Pendleton, B. A., Hong, J. 2009. Modeling content from human-verified blacklists for accurate zero-hour phish detection. Technical report, CMU-LTI-09-005.Google Scholar
Xiang, G., Pendleton, B. A., Hong, J., and Rose, C. P. 2010. A hierarchical adaptive probabilistic approach for zero hour phish detection. In Proceedings of the 15th European Symposium on Research in Computer Security (ESORICS'10), 268--285. Google ScholarDigital Library
Yue, C. and Wang, H. 2010. BogusBiter: A transparent protection against phishing attacks. In ACM Transactions on Internet Technology (TOIT), Vol. 10(2). Google ScholarDigital Library
Zhang, Y., Egelman, S., Cranor, L., and Hong, J. 2007. Phinding phish: An evaluation of anti-phishing toolbars. In Proceedings of the 14th Annual Network & Distributed System Security Symposium (NDSS 2007).Google Scholar
Zhang, Y., Hong, J., and Cranor, L. 2007. Cantina: A content-based approach to detecting phishing web sites. In Proceedings of the 16th International Conference on World Wide Web (WWW'07), 639--648. Google ScholarDigital Library

Index Terms

Smartening the crowds: computational techniques for improving human verification to fight phishing scams

Recommendations

Approximating the crowd

The problem of "approximating the crowd" is that of estimating the crowd's majority opinion by querying only a subset of it. Algorithms that approximate the crowd can intelligently stretch a limited budget for a crowdsourcing task. We present an ...
Read More
The wisdom of smaller, smarter crowds
EC '14: Proceedings of the fifteenth ACM conference on Economics and computation

The "wisdom of crowds" refers to the phenomenon that aggregated predictions from a large group of people can rival or even beat the accuracy of experts. In domains with substantial stochastic elements, such as stock picking, crowd strategies (e.g. ...
Read More
Parting Crowds: Characterizing Divergent Interpretations in Crowdsourced Annotation Tasks
CSCW '16: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing

Crowdsourcing is a common strategy for collecting the “gold standard” labels required for many natural language applications. Crowdworkers differ in their responses for many reasons, but existing approaches often treat disagreements as "noise" to be ...
Read More

Reviews

Reviewer: Pieter Hartel

A good phishing site should resemble the target site as much as possible, and it should hide the differences with the target site, at least to the unsuspecting user. This paper leverages this observation to cluster similar suspected phishing sites. Then, instead of crowd-sourcing the verification of a single suspected phishing site, a whole cluster can be verified at once. This is reported to improve both the timeliness and the accuracy of the results on the basis of an experiment with 239 participants. Unfortunately, the control group and the experimental group had a large overlap (174 participants). The authors argue that this does not invalidate the results because of minimal learning effects, but they have no evidence for this. I believe that the main contribution of the paper is putting forward the idea of clustering similar suspected phishing sites. The paper shows that such clusters abound and that standard techniques (for example, shingling) are effective in discovering those clusters. This suggests important further research not identified in the paper: Is it possible to distinguish suspected phishing sites from genuine sites simply by searching for look-alikes__?__ It would be prudent to keep humans in the loop to avoid liability issues surrounding false positives, and it would be wise to consider the countermeasures that phishers would use to defeat automatic look-alike detection. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SOUPS '11: Proceedings of the Seventh Symposium on Usable Privacy and Security
July 2011
253 pages
ISBN:9781450309110
DOI:10.1145/2078827
General Chair:
Lorrie Faith Cranor
Carnegie Mellon University
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 July 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
crowdsourcing
phishing
voting
wisdom of crowds
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate15of49submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 596
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Smartening the crowds: computational techniques for improving human verification to fight phishing scams

SOUPS '11: Proceedings of the Seventh Symposium on Usable Privacy and Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Approximating the crowd

The wisdom of smaller, smarter crowds

Parting Crowds: Characterizing Divergent Interpretations in Crowdsourced Annotation Tasks

Reviews

Access critical reviews of Computing literature here