Syda Productions - stock.adobe.c

Bot management drives ethical data use, curbs image scraping

Bot management tools can help enterprises combat bad bots, prevent web and image scraping, and ensure ethical data use -- all while maintaining a positive end-user experience.

Bots are a force on the internet, accounting for nearly 38% of all web traffic, according to Imperva. While some bots are good bots, such as web crawlers that catalog websites and improve searchability, close to 20% of internet traffic is generated by bad bots, including those dedicated to credential stuffing, card fraud and inventory hoarding.

Implementing bot management tools helps enterprises identify and stop bot attacks, while still ensuring positive UX for end customers. If customers have to jump through hoops to prove they aren't bots, they will simply go elsewhere.

The range of bot attacks

Let's explore a few ways bots frustrate organizations and erode customer trust:

  • Credential stuffing. Most people have had a username/password pair compromised in a breach. Attackers purchase previously stolen credentials and try them against new sites. Since so many people reuse passwords, trying 1 million credentials is likely to yield several successful logins. Sites that can't stop credential stuffing bots risk having customers' accounts compromised, causing greater potential for fraud, time and money to remediate, and customer mistrust.
  • Card fraud. Bots try to pay for purchases by rapidly attempting gift card number and PIN combinations, hoping to find available card balances. Customers will be furious to have their card balances stolen, and merchants face the cost of refunding gift card funds, not to mention the potential loss of future business.
  • Inventory hoarding. Ever wanted to buy tickets to a popular show or a special edition pair of sneakers and been frustrated to find them sold out? Bots jump on exclusive sales and buy up most of the inventory before humans can. The attackers then resell the tickets or products on other sites at a tidy profit. Potential customers are disappointed to miss out on the sale and may end up spending more to buy from a reseller. Companies, meanwhile, end up with frustrated customers that take their business elsewhere.

Enter image theft

Implementing bot management tools help enterprises identify and stop bot attacks, while still ensuring positive UX for end customers.

And then there's web scraping. Some web scraping bots are good bots, such as ones tied to search engines or partner bots scraping inventory data to help companies extend their sales reach. But web scrapers can be malicious, too -- scraping pricing information from competitors' sites or, as recently reported, scraping pricing information about in-demand consumer products in an attempt to hoard. Bots can scrape images, too. A few months ago, a researcher with New York City's task force on cyber sexual assault discovered bots had scraped 70,000 pictures users had uploaded to their Tinder profiles and fed them to a cybercrime forum. Disturbingly, all the stolen pictures were of women, which leads to concerns about cyber-stalking or worse.

Yet, it's not only cybercriminals engaging in image scraping. Some corporations scrape images to feed AI engines. Clearview AI recently made the news for the facial recognition software it supplies to law enforcement and for feeding its AI images scraped from various social media sites, including Facebook and Twitter. Twitter responded with a cease-and-desist order.

The legality of web scraping is working its way through the courts. The case of HiQ Labs scraping data from LinkedIn to warn HiQ customers about employees that might be job hunting is one example. Appellate rulings indicate this might be legal. Whether image scraping is ultimately deemed legal is another question, but it's certainly unethical and dangerous.

Bot management for ethical data enforcement

Social media and other sites that collect user data have terms of use that specify how they will or will not use the data its users supply. Customers may not always read those terms carefully when they sign up, but when the data is shared in a way they didn't expect or don't like, they understandably get angry. In the case of web and image scraping, it isn't that the social media site is misusing customer data; it's that they are failing to stop others from misusing that data.

Whether the perpetrator is a career cybercriminal or an unethical corporation -- and whether you think there is a difference between the two -- web scraping threatens citizens' privacy and safety. Even where regulations are murky, organizations dealing in user-supplied data, including images, have an ethical mission to avoid data misuse. While we don't often talk about bot management as an ethical tool, blocking bots that are trying to scrape your photos is a legitimate use case that demands more attention. Social media companies -- and any other site whose business relies on collecting customer photos -- must take steps to block image scraping bots and protect their customers.

Even if social media sites are already specifying the terms and conditions for what purposes they allow data and image scraping, they need to take the next step to prevent unethical use of scraped data. Technical controls, such as bot management, will help them restrict data and image scraping to their trusted partners.

Remember, once images are scraped, getting perpetrators to remove them is almost impossible. Even if a company responds to a cease-and-desist order, it will be difficult to validate compliance. Therefore, make sure bot management systems are configured and updated to block unauthorized image and data scrapers. This helps sites walk the talk on protecting their customers' data.

Sandy CarielliSandy Carielli

Sandy Carielli is a principal analyst at Forrester, advising security and risk professionals on application security, with a particular emphasis on the collaboration among security and risk, application development, operations and business teams. Her research covers topics such as proactive security design, security testing in the software delivery lifecycle, protection of applications in production environments, and remediation of hardware and software flaws. Carielli has over 15 years of experience in the security industry, working in software engineering, consulting, product management and technology strategy roles. Her most recent experience was at Entrust Datacard, where she guided the organization's technology strategy and researched the impact of emerging technologies on the business. Carielli is co-author of the Industrial Internet Consortium's IoT Security Maturity Model and has spoken at RSA Conference, Source Boston, Information Systems Security Association International and many other regional security events. Carielli has an ScB in mathematics from Brown University and an MBA from MIT Sloan School of Management.

Next Steps

How to scrape data from a website

Dig Deeper on Application and platform security

Enterprise Desktop
Cloud Computing