Syda Productions - stock.adobe.c

Opinion

Bot management drives ethical data use, curbs image scraping

Bot management tools can help enterprises combat bad bots, prevent web and image scraping, and ensure ethical data use -- all while maintaining a positive end-user experience.

By

Sandy Carielli

Published: 14 Apr 2020

Bots are a force on the internet, accounting for nearly 38% of all web traffic, according to Imperva. While some bots are good bots, such as web crawlers that catalog websites and improve searchability, close to 20% of internet traffic is generated by bad bots, including those dedicated to credential stuffing, card fraud and inventory hoarding.

Implementing bot management tools helps enterprises identify and stop bot attacks, while still ensuring positive UX for end customers. If customers have to jump through hoops to prove they aren't bots, they will simply go elsewhere.

The range of bot attacks

Let's explore a few ways bots frustrate organizations and erode customer trust:

Credential stuffing. Most people have had a username/password pair compromised in a breach. Attackers purchase previously stolen credentials and try them against new sites. Since so many people reuse passwords, trying 1 million credentials is likely to yield several successful logins. Sites that can't stop credential stuffing bots risk having customers' accounts compromised, causing greater potential for fraud, time and money to remediate, and customer mistrust.
Card fraud. Bots try to pay for purchases by rapidly attempting gift card number and PIN combinations, hoping to find available card balances. Customers will be furious to have their card balances stolen, and merchants face the cost of refunding gift card funds, not to mention the potential loss of future business.
Inventory hoarding. Ever wanted to buy tickets to a popular show or a special edition pair of sneakers and been frustrated to find them sold out? Bots jump on exclusive sales and buy up most of the inventory before humans can. The attackers then resell the tickets or products on other sites at a tidy profit. Potential customers are disappointed to miss out on the sale and may end up spending more to buy from a reseller. Companies, meanwhile, end up with frustrated customers that take their business elsewhere.

Enter image theft

Implementing bot management tools help enterprises identify and stop bot attacks, while still ensuring positive UX for end customers.

And then there's web scraping. Some web scraping bots are good bots, such as ones tied to search engines or partner bots scraping inventory data to help companies extend their sales reach. But web scrapers can be malicious, too -- scraping pricing information from competitors' sites or, as recently reported, scraping pricing information about in-demand consumer products in an attempt to hoard. Bots can scrape images, too. A few months ago, a researcher with New York City's task force on cyber sexual assault discovered bots had scraped 70,000 pictures users had uploaded to their Tinder profiles and fed them to a cybercrime forum. Disturbingly, all the stolen pictures were of women, which leads to concerns about cyber-stalking or worse.

Yet, it's not only cybercriminals engaging in image scraping. Some corporations scrape images to feed AI engines. Clearview AI recently made the news for the facial recognition software it supplies to law enforcement and for feeding its AI images scraped from various social media sites, including Facebook and Twitter. Twitter responded with a cease-and-desist order.

The legality of web scraping is working its way through the courts. The case of HiQ Labs scraping data from LinkedIn to warn HiQ customers about employees that might be job hunting is one example. Appellate rulings indicate this might be legal. Whether image scraping is ultimately deemed legal is another question, but it's certainly unethical and dangerous.

Bot management for ethical data enforcement

Social media and other sites that collect user data have terms of use that specify how they will or will not use the data its users supply. Customers may not always read those terms carefully when they sign up, but when the data is shared in a way they didn't expect or don't like, they understandably get angry. In the case of web and image scraping, it isn't that the social media site is misusing customer data; it's that they are failing to stop others from misusing that data.

Whether the perpetrator is a career cybercriminal or an unethical corporation -- and whether you think there is a difference between the two -- web scraping threatens citizens' privacy and safety. Even where regulations are murky, organizations dealing in user-supplied data, including images, have an ethical mission to avoid data misuse. While we don't often talk about bot management as an ethical tool, blocking bots that are trying to scrape your photos is a legitimate use case that demands more attention. Social media companies -- and any other site whose business relies on collecting customer photos -- must take steps to block image scraping bots and protect their customers.

Even if social media sites are already specifying the terms and conditions for what purposes they allow data and image scraping, they need to take the next step to prevent unethical use of scraped data. Technical controls, such as bot management, will help them restrict data and image scraping to their trusted partners.

Remember, once images are scraped, getting perpetrators to remove them is almost impossible. Even if a company responds to a cease-and-desist order, it will be difficult to validate compliance. Therefore, make sure bot management systems are configured and updated to block unauthorized image and data scrapers. This helps sites walk the talk on protecting their customers' data.

Sandy Carielli

Sandy Carielli

Sandy Carielli is a principal analyst at Forrester, advising security and risk professionals on application security, with a particular emphasis on the collaboration among security and risk, application development, operations and business teams. Her research covers topics such as proactive security design, security testing in the software delivery lifecycle, protection of applications in production environments, and remediation of hardware and software flaws. Carielli has over 15 years of experience in the security industry, working in software engineering, consulting, product management and technology strategy roles. Her most recent experience was at Entrust Datacard, where she guided the organization's technology strategy and researched the impact of emerging technologies on the business. Carielli is co-author of the Industrial Internet Consortium's IoT Security Maturity Model and has spoken at RSA Conference, Source Boston, Information Systems Security Association International and many other regional security events. Carielli has an ScB in mathematics from Brown University and an MBA from MIT Sloan School of Management.

Next Steps

How to scrape data from a website

Dig Deeper on Application and platform security

Search Networking

5 principles of change management in networking
Network change management includes five principles, including risk analysis and peer review. These best practices can help ...
How network efficiency advances ESG goals
From SDN to green electricity, network optimization plays a critical role in helping enterprises reduce emissions, cut costs and ...
How to build a private 5G network architecture
A private 5G network can provide organizations with a powerful new option for their wireless environments. Here are the major ...

Search CIO

AI transformation is inevitable but requires change management
Enterprises are split on AI adoption speed. While some take an aggressive workforce overhaul, others preach more careful ...
12 top business process management tools for 2026
BPM platforms are becoming a business transformation engine as vendors infuse their tools with powerful AI and automation ...
What Big Tech's AI spending means for your IT budget
Hyperscalers are spending billions on AI. CIOs can't match that scale -- but they can adopt smarter budgeting strategies to ...

Search Enterprise Desktop

How Windows 11 Safe Mode works and when to use it
Windows 11 Safe Mode gives IT leaders a reliable way to diagnose failures, restore access to broken systems and strengthen ...
How Windows 11 Print Management can fix printer issues
IT admins can use Print Management in Windows 11 to manage all printers connected to a device, troubleshoot problems and restart ...
How to migrate applications to Windows 11
As Windows 10 support ends, organizations must plan Windows 11 migrations carefully. Assess apps, data and device configurations ...

Search Cloud Computing

GenAI drives $119B cloud revenue in Q4
Q4 cloud infrastructure service revenues reach $119.1 billion, bringing the 2025 total to $419 billion. See how much market share...
Cloud infrastructure suffers AI growing pains
Will $5 trillion in AI infrastructure investment be enough? Cloud providers facing that question must also yield a return, ...
8 reasons why IT leaders are embracing cloud repatriation
As IT leaders aggressively re-allocate capital to fund new AI initiatives, repatriation offers both savings and greater control, ...

ComputerWeekly.com

UK direct-to-device satellite connectivity takes off with Virgin Media O2
Leading UK mobile operator switches on pace comms service in a move to make UK the first country in Europe to go live with ...
NTT Data, Ericsson team to scale private 5G, physical AI for enterprises
Global comms tech provider forges global partnership with business and technology services firm to establish 5G as the ...
Connectivity, AI drive fleet safety, productivity and decision-making
Report into state of fleet technology across US reveals three key priorities for the year: increasing productivity, reducing ...

Close