Skip to main content

Questions tagged [screen-scraping]

screen scraping is the act of "scraping" or copying content from a Stack Exchange site and republishing that content on a different site. Stack Exchange content is licensed under Creative Commons BY-SA 4.0 and can be freely distributed with the attribution requirements. This tag should be used with posts concerning sites using Stack Exchange content without proper attribution.

Filter by
Sorted by
Tagged with
9 votes
1 answer
240 views

At the time of writing, the FAQ for scraping writes: What is a "scraper" and why is that bad? Historically, SCRAPER here on Stack Exchange meant "Stack Content Republishers Attributing ...
Rebecca J. Stones's user avatar
25 votes
5 answers
863 views

The 2024 changes to the data dump were made because: Simultaneously, we know that companies have scraped or otherwise ingested Stack Overflow and Stack Exchange data to train models without proper ...
wizzwizz4's user avatar
  • 34.1k
14 votes
0 answers
212 views

I reported two scrapers which were violating the license* recently, as I'd done some time last year as well. I got this email in return: Hello, All content on Stack Exchange is licensed under either ...
bobble's user avatar
  • 19.5k
11 votes
1 answer
840 views

I reported a website (Quora) copying a significant amount of Stack Exchange content via https://meta.stackexchange.com/contact and got the following response from the Stack Overflow support team via ...
Franck Dernoncourt's user avatar
-8 votes
1 answer
180 views

Is there any software one can use to scrape Stack Exchange review queues of choice and e.g. print a summary to standard output? A possible input document: - stackoverflow.com - triage - first ...
202324's user avatar
  • 79
1 vote
1 answer
104 views

I have seen some old posts on Stack Overflow that are repeated in forums and mailing lists. Does Stack Overflow scrape other question sites, forums or mailing lists?
Fernando V's user avatar
9 votes
1 answer
225 views

Altmetric is a service that attempts to track the online impact of scholarly articles, which it does by keeping score of mentions on Wikipedia, Facebook, Twitter, blogs, and the like and then using ...
E.P.'s user avatar
  • 19.6k
1 vote
0 answers
24 views

I've seen a lot of scraped and repackaged content in my time, but I just ran across http://readquestion.com which seems to have scraped the entire network of sites and re-posted the entire thing ...
Caleb's user avatar
  • 23.7k
7 votes
1 answer
151 views

According to the stackexchange scraper policy you shouldn't report sites that: They follow all the attribution requirements, and don't outrank us on Google I've been thinking several ways to use SE ...
marcanuy's user avatar
  • 241
4 votes
2 answers
326 views

I recently posted a question in Mathematica SE, Can a package append its context to $DistributedContexts?. Out of idle curiosity I then googled for some of the code in the question, and I was very ...
E.P.'s user avatar
  • 19.6k
6 votes
0 answers
102 views

The topic of content scraping from StackExchange and StackOverflow is very well known. I thought the admins would like to know that the scrapers have boosted their efforts and are now obfuscating ...
crockpotveggies's user avatar
7 votes
0 answers
55 views

Recently, I answered a question on Stack Overflow that is very specific to a certain platform. I tried Googling the same question to see who else was having this issue, and if the answer was available ...
Matt Clark's user avatar
11 votes
1 answer
700 views

I'm building an application that detects plagiarized answers on Stack Overflow, so I need to retrieve the content of answers programatically. I know I can do this using the Stack Exchange API, but ...
Bob's user avatar
  • 817
5 votes
0 answers
733 views

I've been webscraping the chat room with as goal to retrieve some statistics. (have a look at this meta on security.SE (What do you guys want to know about what is said in the DMZ?) After indexing ...
Lucas Kauffman's user avatar
2 votes
0 answers
38 views

I have a Google alert on my name and this turned up the link http://meta.bet6e.com/users/512728/jan-doggen: If you go there you get redirected to a supposed Wordpress blogging site about ...
Jan Doggen's user avatar
  • 2,341

15 30 50 per page