Recently, a side-project I've been working on to scrape Twitch chat logs has been causing quite a lot of confusion, both on Twitch and Twitter. I'll be doing a follow-up post detailing exactly what it is I've been up to, but in this post, I'll attempt to summarise some of the most frequently asked questions, and my responses to them.
Who are you?
Why are you in every channel on Twitch?
As a recent side-project I've been writing a Twitch chat scraper, which connects to multiple Twitch chat channels - and stores any messages it sees in a database, for future analysis.
Why are you doing this?
Mainly due to exam season boredom. I'm also using this as an opportunity to learn a bit more about the Go programming language, as well as a personal curiosity about whether Twitch chat really is as mindless as its reputation would have you believe.
What are you doing with my chat messages?
Currently, nothing. As mentioned above - I'm currently in full-blown exam-panic mode at university, with little time to devote to this project. In the future, I'd like to do some more analysis on the corpus of messages collected, largely inspired by the kind of thing Twitch's Drew Harry has done recently. Some possible areas of interest include:
- Number of emoticons/second for each channel
- Chat sentiment (happy/sad etc.) during eSports events
- Spectator behaviour (what percentage of chat is spammers, what kind of events inspire people to post in chat etc.)
Why are you hiding it?
I in no way intend to hide what I'm doing. I've become aware of the fact that what I'm currently doing could definitely be more open. This is mainly due to the fact that I have limited time to work on the project, which is still very much a POC. Hopefully, this post goes some way to address people's (natural) curiosity - in the future I also plan to:
- Create a dedicated Twitch account for the scraper (something more obvious, like ChatCollector)
- Make the bot respond to PMs - with a message directing them to this page.
- Give channel owners an opt-out of data collection (more information below)
This is evil - please stop
Firstly, I'm sorry you think so. Doing something evil was never my intention. As far as I'm aware - all the data I'm collecting is posted in a public forum, is not against the Twitch TOS1, and is really no different to the hundreds of people and organisations scraping sites like Twitter every day. If you disagree with this rationale, stick a comment down below/hit me up on Twitter.
If you do find this data collection intolerable, though, there are a couple of things you can do. Firstly, just ban me from your channel. I don't mind - seriously. In case you hadn't already guessed - I'm much more of a chat spectator/lurker than a participant. Secondly, if banning doesn't work - I'm working on a mechanism to blacklist certain channels if they've indicated they want to opt-out of data collection. This is currently still a manual process - so anyone interested in being added to this list should contact me (either in the comments below, or via Twitter). At some point, this will be made self-serve - so channel owners can add themselves to the blacklist. For progress on this - see the GitHub issue.
Give me technical details
I'll do a full post at some point over the next week, with the technical details. In the meantime, feel free to check out the code on GitHub.
Can you remove UniqBot/Some other bot from my channel?
Sadly not. The only username my scraper currently runs under is my own, 'fire_eater64'. There's nothing to stop somebody else from building my project off GitHub, and running it themselves - or writing their own. In the case of bots/scrapers like 'UniqBot' - you'll have to contact the person responsible for running it in order to get it removed from your channel/answer any questions you may have.
Hopefully this has helped with some of the questions/concerns people have with what I'm doing. If you have any questions/comments - feel free to stick them down below, and I'll do my best to answer them.
If somebody from Twitch wants to/can correct me on this, please reach out to me. ↩