Total Posts:8|Showing Posts:1-8
Jump to topic:

Mafia Scrape-apalooza

Cobalt
Posts: 991
Add as Friend
Challenge to a Debate
Send a Message
6/29/2016 3:44:08 AM
Posted: 5 months ago
I've been data scraping DDO mafia games for a few hours now. The process is going as fast as it can, which is quite slow. (To abide by the ToS, I cannot automate requests at a rate higher than what a human could manually do.)

I'll try to collect a few dozen games, then start using R to really see what's going on inside. Which is what you can help me with!

What do you want to know about mafia, from a statistical point of view?

For example, some things I'll be looking at:

1. Do mafia members tend to post in clusters?
2. Is there a correlation between a ELO/debates/forum posts and post volume?
3. What percentage of posts are made by mafia, relative to the actual percentage of mafia players.
4. Etc, etc.

If you have any suggestions of things you'd like to know, I'll crunch those numbers for you. I'm going to make a pretty good sized report, but I can't find what I'm not looking for.
TheGreatAndPowerful
Posts: 3,012
Add as Friend
Challenge to a Debate
Send a Message
6/29/2016 12:17:39 PM
Posted: 5 months ago
At 6/29/2016 3:44:08 AM, Cobalt wrote:
I've been data scraping DDO mafia games for a few hours now. The process is going as fast as it can, which is quite slow. (To abide by the ToS, I cannot automate requests at a rate higher than what a human could manually do.)

I'll try to collect a few dozen games, then start using R to really see what's going on inside. Which is what you can help me with!

What do you want to know about mafia, from a statistical point of view?

For example, some things I'll be looking at:

1. Do mafia members tend to post in clusters?
2. Is there a correlation between a ELO/debates/forum posts and post volume?
3. What percentage of posts are made by mafia, relative to the actual percentage of mafia players.
4. Etc, etc.

If you have any suggestions of things you'd like to know, I'll crunch those numbers for you. I'm going to make a pretty good sized report, but I can't find what I'm not looking for.

I advise you against this, but having done this myself, I understand that personal enjoyment that comes from such endevours. However, don't hold hope that what you find will sway anyone. I've thrown cold hard facts down to try and prove points about mafia for people to basically shrug and ignore them.

Here are some things I've come up with previously:

http://www.debate.org...

I gathere a bunch of day phase results and compared their lengths with the results (lynch, no lynch, mislynch). Conclusion? Optimal time for successful lynches is sooner, rather than later, mislynches peaks and then outcomes trend toward a no lynch as day phases get longer.

http://www.debate.org...

I forget what I was doing here. II blame unlabled axis. I think the x-axis here might be hours or pages? So I might have been showing the number of day phases of a certain length.

http://www.debate.org...

Another day phase/duration type graph.
Cobalt
Posts: 991
Add as Friend
Challenge to a Debate
Send a Message
6/30/2016 1:57:28 AM
Posted: 5 months ago
At 6/29/2016 12:17:39 PM, TheGreatAndPowerful wrote:

Those are very interesting! Thank you for those.

And I'm doing this mostly because I'm studying various machine learning techniques and I need a solid understanding of R and analyzing large amounts of data before I begin. I just thought this data set would interest more than, say, temperatures of the past 30 years.
VelCrow
Posts: 1,273
Add as Friend
Challenge to a Debate
Send a Message
6/30/2016 3:01:15 AM
Posted: 5 months ago
At 6/29/2016 3:44:08 AM, Cobalt wrote:
I've been data scraping DDO mafia games for a few hours now. The process is going as fast as it can, which is quite slow. (To abide by the ToS, I cannot automate requests at a rate higher than what a human could manually do.)

I'll try to collect a few dozen games, then start using R to really see what's going on inside. Which is what you can help me with!

What do you want to know about mafia, from a statistical point of view?

For example, some things I'll be looking at:

1. Do mafia members tend to post in clusters?
2. Is there a correlation between a ELO/debates/forum posts and post volume?
3. What percentage of posts are made by mafia, relative to the actual percentage of mafia players.
4. Etc, etc.

If you have any suggestions of things you'd like to know, I'll crunch those numbers for you. I'm going to make a pretty good sized report, but I can't find what I'm not looking for.

Try number crunching the town win rate against keeping scummy/anti town players alive. Note the fact that the term scummy does not necessarily mean mafia.
"Ah....So when god "Taught you" online, did he have a user name like "Darthmaulrules1337", and did he talk in all caps?" ~ Axonly

http://www.debate.org...
Cobalt
Posts: 991
Add as Friend
Challenge to a Debate
Send a Message
6/30/2016 3:55:27 AM
Posted: 5 months ago
At 6/30/2016 3:01:15 AM, VelCrow wrote:

Try number crunching the town win rate against keeping scummy/anti town players alive. Note the fact that the term scummy does not necessarily mean mafia.

Well, I'm basically pulling an entire thread at a time (just one DP), then collecting any data that can be easily extracted. I highly doubt that even a seasoned software developer could write a script capable of figuring out which players are scummy/anti-town just from the context of the posts.

I am manually entering affiliation data, though. We should be able to get some exciting insights from that alone. For instance, how does a person's average number of posts when town compare against their average while mafia? Is this trend very common?

I also want to figure out if mafia tend to post together, as opposed to at different intervals. The answer to that could be actually useful.
Cobalt
Posts: 991
Add as Friend
Challenge to a Debate
Send a Message
6/30/2016 11:32:38 PM
Posted: 5 months ago
At 6/30/2016 11:25:13 AM, TheGreatAndPowerful wrote:
What program/script language are you using?

I was initially going to use python and pandas, but I decided R might be a better thing to learn in the long run. So I'm using R/RStudio. It has a lot of tools that let me see what I'm doing at every step along the way, which is really nice.

What did you use?
TheGreatAndPowerful
Posts: 3,012
Add as Friend
Challenge to a Debate
Send a Message
7/1/2016 12:22:48 AM
Posted: 5 months ago
At 6/30/2016 11:32:38 PM, Cobalt wrote:
At 6/30/2016 11:25:13 AM, TheGreatAndPowerful wrote:
What program/script language are you using?

I was initially going to use python and pandas, but I decided R might be a better thing to learn in the long run. So I'm using R/RStudio. It has a lot of tools that let me see what I'm doing at every step along the way, which is really nice.

What did you use?

wget and perl

If I was going to do it now, I'd probably use python.