Statistics and Python

So, Dubs. I am aware that they are the last two digits in a post ID, and thus the chance of rolling dubs would be 1 in 10 (0.1). I got thinking though, what would be the chance of getting double digits anywhere in the post ID? I asked my maths professor, and he said "ah fuck that would take a while to figure out". And out of sheer curiosity I made a python script to calculate it for me (included in reply). The script goes through every possible post number and checks for dubs. As it turns out... the chance is exactly 0.56953279, a little over half. So according to stat theory there is a somewhat even chance that this very thread will have double digits...


also I am well aware the script is probably about as efficient as a Zimbabwean farmer, but hey, It was something I whipped up in 15 mins.

Other urls found in this thread:

en.m.wikipedia.org/wiki/Binomial_distribution
twitter.com/NSFWRedditImage

>script below
nums = ["00","11","22","33","44","55","66","77","88","99"]
dubs = 0
nondubs = 0
print("calculating dubs...")
for postID in range(0,999999999+1):
postID = str(postID)
postID = postID.zfill(9)
hasDubs = False
for i in nums:
if (i in postID):
hasDubs = True
break
if (hasDubs == True):
dubs = dubs + 1
else:
nondubs = nondubs + 1
if (int(postID) % 1000000 == 0):
print("done: " + str(int(postID) / 10000000) + "%", end="\r")
print("DUBS: " + str(dubs))
print("NON DUBS: " + str(nondubs))


#RESULTS
#DUBS: 569532790
#NON DUBS: 430467210
#chance to get dubs: 0.56953279

But post id is just incremented with each post on the entire board so how would you be able to say the percent chance of getting dubs on any arbitrary board unless you aim to be much less general

If I'm on a board that gets one post a day then my odds of dubs are much higher on the day where that one posts two ending numbers were x,x-1

...

Sorry, I didn't read carefully But I think my point still stands

Unless you wish to disregard relative board activity (understandable )

>what would be the chance of getting double digits anywhere in the post ID
Not technology. I would recommend you delete your post and repost in /sci/. I understand you wrote code, but this post really belongs on /sci/.

Also, please read the sticky. When you post code you should use code tags because it makes it easier for the rest of us to read. Finally, if after getting your answer on /sci/ you do something interesting with python (or any language) then feel free to share what you are doing in the daliy programming thread: Good luck.

What are the chances of getting triples anywhere in a post?

OP here, I re-scripted it earlier actually because I had the same question, as it turns out it is exactly 0.06315481

>let's assume the post ID is n digits long
>we're going to choose digits, one by one
>choose first one
>odds of dubs: 0, obviously
>choose second one
>the odds of it not making dubs are 9/10 (the criteria for dubs is you win the 10% odds of getting the digit in the previous slot)
>choose third one
>the odds of us still not having dubs are 9/10*9/10
>repeat n-1 times as the first digit doesn't contribute
>the odds of a random n-digit post not having dubs (trips and higher are considered dubs, just like in your script) are (9/10)^(n-1)
>therefore the odds of a random n-digit post having dubs are 1-0.9^(n-1)
Your maths professor does not deserve to be allowed to teach maths

What is the chance of NOT getting dubs in a post number? For convenience limit the question to 8 digit post numbers.

The number of all 8 digit post numbers is 9*10^7 (the first digit cannot be 0).

The number of all post numbers not containing a dub:
9 (first digit can be anything between 1 and 9)
*9 (second digit can be anything BUT the first digit)
*9 (and so on)
...
*9
= 9^8

The number of post numbers containing at least one dub = 9*10^7 - 9^8
The chance of getting a post number with a dub anywhere in the post number:
(9*10^7 - 9^8)/(9*10^7) ~= 0.52

...

Nice digits

Mate c'mon, think about this problem properly.

If you really want to flex your maths and explore Python, consider a Bayesian approach.

en.m.wikipedia.org/wiki/Binomial_distribution

homo

What's funny is there's (on average right now) a 0.2 rate of people getting double digits ITT, lower than the statistical average, ironic considering all the work OP went it ke his code

>exactly
that's not how numerics work

nice script desu

yeah OP status = btfo

But he posted it right after the epoch of dubs at
63077000-63077999
Since the first few numbers obviously aren't dubs it's going to be a lot lower probability until we reach 63088000, which is still thousands of posts away.

The issue with calculating this empirically is that you cannot sample consecutive posts since they're not independent. You can do this with the last 2 digits though since they're practically independent, at least on high-traffic boards.

Czech’d. Czech out my doubles, too. Nigger.

There's actually a formula for it, might take me a time to find it, but you can calculate it. An insurance embezzlement was found that way once, when they proved a certain number was appearing in the sheets too regularly and was so statistically improbable it couldn't have occurred naturally.

The way you calculate probability is embarassing. Study probability and use the proper formula instead of writing scripts

Right off the bat you didn't account for the chance of dubs when the post number is 66xxxxxxx or 6355xxxxx, etc.

Has it right, and phrased it correctly in the end (picking a post at random). But as somebody else mentioned, the sequence matters here, so the conditional probability of any post at the current point in time having matching double digits differs. On high frequency boards you can probably assume that the last two digits are a random iid draw and 0.1 is accurate.

is this the designated dubs thread?