Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
What is _wrong_ with Python? a.k.a. PoR probabilities
Seriously. What is its problem? So you have the loop

for x in range(2,12):
    sum += var[x]

If all values of var[] are 1 you'd expect sum to be 11. Right?
Wrong. It's only ten. If you want include var[12] the range needs to be (2,13).
Maybe I got it rite this time.


Join the scorn.
Python.... Bother, I'm procrastinating learning Joomla. I have two web sites to transfer to my hosting and have avoided Joomla for years. Sigh. Not that ugly Drupal is better. CMSs may be sweet for bloggers, but straight coding is better and easier to deal with overall IMO.

Python. Reminds me all that wasted time learning Perl.

I also need to heavily improve my e-commerce store setup skills.
"Save inches for the bathroom; we're using feet here." ~ Rob Kuntz (2014)

--brought to you by TOLHosting, the service without the site--
Finally dug down and used Akeeba to move a Joomla site. Not seamless as I had to tinker with the database a bit manually, but not bad other than some plugin (system) that blocked access to admin......

Now, editing and securing the darn thing.

Apparently CMSs are reputated.... er... reputed... to be easy to hack. I want to secure this bugger down as the next one I transfer is government and that one has been hacked before.

Also, trying to figure out this selling domains game.

Python though, throws me for a 'loop.' Wink
"Save inches for the bathroom; we're using feet here." ~ Rob Kuntz (2014)

--brought to you by TOLHosting, the service without the site--
So. What's the difference between a d4 and a d6 that hasn't rolled a 5 or 6?

I decided I wanted to figure out the probability of getting N successes on M dice to beat TN using PoR mechanics ( standard comparative dice pool, ala White Wolf, with varying dice sizes where dice <= TN in size can be added together). When the dice are larger than TN, it's no big deal. It's not Binomial Distribution easy (all dice are the same size), but easy. I wrote a python script that calculated all probabilities for target numbers from 1 to 11 all possible dice pools from {2d4} to {11d12} where every die is greater than the target number in less than 15 minutes. It's not an especially efficient script, but I took measures to eliminate a lot of overhead. For instance, a pool is a set so order doesn't matter. Because order doesn't matter, we're only dealing with combinations, not permutations. Also, because order doesn't matter, there's no reason to not have the dice in order by size. Here's my code for die sequence:
if pool[0:MAX] == [ 12 ]*MAX:    #pool is a list of dice indicated by size (4, 6, 8, 10, 12), Max is the number of dice in the pool
       break    #All combinations have been used, go to the next TN
       for idx in xrange (0,MAX):
         if idx == MAX-1:    #pool[0] through pool[idx-2] are unchanged
           pool[idx] += 2    #the next larger die
         elif pool[(idx+1):MAX] == [pool[idx]]*(MAX-idx-1):    #all dice from idx on are the same size
           pool[idx] += 2
           for j in xrange(idx+1,MAX):
             pool[j] = 4
           break   #idx    #only one die changes, so stop looking

The pools iterated thusly:
{4, 4, 4}
{6, 4, 4}
{6, 6, 4}
{6, 6, 6}
{8, 4, 4}
{8, 6, 4}
{8, 6, 6}
{8, 8, 4}
{8, 8, 6}
{8, 8, 8}
{10, 4, 4}

Now, if the target number is 7, then {8, 4, 4}, {8, 6, 4}, and {8, 6, 6} are all equivalent to {8} and each were calculate as such. Duplicate processing. Wasteful and inefficient. I should have set it up so that the dice ranged not from 4, but from MIN = max([(TN/2+1)*2, 4]). (For TN=7, MIN is 8, TN=8, MIN is 10, just how we want it.) But like I said earlier, working this non-binomial distribution is easy. You just need to know the odds for each Bernoulli Trial (single die) and figure out the 2^n combinations of successes. Easy. Done in less than 15 minutes.

Adding dice to exceed a target number, on the other hand....

Not only do you need to iterate through every pool that contains dice smaller than TN, you then need to iterate the dice rolls, then you need to see how many successes you can get from each roll of the pool. Now, to help speed things along, if the sum of all the dice is less than the target number, there aren't any successes. If the sum is less than twice the target number, there can only be one success. But if the number of dice is greater than 3 and the sum of the dice is greater than 2TN, well, now you need to do some calculating. Here's my recursive function:
def cycle( rolls, TN ):     # rolls is a user Class that contains a list where the index indicates\
                           # the result of a die and the value is the number of dice that rolled it.
   rollsum = rolls.rollsum()
   if rollsum < TN:
     return 0
   elif rollsum < 2 * TN:
     return 1
     large = rolls.newlarge()
     if rolls.rolls[ TN - large ]:     # Here TN is actually matched not exceeded. Makes for cleaner code to get the same result.
       rolls.rolls[ TN - large ] += -1
       rolls.rolls[large] += -1
       return 1 + cycle( rolls, TN )
       sum = large
       rolls.rolls[large] += -1
       while sum < TN:
         small = rolls.newsmall()
         sum += small
         rolls.rolls[small] += -1
       return 1 + cycle( rolls, TN )

This is a greedy algorithm that starts with the largest number and, if its complement doesn't exist, adds the smallest numbers until TN is matched or exceeded. I don't have proof that this method always gives the correct result, but anecdotally, I couldn't create a situation where it didn't.

And yes, I'm aware of the dangers of recursive functions in terms of time and memory. However, the number of times the function is called is at most ceiling(MAX/2). For a pool of 11 dice that's 6 times. So, the recursive function isn't the problem. The problem is if you have a pool of 8 d10s, you're looking at 100,000,000 different rolls. You know that fifteen minutes I mentioned earlier? It takes longer than than to calculate 100,000,000 die rolls to beat 10. Then we get to do it again for 11. And 12.

Which brings us back to my original question: What's the difference between a d4 and a d6 that hasn't rolled a 4 or 5? The answer: nothing. let's look at the roll combinations between the two for {2d4} and {d4,d6}.
d4     d4           d4     d6 that hasn't rolled 5 or 6
0      0            0      0    dice that roll 1 cannot be used, thus 0.
0      2            0      2
0      3            0      3
0      4            0      4
2      0            2      0
2      2            2      2
2      3            2      3
2      4            2      4
3      0            3      0
3      2            3      2
3      3            3      3
3      4            3      4
4      0            4      0
4      2            4      2
4      3            4      3
4      4            4      4

Looks fairly identical to me. So after calculating {2d4}, I only need to calculate
5    0
5    2
5    3
5    4
6    0
6    2
6    3
6    4

Remember that 100,000,000 calculations? Now it's only 20,000,000, the other 80,000,000 having already been done. Now that I've finally figured that out all I have to do is actually code it.
Maybe I got it rite this time.


Join the scorn.
Well, that was fun. I finally got the additive portion of the pool working at a (relatively) decent rate. And I reworked the non-binomial distribution to only include dice greater than the target number. Wow, that ran quickly. I'm currently running two instances of additive on the "good" computer. One modified to calculate pools of two to eight dice (which I mistakenly uploaded instead of the 2 to 11) and the other set to process pools of 9 dice. I figured that since one script used only ~50% of the processing, I may as well run two.

But I attached both scripts if you want to take a look. A couple notes about them: Additive gives the number of successes. To get the probabilities, divide by the product of the pool. Additive also gives a number 1 greater than the target number, i.e. a number to match or beat, not just beat. It's left to the student as an exercise to modify the script to match the output type of the other.

Next step is to do some matrix cross-multiplication to come up with a final probability vector. And yes, mathematically, that last sentence was complete gibberish. But bear with me. Say we have a pool {2d4, d8, d10, 2d12} and the target number is 9. What's the probability of failure? Of only one success? Or of three? To start, there are two sub-pools we look at: {2d4, d8} and {d10, 2d12}, one of which has small dice (add to beat the TN) and the other's dice are large. Doing lookups on these two Tables gives us:

9 [12, 12, 10] [1.0125, 0.7875, 0.18749999999999997, 0.012499999999999997]

Something very wrong.

Let me get back to you.

Attached Files
.txt   non-binomial (Size: 1.77 KB / Downloads: 88)
.txt (Size: 3.99 KB / Downloads: 88)
Maybe I got it rite this time.


Join the scorn.
Okay. To corrected the non-binomial file change i**(MAX+1) to i**MAX.

Also, now that I've gotten some actual sleep, let me apologize for that word salad, above.

The method for determining the probability of any number of successes in a mixed pool (some dice larger, some dice smaller than the TN) is simple dot-product multiplication. If the probabilities for the two sub-pools are arranged thusly:
      # ->     0    1    2    3
    % ->    0.5063    0.3938    0.0938    0.0062
#    %    ------------------------------------------------
0     0.5547 |    0.2808    0.2184    0.0520    0.0034
1     0.4453 |    0.2255    0.1754    0.0418    0.0028
(blecch. A preview shows that things do not align properly.)

The i,jth element of the 2x8 matrix is the product ith element of the column and the jth element of the row. To get the probability of a certain number successes (T) just add the positive diagonal where i+j=T.
no successes: 28.08%
one success: 44.39%
two successes: 22.74%
three: 4.52%
four: 0.28%
sum: 100.01% (<- It's called rounding error and it's why you use the calculated values not the recorded values for further calculations.)

Now, in telling you that processing {3d6,d8,4d12} > 12 (>=13 in output terms) took ~4:30.0, you're probably thinking, "Why don't you just use recursion? Starting with a base of {2d4} to {2d12}, keeping track of which combinations are exact pairings (d1+d2=TN) and the number of successes for each, when you add a die, all you need to do is see how adding a die affect the remainder, then keep track of that." To which I reply, "That might actually work. It would definitely use a lot of memory."
Maybe I got it rite this time.


Join the scorn.
Maybe I should stop huffing glue before posting. Change 2**(MAX+1) to 2**MAX. 2 not i.
Maybe I got it rite this time.


Join the scorn.
Warning: thinking "out loud". Do not expect coherency.

So let's think this through. What would it take to rewrite this thing (additive) "recursively"? The idea is that when we add a die D to a pool N, we look at each pool roll of N and see how each roll of D affects it. It isn't necessary to cycle through N, just the results of N, i.e. the number of successes and which rolls weren't used for those successes. As with my greedy algorithm, I don't have a formal proof, only anecdotal evidence in that I tried to create a situation whereby a poolroll p of size n takes too much such that looking at the remainder of p with d causes a success to me missed. I was able to build one where a success was missed after several dice were added for a specific sequence. This loss of a single success out of 4^6 isn't large but it can propagate. So the question is, are there enough of these chains to skew the probabilities down enough to exceed the rounding error?

That out of the way, we start with a Pool, which contains 2 dice to start. A Pool informs us of all the possible rolls, the number of which is the product of the sizes of the dice; [4,4] gives us {[0,0],[0,2],[0,3],[0,4],[2,0],[2,2],[2,3],[2,4],[3,0],[3,2],[3,3],[3,4],[4,0],[4,2],[4,3],[4,4]}. (A set of lists? Lists for pools makes sense, lists for rolls makes sense, but is it necessary to keep record of the die rolls? Especially since the strategy is to track the remainder.) This is meaningless to us without a target number, the smallest of which, for our example, is 5: {[0,0]:0,[0,2]:0,[0,3]:0,[0,4]:0,[2,0]:0,[2,2]:0,[2,3]:1,[2,4]:1,[3,0]:0,[3,2]:1,[3,3]:1,[3,4]:1,[4,0]:0,[4,2]:1,[4,3]:1,[4,4]:1}. Successes: [8,8]. (Do we need Successes or can we calculate it on demand? The database classes I took over a decade ago thinks the latter.) Oh, yeah. Remainders. If I reuse my cycle() code that I've already written, and why wouldn't I, we'd have {[0,0]:[0,0,0,0,0],[0,2]:[0,0,1,0,0],[0,3]:[0,0,0,1,0],[0,4]:[0,0,0,0,1],[2,0]:[0,0,1,0,0],[2,2]:[0,0,2,0,0],[2,3]:[0,0,0,0,0],[2,4]:[0,0,0,0,0],[3,0]:[0,0,0,1,0],[3,2]:[0,0,0,0,0],[3,3]:[0,0,0,0,0],[3,4]:[0,0,0,0,0],[4,0]:[0,0,0,0,1],[4,2]:[0,0,0,0,0],[4,3]:[0,0,0,0,0],[4,4]:[0,0,0,0,0]} after condensing. (That's two dictionaries using die rolls as the key. May need to keep that set of rolls.)

Now add a die. Continuing the precedent of least typing, let's look at adding a d4: [4,4,4]
(At this point I'm seriously questioning the wisdom of tracking successes per roll. Seems completely unnecessary and definitely a waste of memory if unneeded. Just need a running total.)
It becomes clear that my above questioning of tracking successes by roll was a mistake. How else to determine if a roll has two successes for larger pools? So we need a pool roll, the number of successes in that roll, and the remainder in a single storage unit. Makes me yearn for Pascal's struct. Or is that C? I'm thinking C. I believe Pascal had records.
class Poolroll:
  roll = []
  rolls = []  # I think I need to do this for the following functions
  def rollsum(self):
    sum = 0
    for i in range (2,13):
      sum += self.rolls[i] * i
    return sum

  def newlarge(self):
    x = 12
    while x > 0:
      if self.rolls[x]:
        return x
        x += -1
    return 0
  def newsmall(self):
    x=2        #[0] and [1] will always be 0
    while x < 13:
      if self.rolls[x]:
        return x
        x += 1
    return 0

  def __init__( self, oldpoolroll, newdieroll ):
    self.TN = oldpoolroll.TN
    self.roll = oldpoolroll.roll[:]  #if I understand this stuff correctly, this is the fastest copy method for complex data types
                             #and with making a copy any changes made to roll are made to oldroll also.
                             #Something to do with Python being a strict pointer-based language.
    self.rolls = oldpoolroll.rolls[:]  #roll is the list of the numbers rolled, rolls is the number of each value rolled or remainder, as the case may be.
                             #roll would be [4,4,4] and rolls [0,0,0,0,3,0,0,0,0,0,0,0]
    self.rolls[newdieroll] += 1
    self.successes = oldpoolroll.successes
    #It would be nice to go ahead and evaluate the new successes here, but the function needs to be defined before it can be called
    #and I can't define it first because the variables haven't been defined yet. But thinking about it, because we're only looking at a possible
    #increase of only a single success, the process wouldn't be recursive.
    if self.rollsum() < self.TN:
      pass    #no new successes
      large = newlarge()
      if self.rolls[ self.TN - large ]:
        self.rolls[ self.TN - large ] += -1
        self.rolls[self.large] += -1
        self.successes += 1
        total = large
        self.rolls[large] += -1
        while total < self.TN:
          small = self.newsmall()
          total += small
          self.rolls[small] += -1
        self.successes +=1

Anything missing? Roll result: check,poolroll.roll; successes: check, poolroll.successes; remainder: check, poolroll.rolls. And it's set to take a smaller sized poolroll instance. But how do we get the initial seed? Crap, I'm going to have to look up overloading, aren't I. Mmmm... Maybe not. Python doesn't make one declare the type of data of function inputs, so we might be able to make a seed class that satisfies our needs.

class Seed:
  rolls = [0]*13
  successes = 0
  def __init__(self, roll, TN)
    self.roll = [roll]
    if roll:
      self.rolls[roll] = 1  #0 is never counted so why muck things up by including 0 rolls
    self.TN = TN
This has all the requirements for initializing a Poolroll instance and to get our seed:
for i in (4, 6, 8, 10, 12):
  for j in (0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12) if j <= i:
    seed[i,j] = Seed(j,TN) #Yes, I know Python doesn't have multidimensional arrays. Used only for demonstrational purposes.

Meh. Tired. Time for sleep.
Maybe I got it rite this time.


Join the scorn.
We have just proven that we do not deserve what we have.


Pool rolls. They derive from pools. Pools have 1 to 11 dice. They have all those rolls. They have a means of determining all those successes from those rolls. So let's see if I can figure out this inheritance stuff, if it's necessary.

class Pool(poolroll):
pool = []
def __init__(self, oldpool, newdie, maybe other stuff)
self.pool = oldpool.pool[:]
self.pool.sort() #largest die to the right, add pool.reverse() to put it to the left.


You know. Suddenly I realize just how unimportant this is right now.
Maybe I got it rite this time.


Join the scorn.
I need to try to distract myself.

With the Pool class, I shifted TN from PoolRoll to Pool. I'll be editing that in a bit, if I remember.
Class Pool:
  def __init__(self, oldpool, newdie)
    self.TN = oldpool.TN
    self.rolls = set()
    for oldpoolroll in oldpool.rolls:
      for dieroll in (0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12) if dieroll <= newdie:
        newpoolroll = PoolRoll(oldpoolroll, dieroll, self.TN)
        self.rolls.add( newpoolroll )

  def successes(self):
    totals = [0, 0, 0, 0, 0, 0]
    for roll in self.rolls:
      totals[roll.successes] += 1
    return( totals )

class SeedRoll:
  def __init__( self, roll )
    self.roll = [roll]
    self.rolls = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    self.rolls[roll] += 1
    self.successes = 0

class SeedPool:
  def __init__(self, die, TN):
    self.TN = TN
    self.rolls = set()
    for i in (0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12) if i <= die:
      seed = SeedRoll( i )
      del seed   #not sure if this is necessary, also not sure that I can't do self.rolls.add( SeedRoll( i ) )

And this is a recursive function that will, hopefully, cycle through all combinations (not permutations) of pools in size of 2 to 11 dice where each die is less than or equal to first die in the pool, which is itself limited by the target number. If my understanding is correct, the dictionary pools is passed by pointer, so changes made to it follows through the function but oldpool, and pool are limited by scope. MAX is unchanged throughout so it doesn't matter. I suppose I could just use it as a global, but, you know, can't do it.
def iterate( pools, oldpool, lastpool, MAX ):
  if len(lastpool) == MAX:
    pools[str(lastpool)] = oldpool
    pools[str(lastpool)] = oldpool
    for newdie in (4, 6, 8, 10, 12) if newdie <= lastpool[-1]:
      currentpool = lastpool[:]
      pool = Pool(oldpool, newdie)
      pools = iterate( pools, pool, currentpool, MAX )
What I think should happen is it'll create dice pools in the following pattern (sent [8] where MAX = 4):
[8,4], [8,4,4], [8,4,4,4]
[8,6], [8,6,4], [8,6,4,4]
       [8,6,6], [8,6,6,4]
[8,8], [8,8,4], [8,8,4,4]
       [8,8,6], [8,8,6,4]
       [8,8,8], [8,8,8,4]

I'm hoping that this is a properly functioning main{}
MAX = 11    #Maximmum size of a pool
for TN in xrange(5,14):    #target numbers
  limit = ((TN-1)/2)*2      #should return the largest even number <= TN
  for die in (4, 6, 8, 10, 12) if die <= limit:
    currentpool = [die]
    seed = SeedPool(die, pool)
    pool = Pool(seed, die)
    pools = {}
    pools = iterate(pools, pool, currentpool, MAX)
    file = open("iterative additive", "a")
    for key in pools
      file.write("%s %d %s" % (key, pools[key].TN, pools.[key].successes()))
    del pools, seed, pool

Holy crap that's going to suck up RAM. Especially when we get to d12s. But where can we do a dump? When we get to the end condition in iterate()? That would require putting pools into scope. Or maybe not. The last pool can definitely be deleted, the next to last can be deleted if its last die is the same size, and so on. Everything will eventually be deletable, but we want to be sure to get the output first and we don't want to output anything until it's ready to be deleted. This makes things fun because we're tracking the pools in a dictionary, which cannot use lists as keys, which is why I'm converting the lists into strings, and I'm leaving the brackets in for output purposes. So we can slice the string, looking for number characters, but probably creates problems with single and double digit numbers. Converting the string back into a list is doable but still messy. Another method is to store the pool list in pool, something I had removed as duplicate information to keep memory usage down. I'm thinking it's worth putting back in.

def iterate( pools, oldpool, lastpool, MAX ):
  if len(lastpool) = MAX:
    pools[str(lastpool)] = oldpool
    file = open("iterative additive", "a")
    for key in pools if pools[key].pool[-1] == lastpool[-1]:
      file.write("%s %d %s" % (key, pools[key].TN, pools.[key].successes()))
      del pools[key]
That should do it. ( [-1] looks at the last entry, -2 the second last, etc.) If my reasoning is correct (again).
Maybe I got it rite this time.


Join the scorn.

Forum Jump:

Users browsing this thread: 1 Guest(s)