Short answer, if your operations are communicative the order of your output/input doesn't matter making it much easier to schedule operations since you don't have to worry about order. This is a nice feature when parallel processing since you have to do much less synchronization.
Rock-Paper-Scissors Problem: (Add spock and lizard for bonus points ;)
Say you have a long string of RPS and want to know who is left standing. Assume the ouput is the same as the game carried out when you iteratively match up the first two elements in the list until there is only one left.
RPSSPSRSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPRPSSPRRRRSPSPSPPPRSPSPSRP
Multiplication table:
RR->R
PP->P
SS->S
RS->R
SR->R
SP->S
PS->S
RP->P
PR->P
Sit down and figure out how you can evaluate a huge RPS string in parallel.... (solution at the bottom)
Check out their test library for use cases:
https://github.com/twitter/algebird/tree/develop/src/test/scala/com/twitter/algebird
Or the code itself:
https://github.com/twitter/algebird/tree/develop/src/main/scala/com/twitter/algebird
It is a generic programming framework for data types that support the concepts of 0,1,plus, times, inverse ... The user has to write a wrapper in Scala that takes care of the memory management, and carries out the operations. Depending on which concepts the data structure implements it will fall into some abstract algebra concept: group, semigroup, monoid,... where you can use specialized algorithms specific to that generic abstract algebra concept to achieve your computational goals.
For example if you have a communicative operation, it may be faster than an associative one since you are allowed to reorder the elements; however the framework needs to know this in order to make optimizations.
IMHO a C++/Python/Java/Ruby framework based around parallel prefix would have been more useful. Also, the documentation is sort of lacking. I couldn't easily find a concise example of how to take your data structure and wrap it with all the functions it needs for the framework.
Parallel RPS solution.
Think of it as a three state finite state machine R, P, S; with transitions R, P, S.
First encode R->0, P-> 1, S ->2.
Rock transform {RR->R, PR -> P, SR -> R} | {00->0, 10->1, 20->0} | {0,1,0}
Paper transform {RP->P, PP->P, SP -> S} | {1,1,2}
Scissors transform {RS->R, PS->S, SS->S } | {0,2,2}
Now apply a parallel prefix on the transformations {0,1,0}, {1,1,2}, and {0,2,2} under transformation composition; just like permutation composition/multiplication where you do f(g(x)).
How would algebird help you with that? Eh.... not sure.
I've linked these notes before, but the notes here and here give a pretty good (free) note set which I used in my mathematical statistics courses at Rice University. The notes heavily influenced (the author had the previously linked author as an instructor at UW Madison) this book, which I think is probably my favorite statistical theory book.
Those books are more theoretical accounts of statistical theory, and give a good understanding for what's going on.
If you want good books on Machine Learning specifically, then these three books are probably good places to go, with the last being kind of canonical in the field, and all three written by major players in Machine Learning.
None of this stuff is cutting edge, but would get you started for sure.
Best Answer
You're on the right path...according to the definition of $IDF$, $IDF_i=\log_2 (N/n_i)$, so your answers should be
Case A: $TDF.IF \text { score} = \log_2 (10^{7}/320) * (1/15) = \log_2 (6250/3)$
Case B: $TDF.IF \text { score} = \log_2 (10^{7}/320) * (5/15) = \log_2 (31250/3)$