Twitter released its Algebird library, an abstract algebra library for scala.
At a high level, what are the maths involved in abstract algebra that have an impact on aggregating large datasets?
abstract-algebradata analysis
Twitter released its Algebird library, an abstract algebra library for scala.
At a high level, what are the maths involved in abstract algebra that have an impact on aggregating large datasets?
"A Book on Abstract Algebra" by Charles Pinter is a great book for the casual reader. It's an easy read yet maintains rigour throughout all of the topics discussed.
What constitutes a "graduate algebra' course in the United States has undergone a lengthy evolution. As MTurgon and rschwieb said earlier, it's highly subjective from teacher to teacher. But I think the evolution of the subject gives some insight of what can be expected at most programs today for a year long graduate course.
The first modern graduate text was,of course, Van Der Waerden's Moderne Algebra,based on the legendary lectures of Emil Artin, Emmy Noether and Van Der Waerden himself at the the University of Gottingen before World War II. It was the first real abstract algebra textbook since these lectures emerged from the researches of the authors.The syllabus became the standard mantra for algebra courses, "Groups,rings and fields". Until the 1960's, algebra was considered a graduate course and it was very unusual for most undergraduates to have had much exposure to algebra with the exception of the world's top programs, such as Harvard or Yale. The first undergraduate course in algebra was developed and presented by Saunders MacLane and Garrett Birkoff at Harvard in 1941 and it eventually became,of course, the basis for their classic text, A Survey of Modern Algebra. But it was very unusual for undergraduates to have a solid course in abstract algebra before the 1960's. When undergraduate and graduate algebra courses became standard courses in math departments, the curricula was fairly well-established: undergraduate courses were based on the Survey while graduate courses where based on Van Der Waerden. By the late 1960's and early 1970's, Van Der Waerden's book was no longer representative of the frontiers of algebra, which were now nearly unrecognizable with the explosive growth of categorical and homological methods,noncommutative algebra, modern commutative algebra and modern algebraic geometry. The first edition of Serge Lang's Algebra was published in 1965, concurrent with the peak popularity of the Bourbaki volumes. Lang's book effectively replaced Van der Waerden as the graduate algebra text of choice at top programs due its completely modern approach and it's emphasis on categorical and homological methods in all areas of algebra. It still is,to a large degree-but its sheer difficulty and dry austerity,coupled with the mammoth size of later editions and the explosively rapid growth of algebra at the research level-has recently lead to a new generation of algebra books at the graduate level, such as Grillet (my personal favorite), Rowen and Rotman.All these books have continued the hard categorical slant of Lang while trying to bring more recent developments into the standard courses.
From the prior discussion as well as my own experiences, I can state that graduate algebra generally differs from undergraduate courses in the subject in 3 ways:
1) Much the same way undergraduate analysis covers the "classical" analysis of the late 19th century and graduate analysis courses cover modern topic of the last century, graduate algebra differs mainly from undergraduate algebra in the emphasis on category theory and homological methods. There are programs that attempt to present category theory and diagram chasing to undergraduates in their algebra courses, but I think this is mainly at the top research programs,where the goal is to speed students to the frontiers as quickly as possible.In general, the categorical approach isn't tackled full on until the graduate algebra sequence and consequently the topics that are most strongly developed by these methods-i.e. homological algebra, noncommutative ring and module theory, algebraic geometry-are not discussed in depth until the graduate course.
2) Expect the course to be much deeper, terser and problem oriented then your undergraduate course. This for 2 reasons: a) A graduate course in algebra needs to survey most of the subject as it stands today to prepare the students for research in either algebra or other fields-and unless the student is ready to learn actively, there simply will not be time to cover the bulk of this work. Also b) graduate students are now beginning to make the transition to being professional mathematicians and they can't very well do that if they're still learning simple proofs off lectures or textbooks. They have to not only learn material much more quickly,they have to learn to build vast tracts of theory themselves. The best way to do both is to give the student a large chunk of the classwork to learn themselves.
3)Depending on whether your instructor is a prominent researcher in the field of algebra, a graduate algebra course may be much more closely tied to the frontiers of research then is usual. If so, he or she may cover the standard material in a "need to know" fashion in order to cover the maximal amount of material relevant to his or her research interests and a large chunk of the course would then be more like a research seminar, relying much more on published papers then standard textbooks. If the professor is not an active algebraicist, expect the course to follow a much more standard path through a conventional textbook like one of the ones stated above.
Specifically what topics can you expect in a graduate algebra class? At the minimum, I would expect the following topics to be covered: group theory through the Sylow theorems, free groups and presentations and the Fundamental Theorum of Abelian Groups, ring and module theory including both the noncommutative and commutative aspects, field theory including a large section on Galois theory, linear and multilinear algebra including a full discussion of tensor products,basic category theory and homological algebra,universal algebra, semisimple rings and algebras and perhaps some algebraic geometry and algebraic number theory.
Hope that helped-good luck!
Best Answer
Short answer, if your operations are communicative the order of your output/input doesn't matter making it much easier to schedule operations since you don't have to worry about order. This is a nice feature when parallel processing since you have to do much less synchronization.
Rock-Paper-Scissors Problem: (Add spock and lizard for bonus points ;)
Say you have a long string of RPS and want to know who is left standing. Assume the ouput is the same as the game carried out when you iteratively match up the first two elements in the list until there is only one left.
RPSSPSRSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPSRPRPSSPRRRRSPSPSPPPRSPSPSRP
Multiplication table:
RR->R
PP->P
SS->S
RS->R
SR->R
SP->S
PS->S
RP->P
PR->P
Sit down and figure out how you can evaluate a huge RPS string in parallel.... (solution at the bottom)
Check out their test library for use cases: https://github.com/twitter/algebird/tree/develop/src/test/scala/com/twitter/algebird
Or the code itself: https://github.com/twitter/algebird/tree/develop/src/main/scala/com/twitter/algebird
It is a generic programming framework for data types that support the concepts of 0,1,plus, times, inverse ... The user has to write a wrapper in Scala that takes care of the memory management, and carries out the operations. Depending on which concepts the data structure implements it will fall into some abstract algebra concept: group, semigroup, monoid,... where you can use specialized algorithms specific to that generic abstract algebra concept to achieve your computational goals.
For example if you have a communicative operation, it may be faster than an associative one since you are allowed to reorder the elements; however the framework needs to know this in order to make optimizations.
IMHO a C++/Python/Java/Ruby framework based around parallel prefix would have been more useful. Also, the documentation is sort of lacking. I couldn't easily find a concise example of how to take your data structure and wrap it with all the functions it needs for the framework.
Parallel RPS solution.
Think of it as a three state finite state machine R, P, S; with transitions R, P, S. First encode R->0, P-> 1, S ->2.
Rock transform {RR->R, PR -> P, SR -> R} | {00->0, 10->1, 20->0} | {0,1,0}
Paper transform {RP->P, PP->P, SP -> S} | {1,1,2}
Scissors transform {RS->R, PS->S, SS->S } | {0,2,2}
Now apply a parallel prefix on the transformations {0,1,0}, {1,1,2}, and {0,2,2} under transformation composition; just like permutation composition/multiplication where you do f(g(x)).
How would algebird help you with that? Eh.... not sure.