I'm a math major who has recently graduated and I will be starting full time work in 'data analysis'.
Having finished with decent marks and still being incredibly interested in mathematics, I was thinking of pursuing graduate study/research at some point in the future. I was reading up about possible areas of study for this when I came across topological data analysis, which (as I understand it) is an application of algebraic topology to data analysis.
Given my situation, I was intrigued by the concept and I would like to do some self study so I can have a working understanding of the subject. I have only done basic undergraduate abstract algebra, analysis and point set topology, and I am currently reading Munkres' Topology (Chapter 9 onwards). How do I get from where I am now to understanding the theory behind TDA and being able to apply it?
My knowledge on further mathematics is far from extensive and I would appreciate any advice on links/texts which I could use to learn the relevant material.
Best Answer
Before answering you question I would like to discuss some points:
And here comes my suggestion for how to draw your own roadmap:
Statistics. A book that analogously to Ghrist’s book could help you in designing your own roadmap is Larry Wasserman’s All of Statistics. Also, note that the application of statistical methods to techniques from topological data analysis is an active area of research, and while there are some tools and libraries that can be used for applications, this area is still in its infancy. I list here the libraries and relevant references for statistical tools for topological data analysis that I know off the top of my head (these are all related to persistent homology):
Data science. Finally, as for data science more broadly, I don’t know any good text, but you might get an idea of some of the general themes from the book Mathematical Problems in Data Science.
Aside: to finish off, I give some additional references to books/papers and software packages.
References for topological data analysis, and computational topology:
Topology and data, Carlsson
Computational Topology, Edelsbrunner and Harer
Open source libraries that implement some of the methods from topological data analysis: