Google Summer of Code 2022

Introduction

Through this work we present a C++ with R interface implementation of the CRHMC algorithm. This project was build through the Google Summer of Code program in GeomScale. Main goal for this problem was to implement the CRHMC in VolEsti with high performance. With this project now the VolEsti library has the capability to sample from large sparse polytopes. Previously all VolEsti random walks would process the constraint matrix in dense form. One can view the project Proposal Here.

A mathematical theory for Occam's razor

Simplicity is often one of the main goals of any learning process. If a theory that explains a phenomenon is simple then we can understand it more quickly and use it more efficiently. Additionally, it is also believed that simpler models are probably more credible. That belief, Occam’s razor principle, has greatly influenced scientific thinking as a heuristic and it is applied frequently in the field of Data Science. Occam’s original statement is often interpreted as follows: all other things being equal, the simplest explanation of the observations is more likely to be true. Although this statement is qualitative in nature, there are mathematically quantifiable statements that attempt to capture the meaning of this heuristic. However, as of today there does not exist a single unifying theory that can explain all of its occurrences. With this essay we will present the results of some candidate theories as well as their shortcomings.

Importance of Mathematics in Big Data

In the last decade massive datasets are constantly generated by Science and the Internet to be used for hypothesis testing or for exploratory purposes. However, these datasets can be so large and complex, that it is infeasible to process them with traditional techniques. Polynomial complexity, which is often sufficient for a lot of computer science projects, is however prohibitive when working with big data. There, due to computational constraints, we often have to look for algorithms with sub-linear complexity in time and space. We will advocate about the importance of mathematics in this field through significant paradigms of their use that lie at the core of the field.