AI Revolution and Finance
Following 2022, there was a phenomenon known as the "Artificial Intelligence (AI) Boom," in which newly emerging markets and businesses received enormous valuations for having anything to do with A.I. It has been great to be a part of, and in many of these cases, many profited greatly from the excitement. Amidst the innovation, it’s important to note that basic computer science techniques and many foundational ideas underlying AI in the financial sector have not fundamentally changed much since the 1980s - distinguishing the difference between true innovation and market hype is necessary. Since Staga began using NVIDIA's Cuda in 2015 to compute large swaths of data in a highly distributed manner using GPUs, the software algorithms that have been used to trade and evaluate market flows have been fairly stable.
ML and AI are not black boxes in which data is fed into a GPU and magic happens. They are meticulously tailored, as huge sets of weights are computed in a regressive, highly parallelized fashion through intensive computation. Since these AI models need massive parallelism to calculate activation weights quickly, NVIDIA CUDA, an SDK for GPUs, was made available. This innovation was a significant advancement and has long since become the industry standard. Although LLM has advanced significantly in the years after 2022, and it will be interesting to see where they land in the market, affordable access to very powerful hardware—at least when it comes to the models we use—is what is really driving innovation forward.
For a while now, GPUs have been the accepted norm for linear algebra and matrix computation. Open access to GPU compute in a powerful SDK is one of the main reasons NVIDIA has cornered the AI market, and it is also the reason we use it at Staga. From our humble beginnings in an engineering lab, the release of CUDA made it possible for the average researcher and engineer to write code in an easy and intuitive manner, allowing it to become well-established within the open source and research communities. As of 2024, the primary hardware limitation still exists on advancing models forward, and the expense and processing power of GPUs continue to be a barrier to the advancement of the technology. To put it simply, data volume still far exceeds the number of transistors available.
In order to push this further, we need to find a way to break free from the monopoly of a single GPU provider. In the future, I see GPUs becoming more and more popular as a service, especially with cloud compute offerings. To propel ML/AI to new heights and potentially challenge current market dominance, a shared abstraction layer enabling the execution of the same code on multiple GPU vendors will be essential. This will make it easier to develop hardware solutions that are more widely available and reasonably priced. While there are products that allow code to be compiled into multiple hardware profiles, the risks and overhead still make it challenging to use. New research publications might be the catalyst for the next "AI boom," but IaaS (Infrastructure as a Service) solutions that are accessible and affordable will really advance the sector, much like how the cloud has done for the web.