momius - Fotolia

How the top open source AI software drives innovation

In the world of AI, open source software is driving most of the innovation. But with vendor tools largely sidelined, what does this mean for things like security and technical support?

Open source software and tools have long been a mainstay of the computing ecosystem, especially over the past two decades. From the popularity of Linux in the enterprise server environment to the Firefox browser, open source has found a successful place in the computing hierarchy. It should come as little surprise that open source AI software is finding significant popularity and use within the machine learning and deep learning ecosystem, as well.

In fact, much of the technology that powers AI is open source. This comes as a contrast to other enterprise technology, such as operating systems and databases that had their initial roots in closed, licensed software.

Considering the high value and billions of dollars invested in the industry for AI, the fact that the most popular and widely used AI tools are available as open source is a pretty big win for a lot of interested parties. The high quality of open source AI software makes it possible for a wide range of users, including researchers and academics, startups and entrepreneurs, government organizations, and large enterprises, to explore and experiment with AI without having to make significant upfront investments in software licenses or proprietary systems.

Furthermore, large open source communities are giving organizations wider access to talent and skills, while also enabling individuals to improve their knowledge without requiring upfront investment in tools and technology.

Open source machine learning tools and toolkits

One of the biggest drivers for machine learning is the popularity of Python, an open source general-purpose programming language that has found widespread adoption in the AI community. Through libraries such as scikit-learn, the hugely popular Jupyter data science notebook, and open source projects based on Python, AI developers have found no shortage of tools. Open source community members frequently make preassembled toolkits and open source projects available through repositories such as GitHub and other places to help accelerate fellow community members' AI development.

The R programming language -- and its supporting ecosystem -- is another hugely popular and free, open source environment supporting a large number of machine learning and AI researchers, developers and technology packages.

Major technology companies such as Facebook, Google, IBM, Microsoft and others have further enhanced the open source AI ecosystem through contributions that have further enriched open source tools. Google spearheaded the development of the machine learning platform TensorFlow as an open source project, while Facebook's AI research group and others helped to develop PyTorch and Caffe.

Microsoft released its Cognitive Toolkit as an open source package for enterprise-quality distributed deep learning. Similarly, Amazon and others helped shepherd the Apache MXNet project, which comes with the Gluon interface and provides simple and quick building blocks for machine learning development.

There has also been open source development at higher levels of the stack. Popular open source computer vision tools include OpenCV and SimpleCV, among others. There are a wide range of open source natural language toolkits, including the Natural Language Toolkit for Python, SpaCy and PyTorch-NLP, as well as Java-based OpenNLP and many others.

While not AI-specific, the Robot Operating System has gained widespread adoption in the robotics communities, which has, in turn, enabled machine learning and AI capabilities to be used in robotics and autonomous implementations without the use of any licensed software.

Similarly, there has been a major open source push for predictive analytics and data science. Open source tools include H2O, Apache Mahout, Apache Spark and Hadoop for big data analytics enabled with machine learning capabilities through Apache Spark MLlib. In addition to these popular systems, there are a couple of automated machine learning tools that are slowly gaining traction, including TPOT and AutoSklearn.

In much the same way that Red Hat pioneered commercial enterprise support and add-ons for the open source Linux operating system, so too are AI and big data companies such as Cloudera, Databricks and providing commercial, enterprise support and add-ons for open source AI software tools.

Whither commercial enterprise licensed software for AI?

Open source tools appear to be more popular than some of the commercially available alternatives. Certainly the momentum, traction and movement toward open source tools for AI show no signs of relenting. Despite this trend, enterprises, researchers and government organizations are still heavily invested in commercial tools such as SAS and Matlab, which are both known for their popularity for analytics, data science and machine learning applications.

These commercial, licensed tools have widespread adoption and communities that have invested heavily in them. Likewise, they have built a considerable ecosystem or supporting tools and applications that depend on their underlying functionality. As such, despite open source's continued strength and growth in AI and machine learning, commercial tools continue to grow, as well.

While one of the main drawbacks with commercial tools is their cost and licensing restrictions, large organizations are often cautious in using open source tools in critical environments or when the tools might need to be used in more restrictive settings. In these settings, developers are prone to worrying about the ways they are legally allowed to use open source tools, whereas commercial offerings might have more explicit licensing terms.

Commercial tools vendors also like to press the idea that they are more secure by offering security updates and technical support. However, increasingly, companies are emerging to offer similar benefits for open source tools. While there is no doubt the surge in interest in AI and machine learning is driving adoption of both open source and commercial offerings, commercial vendors can no longer rely on an entrenched customer base to secure their future.

Artificial intelligence continues to show tremendous growth. Because open source tools offer a way for organizations to quickly and easily get started with AI, as well as move their systems to large-scale production, there is no doubt open source AI software will continue to show remarkable growth.

Next Steps

Swiss retailer uses open source Ray tool to scale AI models

Dig Deeper on AI technologies

Business Analytics
Data Management