Over the years, the amount of data businesses capture has skyrocketed, largely thanks to the Internet of Things and cheap storage
Now, companies are focusing on the next step: Doing as much as possible with this data. Although typical statistical and analytical tools can extract valuable insight, making the most of data requires a burgeoning technology: Machine learning. Here are some of the ways businesses are leveraging machine learning and some of the most powerful and popular tools.
Conceptualizing Machine Learning
Machine learning is a subset of artificial intelligence, and it relies on the concept of allowing algorithms to train themselves to analyze data to find patterns and trends. When it’s possible to set up a list of rules, more traditional AI rules are often sufficient. Machine learning shines when it comes to more complex tasks that are difficult to generalize appropriately. Complex games, such as Go, are best analyzed using machine learning by providing a bit of training up front and letting the algorithms determine their own rules through trial and error. Similarly, tasks such as facial recognition improve dramatically with machine learning.
We tend to think in terms of discrete computer programs. When it comes to machine learning, however, it’s important to note that frameworks rule. Using machine learning requires piecing together various tools within a framework to set up algorithms to learn on their own. Popular frameworks vary dramatically in terms of how they’re used and where they excel. Those looking to invest in machine learning need to dedicate a significant amount of planning time before settling on a framework or set of frameworks, as development time can be expensive. Furthermore, data scientists often specialize in specific tools, and switching to a new tool can be difficult.
Although it lacks the speed of C, Java, and other programming languages, Python is popular among beginning programmers and professionals alike due to its clean syntax and powerful built-in capabilities. Furthermore, Python performs well when performing raw calculations, which machine learning and neural networks rely on. Although pure Python code has a role to play in machine learning, the NumPy library offers advanced and performant mathematical capabilities that make it a popular tool for data scientists and others working with large volumes of data. Python is also popular for more specific frameworks: Google’s open source TensorFlow suite provides a robust and fast framework for deep learning, and it’s a great foundation for building neural networks. Those seeking yet more abstraction can turn to tools such as Keras, which is built on TensorFlow and provides easier neural network creation and deployment. Python’s ability to scale from a raw programming language to more and more layers of abstraction makes it a robust and powerful machine learning solution, and companies looking for flexibility and a wealth of choice may want to make Python their primary focus.
Apache’s MXNet is a relative newcomer to deep learning, but it’s garnered substantial interest from some of tech’s giants. Machine learning relies on CPUs and GPUs to crunch data fast enough to generate results in a reasonable timeframe, and all popular frameworks are designed to scale smoothly. MXNet, however, was designed in the era of cloud computing, which may explain why it’s the framework of choice for public clouds, including Amazon Web Services. It’s also designed to be portable. Considering how quickly computing can change, this portability makes MXNet a notably flexible option, and if a new type of processor comes to market, MXNet might offer the best support. Companies often struggle to decide whether to invest in their own hardware or tap into a public cloud. MXNet makes switching between options simple, and it’s a good choice for hybrid clouds.
Apache also fosters another popular machine learning framework: Mahout. Although it’s not strictly tied to the platform, Mahout is a popular choice for use with Hadoop, a popular platform for distributed storage and data processing. Machine learning is powerful, but many organizations are better served by focusing on their data storage and management, which makes Mahout’s easy integration with Hadoop a compelling feature. Furthermore, Mahout works with the Java Virtual Machine, and Java remains the most popular programming language according to most metrics. Because expertise in machine learning is relatively rare, many programmers are looking to transition to data science work. Those seeking candidates for machine learning tasks might be able to tap into a burgeoning developer base by sticking with the world’s most popular programming language and platform.
Although general-purpose machine learning tools provide a powerful foundation to build on, some organizations might benefit more from business-specific tools. Many machine learning tools in use by businesses were originally designed for use in science, but business-focused tools are coming to market. Infosys Nia, for example, is designed to capture and analyze data for “people, processes, and legacy systems,” making it a compelling option for businesses looking to take advantage of artificial intelligence and machine learning without building the own tools. IBM, long a leader in artificial intelligence, now offers a range of tools based on its Watson artificial intelligence system. Before investing in a more general machine learning framework, companies should note their use cases and take a look at more tailored solutions to see if any offer the suite of capabilities they’re looking for.
Making a Decision
Deciding on the right machine learning platform is far more complex than most other tech decisions. The major cloud providers, for example, all offer similar capabilities, and cost differences tend to be fairly small. Similarly, companies hosting their own website have great options to choose from, and operational costs and performance will vary little when sticking with popular servers. Choosing a machine learning framework requires deciding on a paradigm that best matches an organization’s needs. Companies looking to improve their efficiency and use data in a smarter way might be best served with tools designed with specific use cases in mind, as they provide excellent value. Companies looking to invest more heavily into machine learning for other core functions, on the other hand, might want to look at more general-purpose tools. Building a team of data scientists able to uncover and exploit value hidden within the every-growing mountain of data typical businesses collect can be a powerful investment.