The risk of using Windows kernel-mode drivers in systems management

Windows kernel-mode drivers can pose risks if used in IT efficiency systems management tools.

Implementing innovative IT efficiency solutions is becoming a top priority for many businesses, so it is critical that solutions providers understand how those efficiency solutions can impact customers' systems management infrastructure, as well as which practices may introduce unnecessary risk.

Some IT efficiency solutions use Windows kernel-mode drivers, introducing inherent risk, extra maintenance and the very real possibility of a major system crash or a blue screen of death (BSOD) epidemic. Microsoft even noted that 90% of BSODs are caused by third-party drivers -- which is why driver signing was introduced.

But even a signed Windows kernel-mode driver may not be up to standard. There may be other third-party applications (such as vendor hardware drivers, third-party disk encryption, or security and antivirus tools) that use the kernel or the same memory space that your customer's infrastructure tool wants to use. This can compromise system stability since even if the software initially appeared to work together, updates, patches and newer versions of any kernel-mode software products could suddenly introduce a clash. This would lead to the dreaded BSOD, as well as a significant loss of time, effort and money.

Some IT professionals get excited by how a technology can greatly improve systems management efficiency and unconsciously minimize the risk it can actually pose.  Even lab test and replica environments cannot fully simulate what will happen in your customer's production environment -- some things simply may not go quite as smoothly as planned.

Memory management options and risks

There are three types of memory management that drivers can use: direct, shared or buffered. With shared or buffered, there is some ability for the software to correct itself. If an issue occurs, it will not crash the system. It simply exits out completely and discards the used memory for the system to collect.

This is not the case with direct memory, which makes some types of kernel drivers much more dangerous: They directly use memory that the operating system uses. File system filter drivers are a prime example, since they're dealing with file system I/O and have speed requirements that make shared or buffered management an unacceptable option.

The risk these products introduce to your customers' environments, due to coexistence issues, is very dangerous. IT efficiency solutions running in the kernel space of the operating system would likely take down the system and possibly the entire network in the event of an issue. In fact, a widespread failure may require a visit to every single system to fix or rebuild them completely.

On top of this, IT efficiency products relying on kernel-level drivers have been shown to take 40 to 60 MB of system resources due to all the overhead the Java Runtime Environment-based solutions require. Relying on a third-party environment also adds layers of complexity through more layers of code.

A recent example of this is the Java zero-day exploit that was found to allow malicious code to be downloaded to client machines. With Oracle known for slow patch updates, companies have taken radical steps such as disabling or removing Java from the clients to avoid exposure to this threat. If your customer's system management infrastructure runs on Java, they have no choice but to disable their IT efficiency solution or risk exposure. Even with a patch to fix the vulnerabilities, planning, testing and pushing the patch to the environment would negate the anticipated IT efficiency savings.

Solutions providers need to leverage stable, secure IT systems management efficiency tools that can handle such coexistence issues. Doing so will ensure that customers will not need to deal with adverse user, system or network ramifications in the event of any failure.

Bandwidth throttling essentials

Some IT systems management tools rely on hardware-based technology to perform bandwidth throttling -- further increasing IT risk and overhead. In fact, this requires your customers to make specific router configurations, such as leveraging DiffServ, a vendor-specific component for Quality of Service (QoS). However, the role of QoS is to reserve network capacity, not free it up for other, possibly more important, traffic. QoS is a mechanism of allocating certain amounts of bandwidth for certain types of traffic at the router level.  A 10 MB connection should be able to be divided to give 5 MB to VoIP traffic, 2 MB to email, 2 MB to Web browsing, etc. 

Often, these solutions require a Network Driver Interface Specification (NDIS) device driver on the operating system to create the network packets that DiffServ requires. They also cannot cope with networks that have multiple router hops from source to destination -- which is extremely common in larger networks that encompass branch offices. NDIS kernel drivers become a necessity for managing Microsoft Systems Center Configuration Manager (SCCM) network content bandwidth with these IT efficiency solutions because the type of packets that can take advantage of these QoS fields cannot be created in user mode.

In addition, if an IT efficiency solution uses DiffServ to check the queue length of the edge router, it may not give your customers a proper measurement of unused bandwidth and SCCM traffic. Because not all routers are configured to use QoS, an edge router may not support it or respond to the DiffServ request. Furthermore, only querying the edge router works fine if there is only one router between the client and the server. If there are two, five or 17, looking only at the edge router will not provide accurate traffic information or a true reading of end-to-end bandwidth availability.

Relying on a systems management solution that looks at router queues or other similar additions to the hardware stack means more points of failure, more configurations, and more time from both the systems management and network teams. Solutions providers need to deliver systems management platforms that won't break down when the speed between routers degrades or when non-ConfigMgr data (i.e., business data) travels a different network path to a destination over the same router.

Doing away with drivers

Solutions providers should make sure an IT efficiency tools implemented in a customer's IT environment does not introduce any new risks or create extra administrative overhead by using third-party drivers to manage a systems management infrastructure component. Help your customers fully reap the benefits of IT efficiency initiatives with regard to time, effort, bandwidth and budget by doing away with drivers, thus helping them protect business data by seamlessly augmenting their systems infrastructure.

About the author: Richard Threlkeld is a technical product manager at 1E specializing in enterprise systems management. Richard oversees the tactical and strategic implementation of 1E technologies. He also has expertise in several areas of software development and a background in mathematics. Prior to 1E, Richard worked for Qualcomm CDMA Technologies and ran the SMS infrastructure for the engineering division. Richard was a speaker for several years at the Microsoft Management Summit and was one of the first MVPs in SMS. He has also contributed to several books around scripting and patch management.

Let us know what you think about the story; email Leah Rosin at [email protected]. Follow @ITChannelTT on Twitter.

Dig Deeper on MSP technology services

Cloud Computing
Data Management
Business Analytics