For the impatient reader… I’ll cut to point. Yes, it was a mistake. We are well beyond the point where a seemingly good idea has been taken to excess and evolved into a bad implementation.
How did I come to this conclusion?
I spend a lot of time evaluating applications (mobile, server, network elements, desktop, mainframe, and “other”). I frequently encounter problems where a component was included in such a way that the software simply won’t work without a framework or library that should have been considered optional.
Frequently this occurs during a vulnerability assessment where an application is installed on a server which is denied GUI capabilities [a lot of developers hard code a “# Include” for GUI libraries even when providing command line capabilities, even for software target Unix servers].
I’ve also been encountering a lot of mobile apps which only provide single-user, offline features and have no use for or need of network communications capabilities. Unfortunately I have to do a lot of extra work assessing these applications because OS Kernels, System Libraries, and Application Frameworks have all been “IP Enabled” by their vendors.
Consider this Apple iOS situation… If an application does not include the CoreLocation Framework, I can reasonably assume it’s not likely to use my GPS or location information and I can spend less time looking at those issues. However, even a Notepad or Solitaire App has unfettered usage of NSUrl. NSUrl is a primary mechanism for reading and writing files, both local and remote. So I spend a lot of time looking for remote communications activity in Apps which shouldn’t even include the capability.
Some may believe the CFNetwork framework is required for network communications in an iOS App. It’s not. The CFNetwork is only required when the developer wants to interact directly with the protocols at a lower level. APIs like NSUrl are fully capable interacting with a wide variety of media (file) types from most anywhere; and it is not limited to the HTTP protocol.
As we slowly move towards IPv6, and situations where devices will have a multitude of IPv6 addresses, the ability to distinguish desired communications activity from undesirable will get even more complicated.
Apple’s iOS NSUrl API isn’t the only example of this, it’s just one which a lot of folk are likely to recognize today. In reality, most modern operating systems and bursting at the seams with these sorts of IP Enabled “features”.
So how did we get to a point where the OS Kernel, System Libraries, and Application Frameworks are so “IP Enabled” that the same API is used whether reading a local text file or a remote file (of any kind whatsoever)?
Explaining this situation may need a little history review…
Once upon a time, computer systems (and networks) did not speak IP. There were numerous other communications protocols and each had it’s strengths, weaknesses, and appropriate applications.
During the 90’s, in the early days of what most people now recognize as the Internet, vendors of operating systems, programming languages, and application development tools embarked on an industry wide effort to adopt IP as the primary communication protocol for their products. At the time, this seemed like a great idea… nearly universal interoperability.
* Although it was an industry wide effort, it wasn’t particularly coordinated or thoughtful on an industry scale. Some folks tried to provide some thoughtful leadership, but mostly it was a Cannonball Run of vendors scrambling for an anticipated gold rush [which led to the industry’s financial implosion in the early 2000s].
Looking back, the effort occurred as a two phase process. During phase one (of the great IP adoption), the product’s core functions continued to use previously existing internal protocols and an “IP Stack” was added to the product. From a customer perspective, this usually satisfied initial expectations and requirements.
However, vendors felt competitive pressure to optimize their products. Remember, the early to mid 90’s were the time of computers measured in Megahertz CPUs, 1MB of RAM was a high end system, and storage media was often measured in Kilobytes up to a few MB. Network communications often utilized modems with speeds measured in bits per second.
Internally, CPUs and software application need to be able to pass information around. Within a single application (or process), this is often done with memory pointers or some equivalent. However, between processes or separate applications there needs to some communications protocol.
When systems or applications used one protocol for internal communications and then translated data to IP for external communications, many felt this translation process was too slow and consumed to many resources. Another customer frustration rose from the initial practice of vendors shipping their product with only it’s native internal protocols and requiring customers to obtain an “IP Stack” from a 3rd party. In the early days of Windows 3.x and even the initial version of Windows 95, it was common for the installed operating system to only contain a couple Microsoft LAN protocols. IPX/SPX (Novell), SDLC/HDLC (IBM), AppleTalk, and TCP/IP all required installation of 3rd party software which Microsoft provided little or no support for.
In the mid to late 90’s there were many products available which provided multi-protocol translation services to both desktop operating systems and servers. It was common to find “Multi-Protocol Router” products, usually software gateways, available for establishing (and controlling) communications between an organizations WinTel, Apple, Mainframe, and other environments. These multi-protocol router applications could also serve as gateways and stateful application firewalls between internal environments and/or external EDI networks or the Internet. Many similar products were available to the desktop for print gateways, internet proxies, access to EDI networks, remote dial / desktop control, and other services.
Amazon, Netscape, and Yahoo all came on the scene in 1994. A lot of early investment were being made and many technical, economic, and social changes were coming together to increase demand for Internet technologies, products, and services. And that demand was growing in both consumer and corporate markets.
So… it’s the mid to late 90’s. All indicators are starting to scream that this Internet things is going to be big. A lot of good multi-protocol technology already existed for getting people and systems connected to the Internet. But system performance and customer satisfaction was still poor. Vendors were shipping multi-processing systems, some multi-processor systems, and multi-threaded applications and customers were loading up more applications than anyone expected. Web sites were emerging and growing faster than bandwidth and modem capabilities. Vendors were scrambling to get in on the gold rush. And the customer experience often perceived the multi-protocol stacks as performance bottlenecks and/or a source of many system errors.
In reality the problems often had more to due with poor thread management, synchronous queuing, and applications which generated excessive chatter or errors without actually crashing themselves (a misbehaving background task can easily convince a non-technical user that his foreground application isn’t working correctly).
In reality, many of the technologies available in the late 90’s simply were not ready for mass market. Many products were well suited to their intended task and performed quite well for organizations with appropriate support and reasonable expectations. Unfortunately… reality, appropriateness, and reasonable expectations are seldom priorities when it comes to mass marketing to consumers. Many technologies were sold to the consumer (and small business) markets before the products were sufficiently robust or stable. Revenues flowed, advertising dollars and customer perceptions overwhelmed technological realities and in fact perceptions often became the effective reality.
As a result of these and other factors, the industry entered phase two (of the great IP adoption) as vendors began a rush to “IP Enable” their operating systems, system libraries, and application frameworks. Across the industry a lot of products were being redesigned and code was being refactored. Engineering priorities often included improving (or implementing for the first time):
- multitasking – ie., running (or appearing to run) multiple applications at the same time
- multithreading – splitting applications into multiple processes. typically some work is sent to a background thread while trying to ensure the user interface or other input queues are kept responsive to new requests.
- remote processing – enabling multiple applications to make service requests or share data. RPC, CORBA, OLE, DDE, and JAVA RMI are a few examples of remote processing technologies. Remote processing does not require, and is not restricted to, applications running on multiple physical servers in multiple locations. It can, and most often does, occur between applications running on a single host computer within a single operating system instance. A very common example of “remote processing” happening on a local computer would be using the MS Outlook application and selecting the “View Messages in Word” option. The Outlook app invokes the Word app, sends it data, and sends it instructions on what to do.
- asynchronous processing and communications – in an asynchronous process, components can work independently of each other. For software applications, this usually involves some optimization of logic for queuing up multithreaded workloads and handling results. For communications I/O (whether disk, memory, network, etc) this usually starts with increasing available channels so transmit and receive operations can occur simultaneously without collisions; next would be optimizing the distribution of activity across available channels.
Advancements in hardware technologies provided software engineers many reasons to rewrite and update their products. Marketing’s demands for “IP Enabled” fit in nicely with these other priorities. The engineers were also enamored with Internet communications and liked the idea of supporting fewer communications protocols.
- At the time I met a lot of software engineers who had little or no idea of the size and complexity of the “TCP/IP Suite” which already existed in the mid 90s. Even fewer could foresee the explosion of “protocol enhancements” which would follow. Of the software engineers who actually create protocol implementations, many happily left IPX, SNA, NetBios, AppleTalk, and others in the dustbin of history… but I doubt you’ll find many who’d say life has gotten simpler since then.
- IANA maintains a port numbering scheme for TCP and UDP protocols (mostly those which have been recognized thru the IETF RFC process). At this time there are about 1,200 TCP/UDP protocols identified by the IANA registry. Even within an “IP only” environment, this # is just a subset of the protocols available in the seven layer OSI stack.
What started for many product engineers (software and hardware) as an effort to make products compatible with IP soon became a very public optimization contest for vendors and their marketing organizations.
The race resulted in making IP the default protocol for inter-process communications and even intra-process communications… the birth of the “IP Enabled Operating System Kernel”.
The IP Enabled Kernel actually has two key characteristics. Some instances may only have one of these, but many now have both.
The first characteristic could be describe as “optimization by inclusion”… or you might call it “kitchen sink compiling”. Many of the networking functions which were previous performed by software modules external to the kernel were compiled into the kernel’s source code. By doing this, the kernel and networking feature share the same physical memory space. When the network function lived in a separate process, the kernel would need to physically copy data out to a new memory location which the network function could access. When the two are combined, they can pass pointers to physical memory. The result is a dramatic speed increase and reduction in I/O.
Image the user wants to send a local file from disk to a network location, but is using a computer system where everything is strictly separated into different application processes. The System Kernel is in process #0. The user is currently running application process #1. The user request causes a file manager to be invoked in process #2. And a network stack needs to be invoked in process #3.
In a well designed / optimized system, the file could be read directly from disk to the buffers of the network interface. Proper process boundaries and virtual memory address management provide by the Kernel would prevent the User App, File Manager App, and Network App from knowing anything about each other or the Kernel… and the user’s request would be performed with a minimum of system resources.
Unfortunately, most systems today still aren’t that well designed. A more common result was for the I/O to occur multiple times as the data traversed the various processes. Or in even worse circumstances, the physical memory pointers were passed to all of processes interested in this information and a bug in one would bring everything to a crash.
The “optimization by inclusion” approach has resulted in many of these functions being compiled into the OS Kernel [or into DLLs which are loaded into the kernel as the systems boots… with pretty much the same run time result].
The second characteristic of the IP Enabled Kernel could be described as “process optimization”. This approach does several things:
- organizes the application (process) logic as close to the OSI Layers as practical
- arranges I/O and data chunks into sizes and patterns which are optimized for encapsulation within IP packets. Network IP Interfaces have a setting called MTU (Maximum Transmission Unit). If a Kernel process is handling some data which might eventually be sent to a network interface, passing that data around in chunks which fit perfectly into the MTU would be a potential optimization.
- prefers and implements IP Protocols for inter- (and sometimes even intra- ) process communications. This is one of the uses for the Loopback Address of 127.0.0.1
Over time, IP capabilities were made native to more and more OS components, system libraries, and app frameworks.
Today we’ve reach a point where vendors are trying to IP Enable our kitchen appliances.
* Actually some of the vendors tried this in the 90s, but the Utilities ignored them and many of the rest of us laughed at them. Today the vendors are trying it again and people are starting to buy into it. In some cases utilities have deployed smart grid products which unintentionally introduced IP capabilities on what were thought to be private, non-IP networks (both wireless and wireless). NERC has begun intervening and requiring stricter technologies standards and security procedures for the utility industry.
I believe we’ve already went to far.
I’m not a security by obscurity fan who wants some mysterious black box kernel setting at the heart of my technology products.
Nor am I some sort of closet luddite who wants to shut down the internet. I like shopping online, using electronic bill pay with automatic bookkeeping and no stamp licking, and digital media.
But I do think it’s time we seriously consider going to back to core components which don’t have native Internet capability. Technology has reached the point where the potential workload from using multi-protocol gateway applications no longer presents a performance problem.
Firewalls and anti-malware tools have become de facto system requirements for everything. IE., we’re already running the workload attempting to monitor IP-to-IP communications. If we stopped allowing every little app, gadget, widget, process, and thread access to every feature of the IP Stack known to man, we could actually reduce the Firewall/Anti-Malware workload on our systems and achieve a higher level of confidence in monitoring being effective.
Memory virtualization and address randomization have evolved to the point where I/O can be optimized while still preventing processes which share data from knowing about each other or interfering with each other.
There’s no reason for an application to have Internet communications capability without expressively asking permission to load and utilize an appropriate framework. At application run time the user / device owner should have the option of denying the application that capability when desired.
Security issues would improve with systems which:
- Move network interfaces and protocols out of the OS Kernel.
- Use a non-kernel process to access network interfaces.
- Use non-IP protocols for inter- and intra- process communications. When a process (even a kernel process) needs network services, require it to request permissions and translation services thru a non-root gateway.
- The entire network communication stack should be moved to a “multi-protocol router and stateful application firewall service” running under a non root account.
- One place to enable/disable communication services.
- One place to monitor communications services.
- But not an all-or-nothing architecture. It should be easy to control which protocols are enabled or disabled. Same with apps. And same with inter-process/service communication.
- These aren’t concepts which require a lot of “start-from-scratch” efforts to realize. The application logic already exists. We created the DEN and CIM specifications back in the late 90s specifically to provide an industry standard way of managing relationships between people, devices, applications, and services.
- In high security environments, this architecture is the required default. It’s usually achieved thru a combination of OS Hardening and 3rd party security products. The hardening process removes unnecessary packages from the system, restricts communications capabilities to specific services, and forces communications to pass thru the 3rd party security product for evaluation.
- Bluetooth devices are for personal area networks. They don’t need a publicly routable IP Enabled network stack.
- Nor do my USB, Firewire, Audio and HDMI interfaces!
- Start applications in a ‘least privilege’ mode and allow the user / device owner to approve activation of features. If the app doesn’t work, or fail gracefully, in least privilege mode it shouldn’t pass QA. [And the operating system shouldn’t let it run without a user override.]
- The Apple iOS Privacy Settings panel demonstrates a good concept, that could be improved. Important services, features, etc., which have privacy/security concerns should be isolated to specific Libraries and Frameworks. Operating systems should provide users / device owners a mechanism to enable or disable entire frameworks as they choose.
- Organizations with high security requirements have been playing whack-a-mole with Mobile Device Vendors over features like cameras, microphones, location tracking, and more. While some organizations have had small successes getting policy management points built into Mobile Device Manager (MDM) products and Mobile Operating Systems… consumers have been left with little to no idea what their devices are doing or capable of doing.
- New features should be linked to a framework and privacy control mechanism before the features GA release.
These issues don’t apply just to Smartphones, laptops, and other typical IT products. These issues are just as important for automobiles, appliances, electronic healthcare products, home automation products, industrial robots, the emerging market of home assistive / personal robotics products, and any other new fangled gadgets coming along with abilities to store, process, or communicate information.
Some time ago, a TED talk described a “moral operating system”. The speaker was describing the need for a system of morality for people… but I tend to take things literally, and kept returning to the idea of, “how could improve computer operating systems to facilitate these ideas?”
The obvious first step has been know for years. Design systems so the default choice is typically the better choice.
Another requirement for this new operating system. It needs to begin with the principle that everything on the disk / storage media belongs to the user or device owner. It’s my information and I have a right to see it when I want to look at it. It’s my information and I have a right to monitor which applications or processes have been accessing or modifying it. And I have a right to restrict which applications or processes can access my information on the disk or storage media.
I’m not daft. I realize DRM isn’t going away anytime soon. And I’m not here to argue over which DRM system, if any, is better than the other. I believe an inherently secure, user-centric operating system can still accommodate a DRM’d service by:
- giving me the choice to delegate control of a storage location and control of an application sub process to the DRM service.
- the delegated storage location could be an external media device I choose to dedicate to the service or, more likely, be an encrypted sparse disk image I choose to allow the service to create (at a file location of my choosing).
- the delegated application “sub process” would likely be some sort of “certificate management” utility which kept the keys to the delegated storage location.
- so long as I permit the “sub process” to run and don’t tamper with it, it would be able to verify it’s code signature and verify it’s certificates to provide sufficient assurance to the DRM’d content provider I’m following the terms of our agreement.
- The DRM service should have absolutely no reach or influence within my computer system beyonds it’s application sandbox, it’s delegated sub process, and it’s delegated storage process.
- If I wish to stop or delete the service, it should be as simple as exiting or deleting the application. The only negative consequence should be loosing the ability to read the contents of the encrypted delegated storage area. Deleting that storage remains my decision, and so does the option of re-installing the App to restore access to the DRM’d media.
In addition to the Virtual Memory Addressing, Memory Address Randomization, and Memory Encryption architectures which have been implemented for computer RAM… I’d also like to see similar architectural changes for how applications are allowed to interact with the file system.
For example, some features might include:
- Restrict sandboxed applications to a virtual file system using encryption and address randomization instead of allowing the application access to any part of the real file system.
- Give the user controls to provide an application with access to “the file system framework” so it can interact with things outside it’s sandbox. Include some granular choices such as file, directory, or “other app’s data”… with standard file permission options still available also.
- Just as it may be reasonable to expect an application to ask permission to use a NetworkFramework to communicate outside of it’s sandbox, it should also be reasonable for an application to need permissions for a FileSystemFramework before interacting with data/media outside of it’s sandbox.
Again, to summarize in short form, these few key changes could improve the inherent security of many computer / electronic products:
- Take the network interfaces out of the OS Kernel.
- Take the network protocols out of the OS Kernel.
- Direct all network communications thru a non-root multi-protocol router and stateful packet inspection service.
- Wrap major product features in a system framework and give the user control over whether that framework is accessible on their device, and by which apps/services if they choose to enable.
- Always start things in least privilege mode until the owner approves more access.
- Always start from a place which acknowledges the user’s ownership of information and preserve the user’s ownership and rights.
- Only the owner can choose to delegate control [the operating system and 3rd party applications cannot arbitrarily grant themselves control over the user’s information].
- Only the owner can choose to provide access to information.
- And, only the owner can choose to disclose information.
In real, day to day terms… these architectural changes would not require large shifts in the way most developers and engineer go about building their products. Very few software engineers actually write protocol stacks, kernels, or system frameworks. For everyone else writing software, the difference between including a framework in your application vs “getting it for free” from the operating system can be as simple as a checkbox or a “# Include”.
The biggest effort, and most important work, is for the kernel and framework developers to adopt architectures which default to inherently safe security configurations and give users control over whether frameworks/features are enabled.
The Linux and Unix communities already have secure OS implementations which achieve some of these goals. Apple, Oracle, Redhat, and Novell all share some responsibilities for completing the architecture and making it standard in their products. Microsoft probably has the most baggage to overcome.
Many others in the IT Industry also share responsibilities in making these sort of changes. Nokia, Siemens, Samsung, Blackberry, Google/Motorola, HP, IBM, and Cisco all need to step up.
Some, such as Symantec, stand to loose some market share if OS Vendors finally step up and fulfill their responsibilities.
Intel, AMD, Motorola, Qualcomm, and TI all have a stake in this as well. Intel is easily in the leadership position right now, since their acquisition of McAfee was explained as being done for the express purpose of introducing more security capabilities directly into the CPU and reducing the need for complex 3rd party products to be loaded by the customer after the system purchase.
Listing the CPU manufacturers brings me to my final point for the security architecture recommendations. To some extent, this one is mostly on the CPU makers, but coordinating with the OS makers will help.
Enough with the kitchen sink “system on chip” approach. Yeah, it’s a great idea. But overdoing it is like combing a Super-Walmart, a Cabelas, a college dorm complex, a hospital, and super-max prison all into a Seven Eleven. Who decided the most trusted processes and the least trusted processes should run on the same chip? The $99 smartphone of today provides as much or more processing capability as $200,000 systems available at the time many of these architectural decisions were made. Inertia has kept us on course. It’s time to reconsider old design decisions.
After decades of watching more capabilities be combined into a single chip, I’m no longer convinced it’s the great idea it started out to be. Keep making things more energy efficient, smaller, lighter, and faster. But consider backing off the physical co-mingling of chip capabilities. Consider fencing some of these untrusted communication services off to a component chip and working with the Secure OS makers to build good gateway processes and frameworks for controlling the flow of data.
As for multi-core CPUs… in consumer devices, I still haven’t seen many examples of workloads (applications) which can properly utilize four or more cores… and I’ve seen even fewer examples of consumer workloads which actually need to do so for more than a fraction of a second. Very few consumers run video rendering processes, and even fewer run multiple virtual machines in a continuous build development process.
On the other hand, I believe a currently underdeveloped chip feature which could provide immediate benefits to consumer and business markets combines secure I/O and secure storage. In our laptops and smartphones, and other similar devices, our media libraries (video, audio, photos) have grown large, but much of our critical personal information fits within just a few megabytes up to a couple GB for those who have been paperless longer. CPU and OS makers should look at implementing physical and logical pathways dedicated to providing the user with a secure data vault. It could utilize any number of different implementation strategies. Some Flash on the motherboard, a CPU pathway available to a specific USB/MicroSD slot or to an optional region/address within an SSD, or something we haven’t even thought of yet. Whatever it ends up to be, just make sure it provides the user with a means of vaulting relatively small amounts of critical data away from their primary storage disk in some way that allowing an app generic/general file system access would still be physically and logically isolated from the vault. The best explanation of my interest here may be Keychain on steroids… put the data at a location physically separate from the regular storage disk, use a different file system, different encryption protocol and key, implement a coordinated CPU and OS architecture which requires all access to the vault be shunted thru specific/dedicated frameworks and gateway services (ie., not directly accessible to regular OS and App process).
Personally, I have three applications which I use in a “data vault” fashion and would benefit from this architecture. I only run them occasionally, when I need them. They already have extra security controls invoked with launching the apps. Placing the app data into a secure vault protected by physical and logical separations including a security framework and control gateway would both improve and simply the scenario I currently use. There is a market for this kind of functionality. If there wasn’t, products like 1Password and RSA SecureID would not exist.
In product areas outside traditional tech markets, vendors are already running into challenges to the System-On-a-Chip (SOC) trend. NERC recently began requiring smart-grid device makers producing residential smart meters to separate functionality into at least three physically separate portions within the device. One trusted portion of the devices are required to undergo extensive certification testing (every version of hardware and everyone version of software, even minor updates). This portion would be allowed to communicate with utility grid control systems. A second, less trusted, and optional, portion could be implementing for local maintenance access (local/downstream only, no grid/upstream access). A third, and mostly untrusted, portion is provided for consumer facing services which have the high risk issues of being available to the consumers home network and also getting frequent software updates as consumer features are continuously developed. Overuse of SOC architectures compromised the entire smart grid. This mandatory chip/feature segregation is critical to the utility industry and provides benefits none of the other proposed architectures can match.
Automotive, aviation, and many other industry segments have similar requirements for architectural physical separation of features. We need to recognize the value of dis-integration in consumer products as well.
It’s almost ironic that much of the IT industry has been loudly espousing the benefits of loose coupling and dis-integration for a decade or more. But yet most of the industry overlooked the increasingly tight coupling between Operating Systems and Network Stacks.
Adopting a new operating system always involves a learning curve. But I look forward to learning a new modern and inherently secure OS that doesn’t have built in IP support.
And if anyone expects to sell me a self driving vehicle or a personal robot (next year or 20 years from now), consider this early notice of my #1 priority. An inherently secure design.
In fact, for cars and robots… the world would probably be better off if the communications capabilities were removed to physically separate chips and I/O pathways between the CPU and CommChips controlled by physical switches or keys. Turning off the switch or removing a key should permit the device to otherwise operate normally… just prevent it from getting new instructions from the neighbor kid while we’re sleeping or gone fishing.