Data residency made easy

ioFABRIC offers a solution to complex storage problems

Where does your data live? For most organizations, data locality is as simple as making sure that the data is close to where it is being used. For some, data locality must also factor in geographical location and/or legal description. Regardless of the scope, data locality is a real world concern to an increasing number of organizations.

If this is the first time you are encountering the term data locality you are likely asking yourself – quite reasonably – “do I care?” and “should I care?” For a lot of smaller organizations that don’t use public cloud computing or storage, the answer is likely no.

Smaller organizations using on-premises IT likely aren’t using storage at scale or doing anything particularly fancy. By default, the storage small IT shops use lives with the workload that uses it. So far so good… except the number of organizations that fall into this category is shrinking rapidly.

It is increasingly hard to find someone not using public cloud computing to some extent. Individuals as well as organizations use it for everything from hosted email to Dropbox-like storage to line-of-business applications and more. For the individual, cost and convenience are usually the only considerations. Things get murkier for organizations.

The murky legal stuff

Whether you run a business, government department, not-for-profit, or other type of organization, at some point you take in customer data. This data can be as simple as name, phone number, and address. It could also be confidential information shared between business partners, credit card information, or other sensitive data.

Most organizations in most jurisdictions have a legal duty of care towards customers that requires their data be protected. Some jurisdictions are laxer than others, and some are more stringent with their requirements based on the type of data involved.

Of increasing concern is where that data lives. If it is on your own network, is it adequately defended? Can you prove that? If you chose a public cloud provider, can you prove that provider is adequately secure? Can you prove the government to which they are beholden respects data laws in your jurisdiction, or has equivalent laws? Are you allowed to transfer that data to a third party? Can you transfer that data to a third party in any country, or only some?

For some data types it is possible to handle these issues on a case-by-case basis. For example, if your business workflows are primarily driven by e-mail, and you are worried about the legalities of hosted e-mail, you can choose a provider in your jurisdiction that meets relevant data protection certifications.

When we start talking about Dropbox-like storage, public cloud workloads, online backups, disaster recovery, long-term archival storage and so forth, however, this can become monstrously difficult to manage.

Part of the problem is the limited number of suppliers. Part of the problem is that you are not only dealing with one application, you’re frequently dealing with copies of data, varying levels of encryption, and an ever increasing number of departments and users wanting to use storage of various types for various purposes.

What’s needed is a storage solution that provides systems administrators with a single view of all of an organization’s data under management. One that lets the systems administrators decide where data lives based on various metrics ranging from performance to regulatory concerns.

Ideally, this solution would make adding new storage simple. Just buy new storage and put it in your datacenter. Alternately, increase the amount of storage available on the various public storage accounts you control. The easier it is to provision and manage storage, the less likely it will be that people work around the system and thus the easier it is to keep all the data where it legally should be.

Performance

In addition to the legal considerations, data locality has a role to play in performance. If you read the tech press you’d be excused for thinking that storage is free and/or that everyone has unlimited budgets for everything from all flash arrays to public cloud instances.

In the real world, storage is still a limited resource. Organizations of all sizes have a mix of storage offerings, all with different features, ages, remaining time under warranty and performance. Making sure the right data is on the right storage can improve performance of critical workloads as well as lower overall storage costs by making sure cold or archival data is on the lowest cost storage available.

An obvious example of how data locality matters is to look at how an all flash array might be (mis)used. All flash arrays are the highest performing and most expensive class of storage likely to be employed by an organization today. They’re great for mission critical applications such as heavily used databases.

All flash arrays are not, however, a good candidate to hold backups, or archival storage. Given the cost, you probably don’t want to use them to store data that is used infrequently and not by very many users. You might instead push that data up to a public cloud provider. They’re much slower, but they can be much less costly than your in-house all flash tier.

Next generation storage

Managing what data goes where can be a full time job for an organization as small as 25 or 50 users. Imagine how difficult this can be for organizations with 100,000 users and tens of thousands of servers! At some point, the scale of it is just too large for any one person – even a team of people – to accomplish.

Fortunately, a new generation of storage solutions is emerging. Among them, ioFABRIC.

ioFABRIC solves data locality issues automatically. Not only can administrators choose where data lives, they can define policies regarding data locality and let ioFABRIC’s algorithms place the data.

ioFABRIC continuously monitors data access patterns for various workloads, moving data to the best storage based on policy and performance requirements. It can be configured to ensure some workloads never use certain types of storage: for example, do not put backups of the medical database in the public cloud. Similarly, critical workloads can be “pinned” to a given storage tier to ensure performance never suffers.

Whether the scope of your data locality considerations is a few meters or a few continents, ioFABRIC provides a solution.

About Trevor Pott

Trevor Pott is a full-time nerd from Edmonton, Alberta, Canada. He splits his time between systems administration, technology writing, and consulting. As a consultant he helps Silicon Valley start-ups better understand systems administrators and how to sell to them.

Visit My Website
View All Posts