Data where?

@gikii and #gikii2013 on twitter
Attending GikII gave me a great opportunity to talk to folks at the junction of law and technology who concern themselves greatly with personal privacy and see a sea of tech washing over the population that causes great concern. I’d been laying out my strategy for dealing with the current tech and our research plans in this space, and was encouraged to get it written down in an easy to read version – i.e. not our research publications! So here goes...

The desire to access information anywhere has been leading to an increasing centralization of services into the cloud so that one can have access to email, files, contacts, etc. from anywhere - I'll refer to this as "data in the cloud". Following closely on this have been a series of applications either built-in (MacOS Mail, Android Mail), free (Dropbox, SkyDrive) or purchased (Outlook) that synchronize contents between the cloud servers and mobile devices and computers in the background - "data sync with cloud". This is done so that when the user interacts with the application, both the application and its data are local, which makes it more responsive and able to operate even when disconnected. One logical and privacy enhancing conclusion to this trend is to arrange the devices to synchronize information directly with each other and forget about maintaining a copy in the cloud - "data on my devices".
Crazy file sharing icons
These "data on my devices" services are already emerging for file sharing - I currently run seven file-sharing applications, which fall into two distinct categories.
Files in the cloud services include Dropbox, SkyDrive, GoogleDrive, Memopal and SpiderOak. The first three all maintain an unencrypted copy of my files in the cloud while the latter two assert they store encrypted copies of files - your level of PRISM related paranoia will dictate whether you trust the encryption of the latter, but for the big three you need to trust the provider to maintain confidentiality. Hence, I use these services for my research talks, publications and random other storage uses where the information is not private or personal; the consequences of a breach of confidentiality for this information are nothing more than a minor irritation – someone sees a work in progress paper or a half-baked presentation.
For private and personal information, including any data relating to other people, I use services that synchronize files across my devices without maintaining a copy in the cloud or ever seeing the contents; examples include BitTorrentSync and AeroFS. In these examples the cloud service merely provides the means for devices to find each other, and possibly provides an encrypted forwarding service if they cannot communicate directly (e.g. your iPad in the Internet cafe talking to your home computer behind your home router).
The impending serious concern is the simultaneous arrival of the “Internet of Things” and “personal data stores” – the scope for dangerous privacy breaches if these services are all in the cloud is significant. I have a simple take on this – don’t put the data in the cloud, synchronize it across your devices and run applications locally. I mean - it's not like we don't know how to do it and still make it easy for the user
To this end we have been developing Nymote as a general solution for secure data synchronization across computing systems, one use of which is to securely store and share private information across my personal devices. Nymote is composed of three elements: Signpost, Irminsule and Mirage. Signpost provides the cloud service that allows your devices to find each other and establish secure communications paths. Irminsule provides the distributed data store, which moves beyond files to provide a robust database that allows simultaneous conflicting updates to data items on different devices with application hooks for their resolution - simply a more useful building block than files. Mirage is the underlying runtime environment that, in its most secure instantiation, runs within its own virtual machine.
That’s the tech side, aiming to build in “privacy by design”; it still needs the underpinning legislation for consumer protection in digital services rather than informed consent (blogs passim), especially if we are going to roll it out as a legal obligation to companies. So over to you JR...

Written on September 17, 2013