Archive for the ‘Data mining’ Category

9 Feb 2011

‘Fess up – where does my data go?


There truly is an app for everything.

Recently, the digital world has been aflutter with news of the first-ever app approved by the Catholic Church – Confession, an app that helps Catholics prepare for the sacrament of confession by guiding the user through “a personalized examination of conscience”:

“To help those that are feeling guilty ready themselves for the sacrament of confession, the app provides a checklist of the Ten Commandments — along with mini-questions based on each — to help in compiling an inventory of malfeasance. The app even lets one add in non-traditional transgressions not already listed.”

One of the selling points of the app appears to be the password-protection feature, enabling you to lock out anyone who may try to find out about your sinnin’ ways. But what seems to be missing is what Little iApps, the developer of Confession, will do with the data they collect. According to reports, the app asks users to also provide information on their age, sex and marital status – paired with detailed information on the user’s transgressions, that’s a potentially detailed profile that would be quite attractive to marketers and others.

Details on the collection and use of the user-provided data wasn’t available on Little iApps’ site…so if the developer is collecting and using information without the user knowing, does that mean they’ve broken one of the commandments themselves – “Thou shalt not steal”?


4 Nov 2010

Big Broker is Watching You


Have you ever looked yourself up online and wondered how companies you have never heard of know your name? Where did they get your information and what are they using it for?

Data brokers were the subject of a recent article in The Wall Street Journal’s “What they Know” series on online privacy. Data brokers collect personal information from various sources such as public registries, telephone listings, product registration cards, and, notably, online social network sites and blogs.  The information is then compiled into profiles and sold – to marketing companies, to individuals through “people finder” websites, to fundraisers.

In the past, data brokers relied primarily on organizations like retailers, subscription services, payment processing companies and charitable organizations to share their customer information by renting or selling customer lists. The lists would typically include names and contact information together with attributes like income, age group, number of children, purchase history, and interests. Data brokers would combine these lists with information from other sources, like survey and census data, to generate specialized lists they would then sell to marketers.

Now that we spend more and more time in a digital world, data brokers are able to tap into a wealth of rich new data sources.  Vast amounts of customer information is being stored online. Public records, like court decisions, no longer live in musty filing cabinets but can be accessed by anyone anywhere with the click of a mouse.  Our actions online – where we go, what we do – are tracked and recorded.

And then there is the information that we voluntarily share on social networking sites, blogs, chats and other social media services. Data brokers say this information has been made public and therefore should be free for the taking. Privacy advocates argue that combining and repurposing data exposes individuals in ways they may not have anticipated or consented to. Why do Canadians chose to reveal personal information online? Privacy by obscurity is a theme we at OPC often hear when we ask that question. “I’m not a celebrity so why would anyone be interested in what I post?” Data brokers are interested because collecting and selling your information is how they generate revenue.

The implications of having so much of our personal information out in the ether are only beginning to be understood. Industry practices are not transparent and the average person knows little about companies that routinely harvest online information.  We have seen reports of information gathered from blogs and chats being used to help determine creditworthiness. But would we even know if that happened to us? And would we know how to stop it?

Many of these companies are based in the U.S., where consumers have to navigate a patchwork of laws and regulations that for some will mean being able to opt out of tracking or having information removed from brokers’ databases.  Here in Canada, we are more fortunate.  Canadian private sector privacy law gives individuals the right to see what information data brokers have about them, ask them to correct inaccurate information, and request that the information be deleted.  The only exception is contact information that appears in a public directory such as a telephone book.  American companies which operate in Canada are also subject to our privacy law. If, after contacting a data broker, you are not satisfied with the response, you can file a complaint with us.

You can also try to limit what information you willingly give away.  Be more selective when posting online, restrict your privacy settings, and think twice about filling out that customer satisfaction survey or warranty card. Stay vigilant and let us know if you come across practices that cause you concern.


12 Aug 2010

Badges? Badges? We don’t need no stinkin’ badges!


Loyalty discounts, the power of recommendations, serendipitous encounters with friends and colleagues, recognition badges, and stalkers. I think that’s a fair summary of most commentary about the growth of location-enabled services and tools.

Location is just one piece of information that can be generated by most smart phones, but is the most relevant for a marketer eager to deliver precise and context-specific messages to a consumer on the move. It is also a highly useful data point for a social scientist trying to measure the flow of human migration and socioeconomic progress, as in the case of Nathan Eagle’s research in the slums of Kibera, Nairobi, Kenya.

Between June 2008 and June 2009, Eagle and his co-researcher evaluated the calls recorded by mobile phones across Kenya (with all callers’ identification replaced with unique hashed IDs) to focus on calls originating or ending in Kibera. Their research tracked between 53,000 and 74,000 calls a month and a total of 18,000 individual callers during the year.

What did this data reveal about individual mobile phone users? “With each call, we can infer a number of individual characters such as

  • spatial data (by the location of the cell tower that transmitted the call),
  • economic data (the average length of each call, the amount of pre-paid minutes an individual has put on their phone, the type of phone),
  • an individual’s regional or tribal affiliation, and
  • a radius of migration for groups of individuals (by the distance between locations of cell towers calls have been made from).”

A first indication from this research is that Kenyans only live in the Kibera slum for a mean of 1.559 months. This high rate of movement and population turnover “supports the theory that slums act as a filter as opposed to a sink where there is a large amount of flux within the slum population.”

Amy Wesolowski, Nathan Eagle, Parameterizing the Dynamics of Slums

Eagle’s work in Kenya is an extension of a research project originally conducted at MIT, where 100 students were provided with mobile phones for 265 days. The mobile phones were equipped with custom survey software that recorded data and prompted the students with questions when certain conditions were met.

How much data?

“From the studies, we gathered 370 megabytes of raw data, including short recordings from 667 calls, 56,000 movements, 10,000 activations of the phone, 560,000 interaction events with our applications, 29,000 records of nearby devices, and 5,000 instant messages.”

Thankfully, from a privacy advocate’s point of view, the researchers also had to struggle with (a limited number of) weak points in their data sets – instances when the participants didn’t bring their phone with them, consciously turned the phone off, or simply ignored it. I would like to think that some of this reflected a conscious effort to mediate information collection, but it was probably just fatigue or forgetfulness.

There was one significant distinction between the two projects: the active involvement and acknowledgement of the participants. In Cambridge, the participating students were walked through the information collection process, provided with details about the information that would be collected, and required to complete a consent form(.pdf).

M. Raento, A. Oulasvirta, N. Eagle, “Smartphones: An Emerging Tool for Social Scientists“, Sociological Methods Research 37:3, 426-454.

This is an important point when it comes to the collection of location data, especially when it is associated with other personal information: individuals want to know what is happening with their information, and would like some element of control over its use.

A recent and exhaustive examination of the 89 then-available location-sharing services (really, who can keep track?) by researchers from Carnegie Mellon University noted that “the willingness to share one’s location and the level of detail shared depends highly on who is requesting this information (or knowing who is requesting this information), and the social context of the request.”

Supplemental interviews confirmed that potential users had particular scenarios in mind when evaluating the benefits and risks of these services: scenarios that would best be addressed with more detailed privacy controls, rules and conditions (explained in detail in the paper):

  • Blacklists
  • Friends Only rules
  • Granularity of controls
  • Group-based rules
  • Invisible status
  • Location-based rules
  • Network permissions
  • Per request permission
  • Time-based rules
  • Time-expiring approval, or
  • No restrictions

Janice Y. Tsai, Patrick Gage Kelley, Lorrie Faith Cranor, Norman Sadeh, Location-Sharing Technologies: Privacy Risks and Controls

Obviously, there are significant gaps in how personal privacy is protected when information is collected and analyzed in a large scale research project, a smaller experiment and within the context of online commercial services.


21 Jul 2010

Location, location, location


Do you know how your location information is used?  A recent survey commissioned by security company, Webroot, asked 1,645 social network users in the U.S. and UK who own location-enabled mobile devices about their use of location-based tools and services.  The survey found that 39 percent of respondents reported using geo-location on their mobile devices and more than half (55 percent) of those users are worried about their loss of privacy. 

A few notable concerns over security and privacy: 49 percent of women (versus 32 percent of men) were highly concerned about letting a would-be stalker know where they are and nearly half (45 percent) are very concerned about letting potential burglars know when they’re away from home (a very real risk outlined nicely by Pleaserobme.com)

The growing popularity of geo-location tools and services (including offerings by industry giants such as Twitter, Apple, Facebook and Google) means that location information is being collected on a colossal scale and the real and potential uses for this information are just starting to work themselves out – from iPhone photos tagged with GPS coordinates to location-based gaming platforms such as Scvngr that enable mobile users to create their own location-based games.

This increase in the collection and use of location information can also pose unique risks for users.  The survey summary notes that a surprising number of respondents engaged in behaviors such as sharing location information with people other than friends that could put them, and their private information, at risk.  A blogger recently wrote about her experience with location sharing gone wrong and Foursquare was recently blasted for unintentional data leakage via their popular location-based service. 

As we note in our recent submission to Industry Canada’s Digital Economy Consultation, good privacy practices can support innovation by reinforcing confidence in users that they have the right to control their personal information and that the technology they use is secure.  With location information, the usual privacy concerns abound and with each cool, new service that hits the market. How to communicate these risks to consumers is something that occupies a great deal of our time.  Dealing with the privacy concerns of location information during the design phase for new services would help businesses avoid expensive (both financial and reputational) after-the-fact privacy fixes and might even provide those privacy-friendly businesses with a significant competitive advantage


31 May 2010

2010 Consumer Privacy Consultations – Montreal is all a-twitter!


Over the course of the year, the Office of the Privacy Commissioner of Canada is hosting consultations with Canadians on issues that pose a serious challenge to privacy. In an attempt to learn more about the privacy implications of new industries, the focus of the consultations has been on online tracking, profiling and targeting of consumers, and the increasing prevalence of cloud computing.

Following the first such consultation in Toronto, a second event was held in Montreal on May 19th, 2010. The event was a resounding success, due in part to the fact that the panels had a lively audience both on and offline.

Did you miss the event? You can still watch the webcast here, and you can check out what was happening on Twitter for each panel below.

Panel 1: Frontiers of Consumer Information Datamining and Analytics

Frontiers of Consumer Information Datamining and Analytics Panel

Panel 2: Online Identity and Reputation

Online Identity and Reputation Panel

Panel 3: Online marketing methods: gaming, advertising, applications and social networks

Online marketing methods: gaming, advertising, applications and social networks panel


27 Apr 2010

Meet Louise.


Meet Louise.

Louise is a central character in our upcoming Consumer Privacy Consultations – not because of her great hair, but because she’s engaged online the way many Canadians are…she buys clothing and books online, she updates her Facebook profile regularly, she’s got an iPhone.

She’s also our fictional case study for examining how our data travels as we engage with the online world – who’s got our data? What are they doing with it?

Below is just one of several scenarios we’ve developed to help ground our conversations during the consultation process. This one will be used during the Advertising panel this week in Toronto. As you read it, ask yourself:

Is Louise aware of how her information may be used when she searches for and buys materials at online bookstores?

How accurate is the advertising profile developed for Louise, given that she shares the computer with other members of her family including her nine-year-old brother?

How could Louise’s profile information be matched with publicly available information to draw inferences about her? What types of decisions are or could be made based on her profile information?  What are the risks of combining online and offline profiles? Or the risks involved in combining different online profiles, like Louise’s Facebook profile with the profile her favourite online bookstore has of her?

Louise is a stylish 21-year college student who likes to meet people and try new things. She is active online and does everything from buying trendy clothing and concert tickets to keeping up touch with friends through posting updates and photos to her Facebook page.  Now in her final year of college, Louise is starting to look for a job. She is putting herself through school by making jewellery and selling it online. She is also a collector of specialty comic books and belongs to an international network of comic book enthusiasts. Louise also has a younger brother, David, who is nine years old.

Louise bought some designer jeans at a store in her local mall with her credit card. She also had the clerk swipe her loyalty card.

When Louise arrived home, she signed into her new account at the store’s web site to learn more about the clothes she had carried into the changing room but not bought. In her excitement to see the store’s merchandise, she clicked through the site’s lengthy privacy policy.

In looking on the store’s web site for a blouse to go with her new jeans, Louise saw an advertisement for jewellery that really appealed to her, so she followed it. Louise felt comfortable at the small Canadian jewellery site because the style of the site was as though she were visiting a friend’s page.

She also liked the styles of jewellery on the site so she bought a necklace and clicked on the “Like” button to update her friends on her latest purchase. From there, she left the store site and searched for the listing of a concert and bought 2 tickets. After that, she checked the status of the online auction she was participating in to get a new specialty comic book.

After this, Louise updated her Facebook page to let her friends know about her purchases and to see who else would be attending the concert. From Facebook, she checked out her favourite online bookstore where she purchased a book that was recommended to her by another comic book expert.

We’re hoping to generate some discussion around Louise’s activities – join the discussion by commenting on our blog, or jumping into the Twitter-stream on Thursday (hashtag #priv2010). We also invite you to check out the live webcast.


26 Mar 2010

Locational services and cool data visualizations


Earlier this month, a rich subset of social media users and technology evangelists descended upon Austin, Texas for the annual SxSW interactive conference. Some see SxSW (South by SouthWest) as an early indicator of developing technology trends. Twitter, the popular microblogging service, broke out as a popular consumer application at the conference two years ago.

This year, the dominant trend seems to be locational services. The video embedded below was produced by a company called SimpleGeo: it uses a data visualization tool to demonstrate how attendees, performers and regular old Austinites were using various consumer locational services during the conference.
Obviously, there are many people who find these services useful, either to meet up with friends, create the opportunity to meet new friends, or simply brag about getting into the most exclusive parties and shows.

As an Office, we are interested in how information from these locational services might be integrated into larger efforts to collect and aggregate data about consumers’ behaviour and preferences.

We also like really cool data visualizations.


11 Feb 2010

Love is in the air, and on the Net…


It’s that time of year again: greeting card stores are decked out in pink, red and white, candy hearts are on sale at the end of every grocery store aisle, roses fly off the shelves by the dozen, and Cupid is a-hunting. Jewelry stores proclaim the only way to “show your love for her” is with a diamond, and teddy bears holding hearts and flowers have taken over gift shops.

For those without a “significant other”, the pressure might be on to get out there and find one, especially at this time of year. This is why many people turn to social networking or online dating sites to find potential love interests. It can be easy to “stalk” or “creep” someone’s profile page, their pictures, and what their friends, colleagues, or coworkers have been saying about them. While these public profiles can provide conversation starters (“So I saw you liked Led Zeppelin/Hootie and the Blowfish/The Spice Girls…”), they might lead to an inaccurate judge of character.

On many social networking sites, people remain “friends” with their exes. This can be a difficult situation, so consider being sensitive about what you post. If you’ve just recently broken up with someone, it may not be the most diplomatic idea to post 20 pictures of yourself canoodling with a new paramour—but if you do, remember to adjust your privacy settings to control who gets a detailed look into your love life.

For those who are still searching for love, look no further than your cell phone. There are many applications that have been created to provide the battlefield of love with military intelligence. Mobile dating applications, such as MeetMoi and MIT’s Serendipity, alert potential matches when a compatible mate is nearby. Users provide information about their interests, and these services use GPS capabilities of their cell phones to locate other compatible users nearby, letting you know about that special someone buying a coffee ahead of you in line. These applications, however, may use your personal information to serve up targeted ads—perhaps about flowers or engagement rings.

If your matchmaking efforts have been in vain, you may be tempted to try a novel approach: genetic testing to find your perfect partner. You send a genetic sample, such as a cheek swab, to a company that determines your match with other members based on factors such as immune system genes. Unfortunately, there is little evidence to suggest that romance lies in your DNA, and releasing your genetic information can expose you to serious privacy risks.

Other applications are set in place to find that special someone safely. Funbers is a phone application that allows people to give out an alternate phone number that they can then use to screen calls. If a date went terribly wrong and they haven’t quite got the hint, Funbers allows the user to send out busy signals or out-of-service messages to make it clear. On the darker side, for those who are suspicious of their loved one’s past new applications from BeenVerified and Date Check allow the user to do a background check on their loved one from their cell phone. A better approach to this suspicion might be to simply ask your partner about his or her past. It is very important to trust your partner, but also to respect their privacy!


17 Nov 2009

Audit of the Financial Transactions and Reports Analysis Centre of Canada


(from our news release)

The Financial Transactions and Reports Analysis Centre of Canada (FINTRAC) has more personal information in its database than it needs, uses or has the legislative authority to receive.

This was one of the key findings of the Privacy Commissioner of Canada’s in-depth audit of the independent agency mandated to analyze financial transactions and identify suspected money laundering and terrorist financing in Canada …

Legislative changes passed in 2006 expanded the types of transactions that must be reported to FINTRAC, as well as the number of professionals and organizations that are required to collect information about clients and to report it to FINTRAC. Examples of entities required to report to FINTRAC include financial institutions, life insurance companies, accountants and casinos.

The audit found that FINTRAC needs to do more to ensure that the amount of personal information it acquires is kept to an absolute minimum. A random sample of files examined in the audit turned up several reports that did not clearly demonstrate reasonable grounds to suspect money laundering or terrorist financing.  For example:

  • A reporting entity filed several reports stating it was “taking a conservative approach in reporting this … because there are no grounds for suspecting that this transaction is related to the commission of a money laundering offence, but there is a lack of evidence to prove that the transaction is legitimate.”
  • An individual deposited a government cheque for an amount less than $300 and then withdrew the entire amount. The financial institution filed a suspicious-transaction report, but did not indicate why the transaction was deemed suspicious.
  • A financial institution filed a report about an individual who had deposited a cheque from a law firm.  The institution was satisfied that the individual had provided legitimate reasons for the source of funds, but decided to notify FINTRAC anyway because of the individual’s ethnic origin and the fact that this person had visited a particular country.

“It is clear that such reports, containing not a shred of evidence of money laundering and terrorist financing, should not be making their way into the FINTRAC database,” says Commissioner Stoddart.

“It is a bedrock privacy principle that you collect only the personal information you need for a specific purpose,” she says. “The federal government needs to have a justifiable need to collect someone’s personal information. Clearly, FINTRAC needs to do more work with organizations to ensure it does not acquire personal information that it has no legislative authority to receive – and that it does not need or use.”

The audit recommended enhanced front-end screening of reports; stronger ongoing monitoring and review to ensure that information holdings are relevant and not excessive, and the permanent deletion of information that FINTRAC did not have the statutory authority to receive.

Under amendments passed in 2006, the Proceeds of Crime (Money Laundering) and Terrorist Financing Act requires the Privacy Commissioner to review FINTRAC every two years and report the results to Parliament.


1 Oct 2009

Survey says Americans Reject Tailored Advertising


A survey commissioned by American academics and privacy advocates reveals that Americans are generally suspicious of efforts to track their behaviour online and to target advertising based on this tracking.

While you might expect older Americans to be suspicious of efforts to track their behaviour on individual websites, and even more so if tracking their behaviour on multiple sites, there seems to be opposition from younger Americans as well. 55% of 18 to 24 year-olds do not want to be subject to tailored advertising – and this number increases significantly if the advertiser is compiling data from a number of sources in order to target.

Interestingly, promises to anonymize the data do not seem to win many supporters:

“Even when they are told that the act of following them on websites will take place anonymously, Americans’ aversion to it remains: 68% “definitely” would not allow it,  and 19% would “probably” not allow it.”

The June/July survey was conducted by telephone interviews with a national sample of 1,000 adult internet users living in the continental United States, using both land line and cellular service.

The report by Joseph Turow, Jennifer King, Chris Hoofnagle, Amy Bleakley and Michael Hennessy is available on the Social Sciences Research Network.