Michal Kosinski: “The End of Privacy” | Talks at Google

you for inviting me here. Thank you guys for coming. Fun fact. Well, I never intended
to be an academic. I actually got
expelled three times from my undergraduate
psychology degree. And my professors,
when my professor was writing me a reference
letter for my PhD course, he was, like, laughing
at me and I was like, I would never get in and
I would never make it. And here I am. I actually met him
a few years ago, and we went back to
those times where no one believed I will make it. But, well, great to be here,
and I probably don’t really have to spend too much
time convincing you guys that you are leaving an enormous
amount of digital footprints behind when using
Google and Google Maps and Gmail and Facebook and
other products and services. And quite some time
ago, back in 2012, IBM estimated that
an average person produces 500 megabytes
of digital footprints every single day. And this includes people
that are too young or too old to own a
computer or a phone, smartphone, or people too poor
or living in too remote areas to own devices. So essentially, you
guys here in the room are most likely producing much– or used to produce much more
digital footprints than that. And of course, our capability
to produce digital footprints is growing extremely quickly. Recently, the Economist
estimated that, in 2025, we’ll be producing 62 gigabytes
of digital footprints every single day. And now, to help you imagine
how much data that is, let us actually go back to this
number here, 500 megabytes. If you wanted to back
up 500 megabytes of data on letter-sized paper,
double-sized, double-sided, single-lined, Times
New Roman size 12, let’s print it out
in 0’s and 1’s. How tall would be a stack
of paper containing just one day worth of data that humanity
used to produce back in 2012? Any accountants in the room? Any guesses? Come on, guys. 100 feet. Great. More than that. Great. More than that. 100 stories. Even more than that. Like from here to the
sun four times over. This is just one day worth of
data backed up on paper that we used to produce in 2012. Now, and we’re quickly
ramping up our capability, so let’s hope that no one wants
to back this amount of data that we’re going to
produce in 2025 on paper. I actually calculated
that the stack of DVDs, if you wanted to back it up
on DVDs, 100,000 kilometers. So one quarter of
the way to the moon. OK, so we’re producing all
of these digital footprints, and now the question is, what
can be gleaned from them? And I know that
you guys at Google do a lot of work trying to
analyze digital footprints in order to improve
people’s experience when using your products,
but of course there are companies and
actors out there that may want to use it to infer
other facts about the users. Perhaps their intimate traits. And I have a short
video here that shows you results of
one of our early studies where we looked at
people’s Facebook likes. Facebook likes are one
of those generic types of digital footprints very
similar to your browsing log, very similar to your
purchase records, to your search records. Essentially, connections
between people and some pieces of content. In the context of
Facebook, those connections can be created by clicking Like. So we’ve collected likes
of around 60,000 users. And we also collected a bunch
of their intimate traits, such as their personality,
IQ, their sexual orientation, political views using
questionnaires and psychometric tests, and then tried
to see whether you can train simple
machine learning models to predict those intimate traits
from their Facebook likes. And no one really likes
anything overly intimate. It’s actually not
possible on Facebook, as there is some
censorship of what you can like going on there. And yet, just by looking
at those connections between people and
pieces of content, you can make very
broad-range predictions about their intimate
psychological or demographic traits. And this is because the data
you can see on the surface is essentially just
a tip of an iceberg. And now, when you
take these data and apply some AI
and machine learning, you can make inferences
spanning much beyond what’s directly
observable in the data. So it’s like, you know,
Barack Obama and the Democrats and other left-leaning
pages on Facebook, then you don’t really need
to run any predictions. It’s pretty obvious what are the
political views of this person. The whole idea here is that
you don’t need to do this. You can just read books that
might be completely not related to political views and
then like them on Facebook, or like those pictures
of funny goats or status updates describing an
older cousin is doing. And yet, just by looking
at those things that are on the surface, absolutely
unrelated to political views, a very simple algorithm can make
extremely accurate predictions when it comes to
a very broad range of intimate psychological
and demographic traits. Which brings me
the next question, how accurate are
such predictions? And now, when we started looking
at this, we thought, hey, maybe we’ll try
to predict gender and age and political views. But what happened is
that those predictions are close to perfect. So, you know, when
I show this slide, it’s just 99% accuracy of
predicting age and gender, and probably the inaccuracy
stems from some imperfection of the ground truth. And so we thought, this is not
really a fun thing to look at, not to mention that people
were not that impressed. Because they say, hey, I
can judge people’s gender or their age or their political
views fairly easily, as well. So I’m not that impressed that
a computer can do it accurately, as well. So what we thought,
we thought, hey, let’s give a computer a
slightly more complicated, slightly more difficult task,
something that people cannot do so easily. So we thought, hey,
let’s try to see whether we can
predict personality from people’s Facebook likes. And now, we are all actually
somewhat good at predicting other people’s personalities. If you have a friend, if
you hang out with them, you will know their
personality after some time. In fact, even if I showed
you a picture of someone, you would be able to
judge their personality with some minimal,
absolutely minimal accuracy. So we have some ability to judge
other people’s personality. But it turns out that
simple algorithms based on Facebook likes are
much, much better than us. So what you can see here is a
plot showing how accurately can we predict Big Five
personality traits given the number of
Facebook likes that we are revealing to the algorithm. And so you can see
here it’s a log scale, but as you can see
here, the more likes revealed to an algorithm,
the more accurate the predictions are getting. And you can also see that the
personality trait of openness, it’s the one that can be most
easily predicted from Facebook likes. Now, if there were any
psychologists in the room, I would get a wow
for these plots, because this shows that
those predictions are extremely accurate. Now, let me show you how
accurate those predictions are by comparing this accuracy
with the accuracy that can be achieved by human judges. So what we’ve done, we’ve judged
personality of our participants using their Facebook
likes, and we also asked their friends and
family members and co-workers to fill in a personality
questionnaire in their name. Now, if you know someone,
if you hang out with them at work or at home,
if you raise them because they’re your
child, you should be able to accurately
answer questions about whether this person likes
poetry or that they are on time when it comes to the meetings,
or whether they are orderly or whether they’re extroverted
or introverted, right? The essence of
knowing someone is to be able to answer
these questions. So people are good at
answering those questions, as I mentioned before. But now let’s see how
our accuracy compares with the accuracy of the
computer algorithm based on all of those funny goats
that you liked on Facebook. So how many likes
do you think do you need to reveal to
an algorithm for it to be better than
your work colleagues at predicting your responses
to a personality questionnaire? Do you have great
work colleagues? It’s quite likely. On average, our work colleagues
don’t know us that well. And 10 Facebook likes is
enough to reach the accuracy of your work colleague. Now, family members,
friends, and cohabitants, you need around 100 to 150 likes
in order to outperform them. And now, the most
accurate of human judges, people that know you
the most, your spouses, you need around 300,
250, 300 Facebook likes for an
algorithm to be more accurate at predicting your
answers to a personality questionnaire. Now, this is somewhat
upsetting to anyone who has a spouse
because this essentially means that, in one
day, probably you produce enough digital
footprints for a [INAUDIBLE] algorithm to know you better
than your spouse knows you, to be able to predict
your future behavior and preferences better
than your spouse can do it. And this is really upsetting. Look, spouses know
us really well. They see us across many
different contexts. They see us in intimate moments. They see us angry. They see us happy. They have conversations with
us on many different subjects. If you guys got
married recently, your spouse still pays a
lot of attention to you. And yet, it turns out
that 250 likes on Facebook allow a linear
regression to be better at predicting your personality
than your own spouse. So now the question
is, how does it work? Why does this tip of an iceberg? So essentially, this data that
is visible on the surface, why, when combined with algorithms,
it allows revealing all of this information that
humans are actually not that great at judging? And we’ll go through this
equation really quickly. But before we do this, let me
give you one practical example. So what you can see here is the
history of my GPS locations, my geographical locations
from Google Maps. So you can see I
work at Stanford. I live next to Page Mill Road. You can see I
cycled to work, then I gave a lecture somewhere
here, then I went to Zola then to Nut House, and back home. It’s all geo-tagged,
essentially, and time-tagged. On the surface, it
just looks like a bunch of numbers describing my
latitude and longitude. But what happens if you
match it with some metadata, like you match it with the
names of the restaurants that I have attended? Then you can match it with
the menu of those restaurants. And what you now know is
what my food taste is. But you also know how
affluent I am because you know how much I spend on my dinner. You know where I live. In many places in the world,
just by knowing where I live, you can, again, check
what my credit score is and what my mortgage is
and whether I own my house or do I rent it. You know where I work. So again, you know
how much I earn. You know what I’m doing. You know that I’m drinking. No one goes to the Nut House if
they don’t– you must like nuts a lot if you are not drinking
and still are going there. And so you know that
I’m drinking alcohol. Now combine it with some
other types of metadata, like combine it with the history
of geographical locations of other people. And suddenly, you can start
discovering my social network, right? If I’m in the same
geographical space with a user of another phone
more often than randomly, then hey, maybe there’s
some connection between us. And now, on top of
that, you can apply what we know from psychology,
which is [INAUDIBLE].. We know that people like
to hang out with others that are similar to them. So now if you know political
views or religiosity or sexual orientation of few people
in my social network, now we can propagate
this information, and now you can guess what
my sexual orientation is and what my political views
are and what my religiosity is, how much I earn, whether I’m
married, do I, you know– where do I go for
dates, and so on. You can learn all of
those intimate things just from looking at the
history of GPS locations. And let me show you
another example here. Imagine you have this
anonymous Facebook user, and the only thing you
know about them is that they liked Hello Kitty the brand. Now, you know nothing about
them, just this one fact. Now, can anyone here guess? Female. So judgmental. In fact, this retired
Japanese gentleman has the largest collection of
Hello Kitty toys in the world. But generally,
you’re right, female. Those stats, they’re not
coming from my studies. This is just Facebook
marketing, what is it called, Audience Insights
platform, where you can check demographic
profile of the audience of any Facebook like. And you can see here
99% of people who like Hello Kitty are females. Great. Well done, guys. So if you don’t know
anything about a user, you just know that
they like Hello Kitty, guess that they’re
a young female. Now, do they work in
sales or in the military? Great. Of course. Are they using a
PC or an iPhone? iPhone, of course. The self-respecting Hello
Kitty fan doesn’t even know what PC stands for. Great. So as you guys can
see, just based on one single digital footprint
of an anonymous person, you can make a very broad range
of accurate psychodemographic predictions. And you don’t need
to be a computer algorithm or a psychologist
or a computer scientist, or I presume you guys are
all computer scientists. Now, in fact, this task
is not always as easy as in the context
of Hello Kitty. Let me show you another Facebook
like, Forrest Gump the movie. Now, what is the gender
of the person that likes Forrest Gump the movie? Both. Great. Gender fluid. No, actually, yes,
probably, to some extent, but because everyone likes
Forrest Gump, you essentially would say, great, both
genders love Forrest Gump because it’s an amazing movie. But if you look
a bit closer, you can see that there’s a slight
overrepresentation of men among people who like
Forrest Gump, and also older people in general, right? So the information, the
signal here is not as clear, but now what algorithms
can do very easily, they can take tens or hundreds
of thousands or millions of the digital
footprint that you leave behind every
single day, and each one of these footprints contains a
little tiny bit of information about you, a signal about you. And now you can use an
algorithm to very easily extract this signal from those
megabytes of data that you are leaving behind. And now, I was talking about
Facebook likes and the history of GPS locations so far,
but of course, there’s virtually any type of
the digital footprint that you leave behind can
be used in prediction. So likes, history of GPS
locations, your playlist, what you are saying when you are
at home or around your phone, what you write in your emails. Now, there’s a wealth
of studies showing that we can use the
language that people use, and even without interpreting
what they’re talking about, right, you don’t need
an advanced algorithm that actually understands
what people say. Just by looking at frequencies
of the words and phrases they’re using, we can make
extremely accurate predictions related to the
psychological traits. What you can see here
is two word clouds that essentially describe
language or contain words that are most diagnostic of being
high or low on the personality dimension of openness to
experience, which is very close with political views. Essentially, low
openness is conservative, high openness tends
to be very liberal. And you can see there’s
a very clear difference between the types of words
that people are using. And those studies, those methods
are not just used in the lab. They’re just widespread used
by companies and governments around the world. One example here,
IBM Watson I think for a decade, over a decade now,
has an API where you can apply, upload samples of
text, and IBM Watson will give you a feedback
on the personality profile. And I’m sure you guys
can come up with hundreds of other examples, like browsing
logs, search queries, data from people’s phones, from
their computer games, and so on. Now, one interesting
type of digital footprint that I got really
excited about recently, for many different
reasons, because it’s so interesting and surprising,
but also because it’s so scary and poses so many privacy
risks, is people’s faces. It turns out that
this is just yet another type of
digital footprint that can be used to extract
intimate information about people, about their
psychodemographic traits. And now, when I say
that, people say whoa, this is completely crazy. Are you going now to say
that computer can just look at someone’s face
and judge their intimate psychodemographic traits? This is just like some kind
of medieval pseudoscience. Well, it sounds like this,
and this was certainly my impression when I first
started thinking about this, but then I realized that we
are constantly doing this. We are looking at people’s
faces and making conclusions, making inferences about their
intimate psychodemographic traits, like age or gender. When we look at people’s
faces, we make those judgments automatically. Emotions. Emotions are intimate
psychological states, and yet we have no trouble
judging these emotions from other people’s
faces, even when they want to hide their emotions
that they’re experiencing. Genes. Genes are pretty
personal, and yet we have no trouble judging
genetic similarity between parents and children,
children, siblings, twins, and so on. Genetic disorders are often, in
many cases, clearly displayed on people’s faces. And the same applies to
developmental disorders. And those disorders affect not
only people’s faces, but also their behavior. Political views. Which of those guys
voted for Trump? Right? We are not going to be
accurate 100% of the time, but there are studies
showing that you are going to be much better
than random at judging from such profile
pictures political views and religious views of people. Now, so we are great at
judging a broad range of psychodemographic
traits from people’s faces. But then, we are also pretty
bad at judging other traits from people’s faces. What you can see here is two
famous psychologists, Sandrine and Sandra. Now, they differ on the
personality dimension of extroversion. One of them is an
outgoing person, another one is an
introverted one. Can you guess from
those two images which one is an extrovert? Who believes that Sandra here
on the left is an extrovert? Hands up. Who believes that Sandrine here
on the right is an extrovert? Wow, guys. You were very accurate. Sandra here on the
left is an extrovert. So there was a lot of
accuracy in this room. On average, people are
accurate in about 54% of cases. So it’s better than random,
better than just toss of the coin, but
not much better, which essentially
means that there seems to be some signal
in people’s faces, but not that much. So that may imply two things. Either there’s essentially
not that much signal in people’s faces, or our brains
did not evolve a capability to either perceive this
signal or use this signal to make predictions about
people’s personality. It might be that
personality is displayed on Sandra’s and Sandrine’s
faces as much as gender is, it’s that we just can’t see it. And let me prove to you
that it’s the latter. What you can see here is 10
images of extroverted females and 10 images of
introverted females overlaid on top of each other. No computer vision magic here. Now, no one will
have any trouble guessing which of those
groups are extroverts, right? You can see extroverted
women much more likely to smile on the picture,
wear makeup, dye their hair. You can see also
here on this side introverted women tend to
wear glasses, tend to not dye their hair. There’s a different
hairstyle for sure. When you look at the larger
versions of these pictures, you also can see that extroverts
tend to dress more liberally and also wear contact lenses,
because their eyes tend to be green or blue. Now, there’s another
difference here. As you guys can see,
introverts, nostrils. Extroverts, no nostrils. Can anyone guess why
introverts– sorry, why extroverts have no nostrils? AUDIENCE: The way they’re
holding their head. MICHAEL KOSINSKI: Great. If you are to remember just
one thing from this talk, extroverts know how
to take selfies. And if you want to look
good in your selfie, you want to take a picture
like this from above. And then your nostrils don’t
show up in the picture. Now, we have rerun
those studies. So those are profile pictures,
but we have rerun those studies on images taken
in the lab, where people would face the camera
directly, makeup is removed, facial hair is removed,
hair is hidden, and you can also see in those
carefully taken pictures that also shapes
of facial features differ slightly between
introverts and extroverts. Now, those differences
are not large. Those are not Hello Kitties. Those differences are
more like Forrest Gumps. So humans have trouble
interpreting them if we don’t amplify them by
overlaying many pictures on top of each other. But it turns out that
computer algorithms can do it pretty easily. And now when I say
that, people still say, well, maybe you’re
making some mistake. Why would anyone
make a claim that one can predict that your
intimate traits are displayed on your face? And please forgive
me this little foray here into a
psychological theory, but I really want to make
this point very clear that, according to
psychological science, it would be an
extraordinary claim to make that your
face is not linked with your psychological traits. And this is for
three main reasons. First reason is that your
psychological traits are going to affect your facial features. And there’s plenty
of examples of that, but let me just show you one. If you’re a cheerful person
that smiles and laughs a lot, over many years you will
develop smile lines. So your face is going
to look different. If you are a person that
likes outdoors, guess what? Over many years,
your complexion will change because of the
exposure to sun and wind and other elements. If you prefer playing computer
games in the basement, your face would also
look slightly different over many years. Now, there’s also a
reverse mechanism. So your face affects your
psychological traits. And let me just show
you again one example. Beautiful babies. We all here were lucky enough
to be beautiful babies, as I can judge from the
faces in the audience. Now, we were also lucky because
our mothers smiled at us more than they would
smile at ugly babies. This is a very sad
fact, but also true. Mothers tend to smile
more at beautiful babies. Not only that, when
those beautiful babies go to primary school, kids want
to hang out with them more. They are invited
to parties more. When they start dating,
they’re invited to dates more. When they start
working, they are getting promotions more
easily and getting pay rises more easily. When they want to
run for office, like to become the president
of United States, for example, people tend to
vote for them more if they have a pretty face. Now, if throughout your life you
constantly are getting things more easily and constantly, when
you interact with other people, they just smile and
melt in front of you because they love hanging out
of such a pretty faced person, guess what? This is going to affect
your psychological traits. And there’s a lot of evidence
that people that have pretty faces as children become
more extroverted with time because they get constant
reinforcement from society for interacting
with other people. And now, people who
are not that fortunate tend to become more
introverted with time. And finally, there
is a broad range of factors that
affect both your face and your psychological
traits, genes. We know that many psychological
traits are heritable, and we know that our
faces are also heritable. We know that there are
genetic disorders that affect both faces and behavior. Hormones. Hormones such as testosterone. Half of the audience,
roughly, is suffering from overdosing testosterone. And I’m sorry, guys, but
over a long period of time, it is going to
affect your hairline. So enjoy it now
that you have it. This testosterone is also
going to affect your behavior. It’s going to increase
your social dominance and aggression. Now, it turns out that
even prenatal hormones, so the hormones that you were
exposed to in your mother’s womb, will affect both the
shape of your facial features and the development of your
brain and your future character traits and so on. And there’s some very
clear examples of that, like, for example, gender
conversion therapy, where people start taking hormones in
order to change their gender. As you can see here, this person
was transitioning from a male to female. And you see how essentially
face is becoming prettier. The distribution of
collagen is changing, the health of the
skin is improving, the density of facial
hair is going down. So you can see here
that, even in adults, you can have dramatic changes
in people’s facial features just because the exposure
to hormones has changed. And finally,
developmental history. If your mother abused
alcohol during pregnancy, this will affect
both your behavior and your facial features, such
as in fetal alcohol syndrome, which can be recognized
by clinicians just from a short exposure
to someone’s face. OK, so the question now is– I hope I made it clear that it
would be an extraordinary claim to make that people’s
intimate traits are not related to their faces. Now, so the question
is can AI be used to predict
people’s intimate traits on a large scale. So what we’ve done here,
we’ve done several studies. In one of these studies,
we took profile pictures of over six million people,
their Facebook profile pictures. We used facial recognition, the
algorithm to detect the face on a picture. Then we used it
to convert facial, essentially faces into
strings of vectors of numbers summarizing
most important features of these faces. And then we started building
very simple predictive models, like logistic regressions
or linear regressions, where we took essentially those
face vectors for many people and tried to see whether
we can predict traits such as political views,
personality, and so on. And as you can see
on this plot here, you have to look
at the blue bars. Computer can predict
people’s personality with the same accuracy as
your work colleagues can after having interacted
with you for some time. Again, a pretty shocking fact. Just from exposure to one
single profile picture of yours, with background cropped,
with your hair removed, a computer algorithm can
know your personality with the same
accuracy as people who spend eight hours a
day with you at work for an extended period of time. And by the way,
the red bars here represent the accuracy
of human judges when shown with the
same facial images. You can see that computers
are significantly better across the board,
and you can also see that those phase-based
predictions of personality can be now used to
predict other life outcomes, such as depression
or political views and so on. AUDIENCE: Are we saying
that it’s like 20% accuracy? MICHAL KOSINSKI:
This is r squared. So this is essentially
correlation squared. So the correlation here would
be 0.4, which is actually 0.4 correlation
because people say, oh, what is 0.4 correlation? Well, 0.4 correlation
is the correlation between the risk of you
having cancer and you smoking. So it’s a very huge– I think it’s defined
as a large effect size. Great. Political views–
in this case, we use profile pictures to predict
people’s political views by showing a computer a
liberal and conservative person and then checking how often
a computer will be correct. And you can see here that
in nearly 70% of cases, computer can distinguish–
can pick the liberal person or conservative person
correctly, where 50% would be a random baseline. Now, all of those studies,
so far, based on faces were random profile
pictures taken from people’s public profile. So no one can argue, hey, maybe
it’s not really people’s face, but maybe it’s makeup,
or facial hair, or glasses, or hairstyle,
or other grooming, or self-presentation
strategies that people use. Now to explore this subject
in a bit more depth, we collected not
profile pictures, but standardized
images from 600 people. We invited them to the lab,
controlling the lighting, controlling the
camera, controlling the angle, removed makeup,
facial hair, and made sure that their facial expression
was neutral, and try to see whether we can
replicate those results. So what you can see here is
two composite images of males. There’s about 20
males behind each one of those composite images. And one of these
groups is liberal and another one is conservative. Now, can you guess which
of these groups is liberal? Who thinks that people here
on your left are liberal? Who thinks that people on
your right are liberal? OK, let’s see. Great. When you look at those images– and I spent a lot of
time looking at them– there are no
obvious differences. Nothing really jumps at
me saying, I’m liberal. And yet, I show those slides
to so many audiences, including sometimes hundreds of
people in the room, and this tremendous accuracy. People always know
which of those are liberals, despite not being
able to actually pinpoint what precisely makes the difference. You can see they’re
very similar. Those are 20 different people
here from those people. And yet, when you
average their faces, they end up being
pretty similar. Now, females. Who thinks that the group
on the left is liberal? Who thinks the group on
the right is liberal? Again, guys, that’s
100% accuracy. That’s very impressive. And now we tried to– so
what I’m trying to do now is I’m trying to look
more deeply into– we know that computers
are very accurate. We know that humans
are very accurate when they look at aggregate images. Computers can also do it
on the individual level. But now it’s actually
pretty difficult to identify all of those Forrest
Gumps, those tiny little bits of signal that actually help
us to make those predictions. For females, for
example, you can see that liberal females have
slightly different outline of their jaws. All right, it’s not
a huge difference. Actually, it’s halfway
already to the difference between males and females. So females have generally
narrower jaws than males. Now, liberal females are
half between an average man and average woman, which is
actually a pretty significant difference. So now let’s see how
accurately can we distinguish between
liberals and conservatives just based on two
images taken in the lab. And as you can see
here, the accuracy is even higher than what we’re
able to achieve when running those studies on social media. So it’s pretty clear that it’s
not self-presentation, not grooming. It’s actually your face
that gives out information about your political views. And again, it’s not just
happening in our labs. There are APIs out there– Face++, Microsoft Cognitive
Services, and others– that allow uploading
facial images and then will give
you predictions related to the psycho
demographic profiles of faces on the picture. Great. So what does it mean that it’s
possible to predict people’s intimate traits based on
their digital footprints, ranging from Facebook likes
to images of their faces? One of the most
important outcomes, most significant
outcomes is, in my view, the revolution in marketing. And in the past,
essentially people were marketing large
groups, males and females, wealthy and poor. Nowadays, you can essentially
change marketing in such a way as to target people
as individuals. And this is great
if you are trying to help people to
stop smoking, or take this new amazing course on
AIs so their job prospects are improving. But it’s pretty
problematic if you want to sell people
cigarettes, or alcohol, or make them vote for a
candidate that doesn’t really have their best
interests in mind. And now, how can you do that? Well, if you are a
large company that has a lot of digital
footprints of many users, you can just run those models,
create people’s profiles, and then target them
with appropriate ads. And in fact, all of the
large online advertisers are doing exactly that. They may not call the dimensions
that they are extracting from users’ profiles–
they may not call them personality
or political views. They may call it just
multidimensional user spaces. But essentially what
they are doing– they’re extracting intimate
traits of people and then using machine learning
models to optimize marketing. But you can do it
more explicitly. And you can also do it if you
don’t have access to any data. And we, in fact, tried
to do that– exactly that– to measure the efficiency
of personality-based marketing. So what we have done,
we used Facebook online advertising platform. And we use the fact that
likes predict personality. We know that introverts and
extroverts, open minded people and conservative people– they like different
things on Facebook. So what we can do, we can just
use Facebook marketing platform to target them separately
with different ads. So what we would do here,
we’d essentially say, hey, please target for me
an introverted audience. We don’t call it an
introverted audience. We just call it an audience
of people who like computers, and “Stargate”, and “Serenity”,
and separately target extroverted audiences who
like meeting new people, and “Entourage”, and [INAUDIBLE]
and drinking, and beer pong, and theater– so essentially, a typical
extroverted audience. And then we would design– so first of all, we would show
those audiences personality tests, just to check if
our manipulation worked. And guess what. You can see that extroverted
audiences scored significantly higher in extroversion. So we just checked that
our manipulation works, that, in fact, we can target
extroverts and introverts separately on Facebook,
even despite the fact that Facebook doesn’t provide
you this option explicitly. We can just use Facebook
likes to achieve psychological targeting. And then what we
would do with design, we would design ads that should
attract attention of extroverts and introverts, right? So we have an ad for extroverts. Love the spotlight,
and feel the moment. And we have an ad
for introverts. Beauty doesn’t have
to shout and just the one person sitting
calmly and putting on makeup. And we would use
those ads to try to sell to people on Facebook
the same line of cosmetics. So this is not just
a lab experiment. It’s actually selling
cosmetics on Facebook. And as you can
see here, this bar describes sales
among introverts. This bar describes
sales among extroverts when being shown
an introverted ad. And as you can see,
introverts are much more likely to buy than
extroverts when presented with an introverted ad. And the opposite happens
for an extroverted ad. Here, actually the difference
is one point– so essentially 80% more likely to
click if they’re psychologically matched
with an ad they are seeing. And we calculated those
scores for conversion rates and also return on investment. Now, if anyone here works
with online advertising, you guys know that sometimes
boosting performance of an ad or a campaign
by a few percents is already a great success. If you’re spending $100 million
a year on online marketing, improving accuracy of
targeting– performance of targeting by 3% just saves
you $3 million potentially, right? You can achieve
boost of up to 80%– so nearly double
your performance– just by using a single
personality trait, and just by using some
designs that we just made up in the lab while
planning for the experiment. Now, if you’re a real
marketing company, you would spend a lot
of money in producing a lot of different designs,
and targeting people based on their many different
psychological traits, and do some A/B testing, and
potentially boost this accuracy even further. If you were not in a coma
in the last few years, you know that it’s
not only cosmetics are being sold in this way. Everyone has heard about
Cambridge Analytica that essentially was using
exactly this approach to try to optimize political
communication for Brexit, then Ted Cruz, and,
finally, Donald Trump. And the outrage was
so huge that even our neighbor Mark got
dragged to the other coast to explain to the senators
what the hell is going on. And I’m not sure if you
guys remember the testimony. Mark Zuckerberg was
absolutely surprised. How is that possible
that someone can take people’s Facebook likes
and predict their personality? This is just some
outrageous technology. And he was very shocked
to learn about this, forgetting about the
fact that back in 2012– so when Cambridge
Analytica people were still in high school, Facebook
patented the technology that Cambridge Analytica later
used to essentially extract people’s personality from their
Facebook likes, which brings me to my last few slides here. Now, as we are
moving forward, we’re going to be producing
more and more Facebook likes, Facebook
status updates, search queries, emails, and all sorts
of other digital footprints. And on the other
hand, algorithms are going to get
better at turning those digital footprints
into accurate predictions of our intimate traits,
psychological traits, and our future behavior,
which essentially means that going forward, we’re going
to have much less privacy, if any, than we have now. And when I say that, people
say, well, OK, that’s great. Thank you for warning us. Let’s just take an action and
give users control over data. This will solve all of
the problems, right? Of course not. Even the government is not
able to control their data. If there’s a
motivated third party, they’re going to get your data,
whether you want it or not. And no one– no citizens
have enough time, education, and assets to actually be
able to successfully protect their data against, again,
a motivated third party. Not to mention that if
you actually gave people an option of just switching
off access to the data, taking full control
of it, then many of the wonderful
technologies that we have now would stop working. Google Maps and the
navigation capability that it’s providing– it’s an amazing technology–
perhaps the greenest and one of the most
world and life changing technologies of recent years. Imagine how many
liters of gas are being saved every day by people
optimizing their route to work. Imagine how many
hours that we can spend working more
or with family are being saved because
the same feature. Imagine how many people are
enabled to travel far and wide, that go and work in new
cities, because they’re not lost in geographical spaces. Now, if we stopped sharing
our geographical location with Google Maps, the
system stops working. And people may say,
well, that’s OK. Maybe it’s a price worth paying. I don’t know, maybe it is. I don’t know. But we definitely have to
think about those trade offs. And I’m sure we can
come up with gazillions of other examples of
technologies that essentially would stop working or would
work much less efficiently if user data was not
essentially shared with the service provider. Not to mention that even if we
shut down all of those services and you are taking full
control over your data, Homo sapiens is just a
really narcissistic species. We are going to be tweeting,
and blogging, and posting our pictures on Instagram. And this is already enough data
for a half-witted algorithm to already predict
your intimate traits with extremely high accuracy. And at this stage,
I hope essentially to have convinced
you that there’s very little chance that
we will find a way out of having lost our privacy. To which people say,
well, whatever, that’s OK. Let’s just move on with our
lives and see what happens. And I just want to
pause here for a second because losing our privacy is
not just some creepy marketing or more Cambridge Analyticas. And this is nicely illustrated
by those two composite images here. You have two groups of males. And those males differ
on an important psycho demographic trait. Can you guys guess what is the
difference between those two groups of males? AUDIENCE: Sexual orientation. MICHAL KOSINSKI: Great. It’s essentially an
aggregate picture of 100 gay and 100
straight males. And if you show an image
of a straight and a gay man to a computer algorithm– and
I’m talking about algorithm that I can train on my laptop. I’m not talking about sci-fi
AI that you guys can train on your super computers
here at Google. You can essentially reach
an accuracy of over 80%. And given five facial images,
the accuracy grows to 90%. We have now replicated those
results in the lab data where people’s facial hair, and
makeup, and hair were hidden, and predictions are even
more accurate than that. And now if you match it
with this map that shows you where the government is going
to kill you if you are gay, this nails down the points,
this emphasizes the point that losing our privacy is not
just some creepy marketing, that losing our privacy is
a matter of life and death for many people out there. And guys, also
consider the following. People get really upset
about those studies showing that we can predict sexual
orientation from people’s faces. It is upsetting. Those studies are made to be
upsetting because I’m a privacy researcher. And I want to study
the privacy risks. I’m not creating new algorithms. I’m using algorithms that
are already widely used by governments and
companies around the world to predict, for example, whether
someone is a criminal or not. But they would not boast,
of course, how accurate those algorithms are. But what people
keep forgetting is that predictions based on faces
are actually not that accurate compared with predictions based
on your browsing logs, credit card use, Facebook likes, status
updates, emails, and so on. Predictions based
on faces are just perhaps the most
difficult and most noisy types of predictions. If you’re afraid of the
government predicting your sexuality
based on your face, you should be really afraid
of your government predicting your sexuality based on your
Facebook likes, or status updates, or tweets, or
all of those other things that we all share
online at all times. And this is just my
last slide, I promise. So I would like
you to encourage– I would like to encourage you
to think about this problem as you would think
about a tornado. We can all agree in this
room that tornadoes are bad. And we can probably
all agree in this room that we should make
tornadoes illegal. We should probably
all agree in this room that we should make
intrusions of privacy illegal. But very much like
with the tornadoes, the fact that we want
to make them illegal will not stop them from coming. And the sooner we
accept the fact that even if we de-legalize
tornadoes or de-legalize fires, those things are
going to be happening. The sooner we can move on
to having a mature adult discussion about how to make
sure that a post-privacy world in which a motivated third
party can invade virtually anyone’s privacy– how to make sure that this
world is still habitable and a safe place to live. Thank you, guys. [APPLAUSE] AUDIENCE: In terms
of manipulation, it’s a word that we’re,
I think, starting to hear about a lot more
with respect to algorithms and its outcomes on society. You pointed out some
examples that I think people are starting to either– that are either known or
they’re starting to get to know. Are there certain risks
or benefits to society that the end of privacy
actually will provide in a way that maybe the layman might
not know and you would from just being deep
in the research? MICHAL KOSINSKI: This is
a great, great question. And I’m a big fan of privacy. And I would like
to have my privacy. And I would like you guys
to have your privacy. But essentially my point is that
despite our best intentions, the technological progress
that is unstoppable– because we are not giving
up those shiny toys, we are not giving
up AI because it’s going to invade our privacy
but also cure cancer, and drive our cars, and
improve our economy, and improve our
education, and so on. So we’re not giving it up. And in the process, will
also invade our privacy. So while I’m not
a fan of this, I’m also seeing it as inevitable. But while I see
it as inevitable, I also cannot stop recognizing
there are some advantages of it. Look, transparency– and
everyone would say, hey, we should have
transparency, right? Especially if
you’re a politician, you should be transparent
with your taxes. Or if you’re a
businessman, you should be transparent with whether
you commit crime or not. If you are a citizen,
you should be transparent with whether
you commit crimes or not. Transparency is the
opposite of privacy. When we have privacy guaranteed
and technologically enforced, child traffickers and
abusers, corrupt politicians, tax avoiding businessmen,
also have privacy. It’s much more
difficult to police. And there are systems out
there where essentially when we remove privacy
of certain outcomes, the system becomes much more
efficient and much better. There’s more trust in it. Example, reputation
system on eBay. The moment we remove the
privacy of the interaction between the seller
and the buyer, and now the record of the
interaction is public, and both sides can
judge each other, suddenly the market became
much more filled with trust. People can buy products
without worrying that they will lose their money and so on. Credit score is another example. Credit score is a big
invasion of privacy. But suddenly your past financial
performance and failing on debts and whatnot is nearly
publicly known to anyone that you want to
buy something from. And yet, those credit
scores broaden the access to financial products
to people that were historically deprived of
it, underprivileged people, right? In the past, if you’re a
banker and you have no access to information about
people’s financial history, you will judge them
by their looks. You say, oh, you
look poor, or you look like you are a minority. I’m not giving you a
loan because I’m not risking my bank’s money. It’s very unfair. I’m not going to admit to
it, but I’m going to do this. And there’s evidence that
that’s what people do. But now when people
have credit scores, you can say, well, I have
now reason to trust you because you were performing
well in the past. So again, there are some
problems again with this type of invasion of
privacy– credit score, or a citizen score in China– but are also clear benefits,
and especially benefits to underprivileged people who
traditionally were completely cut off from jobs,
and loans, and so on. AUDIENCE: Very good, very scary. And I thought I already had
gotten peace with the fact that we have no privacy. But this is making me
a little more worried. So I’m active on
Facebook and like to support my friends
by liking their posts and their pages and
things like that. I have a lot of likes. And I’m worried by this. What can I do to
have more privacy– unlike things, stop
liking, is there anything? I do know already that I
get facially recognized. And I had a college
reunion recently. And I put up pictures
from college. Three of my friends
Facebook facially recognized several years
later, a couple decades later, even though we had changed. And so what do I do to– stop liking, is
that going to help? Stop using Waze? MICHAL KOSINSKI: Don’t. Keep using– keep living
your life in a convenient way because you can stop using
Facebook and maybe Waze. But you can’t stop
using credit cards. You can’t cover your face when
you’re walking the streets. You can give up online banking. That’s essentially
impossible these days. And those types of data are
much more revealing about you than your Facebook likes
and status updates. And also by removing yourself
from those online social platforms, you’re essentially
leaving those platforms for– you essentially take
your opinions as well. And you take your social support
that you give to other people, so essentially making
the experience worse for everyone else. And I would actually– let me
make a radical statement here. That I would actually
argue that if you can afford losing your privacy– and everyone in this room can
afford losing their privacy because we live in a liberal,
peaceful, wonderful place in the world. We’re much luckier than
99.9% of other people. Now, we are all
going to lose privacy or we already did already. Now, if you actively
deprive yourself of privacy, if you use those
channels openly, you are essentially
moving the world in a direction where it’s going
to be safer for people who now live in countries
like Saudi Arabia, or places like some
inner cities in America where losing the privacy
of political views, or sexual orientation,
or religiosity can be, essentially, a matter
of life and death. And this is not a new situation. Harvey Milk was arguing that
if you can afford to come out as gay, you should do
this because then you essentially show your neighbors
that you are just like anyone else. You are removing
the taboo subject. And he paid the highest price
for doing exactly that himself. But now because of the price
he paid and many other people paid over the many
years, we are now living in a completely
different world. And one could make a
very similar argument for the online space and all
sorts of other intimate traits. AUDIENCE: So most
of my Facebook posts are public because I’ve done
social media for a living. So you think continue with that? MICHAL KOSINSKI:
Keep it this way. And you’ll pay the
price because there will be a bigot
somewhere there that will be prejudiced
against you because of your religious views,
or sexual orientation, or political views. But you know what? The problem of our world
is not us losing privacy. The problem of our
world is bigotry. The problem that I cannot freely
reveal my sexual orientation when I’m traveling to
other places is not– the privacy itself
is not the problem. The problem is that
there are those assholes there that are going to
be prejudiced against me and potentially even hurt me. Right? And in our actions, including
voting for politicians, we should essentially adopt
a view that privacy is gone. Because if you adopt this
view, it changes everything. Can you really trade
with Saudi Arabia where they kill gay
people and kill atheists? Well, politicians,
they say, well, of course you can because if
you’re gay and born in Saudi Arabia, just don’t tell anyone. Well, this doesn’t
work anymore, right? This thinking breaks. We need a radical
change in how we think about interactions
in this country and with other countries. And we need to essentially
not fight for privacy because it’s gone. And I love it. I would love it back. But it’s gone. We should fight the
bigotry because that’s the war that we can still win. We kind of lost
the war on privacy. SPEAKER 1: All
right, I think we are going to wrap today’s session. Thank you so, Michael,
for coming here. It was great having you
and listening to your talk. MICHAL KOSINSKI: Thanks. Thanks for having me. SPEAKER 1: Thank you so much. All right. [APPLAUSE]

54 thoughts on “Michal Kosinski: “The End of Privacy” | Talks at Google

  • Well, this guys thinks that Google only collects data to improve the experience.. Please do more research. Read Shoshana Zuboff's book: Surveillance Capitalism.

  • Endless fear mongering- no, sorry- they just want to target people better for advertisers, it pays more- not complicated…..

  • I recently went into my phone settings and under MY ACTIVITIES, i changed all the data collection settings and this program controls all media platforms. Piont being, before i reset things to my favor, my searchs and especialy my youtube experience were crapy. Like while watching a video on conscious or spirituality, it wouldnt recommend anything but trump news and politics. Im 38 so thats what i get. No videos relating to anything i like or relating to the video i was watching.
    Now that i switched things my recommended videos are things i like and things that are very simular. So nice! I forgot how good youtube was 2 years ago. They been slowly creeping on us for a while.

  • In just two days after being fiered youtube ads presented me anxiolitic medication… i just smiled. What else could i do… click skip.

  • Still, I get BS ads for gun supporters and Trump just because of my age and living location even though I am very progressive and have oodles of progressive channels.

  • P.S. about credit scores, they are targeted at the most profitable – i.e. most economically incompetent, and not the most likely to pay the loan back.

  • I wonder if Google has the technology to determine whether I've had a stroke or not so I don't have to go whine to my doctor or waste time, money or gas. They probably bank off of advertising dollars so it would be cool if they invested in themselves and myself and built a Google stroke detector.👍❤🌎🐐

  • I bet this technology could tell whether a person is lying or not too, which is pretty cool I think. It's as if we're all about to become aware that God has existed inside of us and around us all along.

  • Why are you people blocking Android users apps from loading just because we don't want your play store running on our systems?

  • Too rich coming from Google, you cannot ever pretend to respect privacy or diversity of thought at this point, no amount of day speakers will convince us otherwise, shady business attempting to become omniscient and influence behaviours. Rotten leftist faux-virtuous organisation.

  • A lot of what people think they "know" about other people actually consists of learned stereotypes that they have internalized over a lifetime and then apply instantly to people they see on the street. I can tell you with absolute certainty that most of the assumptions people make most frequently about me, based upon nothing more than my appearance, are quite wrong. There are also many stereotypes we all learn about behaviors, preferences, etc. and these can be inferred from these. Thus, most people think they can, for example, "tell" who is "gay" based upon what clothing they are wearing, what car they drive, what movies they like, etc. One of the most ridiculous examples of this that I have ever encountered is the idea that there are "gay cars," i.e., cars that are somehow preferred by women and, thus, "must" be indicative of homosexuality if a man is driving one. Volkswagen was influenced by this to such an extent that they redesigned their New Beetle to flatten out the curves and make the car more "masculine." But, here's another thing: the assumption that homosexuality has anything to do with being effeminate (if you're male) or masculine (if you're female). Neither has anything at all to do with whether one is a homosexual or not and, in fact, there are as many hyper-masculine looking men who are homosexuals as there are effeminate women who are lesbians. There are really only two factors that decide, ultimately, who is and who is not a homosexual: (1) an attraction to members of one's own sex and (2) either acting upon that attraction or considering acting upon it. Nothing else has anything to do with it and most of the rationale people use to decide who is "gay" are nothing more than stereotypes. The supposed accuracy of tests to determine who is "gay" are based upon stereotypes, not upon actual factual data about people.

  • Can u use camera on phone to tell if someone is a cop? Or a crook? Or new BFF? Someone should make app.. That is all.

  • The irony of how Bigoted Google actually is. Project Veritas recently exposed Google. Google's days are numbered for the great migration away from Google Platforms. Can't wait to see all those dollars walk to their competitors.


  • The comment at 26:00 about pretty people having their psychological traits affected over time as a result of being attractive is something I've come to believe just from life experience. Glad to see someone else mentioning it here, with some data to back it up.

  • He had me until the end, where he basically says, 'Privacy is dead, but just be ok with it. Keep liking things on Facebook. Hate bigotry instead.' I'm not sure that's the answer to what was otherwise a very good talk.

  • Democracy is not primarily a way of life, but a
    set of procedures for organising and operating government.
    There are no inherent substantive ends or core beliefs that are
    essential to democratic rule. The majority
    of a community may embrace any set of core beliefs that it

  • Democracy is not primarily a way of life, but a
    set of procedures for organising and operating government.
    There are no inherent substantive ends or core beliefs that are
    essential to democratic rule. The majority
    of a community may embrace any set of core beliefs that it

  • the algos ain't all that , unless you're hypnotized enough to be predictable utube is REALLY confused , their recommends are so wrong , so often , and it's a hobby to analyze their analysis
    but i don't tweet , my phone is dumb , never use Facebook , don't need their ,maps , etc.,etc……………………if you're "normal" , maybe what he says is relevant , but you get what you deserve if you ain't got nothin' , you got nothin' to lose

  • Democracy existed only in ancient Greek. We live in a republic. It means that "we" choose representatives which act according to their personal interests. Sometimes, their interests align with those of their voters, sometimes(most of the time) not. Most of the time voters don't even know what their representative is working on neither how it will affect their life.

  • Look at David Icke, we will be here in 50 years, this global warming is a swindle. Google are EVIL, they even said it when they started up "do no evil" LOL, they got rid of that real fast. Don't use or update face book, get rid of Alexia and such devices, use alternative search engines like DuckDuckGo, use Brave browser, use encrypted email, turn location services off on devices, use Linux (elementary, pop, mint etc…) as a desktop, tape up over cameras and microphones. Basically take control of what YOU can do and tell others and help others do the same.

  • Saying Goodbye to privacy is simplistic and convenient until the though police come for you. Or until you are blackballed for posting politically incorrect content online. How does your No-Privacy theory address this threat of the thought police?

  • Google Maps is great at saving fuel by optimising routes. But they don't provide the best possible routes. Because they work at scale, they only provide good enough routes. Just a little nitpick. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *