Marc Al-Hames, Cliqz, European DataEthics Forum 2017

Marc Al-Hames, Cliqz, European DataEthics Forum 2017


I’m Marc Al-Hames. I’m the CEO of Cliqz. We’ve
built a German search-engine and a browser. It’s a combined product. We don’t store any
data about our users we also prevent everyone else from tracking you through the internet.
We’re a team one hundred-ish people in Munich. We’re financed by Hubert Burda Media which
is a big publisher in Europe and Mozilla invested in us – the maker of Firefox. You shouldn’t
trust me by the way. I think this is a scheme that came across today because I could just
be a sleazy presenter. You should actually check all the facts. This is why everything
we do is open source and I’ll talk about this in second. I’m not against data. Data is really
important and the last thing we want to achieve is a world without data because it’s a really
terrible and awful world. I give you one example: We built a search engine. I’m sure Brian has
the same problem. This is a typical query in Germany, it says “Red Roses” (Rote Rosen).
The best answer for this – and we actually find it – is a TV series. It’s a famous
German TV series, I think it airs once a week and it’s called “Red Roses”. You can pretty
much learn this from crawling the web. Except if you crawl the web you’ll make one terrible
mistake and this is on the 14th of July, no 14th of February, I’m a bad husband (laughs).
It’s the one day where if people search for “Red Roses” the TV series really doesn’t count
because the first result has to be “How to order red roses”. You cannot learn this from
crawling, you can only learn this if you observe what people do in a certain time frame. And
this is why data is really, really important. And you cannot just say: Let’s not collect
any data about what people do because you’ll miserably fail if you want to build a search
engine. This is the second half now: Data is really, really bad and no one should be
allowed to collect it. People actually have a very good intuition about it. Thre was a
Forsa questionnaire where people were asked: “Do you trust the internet?”. What you can
see is 80 percent don’t trust the internet with their data. Now people don’t yet change
because of convenience and laziness and all the good reasons for that. But they really
should because they’re right not to trust it and I want to give you an example. I brought
anonymous marketing data. So you can go to the market to one of those big exchanges and
say “Look I want to target 20 to 30-year-old women with good household income because I
want to market my new product to them” and what you get is – by the way completely
GDPR compliant – data back. I give you some examples. This is one of the data points you
get back. It’s something that a user has visited. Now I blacked out some things, it doesn’t
really matter. I have a second one: This is a German doctor. Anyone concerned with me
showing that web page? Better. You should see a hammer coming (to the audience). I have
a third one: You can buy this for around 1 (Euro) Cent per thousand. So it’s really cheap.
Anyone concerned? (to the audience) You should really all raise your arms by now. But you
obviously don’t because how should you know. This is a bank. You can see a bank account
number in there. It got into the data set by accident but it happens more often than
you would believe. Now, all three of these URLs by definition are actually not private.
They’re public. You can find them in the search index. It’s really not a problem but conveniently
the marketing exchanges give you one more data point and that’s “One person has visited
all these three domains”. Now you’ve learned quite a bit. You learned that someone is visiting
the Fitness Profile of “LittleVegan…” maybe her or himself. That same person has access
to the administrator account of this doctor and has an account at a local bank in Munich.
With these three data points and any search engine like FindX, Cliqz, Google, Bing you
will find out it’s her. It’s just three data points. Whenever I have these sets it usually
takes 3 to 9 URLs and you have any person identified. There are organizations that have
significantly more than just 3 URLs. We did a large study in Germany seeing who monitors
what of the internet traffic. So you find Google has around 60 percent, Facebook around
23 – this is 2 years old, Facebook by now is at around 40 percent. Just to make this
number clear, because very often you hear “Well, then don’t log into Google if you don’t
like them, don’t log into Facebook”. This number is if you don’t use their services.
So if today you’re annoyed, you destroy your Android phone, you remove Chrome from your
Computer, you never watch a YouTube video again and you never do a Google search again.
If you do all these steps and actively opted out and by this made you’re live very miserable
– if you really do this, they (Google) will still see 6 out of 10 things you do on the
internet. If you never had a Facebook account, they (Facebook) still now 2-4 out of 10 things
you do on the internet. You have never opted into that. That’s why it’s so important that
you install Better or something like Cliqz because we stop this. Now you might say “I
have nothing to hide” and we heard this example in the morning maybe you even don’t mind a
microphone on the toilet but it’s not as simple. I’m for example am having a cold today and
I have no problem sharing this information with you and if my voice goes down I’m sorry
because I’m having a cold. I might, however, have a problem sharing with all of you that
I have done an HIV test this morning. There’s a good reason why we don’t communicate by
coming on stage “Guys, I’m really excited being here, I did an HIV test this morning”.
It’s usually up to me to decide if I share this information or not. Except for when I
use the internet. If I inform myself about an HIV test at Mayo Clinic, you’ll find all
that all these companies (by the way this is the Cliqz blocking) are spying on me and
storing this data. And there are companies like for example BlueKai, they now belong
to Oracle. Their business model is selling profiles. You can go there and ask for profiles
of people who did an HIV test. This is not okay. Now even if you’ve never had any sickness
or don’t mind sharing it. All of us have a bank account. At least most of us. Most banks
actually have trackers when you’re logged in. This is when you’re logged into HSBC and
you see trackers getting that information. Now I’ve never opted into that and actually
I believe most of the banks don’t know about it but the way the infrastructure today is
built is: People use frameworks to build Apps, to build websites and they come pre-packaged
with all this shit. It means that all my private information is just public and I can go actually
to a market exchange and buy all this data and it’s incredibly cheap. Don’t get me even
started once you’re logged into something. Because then they actually have all your data.
So if you’re using Google while being logged in or if you log into Facebook, they literally
have everything. And it doesn’t stop there. That’s the ultimate problem. The complete
ecosystem is built on hundreds of companies exchanging all this data. So even if you trust
actually Google and Facebook which probably have the best security engineers in the world
the business model is that all these tiny companies exchange all the information and
you just need to find the weakest link, the one guy on that small island without GDPR
who’s going to sell you that data and there is always the weakest link. Even if you trust
companies, you really shouldn’t. Really not. Because all data – and this is Marc’s law
of data (my law of data) – that gets collect eventually becomes public. Even if the company
is the most trustworthy company in the world and I know there is a lot of Google bashing.
I actually think it’s not a really bad company but you have t trust them forever and you
have to trust each and every individual in that company. Now, MySpace used to be a huge
company. It used to be the Google of the times. Anyone remembering MySpace (to the audience)?
I think they’re bankrupt or sold or something. I don’t even know. Yahoo just got sold. And
whenever a company gets sold all your data gets sold, too. So this is the example of
BlueKai with their 700 Million profiles, like the HIV example, they got acquired by Oracle.
Now Oracle might decide to just change the privacy policy, the data will never be deleted.
You just need one unethical employee and all your privacy protection is gone, you just
need one hacker going into the data and stealing it and all your data is gone. Now, I said,
“We need data because the internet as we know it will fall apart if we don’t have data”
We believe in privacy-by-design we call it “Human Web” and actually with Cliqz we collect
a hell lot of data and “Red Roses” is a very good example for that. However you shouldn’t
trust us, we strictly distinguish between what’s on your device and what reaches the
Cliqz server. There are actually third party proxies between us and the users. All the
code is open source and the rule is: The moment the data reaches our technology, our backend,
on principle it should be possible to make that data public and none of our users should
be in danger. So it’s much more than taking out an IP address like the GDPR asks or not
having a Unique Identifier (UID) for a user. It is for example not having the linkage between
two subsequent actions a user does. This is really important. And this is our internal
mantra: If one of our employees would turn really evil or if someone decided to hack
our computer it would be bad and we wouldn’t be happy about it but are relatively certain
no one of our users would ever be in danger. I say “relatively certain” because there is
no 100 percent certainty in technology, there never is but to the best of our knowledge
and all the penetration tests that we do: it is safe! We’re at the beginning of the
industry, every industry that starts is really ugly, dirty and shitty because growth is money
in the beginning. We’re only 20-some years old with the internet. The iPhone is just
10 years old, we’re babies in that perspective. And it’s our job to change this picture to
what Europe looks today. We wouldn’t accept cole mines and industry looking like this
in Europe and we shouldn’t accept this happening for data either. I can’t promise when this
will happen but I’m absolutely sure eventually customers will stand up and fight against
this. Thanks a lot.

Danny Hutson

Leave a Reply

Your email address will not be published. Required fields are marked *