WOOT '17 Paper #43 Reviews and Comments
===========================================================================
Paper #43 Hello, Facebook! This is the stalkers' paradise!


Review #43A
===========================================================================

Overall merit
-------------
2. Weak reject

Reviewer expertise
------------------
3. Knowledgeable

Paper summary
-------------
The authors dissected a more or less new feature on facebook.com and found a way to bypass several privacy/anti-automation safe-guards in order to extract a couple of thousand phone numbers of people living in a certain area inside the US. 

The authors describe how they did it and what could be recommended for better privacy protection.

Strengths
---------
The paper is well written and has good structure. The attack is well documented and clear. Few questions are left open after reading the paper. 

The ethical concerns that a reader develops after a few lines in are discussed as well later on in the paper.

Weaknesses
----------
It seems that the authors confuse security with privacy issues. They uncovered a privacy bug and, to successfully exploit it, bypass an anti-automation feature (i.e. CAPTCHA).

It is not clear what the academic value is. Some website leaks phone numbers. Almost like a phone book with reverse search - which is available for the region the authors investigated on. 

Why not just report the issue and be good with it? 

Was the bug reported? How is such thing relevant enough for a paper? 

What happened to the data after harvesting - the authors only say "discarded", did someone verify this? 

What did the ethics board say? 

Was there any cross-correlation with reverse phone book searches offered on i.e. the white-pages? How many numbers were actually meant to be private?

The numbers, 20k in seven days are also not very convincing. This is, like, one number every 30 seconds. Could be done by a person - no need for a "framework".

The paper, despite being well written and well structured, leaves too many questions unanswered and appears shallow. 

Not enough contribution for a conference, hence a weak reject.


Review #43B
===========================================================================

Overall merit
-------------
2. Weak reject

Reviewer expertise
------------------
3. Knowledgeable

Paper summary
-------------
This paper demonstrates an attack on Facebook that allows the attacker to enumerate phone numbers and associate them with Facebook profiles by using Facebook's search functionality.

Strengths
---------
The attack is realistic, practical and threatens Facebook's users' privacy.

Weaknesses
----------
The attack is not novel and it has been demonstrated against other online services before (WhatsApp and KakaoTalk).

Comments to authors
-------------------
Overall this is a realistic attack that threatens the users' privacy. Facebook does have some defenses in place, but these are not enough and can be bypassed as shown in this paper (multiple account creation and multiple queries in short amount of time). Nevertheless, this is not a novel attack. Enumerating phone numbers in online services has been studied before [7,9] and discovering the same vulnerability in a different online service does not justify enough novelty for a new publication. 

The authors should inform Facebook of the problems identified in this paper. It seems that they do not care about the phone number enumeration, but they might be interested in the account creation bypass. This blog post did the same attack, informed Facebook and they did not fix it: https://averagesecurityguy.github.io/2016/09/07/facebook-private-phone-enumeration/
In this paper only 87k phone numbers were tested and Facebook's response was that you cannot enumerate a large batch, which means that you might have not hit their limit, or simply after a point you started getting many empty results, as you mention in the paper, and it is not clear how many false results you got from this countermeasure.

There are many spell and grammar mistakes in the paper. Please proofread before submitting a paper.

Minor comments:
* You are citing twice citation [2] in the related work.


Review #43C
===========================================================================

Overall merit
-------------
2. Weak reject

Reviewer expertise
------------------
3. Knowledgeable

Paper summary
-------------
The paper describes the possibility of crawling a social network for user profiles using randomly generated phone numbers, similar to how email addresses have been used by prior work. The authors develop a framework that uses Facebook's search tool to search for a user profile associated with a phone number and extracts the public details from the profile. The proposed framework also overcomes Facebook's rate-limiting by using a set of Facebook accounts with the crawler, instead of a single account, and simulating a different device environment for each account. The authors discover over 20000 valid profiles using over 80000 phone numbers with California prefixes over a period of 7 days, and identify numerous instances where users have made private information (e.g., friends, relations, hometown, date of birth) public.

Strengths
---------
1. While the idea of crawling for accounts using Facebook's search is not new, the paper demonstrates that phone numbers can be successfully used to crawl for user accounts (i.e., they discover over 20k user accounts, from only 80000 randomly generated phone numbers, in 7 days). 

2. The paper demonstrates that Facebook's rate-limiting techniques are rudimentary at best. Specifically, it seems that Facebook does not rate-limit the IP address, further lowering the bar for even novice attackers.

Weaknesses
----------
1. The writing is in poor shape, and needs major revision. See my detailed comments in the "Comments to authors" section.

2. The paper does not discuss related work to an acceptable depth. For instance, the work by Balduzzi et al., which uses emails for crawling for profiles, is closest to the proposed framework. However, the paper does not provide any comparison with the techniques used by Balduzzi et al. for improving the crawler's efficiency or for overcoming Facebook's rate limiting mechanisms. Are there any lessons that can be learned from the work of Balduzzi et al., which have been applied in this paper?

3. The proposed countermeasures are very abstract, and do not provide much information. For instance, it is clear that the paper exploits the fact that Facebook does not use any sophisticated means to detect misuse across accounts. However, the following proposed countermeasure does not provide any meaningful recommendation to solve the problem: "To solve this vulnerability, the service needs to have solutions for detecting the use of numerous accounts by one user. "  In some cases, where the suggested countermeasures make sense, the paper fails to elaborate. For instance, the paper recommends using server-side scripts for generating all content, instead of static content, to make crawling difficult. However, would this be feasible for Facebook's scale, and would it lead to other security issues?

Comments to authors
-------------------
The idea seems feasible, but the quality of the writing, as well as the content, need a major revision. I have outlined some suggestions/comments/questions below:

1. The paper describes a "crawler", just like prior work that uses emails, or other work that crawls for Android applications on Google Play using randomly generated package names. The term "enumeration attack" has a completely different meaning, and does not apply to the attack described in this paper. My understanding is that an enumeration attack involves forcing an entity to enumerate some information it knows (e.g., forcing a server to enumerate its users). Follow the example of Balduzzi et al., who describe the attack as "crawling for profiles". 

2. "Actually on Facebook, there are more than 99% open profiles." Is there any evidence you can cite for this?

3. Please correct the following citations, and thoroughly check other citations in the paper for format and spelling:
Balduzzi, Marco, et al --> Balduzzi et al. 
Malduzzi, Marco, at el --> Balduzzi et al.
Mahmood "at" al --> Mahmood "et" al.

4. Many sentences in the paper are possibly fragments, and do not make sense. For example, consider this: "According to the work implemented by Malduzzi, Marco, at el [2], a small number of users who can see their profile by changing their privacy settings."

5. The paper needs a thorough grammar check. For example, 
a. "For creating multiple Facebook accounts against from these protections..." --> "For creating multiple Facebook accounts in spite of these protections..."
b. "Facebook can not aware that two attacks..." --> "Facebook can not be aware that the two attacks..."
c. "...protection against anomaly searching behavior." --> "...protection against anomalous searching behavior."

6. At the beginning of Section 4, the paper states that Facebook's rate limiting mechanisms can be evaded using multiple accounts, and by introducing a delay. However, the discussion does not include anything about introducing delays. 

7. (Editorial Comment:) If possible, position tables and figures at the top of the page.

8. Was Facebook ever contacted and made aware of the proposed vulnerability?